The post is the first in a series where I learn about ML by applying it to the stock market. You can find the code for this blog post here.
Word2Vec is a simple but surprisingly powerful algorithm.
It builds word vectors
to represent word meanings. And it learns these meanings solely by the
surrounding words. You can then use these word vectors as the input to
make machine learning algorithms perform better and find interesting abstractions.
What happens if we apply Word2Vec to the stock market?
Choosing the Stock2Vec Window
In
Word2Vec the window for each word is the surrounding words. For
Stock2Vec how should we pick the surrounding stocks for each stock?
The
stock market has plenty of variables to choose from: price, volume,
moving average, etc. After talking it through with my friend Yuhi,
who works in finance, we chose price-to-earnings. Price-to-earnings
represents the market’s expectations for the growth of the company.
Hopefully it will learn something!
Building Stock2Vec
While reading the API documentation for Quantopian’s Zipline I came across the Quandl Wiki Dataset which has up to 40 years of end-of-day data for 3000 companies for free. I combined the price data with fundamentals data purchased from Sharadar. Then I used postgres to merge the data. The preprocessing code is here.
I adapted the embeddings project that was part of the Udacity Deep Learning Class I just finished last week :), trained the model on FloydHub, and visualized the results with TensorBoard. You can find the training code here.
链接:
https://medium.com/towards-data-science/stock2vec-from-ml-to-p-e-2e6ba407c24
原文链接:
http://weibo.com/1402400261/F5aFIo6PU?from=page_1005051402400261_profile&wvr=6&mod=weibotime&type=comment#_rnd1495972513908