Blog: Beating the Market
We’re back! After a slight hiatus, Linear Labs is back on the grind showing off our open source tech — BLOX, the simple and extensible machine learning library. If you need caught up, feel free to check out our last article where we introduced BLOX and demonstrated how easy it is to build a neural network model. In this article we’re going to be demonstrating a new addition to the library — Reinforcement Learning, and how to become your own Wolf of Wall Street. Before we crack into the cool stuff, let’s go over what Reinforcement Learning is and how you might be more familiar with it than you think.
What is Reinforcement Learning? RL is a form learning algorithms in which an agent (like a neural network) takes actions in order to maximize a reward. Think of it like this, remember when you were a kid in elementary school, your teacher would give you candy for good being a good student? These rewards made you want to be as good, this way you could get as much candy as possible. Through positive and negative feedback this type of reward system allowed you to learn right from wrong actions. This is a form of RL, in the most basic concepts. Your younger self is like the AI agent, the candy is the agent’s reward and trying to get as much candy as you could is analogous to the agent optimizing it’s cost function.
Now, engineers and scientists use that same type of reward system to help teach machine learning models right from wrong. Doing this allows these agents to explore all possible actions they’re able to take. Based on the rewards of their environment (like a child’s classroom) the agent can learn that certain actions will give it a reward (candy) and try to get as much of that reward as possible . As it turns out, this is a fairly common thing that you’re probably aware of already.
Before you get too worried — NO I’M NOT TALKING ABOUT THE ROBOT UPRISING!
What I’m talking about are machine learning models trained through RL that are being used in video games like DOTA 2, League of Legends and many more.
During training these models are given the ability to choose what they want to do like move forward, attack, hide, etc. and each action has a reward (positive, negative or zero). For example, attacking the other team might be a positive reward worth 1 whereas receiving damage from another opponent might be a reward of -1, to discourage actions that caused the agent to recieve damage.
What makes this different than when you were a kid receiving a star next to your name on the chalkboard is that these models store their memory and are allowed to replay it and try different strategies. Basically, the machine learning models get to save the game, try something risky and if it doesn’t work out, they can try something different. But those are just video games, what if we applied RL to something different?
AI & Wall Street
We’re taking RL to Wall Street using our library BLOX.
BLOX was designed to allow people of all experience levels to quickly experiment, test and deploy machine learning models. There are a ton of machine learning libraries out there and even more tools that wrap around them to make these libraries easy to use. How is this any different?
Let’s look at some details about the library
- BLOX relies on one of the most familiar data-interchange formats — JSON. JSON is easy to read, incredibly well supported and regardless of your programming background, you’re probably familiar with the syntax.
- It can be interfaced easily through the commandline
- You can easily create your own extensions and integrate them into the library with a single line of code
- It automatically scales across all compute nodes
- Most importantly, it makes data science and engineering easy
Throughout the rest of this tutorial we’ll be demonstrating how easy it is to use BLOX, but for more information on how great it is check out overview tutorial.
Building a stock trading app
Like I mentioned, we’ll be using BLOX to build, train and deploy a machine learning model that is able to not only decide if you should buy, sell or hold stock, but how many shares. We’ll break this procedure into 4 steps.
1. Installing BLOX
In order to install BLOX navigate to our github page and download the source directly. Once the package is unpacked you can simply run
$ cd BLOX && sudo python setup.py install
Or if you prefer to do even less work….
$ pip install git+https://github.com/linearlabstech/blox
from here you’ll be able to import the
BLOX module in any python +3.6 script.
2. Downloading The Data
In our repository you’ll find a folder titled
Examples here you’ll find two examples of how to implement BLOX, for this article we only care about the one conveniently named “Day Trader”. In this folder you’ll find a script named
data_downloader.py which will be used for pre-processing the backtesting data and downloading our test data.
In our example, we provide the back testing data for Microsoft ($MSFT) in CSV format from 2013–2018. If you’re looking to train on a different ticker you can check out this Kaggle repo and download your own data if you’d like.
Your data should look something like this, depending on your ticker
We’ll need to convert this CSV to a BLOX readable dataset. To do this run the code below.
$ python data_downloader.py --csv <your_ticker_data_file>.csv --output_file backtest.ds *
What the script is doing is taking the features of each historical trading day (Open,Close,High,Low and Volume) as input and use the next trading day’s average price (Average Price = Open Price – Close Price).
Last, you’ll need to run the same script in order to get the test data which is the stock data from the last 13 weeks
$ python data_downloader.py -t <YOUR_TICKER> **
<your_ticker_data_file> is a variable, for example I used
<YOUR_TICKER> is also a variable. For this I use
3. Training our model
Now it’s time to build your model! But before we do that, let’s talk about the agent’s environment (the rules and parameters of the training environment). In the agent’s environment we’ve established a few rules ahead of time like.
- Our reward is the net value of making a sale, so making money is good, losing money is bad
- We’re starting off with $10,000 straight cash money
- We initially purchase 1 share of stock
- We’re only allowed one transaction a day and we cannot buy or sell more than 55 shares at a time
- We’ll periodically record things like our wallet (cash on hand), number of shares, account value and of course our net gain
Alright, let’s begin. Using the provided config and network files, you’ll first want to train your model like so
$ blox -c train.cfg.json --train
This will start training your model on the backtesting data you just converted. After a few epochs (a full cycle through the training data), our net value looked a little something like this.
252.3% ROI after 5 mins of training!
Pretty sick, right? But let’s say you’re not totally blown away. Maybe you’re asking questions like
The stock market has grown quite a bit over the last half decade. How much would you have gained if you just bought in and did nothing?
That’s a really fair question, especially since you can purchase things like index funds that are tied to the market and usually slightly outperform mutual funds according to this article. So let’s take a look at MSFT’s performance over this time period and compare.
For the period of training data that we had, Microsoft’s stock value raised ~172%! That’s insane, but not 253.2% insane! So that puts us at a 81% improvement.
When I was discussing RL in the section above, I briefly mentioned how Open AI trained agents to play DOTA 2 and were actually really good at it. In fact, they were so good at the game after they trained their agent, they noticed it had developed new strategies for winning. So I think we should ask, is there anything that we can learn from our agent in order to get better at trading?
To do this, let’s first examine our agent’s trading skills (our net value) while it was learning.
Hmmm, it seems like it wasn’t really returning a good ROI until after the fourth training epoch. Since we recorded various statistics like cash on hand and number of shares in our account, we can investigate those to help explain what changed.
Here we have the number of shares in our account while training. The dashed vertical lines are roughly the end of an epoch. Which means we can start to identify the pattern it’s learning for trading. In the first three epochs we see the agent is making a large trades frequently, which aren’t providing much value in the end. Afterwards and specifically the last epoch we notice that it initially buys as many shares as it can, making only small trades to gain value. We then see an abrupt sell of almost all of it’s shares until a later time where it makes a very large purchase. But why? Let’s look at the Microsoft’s historical data to see what inspired that type of reaction.
Looking at the period when the agent takes it’s trading break, we see higher volatility in the stock price (2015–2016). It’s likely the agent starts to identify indicators of volatility and loss and can predict the best times to invest or sit it out.
Ok, so we showed how our model can outperform the market and recognize patterns of volatility and loss, but neural nets are subject to overfitting — especially for small data sets. So how do we know we can generalize outside of our training data?
Let’s test our model on some other data that it hasn’t seen before. Let’s make sure we’re actually able to deliver on our gainz.
We’ll use BLOX again, but this time, we’ll use the
--test flag and our test data we already downloaded.
$ blox -c test.cfg.json --test
This command will run a single pass over the dataset we’ve selected. below are our test results.
The results are in, using our test data from the last 65 days, we see about 6.6% ROI! That’s awesome!
We demonstrated success using BLOX, easy training isn’t just why it was built. We built BLOX because it’s easy to scale and deploy ML models as a service. Since we’re dealing with money, the stock market and our library we have to state
DISCLAIMER: THIS IS ONLY FOR DEMONSTRATION PURPOSES PLEASE DON’T CONNECT YOUR TRADING ACCOUNT(S).
Now that we’ve gotten that out of the way, let’s actually turn our trading app into a micro service. To do this we’ll need to open three terminals on our computer. The first two will run instances of BLOX and the third to host our server. One to serve messages to a message broker and the other to consume and the other to serve the ML model. To do this in one terminal run
$ blox -c test.cfg.json --serve
This will spin up a server to handle all the requests the model will be receiving For the second terminal we’ll run
$ blox -c test.cfg.json --pipeline
--pipeline flag initiates our ML pipeline. This will consume the data from our server and return the evaluated data.
$ python day_trader_app.py -t <YOUR_TICKER>
This will download the current price of our stock and query our model on what action it thinks we should do.
day_trader_appI’ve added a wallet to show you the performance of the model, but really what our model is returning the number of shares you should purchase (negative means sell).
After these steps you should start to get print outs on your terminal like the ones you see below
There you go, now you have your very own app up and running, trading stocks in real time.
Now go download your own data, train your model and start trading today!
If you’re interested in learning more about RL and training agents to play games, Open AI has a bunch of great resources.
A toolkit for developing and comparing reinforcement learning algorithmsgym.openai.com
Their gym library has lots of different games, from Blackjack to Atari’s Pong. Have fun and enjoy!