How To: Stock Predictor Using Machine Learning

Dexter Barahona
4 min readJan 12, 2020

--

Stocks are always changing and are very unpredictable, today I am going to be showing you a tutorial on how to make a machine learning algorithm guess what the stock *might* be for the next day using a months worth of past data.

My uncle has always been interested in stocks, and is always trying to explain to me how it works and how you can make loads of money just investing in companies that you think will succeed — of course this is a super simplified way of looking at it. My uncle also constantly tells me about the problems he encounters and said to me

“If only there was a way to know what the stock would look like tomorrow, then I could know when to sell”

This brought up some interesting thoughts, could I develop something to help him out? maybe some kind of prototype so that he could at least loosely assume what the stock would be the next day, according to data that has been recorded the month before. I decided to open my laptop and get to work.

Part 1: Gathering The Data

When programming with AI, and creating a machine learning model that you want to be a predictor, you should always have a data set, that you will use to be your foundation to predict the next point.

In this case, I am going to be using the GOOG stock to test if this machine learning model will work.

Gathering data will be a bit difficult but lucky for me, there were some websites that were willing to give up old data.

GOOG_Stock Data (CSV file)

Once I collected the data I inserted it into excel and converted it into a CSV file so that the program could read the file, and assess the data properly.

Part 2: Creating the Code

Now, I can walk you through the different parts of code that make the stock prediction possible.

The first part of working with any machine learning algorithm is to add the libraries needed:

import pandas as pd
import numpy as np
from sklearn.svm import SVR
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

importing LinearRegression allows for the parameters to be used to create a Linear prediction regression model.

Next we want to build the algorithm itself:

# = will represent what each part of the code does

# These are the x and y data sets 
dates = []
prices = []
# Since im using google colab, I can simply load the data into it.
from google.colab import files
uploaded = files.upload()
df = pd.read_csv('goog_predict.csv')
# This number represents how many rows of the data frame we want to look at (not test)
df.head(7)
# Now we want to know how many rows and columns we are testing
df.shape
#obtain the last row of data
df.tail(1)
#Get all data, except for the last row
df = df.head(len(df)-1)
df
# Obtain the new shape of the data
df.shape
# Gather rows from "Date" column
df_dates = df.loc[:,'Date']
# Gather rows from "Open" column
df_open = df.loc[:, 'Open']
#Independent Data set x
for date in df_dates:
dates.append( [int(date.split('-')[2])])
#Independent Data set y
for open_price in df_open:
prices.append(float(open_price))
#Print the dates recorded
print(dates)
def predict_prices(dates, prices, x):# Create the 3 Support Vector Regression models
svr_lin = SVR(kernel='linear', C= 1e3)
svr_poly= SVR(kernel='poly', C=1e3, degree=2)
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
# Train the SVR models
svr_lin.fit(dates,prices)
svr_poly.fit(dates,prices)
svr_rbf.fit(dates,prices)
# Create the Linear Regression model
lin_reg = LinearRegression()
# Train the Linear Regression model
lin_reg.fit(dates,prices)
#Plot the models on a graph to see which has the best fit
plt.scatter(dates, prices, color='black', label='Data')
plt.plot(dates, svr_rbf.predict(dates), color='red', label='SVR RBF')
plt.plot(dates, svr_poly.predict(dates), color='blue', label='SVR Poly')
plt.plot(dates, svr_lin.predict(dates), color='green', label='SVR Linear')
plt.plot(dates, lin_reg.predict(dates), color='orange', label='Linear Reg')
plt.xlabel('Days')
plt.ylabel('Price')
plt.title('Regression')
plt.legend()
plt.show()
return svr_rbf.predict(x)[0], svr_lin.predict(x)[0],svr_poly.predict(x)[0],lin_reg.predict(x)[0]# PREDICT!
predicted_price = predict_prices(dates, prices, [[28]])print(predicted_price)

Step 3: Reading the Data:

After all the code is finished and understood, if done in google colab, you should have gotten something like this:

Stock Prediction Regression model

We can assume that the next data point will be someone on the Linear Regression line, and this is going to be much easier for my uncle to guess if he should sell or keep his stock.

I should mention that using machine learning for stock predictors is very difficult. This is simply a foundation for what we could do to predict using past data. It is very hard to predict using an algorithm because there are so many things that affect stocks, such as current events, as well as past and future events.

Thank you for reading and hopefully following this tutorial with me!

If you want to see my algorithm in action please visit:

https://colab.research.google.com/drive/1KUdBL2iG_9LPnmHeNIjCFqdfLAz5xaxH

Social Media: 👤

If you enjoyed this article or had any questions or concerns please contact me at dexteralxbarahona@gmail.com

Connect with me on Linkedin at https://www.linkedin.com/in/dexter-barahona-723314194

Instagram: DexterBarahona

--

--

No responses yet