How to Predict Netflix Stock Price using Machine Learning in R

Step 1: Importing the required libraries

Below is the list of external and internal libraries and packages, that we will be requiring for this R Machine Learning Project:

Package

Uses

smooth

Smoothing techniques and forecasting models for time series analysis.

forecast

Used for forecasting time series data.

xts

Used for handling and manipulating time series data.

imputeTS

Used functions to handle missing values in time series data

fpp2

Provides datasets and additional forecasting tools

tseries

Used for functions for time series analysis, including tests for stationarity.

ggfortify

Used for easy visualization of time series objects

ggplot2

A popular package for creating complex and customizable plots in R.

quantmod

This package provides tools to fetch financial market data, analyze, and visualize it.

R
#Install and load libraries
#Smoothing techniques for time series analysis.
install.packages("smooth")
library(smooth)

# Used for forecasting time series data.
install.packages("forecast")
library(forecast)

#Used for handling and manipulating time series data
install.packages("xts")
library(xts)

#handle missing values in time series data
install.packages("imputeTS")
library(imputeTS)

#provides datasets
install.packages("fpp2")
library(fpp2)

#functions for time series analysis
install.packages("tseries")
library(tseries)

#visualization of time series objects 
install.packages("ggfortify")
library(ggfortify)

#customizable plots in R
install.packages("ggplot2")
library(ggplot2)

# fetch financial market data
install.packages("quantmod")
library(quantmod)

Step 2: Loading the Netfix Stock Price Dataset

Here we install and load the required libraries, based on the choice of mode of dataset (as discussed above).

  • Loading dataset from Finance websites
R
# Loading the required data
df = read.csv("/content/NFLX.csv") #if you use external data set
  • Loading dataset from CSV file
R
# Here we use getSymboles() function for collect the data from Yahoo finance
getSymbols('NFLX', from = '2002-01-01', to = '2024-01-01')
df = NFLX

# View dataset
head(df)

Output:

           NFLX.Open NFLX.High NFLX.Low NFLX.Close NFLX.Volume NFLX.Adjusted
2002-05-23 1.156429 1.242857 1.145714 1.196429 104790000 1.196429
2002-05-24 1.214286 1.225000 1.197143 1.210000 11104800 1.210000
2002-05-28 1.213571 1.232143 1.157143 1.157143 6609400 1.157143
2002-05-29 1.164286 1.164286 1.085714 1.103571 6757800 1.103571
2002-05-30 1.107857 1.107857 1.071429 1.071429 10154200 1.071429
2002-05-31 1.078571 1.078571 1.071429 1.076429 8464400 1.076429

Step 3: Checking the dimension and missing values of our data

Here we measure the dimension of the dataset and check the missing values.

R
# Check the dimension of the dataset
dim(df)

# Check the missing values of all the columns of the dataset
colSums(is.na(df))

Output:

[1] 5439    6

NFLX.Open NFLX.High NFLX.Low NFLX.Close NFLX.Volume NFLX.Adjusted
0 0 0 0 0 0

Step 4: Taking the summary of the data

We check the summary of the data and get the basic idea of the dataset.

R
# Checking the summary of the data
summary(df)

Output:

     Index              NFLX.Open          NFLX.High           NFLX.Low       
Min. :2002-05-23 Min. : 0.3779 Min. : 0.4107 Min. : 0.3464
1st Qu.:2007-10-16 1st Qu.: 4.1143 1st Qu.: 4.1936 1st Qu.: 4.0400
Median :2013-03-13 Median : 33.9957 Median : 34.5543 Median : 33.5100
Mean :2013-03-11 Mean :132.3833 Mean :134.4291 Mean :130.2730
3rd Qu.:2018-08-04 3rd Qu.:255.3800 3rd Qu.:261.5600 3rd Qu.:249.5550
Max. :2023-12-29 Max. :692.3500 Max. :700.9900 Max. :686.0900
NFLX.Close NFLX.Volume NFLX.Adjusted
Min. : 0.3729 Min. : 285600 Min. : 0.3729
1st Qu.: 4.1214 1st Qu.: 5922600 1st Qu.: 4.1214
Median : 33.9600 Median : 10018000 Median : 33.9600
Mean :132.4029 Mean : 15907149 Mean :132.4029
3rd Qu.:255.1150 3rd Qu.: 18833300 3rd Qu.:255.1150
Max. :691.6900 Max. :323414000 Max. :691.6900

Step 5: Plotting the data

We will use chartSeries() function from the quantmod package in R, typically used for financial and stock market data visualization. type = ‘auto’, it automatically selects an appropriate chart type based on the data provided.

R
chartSeries(df, type = 'auto')

Output:

Predicting Stock Prices in R

Now we will Check that the data is stationary or not by visualize the data.

R
ggplot(df, aes(x = NFLX.Close))+
  geom_density(alpha = 0.5, fill = "blue") +
  geom_histogram(aes(y = ..density..), 
                 color = "black", 
                 fill = "lightgray", 
                 bins = 30, alpha = 0.4) +
  labs(title = "Density and Histogram of Close Price",
       x = "Close Price",
       y = "Density") +
  theme_minimal()

Output:

Predicting Stock Prices in R

Clearly the data is not normally distributed which implies it is a non-stationary data.

Step 6: Model building

We take out the data frame consist of closing price and then split our data df.close consist of closing price of stock in a 80:20 ratio where 80% is the training purpose and remaining for test or validation purpose.

We will split the data in train and test and now we will use arima model to Predicting Stock Prices.

R
# df.close is just name of the data frame consist of closing price you can take 
df.close = df[,4] # just taking the 4th column i.e. Close price

# Train test split
df.close.train = df.close[1:(0.8*length(df.close))]

df.close.test = df.close[(0.8*length(df.close)):length(df.close)]

Step 7: Model Fitting

R
# df.close.arima is just a name convention 
df.close.arima = auto.arima(df.close.train,
                            seasonal = T,
                            stepwise = T,
                            nmodels = 100,
                            trace = T,
                            biasadj = T)

Output:

 Fitting models using approximations to speed things up...

ARIMA(2,1,2) with drift : 21853.71
ARIMA(0,1,0) with drift : 21847.69
ARIMA(1,1,0) with drift : 21848.52
ARIMA(0,1,1) with drift : 21847.56
ARIMA(0,1,0) : 21847.87
ARIMA(1,1,1) with drift : 21848.77
ARIMA(0,1,2) with drift : 21849.32
ARIMA(1,1,2) with drift : 21850.01
ARIMA(0,1,1) : 21847.64

Now re-fitting the best model(s) without approximations...

ARIMA(0,1,1) with drift : 21849.74

Best model: ARIMA(0,1,1) with drift

Netflix Stock Price Prediction & Forecasting using Machine Learning in R

Recently, many people have been paying attention to the stock market as it offers high risks and high returns. In simple words, “Stock” is the ownership of a small part of a company. The more stock you have the bigger the ownership is. Using machine learning algorithms to predict a company’s stock price aims to forecast the future value of the company stock. Due to some factors or elements stock price is dynamic and volatile and predicting it is more challenging.

Table of Content

  • DataSet Used for Netflix Stock Price Prediction
  • Model Used for Netflix Stock Price Prediction
  • How to Predict Netflix Stock Price using Machine Learning in R
    • Step 1: Importing the required libraries
    • Step 2: Loading the Netfix Stock Price Dataset
    • Step 3: Checking the dimension and missing values of our data
    • Step 4: Taking the summary of the data
    • Step 5: Plotting the data
    • Step 6: Model building
    • Step 7: Model Fitting
  • Executing and Checking the Model Summary
    • Checking Accuracy of Netflix Stock Price Prediction Model
    • Performance Comparison on Netflix Stock Price Prediction Model on Training vs Test Data Set
  • Predict Netflix Stock Price
    • Calculate Test accuracy score

Similar Reads

DataSet Used for Netflix Stock Price Prediction

For this R Machine Learning Project, we have used the “2002-01-01” to “2022-12-31” Netflix stock price data. This data can be fetched from either of the below sources:...

Model Used for Netflix Stock Price Prediction

Here we will use only the Close price of the Netflix stock for prediction and we will use the ARIMA (p, d, q) model for the prediction....

How to Predict Netflix Stock Price using Machine Learning in R

Step 1: Importing the required libraries...

Executing and Checking the Model Summary

Now we will check the summary of the model....

Predict Netflix Stock Price

With the help of ARIMA() function for different value of (p, d, q) we are seeing the model accuracy and try to find best predicted values....