Implementation of Removing Non Stationarity

This section presents essential data preprocessing techniques for achieving stationarity in time series analysis. Techniques include detrending, seasonal adjustment, logarithmic transformation, and differencing, followed by stationarity tests to validate the transformations, ensuring robust and accurate analysis of the data.

Importing Necessary Libraries and Creating Sample Data

Python3

import pandas as pd import numpy as np import matplotlib.pyplot as plt # Sample data date_rng = pd.date_range(start='2022-01-01', end='2022-12-31', freq='D') ts = pd.Series(np.random.randn(len(date_rng)), index=date_rng)


Detrending using a rolling window

  1. ts_detrended = ts - ts.rolling(window=30).mean(): This calculates the detrended series by subtracting the rolling mean from the original time series ts. The rolling(window=30).mean() computes the rolling mean over a window of size 30.
  2. Plotting: This code plots both the original and detrended series using matplotlib.
Python3

# Detrending using a rolling window ts_detrended = ts - ts.rolling(window=30).mean() # Plot original and detrended series plt.figure(figsize=(14, 7)) plt.plot(ts, label='Original') plt.plot(ts_detrended, label='Detrended', linestyle='--') plt.legend() plt.show()

Output:

Test to determine stationarity

Python3

from statsmodels.tsa.stattools import adfuller # Test for stationarity after detrending result_detrended = adfuller(ts_detrended.dropna()) print(f'ADF Statistic (Detrended): {result_detrended[0]}') print(f'p-value (Detrended): {result_detrended[1]}') print(f'Critical Values (Detrended): {result_detrended[4]}')

Output:

ADF Statistic (Detrended): -18.559254822829608 p-value (Detrended): 2.0882820619850462e-30 Critical Values (Detrended): {'1%': -3.4500219858626227, '5%': -2.870206553997666, '10%': -2.571387268879483}


The p-value is very small, indicating strong evidence against the null hypothesis. In this case, the null hypothesis is that the series has a unit root (i.e., it is non-stationary). The small p-value suggests that we can reject the null hypothesis and conclude that the detrended series is stationary.

The computed ADF statistic, indicating the strength of evidence against the null hypothesis of non-stationarity. Here, it is significantly negative, suggesting strong evidence in favor of stationarity.

Seasonal Adjustment

Python3

from statsmodels.tsa.seasonal import STL # Seasonal adjustment stl = STL(ts, seasonal=13) # Assuming yearly seasonality res = stl.fit() ts_seasonal_adj = ts - res.seasonal # Plot original and seasonally adjusted series plt.figure(figsize=(14, 7)) plt.plot(ts, label='Original') plt.plot(ts_seasonal_adj, label='Seasonally Adjusted', linestyle='--') plt.legend() plt.show()

Output:

Test for stationarity:

Python3

# Test for stationarity after seasonal adjustment result_seasonal_adj = adfuller(ts_seasonal_adj.dropna()) print(f'ADF Statistic (Seasonally Adjusted): {result_seasonal_adj[0]}') print(f'p-value (Seasonally Adjusted): {result_seasonal_adj[1]}') print(f'Critical Values (Seasonally Adjusted): {result_seasonal_adj[4]}')

Output:

ADF Statistic (Seasonally Adjusted): -4.651034555303582 p-value (Seasonally Adjusted): 0.00010390367939221074 Critical Values (Seasonally Adjusted): {'1%': -3.4491725955218655, '5%': -2.8698334971428574, '10%': -2.5711883591836733}

The p-value is small, indicating that there is strong evidence against the null hypothesis. In this case, the null hypothesis is that the series has a unit root (i.e., it is non-stationary). The small p-value suggests that we can reject the null hypothesis and conclude that the seasonally adjusted series is stationary.

The computed ADF statistic, which measures the strength of evidence against the null hypothesis of non-stationarity. In this case, the statistic is negative, indicating evidence in favor of stationarity.

Logarithmic Transformation

Python3

# Transformation (e.g., logarithmic) ts_log = np.log(ts) # Plot original and transformed series plt.figure(figsize=(14, 7)) plt.plot(ts, label='Original') plt.plot(ts_log, label='Log Transformed', linestyle='--') plt.legend() plt.show()

Output:

Test for Stationarity

Python3

# Test for stationarity after variance stabilization (log transformation) result_log = adfuller(ts_log.dropna()) print(f'ADF Statistic (Log Transformed): {result_log[0]}') print(f'p-value (Log Transformed): {result_log[1]}') print(f'Critical Values (Log Transformed): {result_log[4]}')

Output:

ADF Statistic (Log Transformed): -14.60629558553864 p-value (Log Transformed): 4.08969119294649e-27 Critical Values (Log Transformed): {'1%': -3.467004502498507, '5%': -2.8776444997243558, '10%': -2.575355189707274}

The p-value is very small, indicating strong evidence against the null hypothesis. In this case, the null hypothesis is that the series has a unit root (i.e., it is non-stationary). The small p-value suggests that we can reject the null hypothesis and conclude that the log-transformed series is stationary.

The computed ADF statistic, which measures the strength of evidence against the null hypothesis of non-stationarity. In this case, the statistic is significantly negative, indicating strong evidence in favor of stationarity.

Differencing to Remove Auto Correlation

Python3

# Differencing to reduce autocorrelation ts_diff = ts.diff().dropna() # Plot original and differenced series plt.figure(figsize=(14, 7)) plt.plot(ts, label='Original') plt.plot(ts_diff, label='Differenced', linestyle='--') plt.legend() plt.show()

Output:

Test for Stationarity

Python3

# Test for stationarity after differencing result_diff = adfuller(ts_diff.dropna()) print(f'ADF Statistic (Differenced): {result_diff[0]}') print(f'p-value (Differenced): {result_diff[1]}') print(f'Critical Values (Differenced): {result_diff[4]}')

Output:

ADF Statistic (Differenced): -8.439660110734907 p-value (Differenced): 1.7773358987173984e-13 Critical Values (Differenced): {'1%': -3.4492815848836296, '5%': -2.8698813715275406, '10%': -2.5712138845950587}

The p-value is very small, indicating strong evidence against the null hypothesis. In this case, the null hypothesis is that the differenced series has a unit root (i.e., it is non-stationary). The small p-value suggests that we can reject the null hypothesis and conclude that the differenced series is stationary.

The computed ADF statistic, indicating the strength of evidence against the null hypothesis of non-stationarity. In this case, the statistic is significantly negative, suggesting strong evidence in favor of stationarity.



How to Remove Non-Stationarity in Time Series Forecasting

Removing non-stationarity in time series data is crucial for accurate forecasting because many time series forecasting models assume stationarity, where the statistical properties of the time series do not change over time. Non-stationarity can manifest as trends, seasonality, or other forms of irregular patterns in the data.

The article comprehensively covers techniques and tests for removing non-stationarity in time series data, crucial for accurate forecasting, including detrending, seasonal adjustment, logarithmic transformation, differencing, and ADF/KPSS tests for stationarity validation.

Similar Reads

What is non-stationarity?

Non-stationarity refers to a property of a time series where the statistical properties of the data change over time. In other words, the mean, variance, or other statistical characteristics of the data series are not constant across different time periods. Non-stationarity can manifest in various ways, including trends, seasonality, and other irregular patterns....

How to remove non-stationarity?

Trend:...

Tests to Determine Stationarity

Augmented Dickey-Fuller (ADF) Test:...

Implementation of Removing Non Stationarity

This section presents essential data preprocessing techniques for achieving stationarity in time series analysis. Techniques include detrending, seasonal adjustment, logarithmic transformation, and differencing, followed by stationarity tests to validate the transformations, ensuring robust and accurate analysis of the data....