Implementation of Cross-correlation Analysis in Python
There are major 4 methods to perform cross-correlation analysis in Python:
- Python-Manual Function: Using basic Python functions and loops to compute cross-correlation.
- NumPy: Utilizing NumPy’s fast numerical operations for efficient cross-correlation computation.
- SciPy: Leveraging SciPy’s signal processing library for advanced cross-correlation calculations.
- Statsmodels: Employing Statsmodels for statistical analysis, including cross-correlation.
Method 1. Cross-correlation Analysis Using Python
To show implementation let’s generate an dataset comprising two time series signals, signal1
and signal2
, using a combination of sine and cosine functions with added noise. This dataset simulates real-world scenarios where signals often exhibit complex patterns and noise.
In the code, we define two different functions for calculating mean, second cross_correlation fucntion
that takes two signals x
and y
where:
mean(x)
andmean(y)
: Calculates the mean of each signal.sum((a - x_mean) * (b - y_mean) for a, b in zip(x, y))
: Calculates the numerator of the cross-correlation formula by summing the product of the differences between corresponding elements ofx
andy
, centered around their means.x_sq_diff
andy_sq_diff
calculate the sum of squared differences for each signal.math.sqrt(x_sq_diff * y_sq_diff)
: Calculates the denominator of the cross-correlation formula by taking the square root of the product of the squared differences.
import math
import random
# Generate signals
t = [i * 0.1 for i in range(100)]
signal1 = [math.sin(2 * math.pi * 2 * i) + 0.5 * math.cos(2 * math.pi * 3 * i) + random.normalvariate(0, 0.1) for i in t]
signal2 = [math.sin(2 * math.pi * 2 * i) + 0.5 * math.cos(2 * math.pi * 3 * i) + random.normalvariate(0, 0.1) for i in t]
# Define a function to calculate mean
def mean(arr):
return sum(arr) / len(arr)
# function to calculate cross-correlation
def cross_correlation(x, y):
# Calculate means
x_mean = mean(x)
y_mean = mean(y)
# Calculate numerator
numerator = sum((a - x_mean) * (b - y_mean) for a, b in zip(x, y))
# Calculate denominators
x_sq_diff = sum((a - x_mean) ** 2 for a in x)
y_sq_diff = sum((b - y_mean) ** 2 for b in y)
denominator = math.sqrt(x_sq_diff * y_sq_diff)
correlation = numerator / denominator
return correlation
correlation = cross_correlation(signal1, signal2)
print('Correlation:', correlation)
Output:
Manual Correlation: 0.9837294963190838
Method 2. Cross-correlation Analysis Using Numpy
NumPy’s corrcoef
function is utilized to calculate the cross-correlation between signal1
and signal2
.
import numpy as np
# time array
t = np.arange(0, 10, 0.1)
# Generate signals
signal1 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
signal2 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
numpy_correlation = np.corrcoef(signal1, signal2)[0, 1]
print('NumPy Correlation:', numpy_correlation)
Output:
NumPy Correlation: 0.9796920509627758
Method 3. Cross-correlation Analysis Using Scipy
SciPy’s pearsonr
function is employed to calculate the cross-correlation between signal1
and signal2.
The Pearson correlation coefficient measures the linear relationship between two datasets.
import numpy as np
# time array
t = np.arange(0, 10, 0.1)
# Generate signals
signal1 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
signal2 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
from scipy.stats import pearsonr
scipy_correlation, _ = pearsonr(signal1, signal2)
print('SciPy Correlation:', scipy_correlation)
Output:
SciPy Correlation: 0.9865169592702046
Method 4. Cross-correlation Analysis Using Statsmodels
Statsmodels OLS
function is used to calculate the cross-correlation between signal1
and signal2
.
import numpy as np
# time array
t = np.arange(0, 10, 0.1)
# Generate signals
signal1 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
signal2 = np.sin(2 * np.pi * 2 * t) + 0.5 * np.cos(2 * np.pi * 3 * t) + np.random.normal(0, 0.1, len(t))
import statsmodels.api as sm
statsmodels_correlation = sm.OLS(signal1, signal2).fit().rsquared
print('Statsmodels Correlation:', statsmodels_correlation)
Output:
Statsmodels Correlation: 0.9730755677920275
Cross-correlation Analysis in Python
Cross-correlation analysis is a powerful technique in signal processing and time series analysis used to measure the similarity between two series at different time lags. It reveals how one series (reference) is correlated with the other (target) when shifted by a specific amount. This information is valuable in various domains, including finance (identifying stock market correlations), neuroscience (analyzing brain activity), and engineering (evaluating system responses).
In this article, we’ll explore four methods for performing cross-correlation analysis in Python, providing clear explanations and illustrative examples.
Cross-correlation Analysis in Python
- Understanding Cross-correlation
- Implementation of Cross-correlation Analysis in Python
- Method 1. Cross-correlation Analysis Using Python
- Method 2. Cross-correlation Analysis Using Numpy
- Method 3. Cross-correlation Analysis Using Scipy
- Method 4. Cross-correlation Analysis Using Statsmodels