Concept of Rolling Average
In order to generate averages for consecutive subsets, a rolling average first calculates the average of a certain window of data points and then moves the window across the dataset. Through this approach, the influence of random fluctuations is successfully reduced, and longer-term patterns within the data are highlighted.
Benefits of Using Rolling Average
- Smoothing: Rolling averages help in smoothing out short-term fluctuations, making it easier to identify long-term trends.
- Noise Reduction: By averaging data over a period, rolling averages mitigate the effects of outliers and noise in the dataset.
- Forecasting: Rolling averages are commonly used in forecasting future values based on historical trends.
Methods to Calculate Rolling Average in R
rollmean() Function from the xts Package: The rollmean() function, included in the xts package, lets users compute rolling means with extra parameters for handling missing values and window size specification.
Step 1: Installing and Loading Necessary Packages
Make that the xts and zoo packages are loaded into the R environment and installed before continuing.
install.packages("zoo")
install.packages("xts")
library(zoo)
library(xts)
Step 2: Create example dataset
Let’s construct a fictitious dataset with daily temperature values over a month:
# Create example dataset
dates <- seq(as.Date("2024-04-01"), by = "day", length.out = 30)
temperatures <- c(18, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 33, 34,
35, 36, 35, 33, 30, 28, 26, 25, 24, 23, 22, 20, 19, 18, 17, 16)
# Combine into a dataframe
temperature_data <- data.frame(date = dates, temperature = temperatures)
head(temperature_data)
Output:
date temperature
1 2024-04-01 18
2 2024-04-02 20
3 2024-04-03 22
4 2024-04-04 23
5 2024-04-05 24
6 2024-04-06 25
Step 3: Calculate rolling average using rollmean() function
Similarly, we’ll use the rollmean() function from the xts package to calculate the rolling average with a 7-day window size.
# Convert dataframe to xts object
temperature_xts <- xts(temperature_data$temperature, order.by = temperature_data$date)
# Calculate rolling average using rollmean() function
rolling_avg <- rollmean(temperature_xts, k = 7, align = "right", fill = NA)
rolling_avg
Output:
[,1]
2024-04-01 NA
2024-04-02 NA
2024-04-03 NA
2024-04-04 NA
2024-04-05 NA
2024-04-06 NA
2024-04-07 22.57143
2024-04-08 23.85714
2024-04-09 25.00000
2024-04-10 26.00000
2024-04-11 27.00000
2024-04-12 28.14286
2024-04-13 29.28571
2024-04-14 30.42857
2024-04-15 31.57143
2024-04-16 32.71429
2024-04-17 33.57143
2024-04-18 34.00000
2024-04-19 33.71429
2024-04-20 33.00000
2024-04-21 31.85714
2024-04-22 30.42857
2024-04-23 28.71429
2024-04-24 27.00000
2024-04-25 25.42857
2024-04-26 24.00000
2024-04-27 22.71429
2024-04-28 21.57143
2024-04-29 20.42857
2024-04-30 19.28571
The rolling average is calculated from April 1 to April 12, with the following values:
- From April 1 to April 6, the output is
NA
because there aren’t enough data points to calculate a 7-day average. - From April 7 onward, the rolling average reflects the average temperature of the past 7 days, ending on the current day.
The values from April 7 to April 12 show an increasing trend, starting from 22.57143 and going up to 28.14286.
How to calculate a rolling average in R
In R Programming Language a rolling average, often referred to as a moving average, is a computation used in statistics and data analysis to analyze data points by generating a series of averages of various subsets of the entire dataset. This method works especially well for reducing data oscillations over time so that underlying trends may be seen more clearly.