Calculate the Interquartile Range in R Programming – IQR() Function
The first quartile (Q1) and the third quartile (Q3) of a dataset are represented by the interquartile range (IQR), a statistical measure of statistical dispersion. Particularly for the middle 50% of the sample, it offers insightful information regarding the distribution and variability of the data.
Here are the steps how to compute the IQR:
- Sort the dataset by highest to lowest value.
- Find the first quartile’s (Q1) location. The dataset is divided into the lowest 25% of values in the first quartile.
- Find the third quartile’s (Q3) location. The dataset is divided into the lowest 75% of values at the third quartile.
IQR() function in R Language is used to calculate the interquartile range of a data set.
Mathematically, IQR = Q3 – Q1 where, Q3 specifies the median of n largest values Q1 specifies the median of n smallest values
But, R provides an in-built IQR() function to perform the upgiven calculations
Syntax: IQR(x) Parameters: x: Data set
Calculate IQR for the vectors:
R
# R program to calculate IQR value # Defining vector x <- c (5, 5, 8, 12, 15, 16) # Print Interquartile range print ( IQR (x)) |
Output:
[1] 8.5
Calculate IQR for the matrix:
R
# R program to calculate IQR value # Defining a matrix x <- matrix ( c (1:9), 3, 3) # Print Interquartile range print ( IQR (x)) |
Output:
[1] 4
Calculate IQR for the missing values:
With the help of na.rm=TRUE we can ignore missing values and calculate IQR.
R
# R program to calculate IQR value # Defining vector x <- c (5, 5, NA , 8, NA , 12, NA , 15, 16,18) # Print Interquartile range print ( IQR ((x),na.rm= TRUE )) |
Output:
[1] 9
Calculate IQR for the single columns of the dataframe:
We can load the iris dataset to calculate the IQR for single columns.
R
# load the dataset. data (iris) # calculate IQR for single column. IQR (iris$Petal.Length) |
Output:
[1] 3.5
Calculate IQR for the multiple columns of the dataframe:
With the help of apply function, we can calculate the IQR for multiple columns.
R
# load library library (dplyr) # remove Species column from dataset data= select (iris,-( 'Species' )) # calculate the IQR for all columns sapply (data,IQR) |
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width 1.3 0.5 3.5 1.5
We can calculate the IQR for multiple columns in a dataset with the help of sapply function. we remove the species column from the data because the IQR function only works on numerical columns.