Another example utilising a simulated dataset is as follows
R
#Load the required packages. library (car) library (ggplot2) #Prepare the ground for replication set.seed (123) #Data simulation x <- rnorm (50) y <- 3*x + rnorm (50, mean = 0, sd = 0.5) data <- data.frame (x, y) #Create a model of linear regression between x and y. model <- lm (y ~ x, data = data) #Determine the standardised residuals. std.resid <- rstandard (model) #Use a histogram to visualise standardised residuals. ggplot ( data.frame (std.resid), aes (x = std.resid)) + geom_histogram (binwidth = 1.0, fill = "green" ) + xlab ( "Standardized Residue" ) + ylab ( "Count1" ) + ggtitle ( "Histogram of Standardized Residue" ) |
In this example, we begin by loading the required programs, ‘car’ and ‘ggplot2’. Then, using a 50-observation dataset for x and y with added Gaussian noise, we simulate the data.
We use the ‘lm( )’ function to train a linear regression model between ‘x’ and ‘y’, the ‘rstandard( )’ function from the car package to get the standardised residuals, and ‘ggplot2’ to produce a histogram to display the standardized residuals.
To guarantee the reproducibility of the simulation results, take note that we used ‘set.seed( )’ to set the seed to a precise value.