How to Use coeftest() Function in R
In this example we create a data frame in R that shows relationship between a car’s fuel efficiency and its engine size and weight. We’ll use a hypothetical dataset with the following variables:
mpg
: Miles per gallon (fuel efficiency)engine_size
: Engine size in litersweight
: Weight of the car in pounds
# Create a hypothetical dataset
car_data <- data.frame(
mpg = c(21, 23, 18, 25, 19, 22, 20, 24, 17, 26),
engine_size = c(2.0, 2.2, 2.5, 1.8, 2.3, 2.1, 2.4, 1.9, 2.2, 2.6),
weight = c(3000, 3200, 3500, 2800, 3300, 3100, 3400, 2900, 3600, 2700)
)
#view data frame
car_data
Output:
mpg engine_size weight
1 21 2.0 3000
2 23 2.2 3200
3 18 2.5 3500
4 25 1.8 2800
5 19 2.3 3300
6 22 2.1 3100
7 20 2.4 3400
8 24 1.9 2900
9 17 2.2 3600
10 26 2.6 2700
Now, We’ll fit a multiple linear regression model to the data in R. We can use the lm() function to fit this model.
# Fit a multiple linear regression model
car_model <- lm(mpg ~ engine_size + weight, data = car_data)
We can then use the coeftest() function to perform a t-test for each fitted regression coefficient in the model.
# Create a hypothetical dataset
car_data <- data.frame(
mpg = c(21, 23, 18, 25, 19, 22, 20, 24, 17, 26),
engine_size = c(2.0, 2.2, 2.5, 1.8, 2.3, 2.1, 2.4, 1.9, 2.2, 2.6),
weight = c(3000, 3200, 3500, 2800, 3300, 3100, 3400, 2900, 3600, 2700)
)
# Fit a multiple linear regression model
car_model <- lm(mpg ~ engine_size + weight, data = car_data)
# Load the lmtest package
library(lmtest)
# Perform t-test for each coefficient in the model
coeftest(car_model)
Output:
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 50.2325103 4.5374997 11.0705 1.09e-05 ***
engine_size 0.6687243 1.5965455 0.4189 0.6878736
weight -0.0095885 0.0013615 -7.0424 0.0002038 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Interpretation of the coeftest( ) Output
The coeftest()
function outputs four columns.
- The first column
Estimate
provides the estimated coefficients for the intercept and each predictor variable. - The second column
Std. Error
gives the standard error for each coefficient. - The third column
t value
(orz value
for models estimated with robust standard errors) gives the test statistic value. - The fourth column
Pr(>|t|)
(orPr(>|z|)
for models estimated with robust standard errors) provides the two-tailed p-value for the hypothesis test.
The t test statistic and corresponding p-value is shown for each t-test:
- Intercept: t = 11.0705 , p = <0.000
- engine_size: t = 0.4189 , p = 0.6878736
- weight: t = -7.0424, p = 0.0002038
The p-value tests the null hypothesis that each coefficient is equal to zero, given the other predictors in the model. A small p-value (typically less than 0.05) leads us to reject the null hypothesis, suggesting that the predictor is statistically significant.
- The p-value for the
weight
coefficient (0.0002038) is less than the significance level (e.g., 0.05),thus reject the null hypothesis that the coefficient is zero. Therefore, we conclude that theweight
variable has a statistically significant effect on fuel efficiency. - Conversely, for the
engine_size
variable, the p-value (0.6878736) is greater than the significance level. Thus, we fail to reject the null hypothesis for this coefficient, indicating that there is not enough evidence to suggest a statistically significant relationship between engine size and fuel efficiency.
Let us visualize the results, we can create a bar plot showing the coefficients along with their confidence intervals. Here’s how.
# Extract coefficients and their standard errors
coefficients <- coef(car_model)
std_errors <- sqrt(diag(vcov(car_model)))
# Create a data frame for plotting
plot_data <- data.frame(
Coefficient = names(coefficients),
Estimate = coefficients,
Std_Error = std_errors
)
# Plot coefficients and confidence intervals
library(ggplot2)
ggplot(plot_data, aes(x = Coefficient, y = Estimate, ymin = Estimate - 1.96 * Std_Error,
ymax = Estimate + 1.96 * Std_Error)) +
geom_point(size = 3) +
geom_errorbar(width = 0.2) +
labs(title = "Coefficients and Confidence Intervals",
x = "Coefficient",
y = "Estimate") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Output:
How to Use the coeftest() Function in R
In R Programming language, we use coeftest() function to perform hypothesis tests and construct confidence intervals for regression coefficients. It is used after fitting regression models using functions like lm()
(for linear regression), glm()
(for generalized linear models), or any other function that returns a suitable object with coefficient estimates and their standard errors.
Syntax: coeftest(x,vcov. = NULL,df = NULL,)
where:
- x: Name of the fitted regression model
vcov
.
: Covariance matrix of the estimated coefficientsdf
: Degrees of freedom to be used.
In this article, we will learn to use the coeftest() Function. For that, we first install the required packages(the lmtest
(for linear regression) and sandwich
packages). The sandwich
package provides heteroskedasticity-consistent covariance matrix estimators.
After installing, load these packages in R environment using the library() function:
install.packages("lmtest")
install.packages("sandwich")
# Load the required library
library(lmtest)
library(sandwich)