How to Perform a Wald Test in R?
In R, there are many packages that helps to performing Wald tests.
- lmtest: This package provides the waldtest() function, which can be used to perform Wald tests on coefficients in linear regression models.
- car: The car package provides the linearHypothesis() function, which can perform various types of hypothesis tests, including Wald tests, for linear regression models.
- aod: It is known for functions related to overdispersion analysis but the aod package also provides the wald.test() function, which can perform Wald tests for coefficients in regression models.
Here we perform a Wald test using the lmtest package with the ‘mtcars’ dataset from R.
# Load necessary library
install.packages("lmtest")
library(lmtest)
# Fit regression model
model <- lm(mpg ~ disp + hp + wt, data = mtcars)
# Perform Wald test to determine if the coefficients of disp and hp
wald_result <- waldtest(model, terms = c("disp", "hp"))
# Print the result
print(wald_result)
Output:
Wald test
Model 1: mpg ~ disp + hp + wt
Model 2: mpg ~ wt
Res.Df Df F Pr(>F)
1 28
2 30 -2 5.983 0.006863 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Fit a linear regression model to predict mpg using the disp, hp, and wt variables from the mtcars dataset.
- The waldtest() function from the lmtest package to perform a Wald test to determine if the coefficients of disp and hp are simultaneously equal to zero.
- The result of the Wald test is printed, indicating whether the null hypothesis (both coefficients are zero) is rejected or not.
Model 1: This represents the more complex model, which includes the predictors disp, hp, and wt to predict mpg.
- Model 2: This represents the simpler model, which includes only the predictor wt to predict mpg.
- Res.Df: This indicates the residual degrees of freedom, which is the difference between the total number of observations and the number of parameters estimated in the model.
- Df: This represents the change in degrees of freedom between Model 1 and Model 2. In this case, Model 2 has 2 fewer parameters estimated compared to Model 1 because it includes fewer predictors.
- F: This is the test statistic for the Wald test. It follows an F-distribution under the null hypothesis that the parameters in the reduced model (Model 2) are equal to zero. In other words, it tests whether the additional predictors in Model 1 contribute significantly to the model.
- Pr(>F): This is the p-value associated with the F-test statistic. It represents the probability of observing an F-statistic as extreme as the one calculated under the null hypothesis. In this case, the p-value is 0.006863, which is less than 0.05, suggesting strong evidence against the null hypothesis.
Significance codes: These asterisks provide a quick indication of the level of significance of the test. In this case, ** indicates significance at the 0.01 level.
In this example we create a dataset with a binary outcome variable and several predictor variables, and then perform a logistic regression analysis with a Wald test using the aod package.
# Load necessary library
install.packages("aod")
library(aod)
# Generate synthetic data
set.seed(123) # for reproducibility
n <- 100 # number of observations
# Predictor variables
x1 <- rnorm(n) # continuous predictor
x2 <- sample(0:1, n, replace = TRUE) # binary predictor
x3 <- rnorm(n) # continuous predictor
x4 <- sample(0:1, n, replace = TRUE) # binary predictor
# Outcome variable (binary)
y <- rbinom(n, 1, plogis(-1 + 0.5 * x1 + 0.8 * x2 - 0.3 * x3 + 0.6 * x4))
# Combine variables into a dataframe
data <- data.frame(y, x1, x2, x3, x4)
# Fit logistic regression model
model <- glm(y ~ x1 + x2 + x3 + x4, data = data, family = "binomial")
# Perform Wald test to determine if the coefficients of x3 and x4
wald_result <- wald.test(b = coef(model), Sigma = vcov(model), Terms = c(4, 5))
# Print the result
print(wald_result)
Output:
Wald test:
----------
Chi-squared test:
X2 = 2.1, df = 2, P(> X2) = 0.36
Set a seed for reproducibility then generate synthetic data with 100 observations.
- x1 and x3 are continuous predictors sampled from a normal distribution.
- x2 and x4 are binary predictors sampled from a Bernoulli distribution.
- y is a binary outcome variable generated from a logistic regression model.
- Fit a logistic regression model using the glm() function.
- The formula y ~ x1 + x2 + x3 + x4 specifies the model with predictors x1, x2, x3, and x4.
- We specify family = “binomial” to indicate logistic regression for binary outcomes.
- Use the wald.test() function to perform a Wald test.
- b = coef(model) specifies the coefficient estimates obtained from the logistic regression model.
- Sigma = vcov(model) specifies the variance-covariance matrix of the coefficients obtained from the logistic regression model.
- Terms = c(4, 5) specifies the indices of the coefficients corresponding to x3 and x4 that we want to test simultaneously.
Chi-squared test: This indicates that the test statistic follows a chi-squared distribution.
- X2: This is the value of the chi-squared test statistic. In this case, it is 2.1.
- df: This represents the degrees of freedom associated with the chi-squared distribution.
- P(> X2): This is the p-value associated with the chi-squared test statistic. It represents the probability of observing a chi-squared statistic as extreme as the one calculated under the null hypothesis. In this case, the p-value is 0.36.
Uses of Wald Test in R
- Hypothesis Testing: The Wald test is often used to test specific hypotheses about the coefficients in a regression model.
- Comparison of Nested Models: The Wald test can be used to compare nested models, where one model is a special case of another.
- Test Statistic: The Wald test statistic is calculated by squaring the ratio of the estimated coefficient to its standard error.
- P-value: The p-value associated with the Wald test measures the likelihood of observing such a test statistic under the null hypothesis.
- Decision: If the p-value is small then we reject the null hypothesis, indicating significance. If it’s larger, we fail to reject the null, suggesting non-significance.
How to Perform a Wald Test in R
In this article, we will discuss What is the Wald Test and How to Perform a Wald Test in R Programming Language.