Skip to content

Omitted Variable Bias: Wald Test in R

Last Update: February 21, 2022

Omitted Variable Bias: Wald Test in R can be done using lmtest package waldtest function for evaluating whether linear regression omitted independent variables explain dependent variable. Main parameters within waldtest function are object with restricted and unrestricted linear regression lm objects, and test with character specifying whether to do an F-test or a chi-square test.

As example, we can do number of bathrooms omitted variable Wald test from unrestricted multiple linear regression of house prices explained by its lot size, number of bedrooms and bathrooms using data included within AER package HousePrices object [1].

First, we load packages AER for data and lmtest for Wald test [2].

In [1]:
library(AER)
library(lmtest)

Second, we create HousePrices data object from AER package using data function and print first six rows and four columns of data using head function to view data.frame structure.

In [2]:
data(HousePrices)
head(HousePrices[, 1:4])
Out [2]:
  price lotsize bedrooms bathrooms
1 42000    5850        3         1
2 38500    4000        2         1
3 49500    3060        3         1
4 60500    6650        3         1
5 61000    6360        2         1
6 66000    4160        3         1

Third, we fit restricted multiple linear regression using lm function and store results within mlr1 object. Within lm function, parameter formula = price ~ lotsize + bedrooms fits restricted model where house price is explained by its lot size and number of bedrooms.

In [3]:
mlr1 <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)

Fourth, as example again, we fit unrestricted multiple linear regression using lm function, store results within mlr2 object and do Wald test using waldtest function. Within lm function, parameter formula = price ~ lotsize + bedrooms + bathrooms fits unrestricted model where house price is explained by its lot size, number of bedrooms and bathrooms. Within waldtest function, parameters object = mlr1, mlr2 includes restricted mlr1 and unrestricted mlr2 models results, and test = "F" includes character to do an F-test. Notice that mlr1, mlr2 models and waldtest function parameter test = "F" were only included as educational examples which can be modified according to your needs.

In [4]:
mlr2 <- lm(formula = price ~ lotsize + bedrooms + bathrooms, data = HousePrices)
waldtest(object = mlr1, mlr2, test = "F")
Out [4]:
Wald test

Model 1: price ~ lotsize + bedrooms
Model 2: price ~ lotsize + bedrooms + bathrooms
  Res.Df Df      F    Pr(>F)    
1    543                        
2    542  1 122.41 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression in R Course.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

[2] AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.

My online courses are closed for enrollment.
+