# Omitted Variable Bias: Wald Test in R

Last Update: February 21, 2022

Omitted Variable Bias: Wald Test in R can be done using `lmtest` package `waldtest` function for evaluating whether linear regression omitted independent variables explain dependent variable. Main parameters within `waldtest` function are `object` with restricted and unrestricted linear regression `lm` objects, and `test` with character specifying whether to do an F-test or a chi-square test.

As example, we can do number of bathrooms omitted variable Wald test from unrestricted multiple linear regression of house prices explained by its lot size, number of bedrooms and bathrooms using data included within `AER` package `HousePrices` object [1].

First, we load packages `AER` for data and `lmtest` for Wald test [2].

``````In [1]:
library(AER)
library(lmtest)
``````

Second, we create `HousePrices` data object from `AER` package using `data` function and print first six rows and four columns of data using `head` function to view `data.frame` structure.

``````In [2]:
data(HousePrices)
``````
``````Out [2]:
price lotsize bedrooms bathrooms
1 42000    5850        3         1
2 38500    4000        2         1
3 49500    3060        3         1
4 60500    6650        3         1
5 61000    6360        2         1
6 66000    4160        3         1
``````

Third, we fit restricted multiple linear regression using `lm` function and store results within `mlr1` object. Within `lm` function, parameter `formula = price ~ lotsize + bedrooms` fits restricted model where house price is explained by its lot size and number of bedrooms.

``````In [3]:
mlr1 <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)
``````

Fourth, as example again, we fit unrestricted multiple linear regression using `lm` function, store results within `mlr2` object and do Wald test using `waldtest` function. Within `lm` function, parameter `formula = price ~ lotsize + bedrooms + bathrooms` fits unrestricted model where house price is explained by its lot size, number of bedrooms and bathrooms. Within `waldtest` function, parameters `object = mlr1, mlr2` includes restricted `mlr1` and unrestricted `mlr2` models results, and `test = "F"` includes character to do an F-test. Notice that `mlr1`, `mlr2` models and `waldtest` function parameter `test = "F"` were only included as educational examples which can be modified according to your needs.

``````In [4]:
mlr2 <- lm(formula = price ~ lotsize + bedrooms + bathrooms, data = HousePrices)
waldtest(object = mlr1, mlr2, test = "F")
``````
``````Out [4]:
Wald test

Model 1: price ~ lotsize + bedrooms
Model 2: price ~ lotsize + bedrooms + bathrooms
Res.Df Df      F    Pr(>F)
1    543
2    542  1 122.41 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
``````

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression in R Course.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

[2] AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.

+