# Heteroskedasticity: Breusch-Pagan and White Tests in R

Last Update: February 21, 2022

Heteroskedasticity: Breusch-Pagan Test in R can be done using `lmtest` package `bptest` function for evaluating whether linear regression independent variables explain its errors variance. Main parameters within `bptest` function are `formula` with `lm` model to be tested and `varformula` with formula describing independent variables for explaining model errors variance.

Heteroskedasticity: White Test in R can also be done using `lmtest` package `bptest` function for evaluating whether linear regression independent variables and squared independent variables explain its errors variance. Main parameters within `bptest` function are `formula` with `lm` model to be tested and `varformula` with formula describing independent variables and squared independent variables for explaining model errors variance.

As example, we can do Breusch-Pagan, White (no cross terms) and White (cross terms) tests from multiple linear regression of house price explained by its lot size and number of bedrooms using data included within `AER` package `HousePrices` object .

First, we load packages `AER` for data and `lmtest` for Breusch-Pagan and White tests .

``````In :
library(AER)
library(lmtest)``````

Second, we create `HousePrices` data object from `AER` package using `data` function and print first six rows and three columns of data using `head` function to view `data.frame` structure.

``````In :
data(HousePrices)
``````Out :
price lotsize bedrooms
1 42000    5850        3
2 38500    4000        2
3 49500    3060        3
4 60500    6650        3
5 61000    6360        2
6 66000    4160        3``````

Third, we fit multiple linear regression using `lm` function and store results within `mlr` object. Within `lm` function, parameter `formula = price ~ lotsize + bedrooms` fits model where house price is explained by its lot size and number of bedrooms.

``````In :
mlr <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)``````

Fourth, we do Breusch-Pagan test using `bptest` function. Within `bptest` function, parameters `formula = mlr` includes `mlr` model to be tested and `varformula = ~ lotsize + bedrooms` includes formula describing independent variables for explaining `mlr` model errors variance.

``````In :
bptest(formula = mlr, varformula = ~ lotsize + bedrooms, data = HousePrices)``````
``````Out :
studentized Breusch-Pagan test

data:  mlr
BP = 66.222, df = 2, p-value = 4.17e-15``````

Fifth, we also do White test (no cross terms) using `bptest` function. Within `bptest` function, parameters `formula = mlr` includes `mlr` model to be tested and `varformula = ~ lotsize + I(lotsize^2) + bedrooms + I(bedrooms^2)` includes formula describing independent variables and squared independent variables for explaining `mlr` model errors variance. Within `varformula` parameter, `I` function is used so that `^` operators are inhibited as formula operators and used as arithmetical operators instead. Notice that `bptest` function prints `studentized Breusch-Pagan test` title but a White test (no cross terms) is done instead.

``````In :
bptest(formula = mlr, varformula = ~ lotsize + I(lotsize^2) + bedrooms + I(bedrooms^2), data = HousePrices)``````
``````Out :
studentized Breusch-Pagan test

data:  mlr
BP = 67.253, df = 4, p-value = 8.622e-14``````

Sixth, we additionally do White test (cross terms) using `bptest` function. Within `bptest` function, parameters `formula = mlr` includes `mlr` model to be tested and `varformula = ~ lotsize + I(lotsize^2) + lotsize*bedrooms + bedrooms + I(bedrooms^2)` includes formula describing independent variables, squared independent variables and independent variables product for explaining `mlr` model errors variance. Within `varformula` parameter, `I` function is used so that `^` operators are inhibited as formula operators and used as arithmetical operators instead. Notice that `bptest` function prints `studentized Breusch-Pagan test` title but a White test (cross terms) is done instead. Also, notice that White test (cross terms) evaluates heteroskedasticity and model equation specification.

``````In :
bptest(formula = mlr, varformula = ~ lotsize + I(lotsize^2) + lotsize*bedrooms + bedrooms + I(bedrooms^2), data = HousePrices)``````
``````Out :
studentized Breusch-Pagan test

data:  mlr
BP = 67.324, df = 5, p-value = 3.69e-13``````

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression in R Course.

 Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

 AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.

+