Last Update: February 21, 2022
Heteroskedasticity: Breusch-Pagan Test in R can be done using lmtest
package bptest
function for evaluating whether linear regression independent variables explain its errors variance. Main parameters within bptest
function are formula
with lm
model to be tested and varformula
with formula describing independent variables for explaining model errors variance.
Heteroskedasticity: White Test in R can also be done using lmtest
package bptest
function for evaluating whether linear regression independent variables and squared independent variables explain its errors variance. Main parameters within bptest
function are formula
with lm
model to be tested and varformula
with formula describing independent variables and squared independent variables for explaining model errors variance.
As example, we can do Breusch-Pagan, White (no cross terms) and White (cross terms) tests from multiple linear regression of house price explained by its lot size and number of bedrooms using data included within AER
package HousePrices
object [1].
First, we load packages AER
for data and lmtest
for Breusch-Pagan and White tests [2].
In [1]:
library(AER)
library(lmtest)
Second, we create HousePrices
data object from AER
package using data
function and print first six rows and three columns of data using head
function to view data.frame
structure.
In [2]:
data(HousePrices)
head(HousePrices[, 1:3])
Out [2]:
price lotsize bedrooms
1 42000 5850 3
2 38500 4000 2
3 49500 3060 3
4 60500 6650 3
5 61000 6360 2
6 66000 4160 3
Third, we fit multiple linear regression using lm
function and store results within mlr
object. Within lm
function, parameter formula = price ~ lotsize + bedrooms
fits model where house price is explained by its lot size and number of bedrooms.
In [3]:
mlr <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)
Fourth, we do Breusch-Pagan test using bptest
function. Within bptest
function, parameters formula = mlr
includes mlr
model to be tested and varformula = ~ lotsize + bedrooms
includes formula describing independent variables for explaining mlr
model errors variance.
In [4]:
bptest(formula = mlr, varformula = ~ lotsize + bedrooms, data = HousePrices)
Out [4]:
studentized Breusch-Pagan test
data: mlr
BP = 66.222, df = 2, p-value = 4.17e-15
Fifth, we also do White test (no cross terms) using bptest
function. Within bptest
function, parameters formula = mlr
includes mlr
model to be tested and varformula = ~ lotsize + I(lotsize^2) + bedrooms + I(bedrooms^2)
includes formula describing independent variables and squared independent variables for explaining mlr
model errors variance. Within varformula
parameter, I
function is used so that ^
operators are inhibited as formula operators and used as arithmetical operators instead. Notice that bptest
function prints studentized Breusch-Pagan test
title but a White test (no cross terms) is done instead.
In [5]:
bptest(formula = mlr, varformula = ~ lotsize + I(lotsize^2) + bedrooms + I(bedrooms^2), data = HousePrices)
Out [5]:
studentized Breusch-Pagan test
data: mlr
BP = 67.253, df = 4, p-value = 8.622e-14
Sixth, we additionally do White test (cross terms) using bptest
function. Within bptest
function, parameters formula = mlr
includes mlr
model to be tested and varformula = ~ lotsize + I(lotsize^2) + lotsize*bedrooms + bedrooms + I(bedrooms^2)
includes formula describing independent variables, squared independent variables and independent variables product for explaining mlr
model errors variance. Within varformula
parameter, I
function is used so that ^
operators are inhibited as formula operators and used as arithmetical operators instead. Notice that bptest
function prints studentized Breusch-Pagan test
title but a White test (cross terms) is done instead. Also, notice that White test (cross terms) evaluates heteroskedasticity and model equation specification.
In [6]:
bptest(formula = mlr, varformula = ~ lotsize + I(lotsize^2) + lotsize*bedrooms + bedrooms + I(bedrooms^2), data = HousePrices)
Out [6]:
studentized Breusch-Pagan test
data: mlr
BP = 67.324, df = 5, p-value = 3.69e-13
Courses
My online courses are hosted at Teachable website.
For more details on this concept, you can view my Linear Regression in R Course.
References
[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.
Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.
[2] AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.
lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.