Last Update: February 21, 2022
Homogeneity of Regression Slopes: Dummy Variables in R can be done using
waldtest function for evaluating whether linear regression intercept and slopes are homogeneous across populations. Main parameters within
waldtest function are
object with restricted and unrestricted linear regression
lm objects, and
test with string specifying whether to do an F-test or a chi-square test.
As example, we can do homogeneity Wald test from unrestricted multiple linear regression of house prices explained by its lot size, number of bedrooms and air conditioning as dummy independent variable using data included within
HousePrices object .
First, we load packages
AER for data and
lmtest for Wald test .
In : library(AER) library(lmtest)
Second, we create
HousePrices data object from
AER package using
data function and print first six rows, first three columns and tenth column of data using
head function to view
In : data(HousePrices) head(HousePrices[, c(1:3, 10)])
Out : price lotsize bedrooms aircon 1 42000 5850 3 no 2 38500 4000 2 no 3 49500 3060 3 no 4 60500 6650 3 no 5 61000 6360 2 no 6 66000 4160 3 yes
Third, we fit restricted multiple linear regression using
lm function and store results within
mlr1 object. Within
lm function, parameter
formula = price ~ lotsize + bedrooms fits restricted model where house price is explained by its lot size and number of bedrooms.
In : mlr1 <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)
Fourth, as example again, we fit unrestricted multiple linear regression using
lm function, store results within
mlr2 object and print
mlr2 object summary results using
summary.lm function. Within
lm function, parameter
formula = price ~ lotsize + bedrooms + aircon + lotsize*aircon + bedrooms*aircon fits unrestricted model where house price is explained by its lot size, number of bedrooms and air conditioning as dummy independent variable. Notice that
lm function parameter
formula can also be
formula = price ~ lotsize*aircon + bedrooms*aircon because it automatically includes
aircon individual independent variables and their
bedrooms*aircon products within model equation. Also, notice that
lm function automatically converts
yes category into
1 numeric value and
no category into
0 numeric value. Additionally, notice that
aircon dummy independent variable was only included as educational example which can be modified according to your needs.
In : mlr2 <- lm(formula = price ~ lotsize + bedrooms + aircon + lotsize*aircon + bedrooms*aircon, data = HousePrices) summary.lm(mlr2)
Out : Call: lm(formula = price ~ lotsize + bedrooms + aircon + lotsize * aircon + bedrooms * aircon, data = HousePrices) Residuals: Min 1Q Median 3Q Max -67843 -12577 -1124 9250 90491 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.536e+04 4.263e+03 3.603 0.000344 *** lotsize 4.621e+00 4.660e-01 9.915 < 2e-16 *** bedrooms 7.709e+03 1.326e+03 5.813 1.05e-08 *** airconyes -1.423e+04 9.434e+03 -1.509 0.131932 lotsize:airconyes 2.438e+00 8.824e-01 2.763 0.005921 ** bedrooms:airconyes 6.125e+03 2.661e+03 2.302 0.021731 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 19370 on 540 degrees of freedom Multiple R-squared: 0.4785, Adjusted R-squared: 0.4737 F-statistic: 99.09 on 5 and 540 DF, p-value: < 2.2e-16
Fifth, we do Wald test using
waldtest function. Within
waldtest function, parameters
object = mlr1, mlr2 includes restricted
mlr1 and unrestricted
mlr2 models results, and
test = "F" includes string to do an F-test. Notice that
mlr2 models and
waldtest function parameter
test = "F" were only included as educational examples which can be modified according to your needs.
In : waldtest(object = mlr1, mlr2, test = "F")
Out : Wald test Model 1: price ~ lotsize + bedrooms Model 2: price ~ lotsize + bedrooms + aircon + lotsize * aircon + bedrooms * aircon Res.Df Df F Pr(>F) 1 543 2 540 3 37.35 < 2.2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
My online courses are hosted at Teachable website.
For more details on this concept, you can view my Linear Regression in R Course.
 Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.
Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.
 AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.
lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.