# Homogeneity of Regression Slopes: Dummy Variables in R

Last Update: February 21, 2022

Homogeneity of Regression Slopes: Dummy Variables in R can be done using `lmtest` package `waldtest` function for evaluating whether linear regression intercept and slopes are homogeneous across populations. Main parameters within `waldtest` function are `object` with restricted and unrestricted linear regression `lm` objects, and `test` with string specifying whether to do an F-test or a chi-square test.

As example, we can do homogeneity Wald test from unrestricted multiple linear regression of house prices explained by its lot size, number of bedrooms and air conditioning as dummy independent variable using data included within `AER` package `HousePrices` object [1].

First, we load packages `AER` for data and `lmtest` for Wald test [2].

``````In [1]:
library(AER)
library(lmtest)
``````

Second, we create `HousePrices` data object from `AER` package using `data` function and print first six rows, first three columns and tenth column of data using `head` function to view `data.frame` structure.

``````In [2]:
data(HousePrices)
``````
``````Out [2]:
price lotsize bedrooms aircon
1 42000    5850        3     no
2 38500    4000        2     no
3 49500    3060        3     no
4 60500    6650        3     no
5 61000    6360        2     no
6 66000    4160        3    yes
``````

Third, we fit restricted multiple linear regression using `lm` function and store results within `mlr1` object. Within `lm` function, parameter `formula = price ~ lotsize + bedrooms` fits restricted model where house price is explained by its lot size and number of bedrooms.

``````In [3]:
mlr1 <- lm(formula = price ~ lotsize + bedrooms, data = HousePrices)``````

Fourth, as example again, we fit unrestricted multiple linear regression using `lm` function, store results within `mlr2` object and print `mlr2` object summary results using `summary.lm` function. Within `lm` function, parameter `formula = price ~ lotsize + bedrooms + aircon + lotsize*aircon + bedrooms*aircon` fits unrestricted model where house price is explained by its lot size, number of bedrooms and air conditioning as dummy independent variable. Notice that `lm` function parameter `formula` can also be `formula = price ~ lotsize*aircon + bedrooms*aircon` because it automatically includes `lotsize`, `bedrooms`, `aircon` individual independent variables and their `lotsize*aircon`, `bedrooms*aircon` products within model equation. Also, notice that `lm` function automatically converts `aircon` variable `yes` category into `1` numeric value and `no` category into `0` numeric value. Additionally, notice that `aircon` dummy independent variable was only included as educational example which can be modified according to your needs.

``````In [4]:
mlr2 <- lm(formula = price ~ lotsize + bedrooms + aircon + lotsize*aircon + bedrooms*aircon, data = HousePrices)
summary.lm(mlr2)
``````
``````Out [4]:
Call:
lm(formula = price ~ lotsize + bedrooms + aircon + lotsize *
aircon + bedrooms * aircon, data = HousePrices)

Residuals:
Min     1Q Median     3Q    Max
-67843 -12577  -1124   9250  90491

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)         1.536e+04  4.263e+03   3.603 0.000344 ***
lotsize             4.621e+00  4.660e-01   9.915  < 2e-16 ***
bedrooms            7.709e+03  1.326e+03   5.813 1.05e-08 ***
airconyes          -1.423e+04  9.434e+03  -1.509 0.131932
lotsize:airconyes   2.438e+00  8.824e-01   2.763 0.005921 **
bedrooms:airconyes  6.125e+03  2.661e+03   2.302 0.021731 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 19370 on 540 degrees of freedom
Multiple R-squared:  0.4785,	Adjusted R-squared:  0.4737
F-statistic: 99.09 on 5 and 540 DF,  p-value: < 2.2e-16
``````

Fifth, we do Wald test using `waldtest` function. Within `waldtest` function, parameters `object = mlr1, mlr2` includes restricted `mlr1` and unrestricted `mlr2` models results, and `test = "F"` includes string to do an F-test. Notice that `mlr1`, `mlr2` models and `waldtest` function parameter `test = "F"` were only included as educational examples which can be modified according to your needs.

``````In [5]:
waldtest(object = mlr1, mlr2, test = "F")``````
``````Out [5]:
Wald test

Model 1: price ~ lotsize + bedrooms
Model 2: price ~ lotsize + bedrooms + aircon + lotsize * aircon + bedrooms *
aircon
Res.Df Df     F    Pr(>F)
1    543
2    540  3 37.35 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
``````

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression in R Course.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

[2] AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

lmtest R Package: Achim Zeileis and Torsten Hothorn. (2002). Diagnostic Checking in Regression Relationships. R News, 2 (3): 7-10.

+