Skip to content

Linear Regression: Analysis of Variance ANOVA Table

Last Update: February 21, 2022

Linear Regression: Analysis of Variance ANOVA Table is used to analyze dependent variable y total variance together with its two components model fitted value \hat{y} regression variance or explained variance and model random error \hat{e} residual variance or unexplained variance. It is also used to evaluate whether adding independent variables improved linear regression model.

As example, we can fit a three-variable multiple linear regression with formula \hat{y}_{i}=\hat{\beta}_{0}+\hat{\beta}_{1}x_{1i}+\hat{\beta}_{2}x_{2i}\;(1). Then, we can calculate ANOVA table regression degrees of freedom with formula df_{reg}=p\;(2), residual degrees of freedom with formula df_{res}=n-p-1\;(3) and total degrees of freedom with formula df_{tot}=n-1\;(4) where p is number of independent variables, n is number of observations and 1 is constant term. Next, we can estimate regression sum of squares with formula ss_{reg}=\sum_{i=1}^{n}(\hat{y}_{i}-\bar{y})^{2}\;(5), residual sum of squares with formula ss_{res}=\sum_{i=1}^{n}(y_{i}-\hat{y}_{i})^{2}=\sum_{i=1}^{n}\hat{e}_{i}^{2}\;(6) and calculate total sum of squares with formula ss_{tot}=\sum_{i=1}^{n}(y_{i}-\bar{y})^{2}\;(7) where \hat{y}_{i} are regression fitted values, \bar{y} is dependent variable mean, y_{i} are dependent variable values and \hat{e}_{i} are regression residuals. After that, we estimate regression mean squared error with formula ms_{reg}=\frac{ss_{reg}}{df_{reg}}\;(8) and residual mean squared error with formula ms_{res}=\frac{ss_{res}}{df_{res}}\;(9). Later, we estimate F-statistic with formula F=\frac{ms_{reg}}{ms_{res}}\;(10) and do F-test with joint null hypothesis that independent variables x_{1},x_{2} coefficients are equal to zero with formula H_{0}:\hat{\beta}_{1}=\hat{\beta}_{2}=0\;(11). If joint null hypothesis is rejected, then adding independent variables x_{1} and/or x_{2} improved linear regression model.

Below, we find an example of analysis of variance ANOVA table from multiple linear regression of house price explained by its lot size and number of bedrooms [1].

Table 1. Microsoft Excel® analysis of variance ANOVA table from multiple linear regression of house price explained by its lot size and number of bedrooms.

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression Courses.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

Source: AER R Package HousePrices Object. Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

My online courses are closed for enrollment.
+