# Multicollinearity: Variance Inflation Factor

Last Update: February 21, 2022

Multicollinearity is when two or more linear regression independent variables $x_{1},...,x_{p}$ are highly correlated which complicates isolating their individual explanatory relationship with dependent variable $y$. This can be tested through model independent variables estimated variance inflation factors $vif_{j}$. If independent variable $j$ estimated variance inflation factor $vif_{j}$ is between five and ten then independent variable might be highly correlated. And, if independent variable $j$ estimated variance inflation factor $vif_{j}$ is greater than ten then independent variable is highly correlated.

As example, we can fit a five-variable multiple linear regression with formula $\hat{y}_{i}=\hat{\beta}_{0}+\hat{\beta}_{1}x_{1i}+\hat{\beta}_{2}x_{2i}+\hat{\beta}_{3}x_{3i}+\hat{\beta}_{4}x_{4i}\;(1)$. Then, as example again, we can estimate independent variable $x_{1}$ variance inflation factor individually with formula $vif_{1}=\frac{1}{1-r_{1}^2}\;(2)$. Independent variable $x_{1}$ variance inflation factor $vif_{1}$ is equal to one divided by one minus coefficient of multiple determination $r_{1}^2$ from multiple linear regression with formula $\hat{x}_{1i}=\hat{\beta}_{0}+\hat{\beta}_{1}x_{2i}+\hat{\beta}_{2}x_{3i}+\hat{\beta}_{3}x_{4i}\;(3)$. Notice that multiple linear regression $(3)$ only includes independent variables and independent variable $x_{1}$ for which variance inflation factor $vif_{1}$ is estimated becomes its dependent variable while the others remain as independent variables.

Below, we find an example of independent variables individually estimated variance inflation factors from multiple linear regression of house price explained by its lot size, number of bedrooms, bathrooms and stories [1].

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression Courses.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

Source: AER R Package HousePrices Object. Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

+