Last Update: February 21, 2022
Multicollinearity in R can be tested using
vif function for estimating multiple linear regression independent variables variance inflation factors. Main parameter within
vif function is
mod with previously fitted
lm model. Independent variables variance inflation factors can also be estimated as main diagonal values from their inverse correlation matrix using
ginv function. Main parameter within
ginv function is
X with independent variables previously estimated correlation matrix using
As example, we can test multicollinearity of independent variables from multiple linear regression of house price explained by its lot size, number of bedrooms, bathrooms and stories using data included within
HousePrices object .
First, we load packages
AER for data,
car for estimating variance inflation factors,
MASS for estimating inverse correlation matrix and
corrplot for inverse correlation matrix chart .
In : library(AER) library(car) library(MASS) library(corrplot)
Second, we create
HousePrices data object from
AER package using
data function and print first six rows and five columns of data using
head function to view
In : data(HousePrices) head(HousePrices[, 1:5])
Out : price lotsize bedrooms bathrooms stories 1 42000 5850 3 1 2 2 38500 4000 2 1 1 3 49500 3060 3 1 1 4 60500 6650 3 1 2 5 61000 6360 2 1 1 6 66000 4160 3 1 1
Third, we can fit multiple linear regression model using
lm function and store outcome within
mlr object. Within
lm function, parameter
formula = price ~ lotsize + bedrooms + bathrooms + stories fits model where house price is explained by its lot size, number of bedrooms, bathrooms and stories. Then, we can print independent variables estimated variance inflation factors using
vif function. Within
vif function, parameter
mod = mlr includes previously fitted
In : mlr <- lm(formula = price ~ lotsize + bedrooms + bathrooms + stories, data = HousePrices) vif(mod = mlr)
Out : lotsize bedrooms bathrooms stories 1.047054 1.310851 1.239203 1.251087
Fourth, we can create independent variables data frame subset and store it within
ivar object. Next, we can also print independent variables estimated variance inflation factors as main diagonal values from their inverse correlation matrix using
ginv function and store outcome within
ivaricor object. Within
ginv function, parameter
X = cor(ivar) includes independent variables estimated correlation matrix using
In : ivar <- HousePrices[, 2:5] ivaricor <- ginv(X = cor(ivar)) colnames(ivaricor) <- colnames(ivar) rownames(ivaricor) <- colnames(ivar) ivaricor
Out : lotsize bedrooms bathrooms stories lotsize 1.047054041 -0.09909201 -0.1683001 0.007354973 bedrooms -0.099092014 1.31085130 -0.3353444 -0.417827752 bathrooms -0.168300120 -0.33534441 1.2392031 -0.250688885 stories 0.007354973 -0.41782775 -0.2506889 1.251086952
Fifth, we can additionally visualize independent variables estimated variance inflation factors as main diagonal values from their inverse correlation matrix chart using
corrplot function. Within
corrplot function, parameters
corr = ivaricor includes matrix to visualize,
method = "number" includes visualization method to be used and
is.corr = FALSE includes logical value that input matrix is an inverse correlation matrix and not a correlation matrix.
In : corrplot(corr = ivaricor, method = "number", is.corr = FALSE)
My online courses are hosted at Teachable website.
For more details on this concept, you can view my Linear Regression in R Course.
 Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.
Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.
 AER R Package: Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.
car R Package: John Fox and Sanford Weisberg. (2019). An R Companion to Applied Regression. Third Edition. Sage, Thousand Oaks, CA.
MASS R Package: W. N. Venables and B. D. Ripley. (2002). Modern Applied Statistics with S. Fourth Edition. Springer, New York.
corrplot R Package: Taiyun Wei and Viliam Simko. (2021). R package ‘corrplot’: Visualization of a Correlation Matrix. Version 0.90.