Last Update: February 21, 2022
Multiple linear regression in Python can be fitted using
ols function found within
statsmodels.formula.api module. Main parameters within
ols function are
“y ~ x1 + … + xp” model description string and
data with data frame object including model variables. Therefore,
ols(formula = “y ~ x1 + x2”, data = model_data).fit() code line fits model using variables included within
As example, we can fit multiple linear regression of house price explained by its lot size and number of bedrooms using data included within
AER R package
HousePrices object .
First, we import package
statsmodels for data downloading and model fitting .
In : import statsmodels.api as sm import statsmodels.formula.api as smf
Second, we create
houseprices data object using
get_rdataset function and display first five rows and three columns of data using
head data frame method to view its structure.
In : houseprices = sm.datasets.get_rdataset(dataname="HousePrices", package="AER", cache=True).data print(houseprices.iloc[:, 0:3].head())
Out : price lotsize bedrooms 0 42000.0 5850 3 1 38500.0 4000 2 2 49500.0 3060 3 3 60500.0 6650 3 4 61000.0 6360 2
Third, we fit model with
ols function using variables within
houseprices data object, store outcome within
mlr object and print its
params parameter to observe coefficients estimates. Within
ols function, parameter
formula = “price ~ lotsize + bedrooms” fits model where house price is explained by its lot size and number of bedrooms.
In : mlr = smf.ols(formula="price ~ lotsize + bedrooms", data=houseprices).fit() print(mlr.params)
Out : Intercept 5612.599731 lotsize 6.053022 bedrooms 10567.351501 dtype: float64
My online courses are hosted at Teachable website.
For more details on this concept, you can view my Linear Regression in Python Course.
 Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.
Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.
 Seabold, Skipper, and Josef Perktold. (2010). “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference.