Last Update: February 21, 2022
Linearity in Parameters: Ramsey RESET Test in Python can be done using statsmodels
package linear_reset
function found within statsmodels.stats.diagnostic
module for evaluating whether linear regression fitted values non-linear combinations explain dependent variable. Main parameters within linear_reset
function are res
with original model results, power
with augmented model added independent variables powers, test_type
with original model fitted values, independent variables or independent variables first principal component as augmented model added independent variables and use_f
with logical value on whether an F-test or chi-square test should be done.
As example, we can do Ramsey RESET test on multiple linear regression of house price explained by its lot size and number of bedrooms using data included within AER
R package HousePrices
object [1].
First, we import statmodels
package for data downloading, multiple linear regression fitting and Ramsey RESET test [2].
In [1]:
import statsmodels.api as sm
import statsmodels.formula.api as smf
import statsmodels.stats.diagnostic as smd
Second, we create houseprices
data object using get_rdataset
function and display first five rows and three columns of data using print
function and head
data frame method to view its structure.
In [2]:
houseprices = sm.datasets.get_rdataset(dataname="HousePrices", package="AER", cache=True).data
print(houseprices.iloc[:, 0:3].head())
Out [2]:
price lotsize bedrooms
0 42000.0 5850 3
1 38500.0 4000 2
2 49500.0 3060 3
3 60500.0 6650 3
4 61000.0 6360 2
Third, we fit model with ols
function using variables within houseprices
data object and store results within mlr
object. Within ols
function, parameter formula = “price ~ lotsize + bedrooms”
fits model where house price is explained by its lot size and number of bedrooms.
In [3]:
mlr = smf.ols(formula="price ~ lotsize + bedrooms", data=houseprices).fit()
Fourth, as example again, we do Ramsey RESET test using linear_reset
function, store results in resettest
object and print it. Within linear_reset
function, parameters res=mlr
includes original model results, power=2
adds squared independent variable to augmented model, test_type="fitted"
adds original model fitted values as augmented model independent variable and use_f=True
does F-test. Notice that linear_reset
function parameters power=2
, test_type="fitted"
and use_f=True
were only included as educational examples which can be modified according to your needs.
In [4]:
resettest = smd.linear_reset(res=mlr, power=2, test_type="fitted", use_f=True)
print(resettest)
Out [4]:
<F test: F=array([[10.63462745]]), p=0.0011796522160904715, df_denom=542, df_num=1>
Courses
My online courses are hosted at Teachable website.
For more details on this concept, you can view my Linear Regression in Python Course.
References
[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.
Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.
[2] statsmodels Python package: Seabold, Skipper, and Josef Perktold. (2010). “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference.