# Simple Linear Regression in R

Last Update: February 21, 2022

Simple linear regression in R can be fitted using stats package lm function. Main parameters within lm function are formula with y ~ x model description and data with data.frame object including model variables. Therefore, lm(y ~ x, data = model.data) code line fits model $\hat{y}_{i}&space;=&space;\hat{\beta}_{0}&space;+&space;\hat{\beta}_{1}&space;x_{i}$ using variables included within model.data object.

As example, we can fit simple linear regression of house price explained by its lot size using data included within AER package HousePrices object [1].

First, we load packages AER for data and ggplot2 for charting [2].

In [1]:
library(AER)
library(ggplot2)

Second, we create HousePrices data object from AER package using data function and print first six rows and two columns of data using head function to view data.frame structure.

In [2]:
data(HousePrices)
head(HousePrices[,1:2])
Out [2]:
price lotsize
1 42000    5850
2 38500    4000
3 49500    3060
4 60500    6650
5 61000    6360
6 66000    4160

Third, we draw scatter chart with regression line which doesn’t display its confidence interval.

In [3]:
ggplot(data = HousePrices, aes(x = lotsize, y = price)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
Out [3]:

Fourth, we fit model with lm function using variables within HousePrices data object, store outcome within slr object and print its coefficients estimates. Within lm function, parameter formula = price ~ lotsize fits model $\hat{price}&space;=&space;\hat{\beta}_{0}&space;+&space;\hat{\beta}_{1}&space;lotsize$ where house price is explained by its lot size.

In [4]:
slr <- lm(formula = price ~ lotsize, data = HousePrices)
slr
Out [4]:
Call:
lm(formula = price ~ lotsize, data = HousePrices)

Coefficients:
(Intercept)      lotsize
34136.192        6.599

Courses

My online courses are hosted at Teachable website.

For more details on this concept, you can view my Linear Regression in R Course.

References

[1] Data Description: Sales prices of houses sold in the city of Windsor, Canada, during July, August and September, 1987.

Original Source: Anglin, P., and Gencay, R. (1996). Semiparametric Estimation of a Hedonic Price Function. Journal of Applied Econometrics, 11, 633–648.

[2] AER R Package. Christian Kleiber and Achim Zeileis. (2008). Applied Econometrics with R. Springer-Verlag, New York.

ggplot2 R Package. Hadley Wickham (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York.

+