Last Update: June 21, 2022
ARIMA Models Identification: Correlograms in R can be done using ggplot2
package ggAcf
and ggPacf
functions for identifying ARIMA models autoregressive and moving average orders. Functions ggAcf
and ggPacf
are used to visualize autocorrelation and partial autocorrelation functions correlograms. Main parameters within ggAcf
and ggPacf
functions are x
with time series data, ci
with correlogram confidence interval statistical confidence level and lag.max
with maximum lag to calculate correlogram.
As example, we can do training range univariate time series classical ARIMA(p,d,q) model autoregressive p and moving average q orders identification with autocorrelation and partial autocorrelation functions correlograms using data included within datasets
package AirPassengers
object [1]. Notice that we need to evaluate whether level or d order differentiated training range univariate time series is needed for ARIMA model integration d order.
First, we load packages forecast
for time series characteristics, ggplot2
for training range and autocorrelation, partial autocorrelation functions correlograms charts [2].
In [1]:
library(forecast)
library(ggplot2)
Second, we create mdata
model data object copied from datasets
package AirPassengers
object and print first six months of data using head
function to view time series object structure.
In [2]:
mdata <- AirPassengers
head(mdata)
Out [2]:
Jan Feb Mar Apr May Jun
1949 112 118 132 129 121 135
Third, we delimit training range for model fitting as first ten years of data with window
function and store outcome within tdata
object. Within window
function, parameters x = mdata
includes full range model data and end = c(1958, 12)
includes training range end time. Then, we delimit testing range for model forecasting as last two years of data with window
function and store outcome within fdata
object. Within window
function, parameters x = mdata
includes full range model data and start = c(1959, 1)
includes training range start time. Notice that training and testing ranges delimiting was only included as an educational example which can be modified according to your needs.
In [3]:
tdata <- window(x = mdata, end = c(1958, 12))
fdata <- window(x = mdata, start = c(1959, 1))
Fourth, we view training range data with autoplot
and labs
functions. Within autoplot
function, parameter object = tdata
includes training range data object. Within labs
function, parameters y = "Air Passengers"
includes vertical axis label and x = "Year"
includes horizontal axis label.
In [4]:
autoplot(object = tdata) + labs(y = "Air Passengers", x = "Year")
Out [4]:

Figure 1. Training range data.
Fifth, we do training range time series autocorrelation function correlogram chart with ggAcf
function. Within ggAcf
function, parameters x = tdata
includes training range data object, ci = 0.95
includes correlogram confidence interval ninety-five percent statistical confidence level and lag.max = 24
includes correlogram with twenty-four lags. Notice that ggAcf
function parameters were only included as educational examples which can be modified according to your needs.
In [5]:
ggAcf(x = tdata, ci = 0.95, lag.max = 24)
Out [5]:

Figure 2. Training range time series autocorrelation function correlogram.
Sixth, we do training range time series partial autocorrelation function correlogram chart with ggPacf
function. Within ggPacf
function, parameters x = tdata
includes training range data object, ci = 0.95
includes correlogram confidence interval ninety-five percent statistical confidence level and lag.max = 24
includes correlogram with twenty-four lags. Notice that ggPacf
function parameters were only included as educational examples which can be modified according to your needs.
In [6]:
ggPacf(x = tdata, ci = 0.95, lag.max = 24)
Out [6]:

Figure 3. Training range time series partial autocorrelation function correlogram.
Seventh, we do training range time series ARIMA(p,d,q) model autoregressive p and moving average q orders identification.
- If autocorrelation function ACF correlogram tails of gradually and partial autocorrelation function PACF correlogram drops after p statistically significant lags then we can observe the potential need of an autoregressive model AR(p) of order p.
- Alternatively, if autocorrelation function ACF correlogram drops after q statistically significant lags and partial autocorrelation function PACF correlogram tails off gradually then we can observe the potential need of a moving average model MA(q) of order q.
- Otherwise, if autocorrelation function ACF correlogram tails of gradually after q statistically significant lags and partial autocorrelation function PACF correlogram tails off gradually after p statistically significant lags then we can observe the potential need of an autoregressive moving average model ARMA(p,q) of orders p and q.
References
[1] Data Description: Monthly international airline passenger numbers in thousands from 1949 to 1960.
Original Source: Box, G. E. P., Jenkins, G. M. and Reinsel, G. C. (1976). “Time Series Analysis, Forecasting and Control”. Third Edition. Holden-Day. Series G.
Source: datasets R Package AirPassengers Object. R Core Team (2021). “R: A language and environment for statistical computing”. R Foundation for Statistical Computing, Vienna, Austria.
[2] forecast R Package. Hyndman R, Athanasopoulos G, Bergmeir C, Caceres G, Chhay L, O’Hara-Wild M, Petropoulos F, Razbash S, Wang E, Yasmeen F (2022). “forecast: Forecasting functions for time series and linear models”. R package version 8.16
ggplot2 R Package. Hadley Wickham (2016). “ggplot2: Elegant Graphics for Data Analysis”. Springer-Verlag New York