R packages:
# Install necessary packages
list_packages <- c("forecast", "readxl", "stargazer", "fpp",
"fpp2", "scales", "quantmod", "urca",
"vars", "tseries", "ggplot2", "dplyr")
new_packages <- list_packages[!(list_packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)
# Load necessary packages
lapply(list_packages, require, character.only = TRUE)
After fitting models, the next step is to provide forecasting.
Before we introduce the forecast formula, let’s go over some notations.
zn: last observation, z1,...,zn are observed
t=n: forecast origin
zn+l: value of the time series at lead time l, not observed
ˆzn(l): forecast of zn+l at forecast origin n.
En[⋅]: conditional expectation given all observations up to forecast origin n (including forecast origin n), i.e., E[⋅|zn,zn−1...].
Definition Best forecast of zn+l is the forecast ˆzn(l) which minimizes the expected mean square error, En[(Zn+l−ˆzn(l))2].
Based on such a definition, we can prove that En[Zn+l] is the best forecast of zn+l, i.e., En[Zn+l]=E[Zn+l|zn,zn−1...] is the best forecast.
So from now on, we let our forecast to be ˆzn(l)=En[Zn+l]=E[Zn+l|zn,zn−1...]
˜Zt=ϕ1˜Zt−1+atZn+1=μ+ϕ1(Zn−μ)+an+1 Observe z1,...,zn. Our forecast is ˆzn(1)=En[Zn+1]=μ+ϕ1(En[Zn]−μ)+En[an+1]=μ+ϕ1(zn−μ)+0
By assumption En[an+1]=0, because an+1 takes place after zn,zn−1,....
ˆzn(1)=μ+ϕ1(zn−μ)
Since μ and ϕ1 are unknown, so to forecast, we replace parameters by estimates, ˆμ and ˆϕ1 and our final forecast is
ˆzn(1)=ˆμ+ˆϕ1(zn−ˆμ)
We can see that ˆzn(2)=En[Zn+2]=μ+ϕ1(En[Zn+1]−μ)+En[an+2]=μ+ϕ1(ˆzn(1)−μ)=μ+ϕ21(zn−μ)
In general,
ˆzn(l)=μ+ϕl1(zn−μ)
for l=1,2,...
Note that for stationary AR(1), ϕl1→0 when l→∞ because |ϕ1|<1.
˜Zt=at−θ1at−1
Case 1: l>1
Zn+l=μ+an+l−θ1an+l−1ˆzn(l)=En(Zn+l)=μ+0−0=μ
Case 2: l=1
Since Zn+1=μ+an+1−θ1an, we have ˆzn(1)=En(Zn+1)=μ+0−θ1En(an)
Since an depends on current and past observations zt’s, En(an) is not 0 any more. Thus we estimate En(an) by ˆan=zn−ˆzn= residual at time n.
Therefore,
ˆzn(1)=μ+0−θ1ˆan To get a actual forecast, we plug in ˆμ and ˆθ1.
A general forecast formula of ARIMA
Suppose Zt∼ARIMA(p,d,q), which can be considered as Zt∼ARMA(p+d,q) (nonstationary)
Zt=C+Φ1Zt−1+...+Φp+dZt−p−d+at−θ1at−1...−θqat−q
Zn+l=C+Φ1Zn+l−1+...+Φp+dZn+l−p−d+at−θ1an+l−1...−θqan+l−q
The general formula ˆzn(l)=E[Zn+l|Fn]=C+Φ1E[Zn+l−1|Fn]+...+Φp+dE[Zn+l−p−d|Fn]+E[an+l|Fn]−θ1E[an+l−1|Fn]−...−θqE[an+l−q|Fn]
E[Zs|Fn]={zs, if s≤nˆzn(s−n), if s>n
E[as|Fn]={ˆas=zs−ˆzs−1(1), if s≤n0, if s>n
Define
ˆZn+l={ˆzn(l), if l≥1zn+l, if l<0
[as]={ˆas, if s≤n0, if s>n
Then ˆzn(l)=C+ϕ1ˆZn+l−1+...+ϕp+dˆZn+l−p−d−θ1[an+l−1]−...−θq[an+l−q]
Let’s use AR(1) as an example.
ˆzn(l)=En[Zn+l]=En[u+ϕ1(Zn+l−1−u)+an+l]=En[u+ϕ21(Zn+l−2−u)+ϕ1an+l−1+an+l]...=En[μ+an+l+ϕ1an+l−1+ϕ21an+l−2...]=μ+0...+0+ϕl1an+ϕ(l+1)1an−1... Note that |ϕ1|<1 for stationary time series model. Therefore, ϕl1→0 as l→∞. Therefore, the limit of the forecast is
lim
Therefore, the forecast converges to the mean for the stationary time series model. This result holds for AR(1), and it is generally true for all stationary AR, MA, ARMA models.
The forecast for nonstationary time series model do no revert to a fixed mean.
Consider a random walk Z_t=Z_{t-1}+a_t, we have
\begin{align*} \hat{z}_n(l)&= \begin{cases} z_n, \quad l=1\\ \hat{z}_n(l-1), \quad \quad l > 1 \end{cases} \end{align*}
So, we combine the cases and have
\hat{z}_n(l) = z_n \text{ for }l \geq 1
Thus, different realization of the same random walk model will result in forecasts converge to different values.
On the other hand, different realizations of the same stationary time series model will always produce the forecasts which converge to the same constant, \mu.
Next, consider the limit of the variance of the forecast error of a stationary time series model. Let’s take AR(1) as an example.
\lim_{l \to \infty} \text{Var}(Z_{n+l} - \hat{z}_{n}(l))\\ =\lim_{l \to \infty} \sigma_a^2(1+\phi_1^2+(\phi_1^2)^2+(\phi_1^{l-1})^2+...)\\ =\sigma_a^2 \sum_{l=0}^{\infty} (\phi_1^2)^l
Thus the prediction “bands”about the forecast profile become parallel lines for large lead time (l \to \infty).
In fact, for AR(1), we can show
\text{Var}(Z_{n+l} - \hat{z}_{n}(l)) = \sigma_a^2(1 + \phi_1^2 + \phi_1^4+\phi_1^{2(l-1)})
On the other hand, consider the limit of the variance of the forecast error for a nonstationary time series, for example, random walk.
(1-B)Z_t = a_t\\ Z_t = (1-B)^{-1} a_t\\ Z_t = (1+B+B^2+B^3...) a_t\\ Z_t = a_t + a_{t-1} + a_{t-2} + ...\\ Therefore, the variance becomes
\text{Var}(Z_{n+l} - \hat{z}_n(l)) = \sigma^2_a \sum_{j=0}^{l-1} 1 = \sigma^2_a * l So the variance \text{Var}(Z_{n+l} - \hat{z}_n(l)) tends to infinity when l \to \infty, so the prediction band expands.
Case 1: business inventories
## Time Series:
## Start = 1
## End = 60
## Frequency = 1
## [1] 4.4 5.8 6.7 7.1 5.7 4.1 4.6 4.3 2.0 2.2 3.6 -2.2 -5.1 -4.9
## [15] 0.1 4.1 3.8 9.9 0.0 6.5 10.8 4.1 2.7 -2.9 -2.9 1.5 5.7 5.0
## [29] 7.9 6.8 7.1 4.1 5.5 5.1 8.0 5.6 4.5 6.1 6.7 6.1 10.6 8.6
## [43] 11.6 7.6 10.9 14.6 14.5 17.4 11.7 5.8 11.5 11.7 5.0 10.0 8.9 7.1
## [57] 8.3 10.2 13.3 6.2
It seems fitting a stationary model would be sufficient.
##
## Call:
## arima(x = case1, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.6803 6.0423
## s.e. 0.0914 1.2870
##
## sigma^2 estimated as 10.88: log likelihood = -157.04, aic = 320.08
However, the ADF test shows conflicting results.
##
## Augmented Dickey-Fuller Test
##
## data: case1
## Dickey-Fuller = -3.5868, Lag order = 1, p-value = 0.04192
## alternative hypothesis: stationary
##
## Augmented Dickey-Fuller Test
##
## data: case1
## Dickey-Fuller = -3.2623, Lag order = 2, p-value = 0.08614
## alternative hypothesis: stationary
So we decide to fit a nonstationary model again, and use auto.arima
to fit a third model. We compare their forecast results.
##
## Call:
## arima(x = case1, order = c(0, 1, 0))
##
##
## sigma^2 estimated as 13.07: log likelihood = -159.53, aic = 321.07
## Series: case1
## ARIMA(1,1,1)
##
## Coefficients:
## ar1 ma1
## 0.5017 -0.8767
## s.e. 0.1643 0.0918
##
## sigma^2 estimated as 11.34: log likelihood=-154.63
## AIC=315.26 AICc=315.7 BIC=321.5
Case 6: AT&T stock
We repeat similar analysis as above for this AT&T stock data.
## Time Series:
## Start = 1
## End = 56
## Frequency = 1
## [1] 166.8 172.8 178.3 180.3 182.6 184.2 188.9 184.4 181.7 178.5 177.6
## [12] 181.0 186.5 185.7 186.4 186.3 189.3 190.6 191.7 196.1 189.3 192.6
## [23] 192.1 189.4 189.7 191.9 182.0 175.7 192.0 192.8 193.3 200.2 208.8
## [34] 211.4 214.4 216.3 221.8 217.1 214.0 202.4 191.7 183.9 185.2 194.5
## [45] 195.8 198.0 200.9 199.0 200.6 209.5 208.4 206.7 193.3 197.3 213.7
## [56] 225.1
##
## Augmented Dickey-Fuller Test
##
## data: case6
## Dickey-Fuller = -3.051, Lag order = 3, p-value = 0.1508
## alternative hypothesis: stationary
##
## Call:
## arima(x = case6, order = c(1, 0, 0))
##
## Coefficients:
## ar1 intercept
## 0.9416 194.6908
## s.e. 0.0530 10.8988
##
## sigma^2 estimated as 35.64: log likelihood = -180.61, aic = 367.21
##
## Call:
## arima(x = case6, order = c(0, 1, 0))
##
##
## sigma^2 estimated as 36.12: log likelihood = -176.68, aic = 355.36
## Series: case6
## ARIMA(0,1,1)
##
## Coefficients:
## ma1
## 0.3589
## s.e. 0.1245
##
## sigma^2 estimated as 32.75: log likelihood=-173.55
## AIC=351.1 AICc=351.33 BIC=355.11