One of the reason we estimate the regression model is to
generate forecast of the dependent variable.
Before we doing the forecasting, the first things is we need
a concrete model that we can refer to.
Some students of econometrics are not clear about certain
issues when it comes to using regression models as for forecasting.
Keep in mind that the forecasting is not necessary using the
time series data although time series data is more popular when dealt with
forecasting.
Let’s we begin with what we called it as a static linear regression model, which
means there is lagged value of the dependent variable entering as a regressors
in our model.
\({{y}_{t}}={{\beta }_{1}}+{{\beta }_{2}}{{x}_{t}}+{{\beta
}_{3}}{{x}_{t}}+...+{{\beta }_{k}}{{x}_{kt}}+{{\varepsilon }_{t}}\) (1)
We assumed the the error term in Eq(1) is white noise, or
“well-behave”, and our model is estimated with a sample \(T\) observations.
Now, lets we consider that the \({{b}_{i}}\) is denote the OLS estimator for \({{\beta
}_{i}}\), and \(i=1,2,...k\) .
That means, the “fitted value” from our
estimated model become;
\(y_{t}^{*}={{b}_{1}}+{{b}_{2}}{{x}_{t}}+{{b}_{3}}{{x}_{t}}+...+{{b}_{k}}{{x}_{kt}}\) (2)
In Eq(2), we just looking at the within sample predictions of the
estimated model.
These prediction are constructed by using
the point estimates of the regression coefficients, and the actual observed
values of the regressors, which is \(t=1,2,...T\).
In Eq(2), we also notice that for
obtaining the fitted values, the error term has been set to assumed mean value
of zero.
Let’s assumed that now (by time
passes by) we have an additional \(n\) observation
\(\left( T+n \right)\) on y and all of the \({{x}_{t}}\) variables. But, we still use our OLS
parameter estimates based on the original \(T\) observations.
Now, we can generate what we call ex post forecasting of
the \(n\) additional observations.
We know the values of these data, but they haven’t been used
in the estimation of the model. We can see how well our model performs when it
comes to forecasting these \(n\) values, because we know exactly what actually
happened.
We use the data consdpi.dta which is contain two time-series, namely realcons and realdpi.
Before we estimate the run-long run relationship, the two
series must be test for the unit root to make sure that the two time series is
stationary in first difference
varsoc
realcons
dfuller
realcons,trend lags(#)
varsoc
realdpi
dfuller
realdpi,trend lags(#)
varsoc
D.realcons
dfuller
D.realcons,lags(#)
varsoc
D.realdpi
dfuller
D.realdpi,lags(#)
#= number of lags
based on information criterion from varsoc.
The unit root test shows that the two time series is \(I\left(
1 \right)\) , and they are cointegrated.
In this case, we now can estimate the long-run
relationship by regressing one series on
the other without the need for any differencing for the data from 1950Q1 to
1983Q4.
reg realcons realdpi if tin(1950q1,1983q4)
Right now we have estimated the model with the sample ending
in 1983Q4 even though we also have data for 1984Q1 to 1985Q4.
Lets we use these other 8 observations for the ex post forecasting.
regress
realcons realdpi if tin(1950q1,1983q4)
estimates
store consmodel
forecast
create expostforecast, replace
forecast
estimates consmodel
set
seed 1
forecast
solve, prefix(f_) begin(tq(1984q1))end(tq(1985q4)) static >simulate(betas,statistic(stddev,prefix(sd_)) reps(100))
Let’s now we compare
realcons and f_realcons (forecast of realcons) over the forecast period.
list
realcons f_realcons if tin(1984q1,1985q4)
Before we plot the data, lets we compute the upper and lower
bounds of 95% prediction interval for our forecast realcons.
gen
f_y_up = f_realcons + invnormal(0.975)*sd_realcons
gen
f_y_dn = f_realcons + invnormal(0.025)*sd_realcons
Let’s we plot the same data;
twoway(line realcons year)(line
f_realcons year) (line f_y_up
year,lpattern(dash))(line f_y_dn
>year,lpattern(dash))if tin(1984q1,1985q4)
Now, think about the “real-time” in nature and supposed that
we estimate our model using a sample of \(T\) observations.
Then, we want to forecast for another \(n\)
observations. For this point, we don’t know the actual values of \({{y}_{t}}\) for these data-points. This is called as ex ante forecasting .
When we are deal with this type of
forecasting, there is some practical problem arises. If we want to apply Eq(2),
for period \(\left( T+1 \right)\), then it means that we need to have a value
of each of the \({{x}_{t}}\) in period \(\left( T+1 \right)\).
In practice, there is two way how we can
insert the value of \({{x}_{t}}\), either we insert “educated guesses” or we
use some other type of forecasting for future values like ARIMA model.
The discussion so far is based on what we
usually call as a static forecasting, means that our regression in model Eq(1) is
“static” form (rather than dynamic)
because none of the variables in RHS equation are lagged values of \({{y}_{t}}\).
Now, let we modified Eq(1) to include a
lagged value of the dependent variable among the regressors;
\({{y}_{t}}={{\beta }_{1}}+{{\beta }_{2}}{{y}_{t-1}}+{{\beta
}_{3}}{{x}_{t}}+...+{{\beta }_{k}}{{x}_{kt}}+{{\varepsilon }_{t}}\) (3)
The Eq(3) can be used to obtain either ex post or ex ante forcast for observation \(\left( T+1 \right)\) as follows;
\(y_{T+1}^{*}={{b}_{1}}+{{b}_{2}}{{y}_{T}}+{{\beta
}_{3}}{{x}_{T+1}}+...+{{\beta }_{k}}{{x}_{kT+1}}\) (4)
and in Eq(4), at time \(\left( T+1
\right)\) we already know the value of \(y{}_{t-1}\) (The observed value of \({{y}_{T}}\)).
However, when we forecasting point \({{y}_{t}}\)
for period \(\left( T+2 \right)\) ,
there are actually two option to us in the case of ex post forecasting;
a)
We could insert the
known value of \({{y}_{T+1}}\) for \(y{}_{t-1}\) in the forecasting equation
(together with value for \({{x}_{3,T+2}},{{x}_{4,T+2}}...etc\))
b)
Alternatively, we could
insert the previously predicted value of
\({{y}_{T+1}}\) , namely \(y_{T+1}^{*}\), from Eq(4), together with appropriate
\({{x}_{t}}\) values.
The same options remain for forecasting
in period \(\left( T+3 \right)\) and so on.
The first option above is called static forecasting, while the second
option is called dynamic
forecasting.
When we undertaking ex ante forecasting for
two or more periods ahead, we actually use dynamic
forecasting. In this situation, we actually don’t know the true values of the
dependent variable outside the sample.
Once again, future value for the \({{x}_{t}}\)
variables will have to obtained in some way
or other, and this can be a major excise in itself.
Lets now we estimate the model Eq(3) with
a simple dynamic model, or there is only lagged one dependent variable as a
regressor;
reg realcons L.realcons realdpi if
tin(1950q1,1983q4)
The results show that our model is not robust and maybe the
residual exhibit some autocorrelation.
But for illustrate only and points
discussed above, let’s we just ignore the problem that might be exist.
Now, let’s we generate some
ex post static forecast;
reg
realcons L.realcons realdpi if tin(1950q1,1983q4)
estimates
store consmodel
forecast
create forecaststat, replace
forecast
estimates consmodel
set
seed 1
forecast
solve, prefix(sf_) begin(tq(1984q1))end(tq(1985q4))
>static simulate(betas,statistic(stddev,prefix(ssd_)) reps(100))
and some ex post dynamic forecast
forecast
create forecastdy, replace
forecast
estimates consmodel
set
seed 1
forecast
solve, prefix(df_) begin(tq(1984q1))
>end(tq(1985q4)) simulate(betas,
statistic(stddev,
>prefix(dsd_)) reps(100))
Let’s now we compare
realcons and static and dynamic forecast for realcons
(sf_realcons & df_realcons) over the
forecast period.
list
realcons sf_realcons df_realcons if
>tin(1984q1,1985q4)
Notice that
the static and dynamic forecast are identical in the first forecast period, as
expected, but after that the values is differ.
Before we plot the data, lets we compute the upper and lower
bounds of 95% prediction interval for our static and dynamic forecast realcons.
gen
sf_y_up = sf_realcons + >invnormal(0.975)*ssd_realcons
gen
sf_y_dn = sf_realcons + >invnormal(0.025)*ssd_realcons
gen
df_y_up = df_realcons + >invnormal(0.975)*dsd_realcons
gen
df_y_dn = df_realcons + >invnormal(0.025)*dsd_realcons
Let’s we plot the same data;
twoway(line
realcons year)(line sf_realcons year)(line
>sf_y_up
year,lpattern(dash))(line sf_y_dn
>year,lpattern(dash))if tin(1984q1,1985q4)
twoway(line
realcons year)(line df_realcons year)(line
>df_y_up
year,lpattern(dash))(line df_y_dn
>year,lpattern(dash))if tin(1984q1,1985q4)
The results
from the graphs show that most of the dynamic forecast for realcons values is
between the bounds compared to the static forecast. That’s means the dynamic forecast is more preferable than the static forecast
for realcons.