MathType

Tuesday, 2 August 2016

Cointegration – Johansen Test with Stata (Time Series)




In the previous discussion we had shown that how we do the cointegration test what we called it as Engle and Granger test. This test has the advantage that it is intuitive, easy to perform and once we master it we will also realize it limitation and why there are other tests. There are drawbacks when we perform the Engle and Granger test. First, since the test involves and ADF test in the second step, all the problems of ADF test are valid here as well, especially choosing the number of lags in the augmentation is a critical factor. Second, the test is based on the assumptions of one cointegration vector, captured by the cointegration regression. Thus, care must be taking when applying the test to models with more than two variables.  If two variables cointegrate adding a third integrated variable to the model will not change the outcome of the test. If the third variable do not belong in the cointegrating vector, OLS estimation will simply put its parameter to zero, leaving the error process unchanged. The advantage of the procedure is that it is easy, and therefore relatively costless to apply compared with other approaches, especially when two variables can work quite well.

The superior test for cointegration is Johansen’s test (1995). The weakness of the test is that it relies on asymptotic properties and sensitive to specification errors in limited samples.

The method start with a VAR representation of the variables (economic systems we like to investigate).

We have a \(p\)-dimensional process, integrated of order \(d,{{x}_{t}}\sim I\left( d \right)\), with VAR representation

                                \({{\text{x}}_{t}}=\text{v + }{{\text{A}}_{k}}{{\text{x}}_{t-1}}+{{\varepsilon }_{t}}\)                                                        (1)


Typically , we will assume that the system is integrated of order one.

By using the difference operator \(\Delta =1-L\) , or \(L=1-\Delta \) , the VAR in levels can be transformed to a vector error correction model (VECM).

                \(\Delta {{x}_{t}}=v+{{\Gamma }_{1}}\Delta {{x}_{t-1}}+...+{{\Gamma }_{k-1}}\Delta {{x}_{t-k-1}}+\Pi {{x}_{t-1}}+{{\varepsilon }_{t}}\)               (2)

where the \({{\Gamma }_{i}}\)‘s and \(\Pi \) are matrixes of variables. The lag length in the VAR is \(k\)  lags on each variable.

After transforming the model, using \(L=1-\Delta \) ,we ‘lose’ on lag at the end, leading to \(k-1\)  lags in VECM.

The more compact for the VECM becomes;
\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\Pi {{x}_{t-1}}+{{\varepsilon }_{t}}\)                                              (3)


The number cointegrating vectors are identical to the number of stationary relationship in the  \(\Pi \)-matrix. If there is no cointegration , all row in \(\Pi \) must be filled with zeros. If there are stationary combinations, or stationary variables, some parameter in \(\Pi \)  will be nonzero.

The rank of \(\Pi \) matrix determines the number of independent rows in \(\Pi \), and the number of cointegration vectors. The rank of \(\Pi \)  is given by the number of significant eigenvalues found in \(\hat{\Pi }\).Each significant eigenvalue represent a stationary relation.

From Eq(3), the test for cointegration;

·         If rank \(\Pi =0\)   implies that all \(\text{x}\)’s are non-stationary. There is no linear combination of variables that leads to stationary.


·         If rank \(\Pi =\rho \) , so \(\Pi \) has a full rank , then all variables in \({{\text{x}}_{t}}\) must be stationary.


·         If \(\Pi\) has reduced rank, \(0<r<p\) , there are cointegration relations among the \(\text{x}\)’s. The cointegrating vectors are given as \(\Pi =\alpha \beta '\) where \({{\beta }_{i}}\) represents the i-th cointegration vectors, and \({{\alpha }_{j}}\) represents the effect of each cointegrating vector on the \(\Delta {{x}_{p,t}}\) variables in the model.

Johansen derived two test, the \(\lambda -\max \) (or maximum eigenvalue) and the  \(\lambda -\text{trace}\) (or trace test).

The Max test is constructed as;
\({{\lambda }_{\max }}\left[ {{H}_{1}}\left( r-1 \right){{H}_{1}}\left( r \right) \right]=-T\log \left( 1-{{{\hat{\lambda }}}_{r}} \right)\)                                (4)

For \(=0,1,2,...,p-2,p-1\) . The null is that there exist \(r\) cointegrating vectors against the alternative of \(r+1\) vectors.

The trace test is
\({{\lambda }_{\text{trace}}}\left[ {{H}_{1}}\left( r \right){{H}_{0}} \right]=-T\sum\limits_{i=r+1}^{p}{\log \left( 1-{{{\hat{\lambda }}}_{i}} \right)}\)                              (5)

where the null hypothesis is \({{\lambda }_{i}}=0\) , so only the first \(r\) eigenvalue are non-zero.

It has found that the trace test is the better test, since it appears to be more robust to skewness and excess kurtosis. Furthermore, the trace test can be adjusted for degrees of freedom, which can be important in small samples by replacing \(T\) in the trace statistics by \(T-nk\) (Reimers,1992).

Deterministic trends in a cointegration VECM can stem from two distant sources; the mean of the cointegrating relationship and the mean of the difference series.

Allowing for a constant and a linear trend and assuming that there are  cointegration relations, we can rewrite the VECM in (3) as

                 \(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \beta '{{\text{x}}_{t-1}}+\delta t+{{\varepsilon }_{t}}\)                                                  (6)

where \(\delta\)   is a \(K\times 1\)   vector parameter. Because the Eq (6) is the difference of the data, the constant implies a linear time trend in the levels, and the time trend \(\delta t\)  implies a quadratic time trend in the levels of the data. VECMs exploit the properties of the matrix \(\alpha \)  to achieve this flexibility.

Because \(\alpha\) is a \(K\times r\)  rank matrix, we can rewrite the deterministic components in Eq(6) as

                                                \(\text{v = }\alpha \mu \text{+}\gamma \)                                                                                          (7)
                                                \(\delta t=\alpha \rho t+\tau t\)                                                                                               (8)

where \(\mu \) and \(\rho \) are \(r\times 1\) vector of parameters and \(\gamma \) and \(\tau \) are \(k\times 1\) vectors of parameter. \(\gamma \) is orthogonal to \(\alpha \mu \)  and \(\tau \) is orthogonal to \(\alpha \rho \), such that \(\gamma '\alpha \mu =0\) , and \(\tau '\alpha \rho =0\) .

Following this motivation, Eq(6) can be written as VECM as below;

\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \left( \beta '{{\text{x}}_{t-1}}+\mu +\rho t \right)+\gamma +\tau t+{{\varepsilon }_{t}}\)                         (9)


Placing restriction on the trend terms in Eq(9) yields five cases;

Case 1 : No trend, \(\tau =0\) ,\(\rho =0\) , \(\gamma =0\)  and \(\mu =0\). The level data \({{\text{x}}_{t}}\) have no deterministic trends and the cointegration equations do not have intercepts;
\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \beta '{{\text{x}}_{t-1}}+{{\varepsilon }_{t}}\)                                                                                (10)


This model is uninteresting because it assumes that all variables in the cointegrating vectors have the same mean.


Cace 2: Restricted constant, \(\tau =0\) , \(\rho =0\) and \(\gamma =0\). The level data \({{\text{x}}_{t}}\) have no deterministic trends and the cointegration equations have intercepts;
 \(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \left( \beta '{{\text{x}}_{t-1}}+\mu \right)+{{\varepsilon }_{t}}\)          (11)






Case 3 :Unrestricted constant, \(\tau =0\) and \(\rho =0\). The level data \({{\text{x}}_{t}}\) have linear trends but the cointegration equations have  only intercepts;

\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \left( \beta '{{\text{x}}_{t-1}}+\mu \right)+\gamma +{{\varepsilon }_{t}}\)
         (12)


Case 4: Restricted trend , \(\tau =0\).The level data  \({{\text{x}}_{t}}\) and cointegration equations have linear trends;
\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \left( \beta '{{\text{x}}_{t-1}}+\mu +\rho t \right)+\gamma +{{\varepsilon }_{t}}\)                                      (13)


In practice, this is model of last resort. If no meaningful cointegration vector are found using Model 2 or 3, a trend component in the vectors might do a trick. Having trend in cointegrating vectors can be understood as a type of growth in target problem (i.e. productivity growth, technological development etc, which the model cannot account for).

Model 5 : Unrestricted trend. The level data \({{\text{x}}_{t}}\) have a quadratic trends and the cointegrating equations have a linear trends.

\(\Delta {{\text{x}}_{t}}=\text{v}+\sum\limits_{i=1}^{k-1}{{{\Gamma }_{1}}\Delta {{\text{x}}_{t-i}}}+\alpha \left( \beta '{{\text{x}}_{t-1}}+\mu +\rho t \right)+\gamma +\tau t+{{\varepsilon }_{t}}\)                         (9 = 13)


This model quite unrealistic and should not to be considered in applied work. The reason is difficulty in motivation quadratic trends in a multivariate model. Eg, from an economic point of view, it totally unrealistic to assume that technological or productivity growth is an increasingly expanding process.

For our discussion using the Stata, we will use the data macro.

First, we set our data;

tsset qtr, quarterly

Lets we plot the variables linv, linc and lncons;
twoway(line linv qtr)(line linc qtr)(line lcons qtr)
 
 

The graph clearly show that our variables indicate trend together.

Perform the unit root test to make sure that our data are stationary at same level. For these test, we use SBIC as lag selection. 

varsoc linv
dfuller linv,trend lags(1)
varsoc D.linv
dfuller D.linv,lags(0)

 


varsoc linc
dfuller linc,trend lags(2)
varsoc D.linc
dfuller D.linc,lags(0)

 

varsoc lcons
dfuller lcons,trend lags(4)
varsoc D.lcons
dfuller D.lcons,lags(3)

 


The results show that at 10% significance level, all the variables is non-stationary in level but for the first difference, its stationary.

That means all the variables is \(I\left( 1 \right)\) .
Stata provide the command vecrank to perform Johansen test for cointegration
If not available, installing it by typing ssc install vecrank.

Before we perfom the cointegration test,  first we need select an appropriate lags order for VAR by information criterion. To do this;

varbasic linv linc lcons,lags(1/10)
varsoc

The SBIC information criterion show that the appropriate lag is 2.
To perform cointegration test for variables linv, linc and lcons by vecrank command;

*Case 1:no trend

vecrank linv linc lcons, lags(2) trend(none) levela max

*Case 2:restricted constant

vecrank linv  linc lcons, lags(2) trend(rconstant) levela max

*Case 3:unrestricted constant

vecrank linv  linc lcons, lags(2) trend(constant) levela max

*Case 4:restricted trend

vecrank linv linc lcons, lags(2) trend(rtrend) levela max

*Case 5:unrestricted trend

vecrank linv linc lcons, lags(2) trend(trend) levela max



Lets now we perform the cointegration test for the variables  in Case 3;

 
The upper panel is for \({{\lambda }_{\text{trace}}}\)   and the lower panel is for \({{\lambda }_{\text{max}}}\).

The \({{\lambda }_{\text{trace}}}\) show that at \(r=0\)  of 13.6777 exceeds its critical value of 29.68 at 5% level, and we can reject the null hypothesis of no cointegration equations. But at \(r=1\), the \({{\lambda }_{\text{trace}}}\) value of 10.8534 is less than its critical value of 15.41 at 5% level, which means we fail to reject the null hypothesis that there is only one cointegration equations exist.

That means, the Johansen test based on  conform that there is one cointegration relationships exist between the variable linv, linc and lcons.

No comments:

Post a Comment