# Cointegration

The assumption of stationary of regressors and regressands is crucial for the properties and the OLS estimators discussed in Chapter 2. In this case, the usual statistical results for the linear regression model and consistency of estimators hold. However, when variables are non-stationary then the usual statistical results may not hold.

## Spurious Regression

If there are trends in the data (deterministic or stochastic) this can lead to a spurious results when running OLS regression. This is because time trend will dominate other stationary variables and the OLS estimators will pick up covariances generated by time trends only. While the effects of deterministic trends can be removed from the regression by either including time trend regressor or simply de-trending variables, non-stationary variables with stochastic trends may lead to invalid inferences.

Var Data Members

Consider, for example,

Both of the variables are non-stationary and independent from each other. In the regression Yi>t = (30 + [3Y2>t + £t, the value of true slope parameter p1 = 0. Thus, the value of the OLS estimate J3 should be insignificant. The actual estimations produce high R2 coefficients and highly significant

The problem with the spurious regression is that t- and F-statistics do not follow standard distributions. As shown in Phillips (1986), j3 does not converge in

probability to zero, R2 converges to unity as T — oo so that the model will appear to fit well even though it is misspecified.

Regression with I(1) data only makes sense when the data are cointegrated.

## Cointegration

Let Yt = (Y1t Ykt)' denote an k x 1 vector of I(1) time series. Yt is cointegrated if there exists an k x 1 vector (3 = ((1 (3k)' such that

The non-stationary time series in Yt are cointegrated if there is a linear combination of them that is stationary. If some elements of ( are equal to zero then only the subset of the time series in Yt with non-zero coefficients is cointegrated.

There may be different vectors (3 such that Zt = 3'Yt is stationary. In general, there can be 0 < r < k linearly independent cointegrating vectors. All cointegrating vectors form a cointegrating matrix B. This matrix is again not unique. Some normalization assumption is required to eliminate ambiguity from the definition.

A typical normalization is

so that the cointegration relationship may be expressed as

## Error Correction Models

Engle and Granger (1987) state that if a bivariate I(1) vector Yt = (Y1t Y2t)' is cointegrated with cointegrating vector ¡3 = (1 —f2)' then there exists an error correction model (ECM) of the form

that describes the long-term relations of Y1t and Y2t. If both time series are I(1) but are cointegrated (have a long-term stationary relationship), there is a force that brings the error term back towards zero. If the cointegrating parameter f1 or f2 is known, the model can be estimated by the OLS method.

## Tests for Cointegration: The Engle-Granger Approach

Engle and Granger (1987) show that if there is a cointegrating vector, a simple two-step residual-based testing procedure can be employed to test for cointegration. In this case, a long-run equilibrium relationship between components of Yt can be estimated by running

where Y2t = (Y2<t Yk>ty is an (k — 1) x 1 vector. To test the null hypothesis that Yt is not cointegrated, we should test whether the residuals ut ~ I(1) against the alternative ut ~ I(0). This can be done by any of the tests for unit roots. The most commonly used is the augmented Dickey-Fuller test with the constant term and without the trend term. Critical values for this test is tabulated in Phillips and Ouliaris (1990) or MacKinnon (1996).

Potential problems with Engle-Granger approach is that the cointegrating vector will not involve Y1tt component. In this case the cointegrating vector will not be consistently estimated from the OLS regression leading to spurious results. Also, if there are more than one cointegrating relation, the Engle-Granger approach cannot detect all of them.

Estimation of the static model (6.2.4) is equivalent to omitting the short-term components from the error-correction model (6.2.3). If this results for autocorrelation in residuals, although the results will still hold asymptotically, it might create a severe bias in finite samples. Because of this, it makes sense to estimate the full dynamic model. Since all variables in the ECM are I(0), the model can be consistently estimated using the OLS method. This approach leads to a better performance as it does not push the short-term dynamics into residuals.