# Regression Model

## Introduction

This chapter starts with the introduction to a linear regression analysis, estimation and inference methods. Regression analysis is widely used tool in financial econometrics. They are used to describe and evaluate the relationship between financial variables, perform forecasting tasks.

This chapter provides only a short and brief description of main tools used in the regression analysis. More detailed discussion and deeper theoretical background can be found in Greene (2000), Hamilton (1994), Hayashi (2000), Verbeek (2008), Mills (1999), Zivot and Wang (2006).

## Linear Regression Model

Consider the linear regression model where Xi = [1 X2i Xki]' is a k x 1 vector of explanatory variables, (3 = ((1 (3k)' is a k x 1 vector of coefficients, and ui is a random error term. In matrix form the model is expressed as where Y and ¡3 are n x 1 vectors and X is a n x k matrix.

The standard assumptions of the linear regression model are:

1. the linear model (2.2.2) is correctly specified;

2. the regressors Xj are uncorrelated with the error term u: E [XiUi] = 0 for all i = 1 n;

3. E XiXi = oxx is of full rank k;

4. ui are independently identically distributed (iid) with mean zero and constant variance a2.

Ordinary Least Squares (OLS) estimation is based on minimizing the residual sum of squares RSS. The fitted model is where and ui = Yi Yi = Yi — Xi/5. An unbiased estimator of the regression variance is a2 = ^.

n—k

Under the assumptions described above, the OLS estimates /5 are consistent and asymptotically normally distributed. A consistent estimator of the asymptotic variance of the parameters estimator is Estimated standard errors se J3i for individual parameter estimators fii are given by the square root of the diagonal elements of (2.2.3).

Goodness of fit is summarized by the R2 of the regression R2 = 1 , where

TSS = — Y) . The coefficient R2 measures the percentage of the variation of the dependent variable Y that is explained by the variation of the regressors X. The usual R2 has the undesirable feature of never decreasing as more variables are added to the regression, even if the extra variables are irrelevant. A common way to solve the problem is to adjust R2 for degrees of freedom; this gives R2 = 1 (1 R2). The adjusted R2 may decrease with the addition of variables with low explanatory power.

### Hypothesis testing

Suppose that we need to test the null hypothesis The OLS test statistic for testing this hypothesis (also called t-statistic) is which is asymptotically distributed N(0 1) under the null hypothesis. With the additional assumption of iid Gaussian error term, fj is normally distributed and the t-statistic follows Student's t distribution with n — k degrees of freedom.

More general linear restrictions hypotheses of the form H0 : Rf = r, where R is a fixed m x k matrix of rank m and r is a m x 1 vector, are tested using the Wald statistic Under the null, the Wald statistic is asymptotically distributed xL- Under the Gaussian assumptions of residuals, F m ~ Fm>n_k.

The statistical significance of all of the regressors excluding the intercept is captured by the F-statistic which is distributed Fk-Ïyn-k under the null hypothesis that all slope coefficients are zero.

### Residual diagnostics

If the classical assumptions of the linear regression model do not hold it may lead to inconsistent, inefficient estimates. There are several residual diagnostic statistics are usually reported along with the regression results to check the validity of the model predictions.

Two common problems with the regression assumptions are heteroscedasticity and autocorrelation of the error terms. Heteroscedasticity means that variances of error terms are not constant from observation to observation. Autocorrelation means presence of series correlation between error terms. In both cases, the OLS estimator is unbiased and consistent but not efficient anymore. Moreover, the standard formula for computing the variance of the parameters estimators (2.2.3) is not valid any more which may lead to wrong conclusions. If the variance-covariance matrix of error terms var [u] = a2Q then One way of obtaining an efficient estimate of the regression parameters is to use Generalized Least Squares (GLS) method. The GLS estimator is given by [3gls = (X'Q-1X)-1X'Q-1Y with variance var [/?] = a2 (X'Q-1X).

If the matrix Q is not known, one can use White's heteroscedasticity consistent estimator of standard errors of the OLS estimators. The matrix can be used as an estimate of the true variance of the OLS estimator.

There are several testing procedures to detect heteroscedasticity. The White test suggest to estimate an auxiliary regression of the squared OLS residuals on a constant and all regressors, their squares and cross products. Under the null hypothesis of homoscedasticity nR2 statistic is asymptotically distributed x2(q), where q is the number of variables in the auxiliary regression minus one. If the value of statistic is large, the null hypothesis of homoscedasticity in residuals is rejected.

Another test for heteroscedasticity is Breusch-Godfrey-Pagan test. It suggests to regress squared residuals from the initial regression scaled by a2 = Yl u2 n on a set of known variables Zt (they also could be regressors but not restricted to). Under the null of homoscedasticity, the scaled 1 ESS from the auxiliary regression follows asymptotically x2(p — 1) distribution, where p is a number of auxiliary variables Zt.

The most common diagnostic statistics for presence of autocorrelation based on the estimated residuals U is the Durbin-Watson statistic DW. It is defined as For large n the Durbin-Watson statistics can be approximated DW = 2(1 p), where p is the estimated correlation between and ui and ui-1. Thus, the range of values of DW is from 0 to 4. Values of DW around 2 indicate no serial correlation in the error terms, values less than 2 suggest positive serial correlation, and values greater than 2 suggest negative serial correlation.

Exact critical values for a general case cannot be tabulated; however, Durbin and Watson (1950) established upper and lower bounds (du and dL respectively) for the critical values. The testing procedure is as follows:

• if DW < dL we reject the null hypothesis of no autocorrelation in favour of positive first-order autocorrelation;

• if DW > du we do not reject the null hypothesis

The bounds for critical values in the case of negative autocorrelation alternative are 4 — du and 4 — dL. The values of the bounds can be found in Savin and White (1977); some of the are tabulated in Table 2.1.

Table 2.1: Lower and Upper bounds for 5% critical values of the Durbin-Watson

test The Breusch-Godfrey test for autocorrelation considers the regression of the OLS residuals ui upon its lad ui-1. This auxiliary regression produces an estimate for the first-order autocorrelation coefficient p and provides a standard error to this estimate. In general case the test is easily extended to higher orders of autocorrelation by including additional lags of the residual. Testing the null hypothesis of no autocorrelation is equivalent to testing the significance of the auxiliary regression.

Another common diagnostic for serial correlation is the Ljung-Box modified Q statistic. The Q-statistic at lag q is a test statistic for the null hypothesis of no autocorrelation up to order q and is computed as: where f)j is the j-th autocorrelation.

The most often used diagnostic statistic to test for normality of the residuals is the Jarque-Bera test statistics. It measures the difference of the skewness and kurtosis of the series with those from the normal distribution. The statistic is computed as: where S is the skewness, and K is the kurtosis. We reject the null hypothesis of normality if a Jarque-Bera statistic exceeds the corresponding critical value.