# Regression Model

## Introduction

This chapter starts with the introduction to a linear regression analysis, estimation and inference methods. Regression analysis is widely used tool in financial econometrics. They are used to describe and evaluate the relationship between financial variables, perform forecasting tasks.

This chapter provides only a short and brief description of main tools used in the regression analysis. More detailed discussion and deeper theoretical background can be found in Greene (2000), Hamilton (1994), Hayashi (2000), Verbeek (2008), Mills (1999), Zivot and Wang (2006).

## Linear Regression Model

Consider the linear regression model

where * Xi *= [1

*is a*

**X2i Xki]'***1 vector of explanatory variables,*

**k x***is a*

**(3 = ((1 (3k)'***vector of coefficients, and*

**k x 1***is a random error term. In matrix form the model is expressed as*

**ui**where Y and * ¡3 *are

*1 vectors and X is a*

**n x***matrix.*

**n x k**The standard assumptions of the linear regression model are:

1. the linear model (2.2.2) is correctly specified;

2. the regressors Xj are uncorrelated with the error term u: * E [XiUi*] = 0 for all

*= 1 n;*

**i**3. * E XiXi = oxx *is of full rank k;

4. * ui *are independently identically distributed (iid) with mean zero and constant variance a2.

Ordinary Least Squares (OLS) estimation is based on minimizing the residual sum of squares * RSS*. The fitted model is

where

and * ui = *Yi

*Yi =*

**—***Xi/5. An unbiased estimator of the regression variance is*

**Yi —***.*

**a2 = ^****n—k**

Under the assumptions described above, the OLS estimates /5 are consistent and asymptotically normally distributed. A consistent estimator of the asymptotic variance of the parameters estimator is

Estimated standard errors * se J3i *for individual parameter estimators

*are given by the square root of the diagonal elements of (2.2.3).*

**fii**Goodness of fit is summarized by the * R2 *of the regression

*= 1*

**R2***, where*

**—*** TSS = — *Y) . The coefficient R2 measures the percentage of the variation of the dependent variable Y that is explained by the variation of the regressors X. The usual

*has the undesirable feature of never decreasing as more variables are added to the regression, even if the extra variables are irrelevant. A common way to solve the problem is to adjust R2 for degrees of freedom; this gives R2 = 1*

**R2***(1*

**—***R2). The adjusted R2 may decrease with the addition of variables with low explanatory power.*

**—**### Hypothesis testing

Suppose that we need to test the null hypothesis

The OLS test statistic for testing this hypothesis (also called t-statistic) is

which is asymptotically distributed * N*(0 1) under the null hypothesis. With the additional assumption of iid Gaussian error term,

*is normally distributed and the t-statistic follows Student's t distribution with*

**fj***degrees of freedom.*

**n — k**More general linear restrictions hypotheses of the form * H0 : Rf = r, *where

*is a fixed*

**R***matrix of rank*

**m x k***and*

**m***is a*

**r***vector, are tested using the Wald statistic*

**m x 1**Under the null, the Wald statistic is asymptotically distributed * xL- *Under the Gaussian assumptions of residuals,

*.*

**F m ~ Fm>n_k**The statistical significance of all of the regressors excluding the intercept is captured by the F-statistic

which is distributed * Fk-Ïyn-k *under the null hypothesis that all slope coefficients are zero.

### Residual diagnostics

If the classical assumptions of the linear regression model do not hold it may lead to inconsistent, inefficient estimates. There are several residual diagnostic statistics are usually reported along with the regression results to check the validity of the model predictions.

Two common problems with the regression assumptions are heteroscedasticity and autocorrelation of the error terms. Heteroscedasticity means that variances of error terms are not constant from observation to observation. Autocorrelation means presence of series correlation between error terms. In both cases, the OLS estimator is unbiased and consistent but not efficient anymore. Moreover, the standard formula for computing the variance of the parameters estimators (2.2.3) is not valid any more which may lead to wrong conclusions. If the variance-covariance matrix of error terms * var *[u] = a2Q then

One way of obtaining an efficient estimate of the regression parameters is to use Generalized Least Squares (GLS) method. The GLS estimator is given by * [3gls = (X'*Q-1X)-1X'Q-1Y with variance

*?*

**var [/***2 (X'Q-1X).*

**] = a**If the matrix Q is not known, one can use White's heteroscedasticity consistent estimator of standard errors of the OLS estimators. The matrix

can be used as an estimate of the true variance of the OLS estimator.

There are several testing procedures to detect heteroscedasticity. The White test suggest to estimate an auxiliary regression of the squared OLS residuals on a constant and all regressors, their squares and cross products. Under the null hypothesis of homoscedasticity * nR2 *statistic is asymptotically distributed

*where*

**x2(q),***is the number of variables in the auxiliary regression minus one. If the value of statistic is large, the null hypothesis of homoscedasticity in residuals is rejected.*

**q**Another test for heteroscedasticity is Breusch-Godfrey-Pagan test. It suggests to regress squared residuals from the initial regression scaled by * a*2 =

*u2*

**Yl***on a set of known variables*

**n***(they also could be regressors but not restricted to). Under the null of homoscedasticity, the scaled 1*

**Zt***from the auxiliary regression follows asymptotically*

**ESS***distribution, where*

**x2(p — 1)***is a number of auxiliary variables Zt.*

**p**The most common diagnostic statistics for presence of autocorrelation based on the estimated residuals U is the Durbin-Watson statistic * DW*. It is defined as

For large * n *the Durbin-Watson statistics can be approximated

*2(1*

**DW =***p), where p is the estimated correlation between and*

**—***and*

**ui***Thus, the range of values of*

**ui-1.***is from 0 to 4. Values of DW around 2 indicate no serial correlation in the error terms, values less than 2 suggest positive serial correlation, and values greater than 2 suggest negative serial correlation.*

**DW**Exact critical values for a general case cannot be tabulated; however, Durbin and Watson (1950) established upper and lower bounds * (du *and

*respectively) for the critical values. The testing procedure is as follows:*

**dL**• if * DW < dL *we reject the null hypothesis of no autocorrelation in favour of positive first-order autocorrelation;

• if * DW > du *we do not reject the null hypothesis

The bounds for critical values in the case of negative autocorrelation alternative are 4 * — du *and 4

*The values of the bounds can be found in Savin and White (1977); some of the are tabulated in Table 2.1.*

**— dL.****Table 2.1: Lower and Upper bounds for 5% critical values of the Durbin-Watson**

test

The Breusch-Godfrey test for autocorrelation considers the regression of the OLS residuals * ui *upon its lad ui-1. This auxiliary regression produces an estimate for the first-order autocorrelation coefficient p and provides a standard error to this estimate. In general case the test is easily extended to higher orders of autocorrelation by including additional lags of the residual. Testing the null hypothesis of no autocorrelation is equivalent to testing the significance of the auxiliary regression.

Another common diagnostic for serial correlation is the Ljung-Box modified Q statistic. The Q-statistic at lag * q *is a test statistic for the null hypothesis of no autocorrelation up to order

*and is computed as:*

**q**where * f)j *is the

*-th autocorrelation.*

**j**The most often used diagnostic statistic to test for normality of the residuals is the Jarque-Bera test statistics. It measures the difference of the skewness and kurtosis of the series with those from the normal distribution. The statistic is computed as:

where * S *is the skewness, and

*is the kurtosis. We reject the null hypothesis of normality if a Jarque-Bera statistic exceeds the corresponding critical value.*

**K**