One important assumption of the basic linear regression model is that the error term has to be uncorrelated with the explanatory variables. If the explanatory variables are in fact correlated with the error term, it would lead to inconsistent estimates of the parameters of the model. In this chapter we will relax this assumption by including additional equations to the model that explains where the correlation is coming from, and discuss the conditions that need to be fulfilled to receive consistent estimate.

Introduction

This chapter will only scratch the surface of the issues involved in estimating simultaneously equations and should therefore be seen as an introduction to the subject. In order for the explanatory variables to be correlated with the error term, they need to be considered random, which was not the case in the previous chapters. The assumption of random explanatory variables does not change anything related to the property of the OLS estimators but it allows for the possibility of being correlated with the error term.

What are the consequences of having the explanatory variable being correlated with the error term? The answer to that question is very similar to the case when we have measurement errors in the explanatory variables, which make the estimates bias and inconsistent. To see this, consider the following simple macro economic model of income determination:

with y being the national income, It investments, Ct the consumption expenditure and u a stochastic term. Equation (12.1) is an identity and an equilibrium condition. Hence this model is formulated under the condition of being in equilibrium, and the equation show how national income is related to consumption and investment in equilibrium. The second, equation given by (12.2), is a behavioral equation since it defines the behavior of the consumption expenditure in this economy. Equations with stochastic error terms are to be considered behavioral. Since Yt and Ct are left hand variables in this system of equations, their values are determined by the model. We therefore say that Yt and Ct are endogenous variables. We have an additional variable included in the model which is the investment. Since it is a right hand variable in the consumption function we say that it is an exogenous variable, which is to say that the value of investments are determined outside the model, it is predetermined. Since investment is determined outside the model it is also uncorrelated with the stochastic term ut.

The system of equations can be solved with respect to the two endogenous variables in order to receive their long run expressions. To solve the system with respect to Yt, we substitute (12.2) into (12.1) and solve for Yt. That results in the following expression:

In order to receive the long run expression for consumption expenditure, we substitute (12.1) into (12.2) and solve for C:

With this setup it is easy to describe the consequences of estimating (12.2) ignoring the fact that it is part of a system. From (12.3) we can see that Yt is a function of u which means it is correlated with u. Since Yt is correlated with Ut, we can not use OLS to estimate the coefficients of (12.2) without bias. If consumption expenditure had not been part of this system one could have argued that yt and ut in fact are uncorrelated. But when that is not the case we see from (12.3) how they are related.

It should now be obvious that the OLS estimators are biased in small samples due to the correlation between Yt and Ut. But are they also inconsistent? That is, if we increase the number of observations to a very large number, will the estimators still be biased? To see this consider the OLS estimator for B1:

This expression was developed in chapter 3 (see (3.12)). If we take the expected value of the estimator we will receive:

The problem with the expectation on the right hand side is that Yt is a random variable and correlated with ut, and for that reason we can not proceed as in chapter 3. Furthermore, since the expectation is a linear operator we have that e[a / b]^ E[a]/E[b] , which further complicates the problem. Even though this makes it clear that the estimator no longer is unbiased, we do not know how the second component on the right hand side of (12.6) behave in large samples. It can be shown that the limit of the OLS estimator is given by the following expression:

(12.7) show that in the limit the sample estimator still deviate from the population parameter, which means that the bias remains in large samples.

Correlation between the error term and the explanatory variables in a single equation model using OLS would lead to:

• Biased and inconsistent parameter estimates

• Invalid tests of hypothesis

• Biased and inconsistent forecasts

Found a mistake? Please highlight the word and press Shift + Enter