Menu
Home
Log in / Register
 
Home arrow Environment arrow Research Methods in Anthropology: Qualitative and Quantitative Approaches
Source

HOW REGRESSION WORKS

To give you an absolutely clear idea of how the regression formula works, table 21.16 shows all the predictions along the regression line for the data in table 21.14.

Table 21.16 Regression Predictions for the Dependent Variable in Table 21.14

For the country of

Where the infant mortality rate in 2008 was

Predict that the TFR will be

And compare that to the actual TFR in table 20.8

Armenia

22.2

1.20 + .056(22.2) = 2.44

1.79

Chad

48.7

1.20 + .056(48.7) = 3.93

5.78

Ghana

67.0

1.20 + .056(67.0) = 4.95

4.00

El Salvador

17.5

1.20 + .056(17.5) = 2.18

2.22

Iran

24.2

1.20 + .056(24.2) = 2.56

1.74

Latvia

8.3

1.20 + .056(8.3) = 1.66

1.48

Namibia

27.2

1.20 + .056(27.2) = 2.72

3.07

Panama

15.7

1.20 + .056(15.7) = 2.08

2.41

Slovenia

3.6

1.20 + .056(3.6) = 1.40

1.47

Suriname

20.5

1.20 + .056(20.5) = 2.35

2.29

We now have two predictors of TFR: (1) the mean TFR, which is our best guess when we have no data about some independent variable like infant mortality, and (2) the values produced by the regression equation when we do have information about something like infant mortality.

Each of these predictors produces a certain amount of error, or variance, which is the difference between the predicted number for the dependent variable and the actual measurement. This is also called the residual—that is, what’s left over after making your prediction using the regression equation. (To anticipate the discussion of multiple regression in chapter 22: The idea in multiple regression is to use two or more independent variables in order to reduce the size of the residuals.)

You’ll recall from chapter 20, in the section on variance and the standard deviation, that in the case of the mean, the total variance is the average of the squared deviations of the observations from the mean, |2 (x x)2 / n}. In the case of the regression line predictors, the variance is the sum of the squared deviations from the regression line. Table 21.17 compares these two sets of errors, or variances, for the data in table 21.14.

Table 21.17 Comparison of the Error Produced by Guessing the Mean TFR in Table 21.14 and the Error Produced by Applying the Regression Equation for Each Guess

Country

TFR

y

Old error

(y - v)2

Prediction using the regression equation

New error (y - the prediction using the regression equation)2

Armenia

1.79

0.71

2.44

0.42

Chad

5.78

9.23

3.93

3.42

El Salvador

2.22

0.17

2.18

0.002

Ghana

4.00

1.88

4.95

0.90

Iran

1.74

0.79

2.56

0.67

Latvia

1.48

1.32

1.66

0.03

Namibia

3.07

0.19

2.72

0.12

Panama

2.41

0.05

2.08

0.11

Slovenia

1.47

1.35

1.40

0.005

Suriname

2.29

0.12

2.35

0.004

2 = 15.81

2 = 5.68

We now have all the information we need for a true PRE measure of association between two interval variables. Recall the formula for a PRE measure: the old error minus the new error, divided by the old error. For our example in table 21.14:

In other words: The proportionate reduction of error in guessing the TFR in table 21.14— given that you know the distribution of informant mortality rates and can apply a regression equation—compared to just guessing the mean of TFR is 0.64, or 64%.

This quantity is usually referred to as r-squared (written r2), or the amount of variance accounted for by the independent variable. It is also called the coefficient of determination because it tells us how much of the variance in the dependent variable is predictable from (determined by) the scores of the independent variable. The Pearson product moment correlation, written as r, is the square root of this measure, or, in this instance, 0.80. (We calculated r in table 21.15 by applying formula 21.22 and got r = 0.81. The difference is rounding error when we do these calculations by hand. You won’t get this error when you use a computer to do the calculations.)

 
Source
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Mathematics
Political science
Philosophy
Psychology
Religion
Sociology
Travel