CALCULATION OF P VALUES
Although the investigator is better off relying on estimation rather than tests of statistical significance for inference, for completeness, we give the basic formulas from which traditional P values can be derived that test the null hypothesis that exposure is not related to disease.
Risk Data
For risk data, we use the following expansion of the notation used earlier in the chapter:
Exposed |
Unexposed |
Total |
|
Cases |
a |
b |
M1 |
Noncases |
c |
d |
M0 |
People at risk |
N1 |
N0 |
T |
The P value testing the null hypothesis that exposure is not related to disease can be obtained from the following equation for ?
For the data in Table 9-1, Equation 9-7 gives x as follows:
The P value that corresponds to this x statistic must be obtained from tables of the standard normal distribution (see Appendix). For a x of -4.78 (minus sign indicates only that the exposed group had a lower risk than the unexposed group), the P value is very small (roughly 0.0000009). The Appendix tabulates values of X only from -3.99 to +3.99.
Incidence Rate Data
For incidence rate data, we use the following notation, which is an expanded version of the table we used earlier:
Exposed |
Unexposed |
Total |
Cases a |
b |
M |
Person-time PT1 |
PT0 |
T |
for which we can use the following equation to calculate
Applying this equation to the data of Table 9-2 gives the following result for ?
This x is so large in absolute value that the P value cannot be readily calculated. The P value corresponding to a x of -8.92 is much smaller than 10-20, implying that the data are not readily consistent with a chance explanation.
Case-Control Data
For case-control data, we can apply Equation 9-7 to the data in Table 9-3.
From the appendix table, we see that this result corresponds to a P value of 0.00022.