# CALCULATION OF P VALUES

Although the investigator is better off relying on estimation rather than tests of statistical significance for inference, for completeness, we give the basic formulas from which traditional P values can be derived that test the null hypothesis that exposure is not related to disease.

## Risk Data

For risk data, we use the following expansion of the notation used earlier in the chapter:

 Exposed Unexposed Total Cases a b M1 Noncases c d M0 People at risk N1 N0 T

The P value testing the null hypothesis that exposure is not related to disease can be obtained from the following equation for ? For the data in Table 9-1, Equation 9-7 gives x as follows: The P value that corresponds to this x statistic must be obtained from tables of the standard normal distribution (see Appendix). For a x of -4.78 (minus sign indicates only that the exposed group had a lower risk than the unexposed group), the P value is very small (roughly 0.0000009). The Appendix tabulates values of X only from -3.99 to +3.99.

## Incidence Rate Data

For incidence rate data, we use the following notation, which is an expanded version of the table we used earlier:

 Exposed Unexposed Total Cases a b M Person-time PT1 PT0 T

for which we can use the following equation to calculate Applying this equation to the data of Table 9-2 gives the following result for ? This x is so large in absolute value that the P value cannot be readily calculated. The P value corresponding to a x of -8.92 is much smaller than 10-20, implying that the data are not readily consistent with a chance explanation.

## Case-Control Data

For case-control data, we can apply Equation 9-7 to the data in Table 9-3. From the appendix table, we see that this result corresponds to a P value of 0.00022.