# Random Error and the Role of Statistics

Statistics plays two main roles in the analysis of epidemiologic data: first, to measure variability in the data in an effort to assess the role of chance, and second, to estimate effects after correcting for biases such as confounding. This chapter concentrates on the assessment of variability. The use of statistical approaches to control confounding is discussed in Chapters 10 and 12.

An epidemiologic study can be viewed as an exercise in measurement. As in any measurement, the goal is to obtain an accurate result, with as little error as possible. Systematic error and random error can distort the measurement process. Chapter 7 describes the primary categories of systematic error. The error that remains after systematic error is eliminated is *random error.* Random error is nothing more than variability in the data that cannot be readily explained. Sometimes, random error stems from a random process, but it may not. In randomized trials, some of the variability in the data reflects the random assignment of subjects to the study groups. In most epidemiologic studies, however, there is no random assignment to study groups. For example, in a cohort study that compares the outcome of pregnancy among women who drink heavily chlorinated water with the outcome among women who drink bottled water, it is not chance but the decision making or circumstances of the women themselves that determines the cohort in which the women are grouped. The individual assignments to categories of water chlorination are not random; nevertheless, some of the variability in the outcome is considered to be random error. Much of this variation may reflect hidden biases and presumably can be accounted for by factors other than drinking water that affect the outcome of pregnancy. These factors may not have been measured among these women or perhaps not even discovered.