UNCONFOUNDED EFFECT ESTIMATES AND CONFIDENCE INTERVALS FROM STRATIFIED DATA
How does stratification control confounding? Confounding, as explained in Chapter 7, comes from the mixing of the effect of the confounding variable with the effect of the exposure. If a variable that is a risk factor for the disease is associated with the exposure in the study population, confounding will result. Confounding occurs because the comparison of exposed with unexposed people is also a comparison of those with differing distributions of the confounding factor. In the trigeminal neuralgia example, comparing men with women was also a comparison of younger people (ie, men in the study) with older people (ie, women in the study). Stratification creates subgroups in which the confounding factor either does not vary at all or does not vary much. Stratification by nominal scale variables, such as sex or country of birth, theoretically results in strata in which the variables of sex or country of birth do not vary; in actuality, there may still be some residual variability because some people may be misclassified into the wrong strata. Stratification by a continuously measured variable, such as age, will result in age categories within which age can vary, although over a restricted range. With either kind of variable, nominal scale or continuous, a stratified analysis proceeds under the assumption that within the categories of the stratification variable there is no meaningful variability of the potential confounding factor. If the stratification variable is continuous, such as age, the more categories that are used to form strata, the less variability by age there can be within those categories.
In some stratified analyses, the end result is nothing more than the presentation of the data within each of the strata, with estimates of rates, risks, or effect estimates for each stratum. Often, however, the investigator hopes to summarize the relation between exposure and disease over the strata. The methods that do so compare exposed and unexposed subjects within each stratum and then aggregate the information from these comparisons over all the strata. The two basic approaches to aggregate the information over strata are referred to as pooling and standardization, representing two different methods for combining the data across the strata.
Pooling is one method for obtaining unconfounded estimates of effect across a set of strata. When pooling is used, it comes with an important assumption: that the effect being estimated is constant across the strata. With this assumption, each stratum can be viewed as providing a separate estimate, referred to as a stratum-specific estimate, of the overall effect. The principle behind pooling is to take an average of these stratum-specific estimates of effect. The average is taken as a weighted average, which is a method of averaging that assigns more weight to some values than to others. In pooling, the weights are assigned so that the strata that provide the most information, which is to say the strata with the most data get the most weight. This weighting is built directly into the formulas for obtaining the pooled estimate. When the data do not conform to the assumption that the effect is constant across all strata, pooling is not applicable. In that situation, it is still possible to obtain an unconfounded summary estimate of the effect over the strata using standardization, which is discussed later.