# Example: Confidence Limits for a Risk or Prevalence

Using the following equation with the example of 20 cases of influenza among 100 people, we can calculate a 90% confidence interval for the risk as follows. The lower bound would be

The upper bound could be obtained by substituting a plus sign for the minus sign in the calculation. Making this substitution gives a value 0.27 for the upper bound. With 20 influenza cases in a population of 100 at risk, the 90% confidence interval for the risk estimate of 0.2 is 0.13 to 0.27.

# Incidence Rate Data

For incidence rate data, we use a to represent cases and PT to represent person-time. Although the notation is similar to that for risk data, these data differ conceptually and statistically from the binomial model used to describe risk data. For binomial data, the number of cases cannot exceed the total number of people at risk. In contrast, for rate data, the denominator does not relate to a specific number of people but rather to a time total. We do not know from the value of the persontime denominator, PT, how many people might have contributed time.

For statistical purposes, we invoke a model for incidence rate data that allows the number of cases to vary without any upper limit. It is the Poisson model.

We take *a/PT* as the estimate of the disease rate, and we calculate a confidence interval for the rate using Equation 9-1 with the following standard error:

Do Rates Always Describe Population Samples?

Some theoreticians propose that if a rate or risk is measured in an entire population, there is no point to calculating a confidence interval, because a confidence interval is intended to convey only the imprecision that comes from taking a sample from a population. According to this reasoning, if the entire population is measured instead of a sample, there is no sampling error to worry about and therefore no confidence interval to compute. There is another side to this argument, however. Others hold that even if the rate or risk is measured in an entire population, that population represents only a sample of people from a hypothetical superpopulation. In other words, the study population, even if enumerated completely without any sampling, represents merely a biologic sample of a larger set of people; therefore, a confidence interval is justified.

The validity of each argument may depend on the context. If one is measuring voter preference, it is the actual population in which one is interested, and the first argument is reasonable. For biologic phenomena, however, what happens in an actual population may be of less interest than the biologic norm that describes the superpopulation. Therefore, for biologic phenomena the second argument is more compelling.

# Example: Confidence Limits for an Incidence Rate

Consider as an example a cancer incidence rate estimated from a registry that reports 8 cases of astrocytoma among 85,000 person-years at risk. The rate is 8/85,000 person-years, or 9.4 cases/100,000 person-years. A lower 90% confidence limit for the rate would be estimated as

Using the plus sign instead of the minus sign in the equation gives 14.9/100,000 person-years for the upper bound.