# Stratified Sampling Formulas Illustrated with an Example

Suppose the sampling frame (or the group from which people will be sampled) consists of 10,000 members, each of which belongs to one of three strata, such that 60 percent of the frame is in stratum A, 10 percent is in stratum B, and the remaining 30 percent is in stratum C. Table E.l illustrates the resultant size of each stratum in the second

Table E.1. Stratified Sampling Example, In Which 3 Percent of Each Sub-Population Is Selected

 Strata Number in Sampling Frame Proportion of Sampling Frame Number in Sample Proportion of Sample : Probability of Selection A 6,000 0.6 180 0.6 B 1,000 0.1 30 0.1 C 3,000 0.3 90 0.3 Total 10,000 1.0 300 1.0 0.03

column. The third column shows the proportion that the number of people in each stratum represents within the sampling frame (e.g., there are 6,000 people in stratum A and 10,000 people in the sampling frame; 6,000/10,000 = 0.6 or 60 percent).

Now suppose that we have chosen to randomly select 3 percent of the people from each stratum (i.e., everyone in the stratum has a 0.03 probability of being selected), as shown in the far right column. Column 4 shows how many people need to be randomly sampled from each stratum to achieve the 0.03 probability of selection (e.g., 6,000 × 0.03 = 180, so 180 need to be selected from stratum A). Column 5 shows the proportion of the resulting sample from each stratum relative to the total number of people selected (e.g., out of the total sample of 300, 180 people [or 60 percent] will be members of stratum A; 180/300 = 0.6 or 60 percent). Note that the proportion of the sampling frame matches the proportion of the sample in the example. This example illustrates one type of stratified sampling: stratified random sampling using proportional allocation to strata.

We now detail the mathematics for this type of stratified sample. The sampling frame consists of N people, each of whom is a member of one of G well-defined, mutually exclusive subgroups of interest (strata), indexed by g Î {1, ..., G}. Let Ng represent the size of stratum g in the

sampling frame. Since the strata are mutually exclusive, the individual strata sizes sum to N, i.e.,, and each stratum represents of the total sampling frame, t u rther, suppose that a sample of size n fromthe sampling frame is desired, such that each stratum representsof the total sample; i.e., the proportion of each stratum in the sample matches the proportion of each stratum in the sampling frame. The sample is generated bv drawing a simple random sample from each stratum of size, with each member of stratum g having probabilityof being sampled. For the moment, we assume is an integer; see the additional considerations section below tor the non-integer case. This sample includes the following features:

1. The samples from each stratum sum to produce the desired overall sample size n:

2. The proportion of stratum g found in the sample matches the proportion of stratum g found in the sampling frame:

3. Each member of the sampling frame has an equal probability of selection, regardless of the stratum to which they belong:

While the simple random sample approach would also produce a sample of size n, with each member of the sampling frame having an equal probability of selection, this stratified random sampling approach also yields a sample distribution of the strata equivalent to that distribution in the sampling frame (feature 2 above).