APPENDIX E. Mathematics and Examples of Key Considerations for Stratified Sampling

In this appendix, we describe the statistical approach known as stratified random sampling and the mathematics associated with it. For illustration purposes, we offer an example using a population similar to that involved in the firefighter context. However, there are many detailed nuances that are beyond the scope of this report, as any stratified sampling practice would need to be tailored to the exact population in question and the context and goals of that sampling would need to be clearly specified before sound direction could be given. Also, as we have noted in the preface and elsewhere, there are significant legal implications in handling race information in a selection context. We do not address those legal issues here, and we reiterate that any stratified process should be carefully reviewed by the city's legal counsel.

In Appendix D, we discussed the chance variability that can occur in the use of a simple random sample. We noted that with such a procedure all applicants would have an equal probability of selection; however, the proportion of a subgroup of interest represented in the applicant pool and in a simple random sample would naturally differ because of sampling variability. For example, in Figure D.l, we saw that when drawing a simple random sample of size 300 from an applicant pool that is 50 percent white, the resulting percentage of white applicants in the selected sample would fall between 45 and 55 percent approximately 90 percent of the time.

Stratified random sampling is an alternative sampling method in which the sample is independently drawn from mutually exclusive subgroups^{[1]} of interest (i.e., strata:) within the group of people from which the sample is being drawn (i.e., the sampling frame),^{[2]} thereby allowing the sampling properties of the subgroups to be refined. Stratification may be used as a feature of a sampling design to ensure the representation of properly defined subgroups within the sampling frame (Groves et al., 2004). In particular, proportional allocation to strata is a technique by which the sample is selected within each stratum with the same probability of selection (Groves et al., 2004), so that each member of the sampling frame has the same probability of being selected into the sample and each well-defined stratum is represented in the sample at the same rate at which it appears in the sampling frame. In this context, if all applicants were properly classified into the correct subgroups, proportional allocation would enable us to draw a stratified sample in which all subgroups of interest were represented proportionally to their presence in the applicant pool. For example, when using proportional allocation to draw a random sample stratified by race/ethnicity from an applicant pool that is 50 percent white, the resulting sample would also be 50 percent white.^{[3]}

[1] Such groups are mutually exclusive if each member of the sampling frame belongs to one and only one of the groups.

[2] Note that in the case of drawing a stratified random sample of job applicants, for example, the sampling frame would differ depending on when in the selection process the stratified sampling was taking place.

[3] Depending on that sample size, the exact proportion of the sample may not match the exact proportion of the sampling frame. In the example, a sample size of 301 could not yield a sample with 150.5 white applicants. See the section on additional considerations below for further discussion.

Found a mistake? Please highlight the word and press Shift + Enter