There are two things you can do to get good samples. (1) You can ensure sample accuracy by making sure that every element in the population has an equal chance of being selected—that is, you can make sure the sample is unbiased. (2) You can ensure sample precision by increasing the size of unbiased samples. We’ve already discussed the importance of how to make samples unbiased. The next step is to decide how big a sample needs to be.

Sample size depends on: (1) the heterogeneity of the population or chunks of population (strata or clusters) from which you choose the elements; (2) how many population subgroups (that is, independent variables) you want to deal with simultaneously in your analysis; (3) the size of the phenomenon that you’re trying to detect; and (4) how precise you want your sample statistics (or parameter estimators) to be.

1. Heterogeneity of the population. When all elements of a population have the same score on some measure, a sample of 1 will do. Ask a lot of people to tell you how many

FIGURE 5.2a.

Distribution of the ages of Ticuna household heads.

SOURCE: Adapted from data in A. Oyuela-Cacedo and J. J. Vieco Albarracin, ''Approximacion cuantitativa a la organizacion social de los Ticuna del trapecio amazonico colombiano,'' Revista Colombiana de Antropo- logia, Vol. 35, pp. 146-79, figure 1, p. 157,1999.

FIGURE 5.2b.

Distribution of the number of children in Ticuna households.

SOURCE: Adapted from data in A. Oyuela-Cacedo and J. J. Vieco Albarracin, ''Approximacion cuantitativa a la organizacion social de los Ticuna del trapecio amazonico colombiano,'' Revista Colombiana de Antropo- logia, Vol. 35, pp. 146-79, table 6, p. 159,1999.

days there are in a week and you’ll soon understand that a big sample isn’t going to uncover a lot of heterogeneity. But if you want to know what the average ideal family size is, you may need to cover a lot of social ground. People of different ethnicities, religions, incomes, genders, and ages may have very different ideas about this. (In fact, these independent variables may interact in complex ways. Multivariate analysis tells you about this interaction. We’ll get to this in chapter 22.)

2. The number of subgroups in the analysis. Remember the factorial design problem in chapter 4 on experiments? We had three independent variables, each with two attributes, so we needed eight groups (2^{3} = 8). It wouldn’t do you much good to have, say, one experimental subject in each of those eight groups. If you’re going to analyze all eight of the conditions in the experiment, you’ve got to fill each of the conditions with some reasonable number of subjects. If you have only 15 people in each of the eight conditions, then you need a sample of 120.

The same principle holds when you’re trying to figure out how big a sample you need for a survey. If you have four age groups and two genders, you wind up with an eightcell sampling design.

If all you want to know is a single proportion—like what percentage of people in a population approve or disapprove of something—then you need about 100 respondents to be 95% confident, within plus or minus 3 points, that your sample estimate is within 2 standard deviations of the population parameter (more about confidence limits, normal distributions, standard deviations, and parameters in a minute). But if you want to know whether women factory workers in Rio de Janeiro who earn less than $300 per month have different opinions than, say, middle-class women in Rio whose family income is more than $600 per month, then you’ll need a bigger sample.

3. The size of the subgroup. If the population you are trying to study is rare and hard to find, and if you have to rely on a simple random sample of the entire population, you’ll need a very large initial sample. A needs assessment survey of people over 75 in Florida took 72,000 phone calls to get 1,647 interviews—about 44 calls per interview (Henry 1990:88). This is because only 6.5% of Florida’s population was over 75 at the time of the survey. By contrast, the monthly Florida survey of 600 representative consumers takes about 5,000 calls (about eight per interview). That’s because just about everyone in the state 18 and older is a consumer and is eligible for the survey. [Christopher McCarty, personal communication]

The smaller the difference on any measure between two populations, the bigger the sample you need to detect that difference. Suppose you suspect that Blacks and Whites in a prison system have received different sentences for the same crime. Henry (1990:121) shows that a difference of 16 months in sentence length for the same crime would be detected with a sample of just 30 in each racial group (if the members of the sample were selected randomly, of course). To detect a difference of 3 months, however, you need 775 in each group.

4. Precision. This one takes us into sampling theory.

FURTHER READING

Street intercept surveys: Ross et al. (2006); Waltermaurer et al. (2003).

Mall-intercept survey: Bruwer and Haydam (1996); Bush and Hair (1985); Gates and Solomon (1982); Hornik and Ellis (1988); Wang and Heitmeyer (2006).

Space sampling: Daley et al. (2001); Lang et al. (2004).