Systematic sampling is fine if you know that the sampling frame has 51,413 elements. What do you do when the size of the sampling frame is unknown? A big telephone book is an unnumbered sampling frame of unknown size. To use this kind of sampling frame, first determine the number of pages that actually contain listings. To do this, jot down the number of the first and last pages on which listings appear. Most phone books begin with a lot of pages that do not contain listings.

Suppose the listings begin on page 30 and end on page 520. Subtract 30 from 520 and add 1 (520 — 30 + 1 = 491) to calculate the number of pages that carry listings.

Then note the number of columns per page and the number of lines per column (count all the lines in a column, even the blank ones).

Suppose the phone book has three columns and 96 lines per column (this is quite typical). To take a random sample of 200 nonbusiness listings from this phone book, take a random sample of 400 page numbers (yes, 400) out of the 491 page numbers between 30 and 520. Just think of the pages as a numbered sampling frame of 491 elements. Next,

BOX 5.1

PERIODICITY AND SYSTEMATIC SAMPLING

Systematic sampling usually produces a representative sample, but be aware of the periodicity problem. Suppose you're studying a big retirement community in South Florida. The development has 30 identical buildings. Each has six floors, with 10 apartments on each floor, for a total of 1,800 apartments. Now suppose that each floor has one big corner apartment that costs more than the others and attracts a slightly more affluent group of buyers. If you do a systematic sample of every 10th apartment, then, depending on where you entered the list of apartments, you'd have a sample of 180 corner apartments or no corner apartments at all.

David and Mary Hatch (1947) studied the Sunday society pages of the New York Times for the years 1932-1942. They found only stories about weddings of Protestants and concluded that the elite of New York must therefore be Protestant. Cahnman (1948) pointed out that the Hatches had studied only June issues of the Times. It seemed reasonable. After all, aren't most society weddings in June? Well, yes. Protestant weddings. Upper-class Jews married in other months. The Times covered those weddings, but the Hatches missed them.

You can avoid the periodicity problem by doing simple random sampling, but if that's not possible, another solution is to make two systematic passes through the population using different sampling intervals. Then you can compare the two samples one a few independent variables, like age or years of education. Any differences should be attributable to sampling error. If they're not, then you might have a periodicity problem.

take a sample of 400 column numbers. Since there are three columns, you want 400 random choices of the numbers 1, 2, 3. Finally, take a sample of 400 line numbers. Since there are 96 lines, you want 400 random numbers between 1 and 96.

Match up the three sets of numbers and pick the sample of listings in the phone book. If the first random number between 30 and 520 is 116, go to page 116. If the first random number between 1 and 3 is 3, go to column 3. If the first random number between 1 and 96 is 43, count down 43 lines. Decide if the listing is eligible. It may be a blank line or a business. That’s why you generate 400 sets of numbers to get 200 good listings.

Telephone books don’t actually make good sampling frames—too many people have unlisted numbers (which is why we have random digit dialing—see chapter 9). But because everyone knows what a phone book looks like, it makes a good example for learning how to sample big, unnumbered lists of things, like the list of Catholic priests in Paraguay or the list of orthopedic surgeons in California.