# PROBABILITY PROPORTIONATE TO SIZE

The best estimates of a parameter are produced in samples taken from clusters of equal size. When clusters are not equal in size, then samples should be taken **PPS**â€”with **probability proportionate to size**.

Suppose you had money and time to do 800 household interviews in a city of 50,000 households. You intend to select 40 blocks, out of a total of 280, and do 20 interviews in each block. You want each of the 800 households in the final sample to have exactly the same probability of being selected.

Should each block be equally likely to be chosen for your sample? No, because census blocks never contribute equally to the total population from which you will take your final sample. A block that has 100 households in it *should* have twice the chance of being chosen for 20 interviews as a block that has 50 households and half the chance of a block that has 200 households.

When you get down to the block level, each household on a block with 100 residences has a 20% (20/100) chance of being selected for the sample; each household on a block with 300 residences has only a 6.7% (20/300) chance of being selected.

Lene Levy-Storms wanted to talk to older Samoan women in Los Angeles County about mammography. The problem was not that women were reticent to talk about the subject. The problem was how do you find a representative sample of older Samoan women in Los Angeles County?

From prior ethnographic research, Levy-Storms knew that Samoan women regularly attend churches where the minister is Samoan. She went to the president of the Samoan Federation of America in Carson, California, and he suggested nine cities in L.A. County where Samoans were concentrated. There were 60 churches with Samoan ministers in the nine cities, representing nine denominations. Levy-Storms asked each of the ministers to estimate the number of female church members who were over 50 years old. Based on these estimates, she chose a PPS sample of 40 churches (so that churches with more or fewer older women were properly represented). This gave her a sample of 299 Samoan women over 50. This clever sampling strategy really worked: Levy-Storms contacted the 299 women and wound up with 290 interviewsâ€”a 97% cooperation rate (Levy-Storms and Wallace 2003).

PPS sampling is called for under three conditions: (1) when you are dealing with large, unevenly distributed populations (such as cities that have high-rise and single-family neighborhoods); (2) when your sample is large enough to withstand being broken up into a lot of pieces (clusters) without substantially increasing the sampling error; and (3) when you have data on the population of many small blocks in a population and can calculate their respective proportionate contributions to the total population.