MEASURES OF DISPERSION I: RANGE AND INTERQUARTILE RANGE

After central tendency and shape, the next thing we want to know about data is something about how homogeneous or heterogeneous they are—that is, something about their dispersion or variation The concept of dispersion is easily seen in the following example. Here are the ages of two groups of five people:

Group 1: 35 35 35 35 35

Group 2: 35 75 15 15 35

Both groups have an average age of 35, but one of them obviously has a lot more variation than the other. Consider MFRATIO in table 20.8. The mean number of males to 100 females in the world was 99.9 in 2010. One measure of variation in this mean across the 50 countries in our sample is the range. Inspecting the column for MFRATIO in table 20.8, we see that Latvia had the lowest ratio, with 85 males for every 100 females, and the United Arab Emirates (UAE) had the highest ratio, with 205. The range, then, is 205 — 85 = 120. The range is a useful statistic, but it is affected strongly by extreme scores. Without the UAE, the range is 24, a drop of nearly 80%.

The interquartile range avoids extreme scores, either high or low. The 75th percentile for MFRATIO is 100 and the 25th percentile is 95, so the interquartile range is 100 — 95

= 5. This tightens the range of scores, but sometimes it is the extreme scores that are of interest. The interquartile range of freshmen SAT scores at major universities tells you about the middle 50% of the incoming class. It doesn’t tell you if the university is recruiting athletes whose SAT scores are in the bottom 25%, the middle 50%, or the top 25% of scores at those universities (see Klein 1999).