# of the Central Limit Theorem

Figure 6.5 shows the distribution of the 50 data points for GDP in table 6.1. The range is quite broad, from \$118 to \$978 per year per person, and the shape of the distribution is multimodal.

The actual mean of the data in table 6.1—that is, the parameter we want to estimate—is \$533.28. There are 2,118,760 samples of size 5 that can be taken from 50 eleTable 6.1 Per Capita Gross Domestic Product (PCGDP) in U.S. Dollars for the 50 Poorest Countries in the World, 2007

 Country PCGDP Country PCGDP Burundi 118 Burkina Faso 483 DR-Congo 151 Mali 554 Zimbabwe 159 Tajikistan 555 Liberia 195 Comoros 556 Ethiopia 201 Cambodia 598 Guinea-Bissau 211 Haiti 612 Malawi 257 Benin 618 Eritrea 271 N. Korea 618 Niger 289 Ghana 647 Somalia 291 Chad 692 Sierra Leone 330 Kyrgyzstan 704 Afghanistan 345 Uzbekistan 704 Rwanda 354 Laos 711 Mozambique 362 Kiribati 762 Tanzania 368 Kenya 786 Gambia 377 Lesotho 797 Madagascar 377 Viet Nam 815 Myanmar 379 Mauritania 874 Togo 386 Senegal 908 Timor-Leste 393 Sao Tome and Principe 912 Central African Rep. 394 Papua New Guinea 953 Uganda 403 Yemen 967 Nepal 419 Zambia 974 Bangladesh 428 India 976 Guinea 452 Solomon Islands 978

SOURCE: United Nations, Dept. of Economic and Social Affairs, Economic and Social Development. http:// unstats.un.org/unsd/demographic/products/socind/inc-eco.htm.

Table 6.2 All Samples of Two from Five Elements

 Sample Mean Cumulative mean Uzbekistan and Senegal (704 + 908)/2 = 806.0 806.0 Uzbekistan and Guinea (704 + 452)/2 = 578.0 1,384.0 Uzbekistan and Rwanda (704 + 354)/2 = 529.0 1,913.0 Uzbekistan and Liberia (704 + 195)/2 = 449.5 2,362.5 Senegal and Guinea (908 + 452)/2 = 680.0 3,042.5 Senegal and Rwanda (908 + 354)/2 = 631.0 3,673.5 Senegal and Liberia (908 + 195)/2 = 551.5 4,225.0 Guinea and Rwanda (452 + 354)/2 = 403.0 4,628.0 Guinea and Liberia (452 + 195)/2 = 323.5 4,951.5 Liberia and Rwanda (195 + 354)/2 = 274.5 5,226.0 x = 5,226/10 = 522.6

ments. Table 6.3 shows the means from 10 samples of five countries chosen at random from the data in table 6.1.

Even in this small set of 10 samples, the mean is \$504.72—quite close to the actual mean of \$533.28. Figure 6.6 (left) shows the distribution of these samples. It has the look of a normal distribution straining to happen. Figure 6.6 (right) shows 20 samples of five from the 50 countries in table 6.1. The strain toward the normal curve is unmistakable and the mean of those 20 samples is \$505.18.

The problem is that in real research, we don’t get to take 10 or 20 samples. We have

FIGURE 6.4.

Five cases and the distribution of samples of size 2 from those cases.

FIGURE 6.5.

The distribution of the 50 data points for GDP in table 6.1.

Table 6.3 10 Means from Samples of Size 5 Taken from the 50 Elements in Table 6.1

 522.6 652.8 434.4 461.2 586.2 489.2 468.2 458.6 465 509

Mean = 504.72 Standard Deviation = 67.51

to make do with one. The first sample of five elements that I took had a mean of \$522.60—pretty close to the actual mean of \$533.28. But it’s very clear from table 6.3 that any one sample of five elements from table 6.1 could be off by a lot. They range, after all, from \$434.40 to \$652.80. That’s a very big spread, when the real average we’re trying to estimate is \$533.28. Still, as you can see from figure 6.6, as we add samples, the mean of the samples gets closer and closer to the parameter we’re trying to estimate and the distribution of the means of the samples looks more and more like the normal distribution.

We are much closer to answering the question: How big does a sample have to be?