 # The Variance

The variance is a measure of dispersion that eliminates the issue of differences totaling to zero and also the issue of negative numbers. You calculate it by squaring each of the differences, taking the total of the squared differences, and then dividing that total by the number of records.

IDEA's field statistics provides the population variance information of the dataset.

# The Standard Deviation

The most common measure of variability is the standard deviation. It is the distance of the number from the center or average. The standard deviation is the square root of the variance.

You calculate it by squaring each of the differences, taking the total of the squared difference, dividing that total by the number of records, and then applying the square root to the resulting number. The standard deviation tells us the variability in the distribution of the data. It tells us how far each number is away from the mean. The further the number deviates from the mean, the larger the standard deviation amount. It can be used as a measure of relativity between the numbers. This can be also used as a comparison to different data sets. Since standard deviation is relative, it eliminates issues of comparing difference scales or bases as a ratio is calculated. If you were comparing test scores, whether they are calculated out of 100 or 125, the standard deviation can be compared without any additional calculations.

## Standard Deviation of a Population versus Standard Deviation of a Sample

The standard deviation calculation discussed earlier is for a population standard deviation. The total of the squared differences is divided by the number of records or n.

When calculating the standard deviation of a sample, the total of the squared differences is divided by the number of records minus 1 or n - 1. Subtracting 1 is a correction factor. The reduction in the denominator results in a larger standard deviation in a sample and should be more representative of the true standard deviation of the population.

The accuracy of the standard deviation of a sample increases with the increase of the sample size. As you increase the sample size, you get closer to having a standard deviation that is the standard deviation of the population. As you increase the sample size, or n , the correction factor has less impact. Applying division with a denominator of, say 4 (5 - 1, where the sample size is 5) would have a greater impact than dividing by 499 (500 - 1, where the sample size is 500).

IDEA's field statistics provides both the population and sample deviation information of the dataset.

Z-scores or standard scores tell us how far a number is away from the mean. It is a good example of how it applies the standard deviation in the formula. Z-scores are discussed further along in the book.