Mechanics
Applying the chi-square test is quite simple. The following paragraphs outline the process, which when followed, will provide the user a template for practical applications of this statistical method.
Types of Data
The chi-square test is a non-parametric test, also called a distribution free test. Nonparametric tests can be used when any one of the following conditions pertains to the data (McHugh, 2013):
- 1. The level of measurement of all the variables is nominal or ordinal.
- 2. The original data were measured at an interval or ratio scale, but violate one of the following assumptions of a parametric test:
a. The distribution of the data is seriously skewed, violating the assumption that the dependent variable is approximately normally distributed.
b. The data violate the assumption of equal variance or homoscedasticity.
c. For any of a number of reasons, the continuous data were collapsed into a small number of categories, and thus the data are no longer interval or ratio. For example, number of vehicles in a household could be used in a chi-square test if you categorize them into groups (e.g., no vehicle or any number of vehicles).
Assumptions
As with any statistic, there are requirements for its appropriate use, which are called assumptions of the statistic. The assumptions of the chi-square include:
1. The data in the cells are frequencies, or counts of cases. Categorical data may be displayed in a contingency table. For example, Table 8.1 shows counts of cases regarding two variables—trip modes and housing types.
Table 8.1 Contingency Table
Counts of survey respondents |
housing type ( 1-single family detached; O-others) |
Total |
||
0 |
1 |
|||
mode: 1-walk, 2-bike, 3-tiansit, 4-auto, O-others |
0 |
591 |
3,303 |
3,894 |
1 |
1,979 |
6,424 |
8,403 |
|
2 |
272 |
1,470 |
1,742 |
|
3 |
963 |
1,424 |
2,387 |
|
4 |
14,087 |
101,856 |
115,943 |
|
Total |
17,892 |
114,477 |
132,369 |
- 2. The levels (or categories) of the variables are mutually exclusive. That is, a particular subject fits into one and only one level of each of the variables. You can’t ride both transit and automobile at the same time. In Table 8.1, if you asked respondents “what is your means of transportation to go to school or work?" and allow them to select multiple modes used in an entire trip (e.g., both walk and transit), you cannot apply the chi-square test for figuring out the statistical differences among different modes.
- 3. Each subject may contribute data to one and only one cell in the %2. If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc., then %2 may not be used. This kind of data is called paired samples.
- 4. The expected value of the cell should be five or more in at least 80 percent of the cells, and no cell should have an expected value of less than one (Yates, Moore, & McCabe, 1999). This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by five. You will see what the expected value means later.
Hypothesis in Chi-Square Test
In this step, we will create both a null hypothesis and an alternative hypothesis. The null hypothesis is that there is no difference in the proportion of occurrences in each category'. So the variables are unrelated in any way. On the contrary, the alternative hypothesis, or the research hypothesis, is that there is a difference in the proportion of occurrences in each category'. If you are interested in the relationship between two variables, you can word the alternative hypothesis that there is a statistically significant relationship between the variables.
Going back to our example, our null hypothesis would be that there is no difference in the proportion of occurrences in each category and so two variables—trip modes and housing types—are not related. Then, our alternative hypothesis is the exact opposite, which is that the relationship between trip modes and housing types is statistically significant. If the /»value is lower than 0.05, it indicates that there is less than a 5 percent chance that the values of each category' are randomly distributed. In other words, there is a difference in the proportion of occurrences in each category. In this case, the null hypothesis is rejected and the alternative hypothesis is accepted as true.
Calculate the Test Statistic
The chi-square test statistic is obtained by contrasting the observed frequencies with the expected frequencies. The expected frequencies represent the number of observations that would be found in each cell if the null hypothesis were true or, in other words, the categorical variables were unrelated. The chi-square equation is shown here.
In this equation, is chi-square, o is the observed frequency of each category and e is the expected frequency, or the number of observations that would be found if the null hypothesis were true.
In this equation for the expected value, MK represents row marginal values, or the sum of each row, and Mc represents column marginal values, or the sum of each column while n is the total sample size. For cell 1 in Table 8.1, the math is as follows: (17,892 * 3,894) / 132,369 = 526.3. Table 8.2 provides the results of this calculation for each cell.
Once the expected values have been calculated, the cell /2 values are calculated with the formula. Then they are summed to obtain the %2 statistic for the table.
Why square the difference between observed and expected frequencies? It is to get rid of the minus signs and provide a set of measures whose sum will reflect the aggregate degree of difference that actually exists between the observed and expected patterns of frequencies. So a large value of the x2 statistic would not support the null hypothesis and thus lead to its rejection. Otherwise, a small value of the x2 statistic
Table 8.2 Observed Versus Expected Counts
housing type: 1-single-family detached; 0-others |
Total |
|||
0 |
1 |
|||
mode: 1-walk, 2-bike, 3- transit, 4- auto, 0- others |
0 Count |
591 |
3,303 |
3,894 |
Expected Count |
526.3 |
3,367.7 |
3,894.0 |
|
1 Count |
1,979 |
6,424 |
8,403 |
|
Expected Count |
1,135.8 |
7,267.2 |
8,403.0 |
|
2 Count |
272 |
1,470 |
1,742 |
|
Expected Count |
235.5 |
1,506.5 |
1,742.0 |
|
3 Count |
963 |
1,424 |
2,387 |
|
Expected Count |
322.6 |
2,064.4 |
2,387.0 |
|
4 Count |
14,087 |
101,856 |
115,943 |
|
Expected Count |
15,671.7 |
100,271.3 |
115,943.0 |
|
Total |
Count |
17,892 |
114,477 |
132,369 |
Expected Count |
17,892.0 |
114,477.0 |
132,369.0 |
indicates a high probability' of the result occurring by chance and you can conclude that no association between two variables exists.
In the end, you have computed a chi-square test statistic of a certain value. This test statistic will be compared to a critical value of chi-square to determine a level of statistical significance. Figure 8.1 shows chi-square frequency distributions for different degrees of freedom (df). The area under these curves to the right of each value represents the probability that you might get a value of chi-square this great or greater by random chance. You can see, for example, that for 2 or 3 df, you will almost never get a chi-square value greater than 10 by chance, while for 10 df, you may' get a value of chi-square greater than 10 by chance more than 5 percent of the time. For 5 df, it is not obvious how much area is to the right of the critical value of 10, and you would have to consult a chi-square table to determine the level of statistical significance. In fact, the critical value of chi-square at the 0.05 significance level is 11.07. If you computed a chi-square value of 10, you could not reject the null hypothesis since you could get a value this great by chance more than 5 percent of time.
Determine the Degrees of Freedom and a Critical Value
We are getting close to drawing some conclusions; however, we cannot interpret the test statistic without considering the degrees of freedom (df).
The chi-square table requires the df in order to determine the significance level of the statistic. The df for a %2 table is calculated with the formula:
For the preceding example, the 5x2 table has 4 degrees of freedom for (5-1) x (2-1). So degrees of freedom become a multiplied value of two variables’ number of categories minus one.

Figure 8.1 Chi-Square Distributions for Different Degrees of Freedom
In this final step of our analysis, we take all of the information we have obtained in earlier steps and begin to pull it together to draw a conclusion. We will assess the test statistic against the critical value at our chosen level of significance (usually 0.05) and either reject or fail to reject our null hypothesis. For Table 8.2, the computed chi-square statistic is 2,394 and we can reject the null hypothesis at the 0.05 level or beyond (way beyond). That is, we can say with 95 percent or greater confidence that travel mode varies by housing type, or equivalently, that the two variables are related to one another.
Strength Test for the Chi-Square
The researcher’s work is not quite done yet. Finding a significant difference merely means that there is less than a 5 percent chance that the values of each category are randomly distributed and two variables are not related. But recall that statistical significance is not equivalent to practical significance. Statistical significance depends on both the strength of the relationship between two variables and the number of cases. Because the numerator in the chi-square formula is squared, you will likely get statistically significant values when you have a large sample, even if the association between variables is weak.
For the chi-square, the most commonly used strength tests are phi test and Cramer’s V tests. Both depend only on the strength of the relationship between two categorical variables in a contingency table.
Phi is used with 2x2 contingency tables and Cramer’s V is used with larger tables. Phi and Cramer’s V assume values between 0 and 1. Phi eliminates sample size by dividing chi-square by n (the sample size) and taking the square root. V eliminates sample size by taking the square root of chi-square divided by n and multiplied by m (which is the smaller of [rows - 1] or [columns - 1]). Since phi and V have known distributions, statistical software packages can give us the significance level of the computed phi or V value.