# A Note on the Decomposition of Health Inequality by Population Subgroups in the Case of Ordinal Variables

- Introduction
- The Decomposition of Health Inequalityby Population Subgroups
- The Proposal of Kobus and Miloś (2012)
- The Gini-Related Index of Lv et al. (2015)
- The Properties of the Index Introduced by Lv et al. (2015)
- Decomposing by Population Subgroups the Gini-Related IndexProposed by Lv et al. (2015)
- An Empirical Illustration
- References

**Pundarik Mukhopadhaya and Jacques Silber**

## Introduction

"Global inequality takes many dimensions. Not only is there great inequality across the peoples of the world in material standards of living, but there are also dramatic inequalities in health. The inhabitants of poor countries not only have lower real incomes, but they are also more often sick, and they live shorter lives. These international correlations between income and health should affect the way that we think about the level and distribution of global wellbeing. They also need to be understood if we are to be effective in reducing global deprivation in either income or health" (Deaton 2006). There is certainly a strong need to better grasp the determinants and impact of health inequality. However, efforts must also be devoted to measurement issues, in particular given that data on health often appear as ordinal variables, as in the case of self-assessments of health. The present note is an attempt in this direction. We show that one of the health inequality indices recently axiomat- ically derived by Lv et al. (2015) is related to the Gini index and hence may be decomposed by population subgroups into the sum of three elements: between and within groups inequalities and a residual reflecting the extent of overlap between the distribution of health status in the various population subgroups. Such a breakdown is, needless to say, useful for policy makers as it should allow them to focus their attention on the component of health inequality to which priority should be given. This note is organized as follows. Section 3.2 describes the methodology while Section 3.3 gives a simple numerical illustration.

## The Decomposition of Health Inequalityby Population Subgroups

### The Proposal of Kobus and Miloś (2012)

These authors used one of the indices derived by Abul Naga and Yalcin (2008) and defined as

where *Fk* refers to the cumulative relative frequency, *К* to the number of health categories, and *me* to the category corresponding to the median of the distribution.

Kobus and Milos (2012) proposed a generalization *I _{a}i,* of this index which is expressed as

with *a >* 0; *b >* 0.

If *a =* 1 and *b =* 1, we get the index Iay- Note that when *a > b,* the index is more sensitive to inequality below the median, the opposite being true when *b > a.*

Kobus and Milos (2012) compute the (relative) contribution С/ of a given population subgroup *l* to health inequality in the whole population as

where *r _{t}* is the weight of population subgroup / in the whole population,

*I'*is the generalized Abul Naga-Yalcin index for population subgroup

_{lb }*l,*and

*I„*is the value of this index in the total population.

_{b }### The Gini-Related Index of Lv et al. (2015)

#### The Properties of the Index Introduced by Lv et al. (2015)

Lv et al. (2015) proposed an axiomatic derivation of measures of health inequality in the case of ordinal variables. They came up with two families of indices (see Lv et al., 2015). We will focus on the index they derived which is related to the Gini index. This index Ilwxi is defined as

where *К* is the total number of health categories, the latter referring for example to degrees of satisfaction with health; *h* and *к* refer to the rank of two different health categories, the categories being ranked by increasing degrees of satisfaction; and//, and/-, are the relative frequencies of these two categories.^{[1]}

Note that Lv et al. (2015) stressed the fact that the index in (3.4) is identical to one of the ordinal segregation indices proposed by Reardon (2009) and considered by Lazar and Silber (2013) as a possible health inequality index. Using our previous notations, this index (see, Reardon, 2009) is defined as

where F refers to the cumulative relative frequency distribution, so that F* =

£5.1 *и*

In deriving (2.4) Lv et al. (2015) had assumed that the following axioms^{[2]} would hold:

- • Normalization: this axiom assumes that, if everyone has the same health status, then health inequality is equal to 0.
- • Simple Aversion to Median-Preserving Spreads: a "medianpreserving" increase in the spread of a frequency distribution increases its inequality. Note that the idea of aversion to medianpreserving spreads appears already in Allison and Foster (2004) and Abul Naga and Yalcin (2008).
- • Invariance to Parallel Shifts: This axiom assumes that, when all the health statuses are clustered on two categories, a "parallel" shift of the entire frequency distribution leaves the health inequality index unchanged, although pushing each category one level up or one level down may change the level of "overall" health status.
- • Additivity: The idea here is that an index of health inequality is the sum of the inequalities between all possible couples of individuals.
- • Independence: This axiom requires that, for any two health categories
*i*and*j,*the change in health inequality, when there is no change in the relative frequency*hj*of category*i,*but a change*Shj*in the relative frequency of category*j,*is independent of the level of the relative frequency*hj.*

#### Decomposing by Population Subgroups the Gini-Related IndexProposed by Lv et al. (2015)

Using (3.4) it is easy to observe that

where Д is Gini's mean difference of the distribution of the ranks of the different categories. Using (3.4) and (3.6) we may then rewrite (3.4) as

Remembering now that the famous Gini index *Ic* of the distribution of these ranks may be expressed (see Kendall and Stuart, 1969) as

where *h* refers to the mean rank in the distribution of ranks, we can combine expressions (3.4) to (3.8) to derive that

Silber (1989), however, showed that the Gini index *Ic* could be also expressed as

where *e'* is a row vector of the relative frequencies *fi„* the latter being classified by decreasing rank of the health categories (e.g., by decreasing degree of satisfaction with one's health) and s is a column vector of the shares *s _{h}* in "total rank value," these shares being also classified by decreasing rank of the health categories. In other words, these shares are defined as

*s/,*=//,(£), where

*h*is the rank of health category

*h*and

*h*the average rank in the population.

Finally G is a square matrix, called the G-matrix (see Silber, 1989), whose typical element y,_{(} is equal to 0 if *i = j,* to -1 if / > *i,* and to +1 if *i > j.*

As mentioned in Silber (1989), the Gini index *Ic* may be decomposed as follows by population subgroups so that

where Iwithin, I between, and Ioverlap refer to the within groups Gini index, the between groups Gini index, and the overlap component of the overall Gini index.

Combining (3.8) and (3.11) we derive that

where Abetween, Awithin, and Aoverlap refer respectively to the between- groups mean difference, the within-groups mean difference, and the overlap component of the mean difference.

Note that the within-groups Gini index may be written (see, Silber, 1989) as

where *L* refers to the total number of population subgroups, r/ to the share of population subgroup / in the total population, and *qi* to the share of population subgroup / in the "total rank value" of the whole population. In other words, *q, = J**2**h=fh=J* where *ft* is the relative frequency of category *h*

in population subgroup *l, h* the rank of category *h,* and *h ^{1}* the average rank in subpopulation /. Finally Iqi is the Gini index of the distribution of ranks in population subgroup 1

*.*

The between groups Gini index Ibetween is defined (see, Silber, 1989) as

where *ri* is a row vector of the population shares q, *q* a column vector of the shares *i]i,* both sets of shares being ranked by decreasing values of the average ranks *h ^{1}* of the different population subgroups, and G is a

*L*by

*L*G-matrix.

Finally Ioverlap is a residual obtained by deducting the sum *(I*within *+ *Ibetween) from the total Gini index^{[3]}Iq Combining (3.8) and (3.13) we derive that

However, we may also define Iwithin as

where A within refers to the within groups mean difference of the distribution of ranks.

Combining (3.15) and (3.16) we conclude that

Using (3.14), and defining the between-groups mean difference A between as

we derive that

where refers to a column vector whose typical element is *(r/j),* the

shares r/ and *hi* in (3.19) being ranked by decreasing values of the mean ranks *hi.*

Combining (3.7), (3.12), (3.17) and (3.17) we finally conclude that

Equation (3.20) allows one to determine the contributions to overall health inequality of the within population subgroups health inequality, the between population subgroups health inequality, as well as the contribution of the degree of overlap between the distributions of the individuals by health category in the different population subgroups.

A simple numerical illustration is given in the next section.

## An Empirical Illustration

Assume two population subgroups, A and B, and four health categories. We will assume that the distributions of the individuals of these two population subgroups, between the four health categories, do not overlap. Here are the data concerning the distribution of the individuals between the various health categories (Tables 3.1-3.3).

These data allow us first to compute the average ranks.

For the total population the average rank *h* is

For population subgroup A we compute the average rank /г л as Finally for population subgroup В we compute the average rank /*g as

TABLE 3.1

Number of Observations

Population Subgroup |
Lowest Health Category (1) |
Second Lowest Health Category (2) |
Second Highest Health Category (3) |
Highest Health Category (4) |

A |
7 |
3 |
0 |
0 |

В |
0 |
0 |
4 |
6 |

Total population |
7 |
3 |
4 |
6 |

Relative Frequencies

TABLE 3.2

Population Subgroup |
Lowest Health Category (1) |
Second Lowest Health Category (2) |
Second Highest Health Category (3) |
Highest Health Category (4) |

A |
0.7 |
0.3 |
0 |
0 |

В |
0 |
0 |
0.4 |
0.6 |

Total population |
0.35 |
0.15 |
0.2 |
0.3 |

Cumulative Relative Frequencies

TABLE 3.3

Population Subgroup |
Lowest Health Category (1) |
Second Lowest Health Category (2) |
Second Highest Health Category (3) |
Highest Health Category (4) |

A |
0.7 |
1 |
1 |
1 |

В |
0 |
0 |
0.4 |
1 |

Total population |
0.35 |
0.50 |
0.70 |
1 |

Using (3.5) we easily derive that the Reardon (2009)-Lv et al. (2015) index for the whole population is equal to

Using (3.7) we then derive the following mean differences:

Using now (3.17) we derive that the within-groups mean difference Awithin is expressed as

Using (3.19) we then derive that the between-groups mean difference

Abetween is expressed as

It is easy to prove that Abetween in the case of two population subgroups maybe also expressed as

We then derive that the sum (Abetween *+ *A within) = 1.15 + 0.225 = 1.375

Using (3.4) we can multiply this last result by (^-), that is by (2/3), and we obtain a value of 0.9166 which is exactly the value of the Reardon (2009)-Lv et al. (2015) index which we computed directly previously.

In the case of more than two distributions we can proceed similarly. If the distributions are overlapping, then obviously the sum of the between- and within-groups mean differences will not be equal to the overall mean difference. There will remain a residual which is in fact a measure of the degree of overlap of these distributions.

## References

Abul Naga, R. H. and Yalcin, T. (2008). Inequality Measurement for Ordered Response Health Data, *Journal of Health Economics* 27:1614-1625.

Allison, R. A. and Foster, J. (2004). Measuring Health Inequalities Using Qualitative Data, *Journal of Health Economics* 23: 505-524.

Dagum, C. (1997). A New Approach to the Decomposition of the Gini Income Inequality Ratio, *Empirical Economics* 22: 515-531.

Deaton, A. (2006). Global Patterns of Income and Health: Facts, Interpretations, and Policies," WIDER Annual Lecture, Helsinki, September 29th 2006 . NBER Working paper series. 10.3386/wl2735. https://www.nber.org/system/files/ working_papers/wl2735/wl2735.pdf.

Kendall, M. G. and Stuart, A. (1969). *The Advanced Theory of Statistics,* London:Charles Griffin and Company Limited.

Kobus, M. and Milos, P. (2012). Inequality Decomposition by Population Subgroups for Ordinal Data, *Journal of Health Economics* 31:15-21.

Lazar, A. and Silber, J. (2013). On the Cardinal Measurement of Health Inequality When only Ordinal Information is Available on Individual Health Status, *Health Economics 22:* 106-113.

Lv, G., Wang, Y. and Xu, Y. (2015). On a New Class of Measures for Health Inequality Based on Ordinal Data, *Journal of Economic Inequality* 13: 465-477.

Reardon, S. F. (2009). Measures of Ordinal Segregation, *Research on Economic Inequality, *vol. 17, pp. 129-155,Bingley, UK: Emerald.

Silber, J. (1989). Factors Components, Population Subgroups and the Computation of the Gini Index of Inequality, *The Review of Economics and Statistics* LXXI: 107-115.

- [1] 'The coefficient (j^_) was introduced by Lv et al. (2015) to make sure that their index liesbetween 0 and 1. It is easy to observe that when inequality is maximal, that is, when half ofthe population is in the lowest health category, and half in the highest health category, nobodybeing in the other health categories, the index Ilwxl will be equal to (К — 1)/2.
- [2] For a more rigorous definition of these axioms, see, Lv et al. (2015).
- [3] Dagum (1997) gave a direct formulation of IqverlaI’ which avoids considering it as a residual.