Multidimensional First-Order Dominance Comparisons of Population Wellbeing
Multidimensional First-Order Dominance Comparisons of Population Wellbeing
Abstract and Keywords
This chapter conveys the concept of first-order dominance (FOD) with particular focus on applications to multidimensional population welfare comparisons. It gives an account of the fundamental equivalent definitions of FOD both in the one-dimensional and multidimensional setting, illustrated with simple numerical examples. An implementable method for detecting dominances that relies on linear programming is explained along with a bootstrapping procedure that yields additional information relative to what can be obtained from dominance comparisons alone. The chapter discusses strengths and weaknesses of FOD compared to other multidimensional population comparison concepts, and describes practical tools that enable the reader to easily use it.
Keywords: first-order dominance, multidimensional welfare, multidimensional wellbeing, multidimensional population welfare comparison, linear programming, bootstrapping procedure
3.1 Introduction
A central question in applied welfare economics is how to make comparisons of population wellbeing across groups or over time. Appropriate comparison concepts have many potential uses. For example, if a study is able to detect that one population group is clearly worse off than another (i.e. is overall poorer or has less social welfare), society might wish to undertake policies aimed at narrowing this gap. Also, since reducing poverty or improving social welfare over time is often a key objective for public policies and reforms, the ability to make relevant comparisons over time is crucial for the formulation of meaningful goals and for policy evaluation.
The traditional approach to comparing population wellbeing is the use of a social welfare (or poverty) measure based on a one-dimensional individual wellbeing indicator, typically a monetary variable such as income or wealth. However, it has long been recognized that poverty and wellbeing are multidimensional phenomena, which are not adequately represented by a single income variable. As Sen (1976) points out, there is good reason to think that sometimes a richer person may have lower wellbeing than a poorer person; e.g. if he is disabled. This has given rise to numerous proposals of appropriate dimensions to include in multidimensional welfare analyses, including (but not restricted to) health and education (World Bank 1990) as well as standards of living (Sen 1988) to name a few.
Multidimensional welfare is often measured by aggregating multiple dimensions and weighting each dimension (see e.g. Alkire and Foster 2011; Roelen and Gassmann 2008; Rippin 2010). (p.25) The method is covered in-depth in Alkire et al. (2015) in a comprehensive representation of multidimensional poverty measurement and analysis, which the reader is encouraged to consult. The weighting is primarily made in order to reflect societal judgements about different dimensions as well as to be able to obtain a single measure of the welfare for a given population. This aggregation procedure enables the analyst to rank the populations. Furthermore, the approach is very convenient and can easily be justified when there exists a reasonably high degree of consensus about which weights should be applied. There is, however, no natural and generally agreed methodology to obtaining these weights. Often, it is not easy to say if one dimension is more important than another, and even when it is, quantifying by how much is often very difficult and perhaps not even meaningful to people.
The challenges described above have motivated the development of methods for comparing population wellbeing, poverty, or inequality with multidimensional indicators that are methodologically ‘robust’ in the sense that the conclusions obtained do not rely on predetermined weights on each dimension. In the context of applied welfare economics, such methods were popularized by Atkinson and Bourguignon (1982) who showed how stochastic dominance techniques for comparisons of probability distributions can be used to make comparisons of populations across broad classes of underlying social welfare functions. Such techniques have been further refined and applied by, e.g., Atkinson and Bourguignon (1987), Bourguignon (1989), Atkinson (1992), Bourguignon and Chakravarty (2003), Duclos et al. (2006, 2007), Gravel et al. (2009), Gravel and Mukhopadhyay (2010), Muller and Trannoy (2011), Gravel and Moyes (2012), and many others.
These contributions apply dominance concepts, which rely on assumptions that are typically formulated in terms of a specified sign on the second- (and higher-)order partial- or cross-derivatives of the underlying individual utility function considered by a utilitarian planner. This leads to so-called lower- or upper-orthant dominance (or even more demanding concepts). For example, Duclos and Échevin (2011) assume substitutability between health and income, i.e. an underlying utility function with a negative cross-partial derivative between health and income.
These concepts, while considerably more robust than methods relying on given weights, do not apply to ordinal data, where only the ranking of outcomes along each dimension is known to the analyst (based on a more-is-better assumption) but no information is available regarding, for example, the complementarity/substitutability relationship across the dimensions. However, welfare indicators are often ordinal in nature. For example, a higher educational attainment (e.g. a university degree) is considered to be better (p.26) than a lower (e.g. primary school), but quantifying by how much is not easily done and perhaps not even meaningful.
A natural concept for making comparisons of population distributions with multidimensional ordinal data is first-order dominance (FOD), also known as the usual (stochastic) order in the probability theory literature (see e.g. Shaked and Shanthikumar 2007). A finite (population, probability) distribution A first-order dominates distribution B if one can obtain distribution B from A by shifting (population, probability) mass within A from preferred to less preferred outcomes (where a less preferred outcome is not better in any dimension and is strictly worse in at least one dimension). Hence, if one distribution first-order dominates another, it is unambiguously better than the other. Thus, under the assumption that outcomes within each distribution can be ranked—e.g. we prefer the child attending school as opposed to not—the FOD approach provides a maximally robust way of making comparisons of multidimensional welfare. Technically, it does so without making any assumptions on utility functions and/or social welfare functions other than a more-is-better assumption. No additional assumptions are required about the strength of preferences for each dimension, nor about the relative desirability of changes between levels within or between dimensions (Arndt et al. 2012).
The absence of restrictive assumptions in the FOD approach makes the concept not only robust but also intuitively appealing. However, robustness comes at a cost. First, the result of comparing two distributions may be indeterminate. In other words, it may happen that distribution A does not dominate B and B does not dominate A. This makes the analyst unable to distinguish groups A and B according to wellbeing based on the selected indicators. Second, the FOD approach provides no information about whether a dominating distribution is slightly or substantially better than a dominated distribution. This chapter will discuss a way of mitigating these costs by applying a bootstrapping approach that provides a measure for the probability of observing dominances under resampling. This can serve both as a robustness check for the magnitude of the dominances observed and for the probability of observing dominance. Furthermore, if one is willing to accept the likelihood of performing well in head-to-head comparisons with other groups as an indicator of the relative wellbeing of a group, a full ranking of the groups can be calculated (Arndt et al. 2016).
The remainder of this paper is structured as follows: section 3.2 provides an overview of the theory of FOD with definitions and intuitive explanations using examples. Section 3.3 discusses faster checking algorithms and an alternative dominance criterion than FOD; and lastly, section 3.4 sums up and concludes.
(p.27) 3.2 Theory and Examples
This section provides the basic definitions and theory of the FOD approach, illustrated with some simple examples. Furthermore, a practical linear programming method for detecting dominances is described and the bootstrapping procedure is explained.
3.2.1 One-Dimensional FOD
Suppose first that the outcome of interest is one-dimensional. The outcome could, for example, be individual income (or wealth). In this case, there is a natural ordering of outcomes (assuming that a higher income is better), but only this single dimension is taken into account.
3.2.1.1 Notations and Definitions
Let X denote a finite set of real-valued outcomes. Let the distribution of wellbeing of population A be described by a probability mass function^{1} f over X, i.e. $\sum}f\left(x\right)=1$ and $\sum}f\left(x\right)\ge 0$ for all $x\in X$. Similarly, let the distributions of populations B and C be described by the probability mass functions g and h respectively.
Table 3.1. Distributions f, g, and h (per cent), one-dimensional
Population A |
||
---|---|---|
f |
Total | |
Income |
0 (deprived) |
35 |
1 (not deprived) |
65 | |
Total |
100 | |
Population B | ||
g |
Total | |
Income |
0 (deprived) |
50 |
1 (not deprived) |
50 | |
Total |
100 | |
Population C | ||
h |
Total | |
Income |
0 (deprived) |
40 |
1 (not deprived) |
60 | |
Total |
100 |
Source: Authors’ hypothetical example
As a very simple example, suppose that there are only two possible outcomes, $X=\left\{0,1\right\}$. We will always assume that higher numbers are better, so 0 is the bad outcome (‘income-deprived’), and 1 is the good outcome (‘not income-deprived’). In this situation, a population distribution is completely described by its share of individuals being income-deprived. Table 3.1 shows distributions for three hypothetical populations.
In the one-dimensional case, f first-order dominates g if and only if any of the following (equivalent) conditions hold:^{2}
(a) g can be obtained from f by a finite number of shifts of probability mass in f from one outcome to another that is worse.
(b) Social welfare is at least as high for f as for g for any non-decreasing additively separable social welfare function, i.e. ${{\displaystyle \sum}}_{x\in X}f\left(x\right)w\left(x\right)\ge {{\displaystyle \sum}}_{x\in X}g\left(x\right)w\left(x\right)$ for any weakly increasing real function w(∙).
(c) $F\left(x\right)\le G\left(x\right)$ for all $x\in X$, where F(∙) and G(∙) are the cumulative distribution functions (CDFs)^{3} corresponding to f and g.
(p.28) Condition (a) provides perhaps the most intuitive definition of FOD. It provides a natural criterion for the case where one distribution is unambiguously better than another. Condition (b) is a robustness property in relation to social welfare comparisons and thus provides a link to welfare economics (and to expected utility theory in the case of a probability distribution). If there is FOD, social welfare will be at least as high for the dominating population no matter the functional form of the social welfare function as long as w(∙) is weakly increasing. For ordinal data, this condition on w(∙) simply means that outcomes can be ranked from worse to better. As noted in the Introduction, no additional assumptions are required. Condition (c) turns out to be equivalent to the first two conditions and is useful for checking FOD.
3.2.1.2 Checking One-Dimensional FOD
In the one-dimensional case, FOD can be checked in a simple and effective way with direct application of condition (c). To illustrate, consider the CDFs F(∙), G(∙), and H(∙) corresponding to the three probability mass functions, f, g, and h, respectively; cf. Table 3.1. We have $F\left(0\right)=0.35$, $G\left(0\right)=0.50$, $H\left(0\right)=0.40$, and of course $F\left(1\right)=G\left(1\right)=H\left(1\right)=1$. These are shown in Figure 3.1 (the black line illustrates F(x), the gray line illustrates G(x), and the dotted line illustrates H(x)).
(p.29) As can be seen from the graph, $F\left(x\right)\le H\left(x\right)\le G\left(x\right)$ for all $x\in X$. More precisely, $F\left(0\right)<H\left(0\right)<G\left(0\right)$ whereas $F\left(1\right)=G\left(1\right)=H\left(1\right)=1$. Hence f dominates both g and h. Since $H\left(x\right)\le G\left(x\right)$ for all $x\in X$, h dominates g.
Condition (a) also provides an intuitive way of explaining dominances. For example, it can be seen from Table 3.1 that f dominates g since g can be obtained from f by shifting probability mass from one outcome to another that is worse. More precisely, shifting fifteen percentage points from (1) to (0) in f yields exactly g.
3.2.2 Multidimensional FOD
Now suppose that the outcome is multidimensional. In the case of two dimensions, these could, for example, be income and health. In the case of three dimensions, one may wish to add educational attainment, and so on.
3.2.2.1 Notations and Definitions
Let Y be a finite set of (multidimensional) outcomes. A distribution of wellbeing of population A is described by a probability mass function f over Y, i.e. $\sum}f\left(y\right)=1$ and $f\left(y\right)\ge 0$ for all $y\in Y$. Similarly, let the distributions of populations B and C be described by the probability mass functions g and h respectively.
To illustrate, suppose that the two dimensions each have two possible outcomes, 0 or 1. Thus, $Y=\left\{\left(0,0\right),\left(0,1\right),\left(1,0\right),\left(1,1\right)\right\}$. One dimension could be income (dimension I) and the other dimension could be health (dimension II). Then the outcome (0,0) for a person means that she is deprived in both dimensions, while the outcome (1,0) means that she is not deprived in the first dimension (I) but deprived in the second dimension (II), and so on. (p.30)
Table 3.2. Distributions f, g, and h (per cent), two-dimensional
Population A |
f |
II (health) |
Total | ||
---|---|---|---|---|
0 (deprived: bad) |
1 (not deprived: good) |
|||
I (income) |
0 (deprived: poor) |
10 |
25 |
35 |
1 (not deprived: rich) |
25 |
40 |
65 | |
Total |
35 |
65 |
100 | |
Population B |
g |
II (health) |
Total | ||
---|---|---|---|---|
0 (deprived: bad) |
1 (not deprived: good) |
|||
I (income) |
0 (deprived: poor) |
25 |
25 |
50 |
1 (not deprived: rich) |
25 |
25 |
50 | |
Total |
50 |
50 |
100 | |
Population C |
h |
II (health) |
Total | ||
---|---|---|---|---|
0 (deprived: bad) |
1 (not deprived: good) |
|||
I (income) |
0 (deprived: poor) |
30 |
10 |
40 |
1 (not deprived: rich) |
10 |
50 |
60 | |
Total |
40 |
60 |
100 |
Source: Authors’ hypothetical example
Suppose that three probability mass functions f, g, and h are distributed as shown in Table 3.2. Note that distributions over dimension I (income) in the rightmost column are identical to those in Table 3.1 representing a situation where the same populations are considered but now an additional dimension is taken into consideration.
In the case of multidimensional outcomes, f first-order dominates g if and only if any of the following (equivalent) conditions hold:^{4}
(A) g can be obtained from f by a finite number of shifts of probability mass from one outcome to another that is worse.
(B) Social welfare is at least as high for f as for g for any non-decreasing additively separable social welfare function, i.e. ${{\displaystyle \sum}}_{y\in Y}f\left(y\right)w\left(y\right)\ge {{\displaystyle \sum}}_{y\in Y}g\left(y\right)w\left(y\right)$ for any weakly increasing real function w(∙).
(C) ${{\displaystyle \sum}}_{y\in Z}g\left(y\right)\ge {{\displaystyle \sum}}_{y\in Z}f\left(y\right)$ for any lower comprehensive set $Z\subseteq Y$.^{5}
(p.31) Each condition is the natural multidimensional extension of its counterpart for the one-dimensional case. Again, note that condition (A) provides an intuitive criterion for one distribution being unambiguously better than another, (B) provides a foundation in welfare economics but one that is not conveniently amenable to testing, while (C) provides a directly testable condition that may not be particularly intuitive.
3.2.2.2 Checking Multidimensional FOD
First, due to the intuitive nature, we appeal to condition (A). It can be seen from Table 3.2 that f dominates g, since it is possible to obtain g from f by shifting probability mass in f from better to worse outcomes. More precisely, shifting fifteen percentage points of probability mass from (1,1) to (0,0) in f yields exactly g, which implies that f dominates g. The distribution f is thus unambiguously preferred to the distribution g.
However, consider f and h. Neither f dominates h, nor does h dominate f. Intuitively, this is the case since f would be better if what matters most is minimization of the share of the population who are deprived in dimensions II (health) since $f\left(0,0\right)+f\left(1,0\right)=35<h\left(0,0\right)+h\left(1,0\right)=40$. On the contrary, h would be better if what matters most is maximization of the share of the population not deprived in neither dimension since $h\left(1,1\right)=50>f\left(1,1\right)=40$. Consequently, no dominances are detected since no assumptions are made about the relative importance of the different dimensions. Note that the conclusion that f does not dominate h is in contrast to the one-dimensional case. This illustrates that the conclusions might change when more dimensions are added to the analysis. It is therefore important to bear in mind that an analysis applying few welfare indicators may conclude that one population dominates another whereas a multidimensional FOD analysis of the same populations with more indicators may be indeterminate. Attention should therefore be given to include important dimensions that cover overall wellbeing reasonably well.
As mentioned, condition (C) provides a direct method for checking multidimensional FOD. In our example, f dominates g if and only if the following four inequalities are jointly satisfied:^{6}
(i) $g\left(0,0\right)\ge f\left(0,0\right)$
(ii) $g\left(0,0\right)+g\left(0,1\right)\ge f\left(0,0\right)+f\left(0,1\right)$
(iii) $g\left(0,0\right)+g\left(1,0\right)\ge f\left(0,0\right)+f\left(1,0\right)$
(iv) $g\left(0,0\right)+g\left(1,0\right)+g\left(0,1\right)\ge f\left(0,0\right)+f\left(1,0\right)+f\left(0,1\right)$.
(p.32) Considering the distributions in Table 3.2 in relation to the four inequalities above, it can be seen that, when comparing f and g, each of the four inequalities (i)–(iv) are (strictly) satisfied; (i) $g\left(0,0\right)\ge f\left(0,0\right)$ since $0.25>0.10$, (ii) $g\left(0,0\right)+g\left(0,1\right)\ge f\left(0,0\right)+f\left(0,1\right)$ since $0.25+0.25>0.10+0.25$, (iii) $g\left(0,0\right)+g\left(1,0\right)\ge f\left(0,0\right)+f\left(1,0\right)$ since $0.25+0.25>0.10+0.25$, and (iv) $g\left(0,0\right)+g\left(1,0\right)+g\left(0,1\right)\ge f\left(0,0\right)+f\left(1,0\right)+f\left(0,1\right)$ since $0.25+0.25+0.25>$ $0.10+0.25+0.25$. Hence, f dominates g, as already observed. When comparing each of the other distribution pairs, it can easily be seen that at least one of the inequalities is violated. For example, when comparing f and h, inequalities (i)–(iii) are satisfied whereas inequality (iv) is not.
3.2.2.3 Detecting FOD in Practice
Criterion (C) provides a simple method for detecting dominance, which can be visually perceived in cases with few outcomes such as in the example with two binary indicators giving four different outcomes in total. However, the number of inequalities to be checked increases drastically when more dimensions and levels are added. For real-world applications, computationally efficient algorithms for checking dominance are required (Range and Østerdal 2013). Mosler and Scarsini (1991) and Dyckerhoff and Mosler (1997) show that, appealing to definition (A), checking FOD corresponds to determining if a certain linear program has a feasible solution. FOD can thus be determined using a linear programming package. The first empirical implementation of this approach was provided by Arndt et al. (2012) in a study of child poverty in Mozambique and Vietnam.
Let A and B be two populations characterized by probability mass functions f and g respectively. For outcomes y and y′ with $y\prime \le y$, let t_{y,y′} be the amount of probability mass transferred from outcome y to y′. Note that the first subscript denotes the source of the transfer whereas the second denotes the destination.
Given the conditions outlined above, population A dominates population B if and only if there exists a feasible solution to the following linear program:^{7}
To provide an example, let us return to Table 3.2. There, f dominates g since it is possible to obtain g from f by shifting fifteen percentage points of probability mass from outcome (1,1) to (0,0); cf. condition (A). In terms of the linear program in equation (3.1), this implies that for $y=\left(0,0\right)$ and $y\prime =\left(1,1\right)$, ${t}_{{y}^{\prime},y}=0.15$ in (p.33) order for the equality to be fulfilled for $y=\left(0,0\right)$. Furthermore, for $y=\left(1,1\right)$ and $y\prime =\left(0,0\right)$, ${t}_{y,y\prime}=0.15$ in order for the equality to be fulfilled for $y=\left(1,1\right)$. It is thus possible to fulfil all the constraints for all $y\in Y$ whereby f dominates g. An implementation of the FOD approach using linear programming in GAMS is reviewed in Chapter 4.
3.2.3 Mitigating the Limitations of FOD
As mentioned previously, due to the absence of strong assumptions such as predetermined weights, there are some inherent limitations to the FOD approach. First, when comparing two groups, it may be the case that no dominations are found and hence the FOD approach might yield an indeterminate result. This provides little information about the populations’ relative wellbeing, as was the case when considering f and h with multiple dimensions in section 3.2.2.2. Second, the FOD approach provides no information about the strength of dominance. For example, if population A dominates population B, the FOD check itself provides no information as to whether A is marginally or substantially better than B. Both of these limitations can be mitigated using a bootstrapping approach as described below.
3.2.3.1 Bootstrapping
To mitigate the limitations mentioned in section 3.2.3, a bootstrapping procedure can be applied (Arndt et al. 2012). In general, bootstrapping is a procedure that relies on random sampling with replacement from the original dataset. When comparing populations A and B, J samples of size K are drawn with replacement for each population group where $K\le N$, N being the number of individuals in that population in the original sample. In the bootstrap procedure shown in Chapter 4, the samples are drawn in clusters from each stratum with $K=N$. When a cluster is drawn, all households in that cluster are drawn. Due to the drawing with replacement, each cluster (and thus household) may appear more than once. The FOD approach is then applied to each of the J bootstrap samples. When these repeated bootstrap samples are compared using the FOD approach, the final output can be interpreted as an empirical probability that population A dominates population B since the original sample is a subsample from a larger population. These probabilities yield significantly more information than only applying FOD to the original data where, for example, an indeterminate result will make it impossible to draw further conclusions about the comparative wellbeing of the two populations. The bootstrapping procedure thus enables the analyst to extract some information about the strength of conclusions based on the probability of dominance under resampling.
(p.34) For example, with bootstrapping, we may find that occasionally A dominates B and occasionally the inverse is true, but most of the time the results are indeterminate, i.e. rough equality of A and B. Alternatively, we may find that the probability that A dominates B is fairly high, the probability that B dominates A is very low or zero, and the probability of an indeterminate result is somewhat low, i.e. likely dominance of A over B, or we might find that A dominates B almost always, i.e. solid dominance of A over B.
As a concrete example, say that population A dominates B in 995 of the $J=1,000$ bootstrap samples and that A dominates C in 870 of the bootstrap samples. This corresponds to a 99.5 per cent chance of A dominating B and an 87 per cent change of A dominating C. In this example, A is thus better than C (likely dominance of A over C) and considerably better than B (solid dominance of A over B). If no dominations between B and C are obtained in the original sample, the ranking of these is ambiguous. However, if, for example, C dominates B in four (0.4 per cent) of the bootstrap samples whereas B dominates C in eighty (8 per cent) of the bootstrap samples, B is seemingly better than C (though rough equality of B and C).
Furthermore, if one is willing to accept the tendency to outperform other groups as an overall relative indicator of population wellbeing, it is possible to provide an intuitive ranking of all population groups via the Copeland (1951) method, which is analogous to the way in which teams are ranked by assigning points to wins, draws, and losses from matchups in a sports tournament. For instance, for each population group (n population groups in total), one can count how many of the $\left(n-1\right)$ other population groups it dominates and from that subtract the number of times it is dominated by these other groups. This yields a score in the interval $\left[-\left(n-1\right),n-1\right]$ which can then be normalized to the interval $\left[-1,1\right]$ (see e.g. Arndt et al. 2016).
3.3 Further Considerations
3.3.1 Faster Solution Algorithms
The linear programming approach presented provides a practical method for checking FOD for many applied problems. For most applications, the method is computationally fast enough to allow for a great number of pairwise comparisons as well as, if desired, hundreds of bootstrap repetitions for each pair of distributions compared. In applications with multiple binary indicators, the linear programming method is particularly suitable.^{8}
(p.35) However, the linear programming approach is not the fastest possible way to make FOD comparisons. Using a network flow formulation of the problem, as outlined in Preston (1974) or Hansel and Troallic (1978), it is possible to check FOD via computation of the maximum flow. As discussed in Range and Østerdal (2013), the problem of checking FOD for multidimensional distributions can also be formulated as a special bipartite network problem related to the classical transportation problem. Generally, these formulations are computationally faster than the linear programming method. In particular, for the bivariate case, Range and Østerdal (2013) provide an algorithm for checking FOD where the worst-case computational complexity grows linearly in the size of the problem (determined by the total number of outcomes).
3.3.2 Alternative Dominance Criterion
There are many other dominance criteria in the literature than FOD. In general, the alternative dominance criteria all impose stronger underlying assumptions on the underlying utility/social welfare functions. A comprehensive overview of alternative dominance criteria is outside the scope of this chapter—see e.g. Shaked and Shanthikumar (2007) for an extensive review. However, we will compare FOD with the lower-orthant dominance ordering, which is one of the most frequently used alternative dominance criteria in welfare economics.
As mentioned in the Introduction, the FOD approach differs from the criteria for robust welfare comparisons of the Atkinson–Bourguignon type (see Atkinson and Bourguignon 1982; Atkinson and Bourguignon 1987; Bourguignon 1989; Atkinson 1992). These are variations of orthant stochastic orderings (see Dyckerhoff and Mosler 1997) even though the name first-order dominance has sometimes been used synonymously with orthant stochastic orderings in the welfare economics literature (e.g. Atkinson and Bourguignon 1982). Orthant dominance is not suitable to ordinal data. However, if one assumes substitutability between dimensions (as e.g. Duclos and Échevin 2011, where substitutability between health and income is assumed, i.e. an underlying utility function with a negative cross-partial derivative), a criterion less restrictive than FOD can be used. In particular, f orthant dominates g if and only if:
The label (C_{0}) is used to indicate that this condition relates to condition (C) in the case of multidimensional FOD in section 3.2.2.1. However, condition (C_{0}) is less restrictive than (C). This implies that condition (C_{0}) may be satisfied even though conditions (A), (B), and (C) are not. For a two-dimensional comparison with binary indicators (as in Table 3.2), f orthant dominates g if and only if each of the following three conditions are satisfied:
(ii_{0}) $g\left(0,0\right)+g\left(0,1\right)\ge f\left(0,0\right)+f\left(0,1\right)$
(iii_{0}) $g\left(0,0\right)+g\left(1,0\right)\ge f\left(0,0\right)+f\left(1,0\right)$.
The labels (i_{0})–(iii_{0}) are used to indicate that these conditions relate to conditions (i)–(iii) in section 3.2.2.2. Note that the fourth inequality (iv) need not be satisfied for orthant dominance. Returning to f and h in Table 3.2, recall that neither f first-order dominates h, nor does h first-order dominate f. However, in the case of orthant dominance, it can be seen that: (i_{0}) $h\left(0,0\right)\ge f\left(0,0\right)$ since $0.30>0.10$, (ii_{0}) $h\left(0,0\right)+h\left(0,1\right)\ge f\left(0,0\right)+f\left(0,1\right)$ since $0.30+0.10>0.10+0.25$, and (iii_{0}) $h\left(0,0\right)+h\left(1,0\right)\ge f\left(0,0\right)+f\left(1,0\right)$ since $0.30+0.10>0.10+0.25$. Hence f orthant dominates h even though f does not first-order dominate h.
3.4 Conclusion
Population wellbeing is increasingly recognized as a multidimensional phenomenon that is not adequately described by a single dimension (e.g. by income only). Several methods of measuring and comparing welfare have been proposed where application of a weighting or counting scheme to different dimensions is used. It is often, however, difficult to determine these weights. Due to the sensitivity of the outcome to the weights applied, different conclusions about welfare rankings are likely to occur if the weighting scheme differs from one analysis to another. While comparisons using, for example, lower orthant (stochastic) orderings following Atkinson and Bourguignon (1982) are considerably more ‘robust’ than applying weighting schemes, they typically apply conditions formulated in terms of the second- (or higher-)order cross-partial derivative and do not apply to ordinal data.
The first-order (stochastic) dominance (FOD) approach requires only that the outcomes in each dimension can be ranked from worse to better. The FOD approach can be applied to ordinal multidimensional data, enabling the analyst to perform wellbeing comparisons across population groups with a minimum of assumptions imposed. FOD is thus robust across all possible weighting schemes. This advantage is accompanied by limitations in that the FOD approach can yield indeterminate outcomes and does not directly provide information with respect to degree of dominance. A bootstrapping approach can be used to obtain more information, thus mitigating these limitations. Moreover, a Copeland approach can be used to obtain a ranking (i.e. a complete and transitive ordering) of all groups being compared.
Finally, it is worth mentioning that even though the FOD approach and bootstrapping procedure enable the analyst to rank population welfare (p.37) without assumptions about weights, the analysis should ideally be performed together with alternative welfare measurements, which provide cardinal information about the relative size of wellbeing differences under fixed weights assumptions. As Ferreira (2011) puts it, looking at a few core, truly irreducible, dimensions and applying dominance analysis (as well as a number of indices) is likely to contribute to the design and targeting of policy actions. Thus, FOD comparisons should form part of a broader population wellbeing analysis strategy.
References
Bibliography references:
Alkire, S. and J. Foster (2011). ‘Counting and Multidimensional Poverty Measurement’, Journal of Public Economics, 95(7–8): 476–87.
Alkire, S., J. Foster, S. Seth, M. E. Santos, J. M. Roche, and P. Ballon (2015). Multidimensional Poverty Measurement and Analysis. Oxford: Oxford University Press.
Arndt, C., R. Distante, M. A. Hussain, L. P. Østerdal, P. L. Huong, and M. Ibraimo (2012). ‘Ordinal Welfare Comparisons with Multiple Discrete Indicators: A First Order Dominance Approach and Application to Child Poverty’, World Development, 40(11): 2290–301.
Arndt, C., M. A. Hussain, S. Vincenzo, F. Tarp, and L. P. Østerdal (2016). ‘Poverty Mapping Based on First-Order Dominance with an Example from Mozambique’, Journal of International Development, 28(1): 3–21.
Atkinson, A. B. (1992). ‘Measuring Poverty and Differences in Family Composition’, Economica, 59(233): 1–16.
Atkinson, A. B. and F. Bourguignon (1982). ‘The Comparison of Multi-Dimensioned Distributions of Economic Status’, Review of Economic Studies, 42(2): 183–201.
Atkinson, A. B. and F. Bourguignon (1987). ‘Income Distribution and Differences in Need’, in G. R. Feiwek (ed.), Arrow and the Foundations of the Theory of Economic Policy. New York: New York University Press, 350–70.
Bourguignon, F. (1989). ‘Family Size and Social Utility: Income Distribution Dominance Criteria’, Journal of Econometrics, 42(1): 67–80.
Bourguignon, F. and S. R. Chakravarty (2003). ‘The Measurement of Multidimensional Poverty’, Journal of Economic Inequality, 1(1): 25–49.
Copeland, A. H. (1951). ‘A “Reasonable” Social Welfare Function’, University of Michigan Seminar on Applications of Mathematics to the Social Sciences.
Duclos, J.-Y. and D. Échevin (2011). ‘Health and Income: A Robust Comparison of Canada and the US’, Journal of Health Economics, 30(2): 293–302.
Duclos, J.-Y., D. E. Sahn, and S. D. Younger (2006). ‘Robust Multidimensional Poverty Comparisons’, Economic Journal, 116(514): 943–68.
Duclos, J.-Y., D. E. Sahn, and S. D. Younger (2007). ‘Robust Multidimensional Poverty Comparisons with Discrete Indicators of Well-Being’, in S. P. Jenkins and J. Micklewright (eds), Inequality and Poverty Re-Examined. Oxford: Oxford University Press, 185–206.
(p.38) Dyckerhoff, R. and K. Mosler (1997). ‘Orthant Orderings of Discrete Random Vectors’, Journal of Statistical Planning and Inference, 62(2): 193–205.
Ferreira, F. H. G. (2011). ‘Poverty Is Multidimensional. But What Are We Going to Do About It?’, Journal of Economic Inequality, 9(3): 493–5.
Gravel, N. and P. Moyes (2012). ‘Ethically Robust Comparisons of Bidimensional Distributions with an Ordinal Attribute’, Journal of Economic Theory, 147(4): 1384–1426.
Gravel, N., P. Moyes, and B. Tarroux (2009). ‘Robust International Comparisons of Distributions of Disposable Income and Regional Public Goods’, Economica, 76(303): 432–61.
Gravel, N. and A. Mukhopadhyay (2010). ‘Is India Better Off Today than 15 Years Ago? A Robust Multidimensional Answer’, Journal of Economic Inequality, 8(2): 173–95.
Hansel, G. and J. Troallic (1978). ‘Measures marginales et théorème de Ford-Fulkerson’, Zeitschrift für Wahrsheinlichkeitstheorie und verwandte Gebiete, 43(3): 245–51.
Hussain, M. A. (2016). ‘EU Country Rankings’ Sensitivity to the Choice of Welfare Indicators’, Social Indicators Research, 125(1): 1–17.
Kamae, T., U. Krengel, and G. L. O’Brien (1977). ‘Stochastic Inequalities on Partially Ordered Spaces’, Annals of Probability, 5(6): 899–912.
Lehmann, E. L. (1955). ‘Ordered Families of Distributions’, Annals of Mathematical Statistics, 26(3): 399–419.
Levhari, D., J. Paroush, and B. Peleg (1975). ‘Efficiency Analysis for Multivariate Distributions’, Review of Economic Studies, 42(1): 87–91.
Mosler, K. C. and M. R. Scarsini (1991). Stochastic Orders and Decision under Risk. Heyward, CA: Institute of Mathematical Statistics.
Muller, C. and A. Trannoy (2011). ‘A Dominance Approach to the Appraisal of the Distribution of Well-Being across Countries’, Journal of Public Economics, 95(3–4): 239–46.
Preston, C. (1974). ‘A Generalization of the FKG Inequalities’, Communications in Mathematical Physics, 36(3): 233–41.
Range, T. M. and L. P. Østerdal (2013). ‘Checking Bivariate First Order Dominance’, Discussion Papers on Business and Economics No. 9/2013. Odense: Department of Business and Economics, University of Southern Denmark.
Rippin, N. (2010). ‘Poverty Severity in a Multidimensional Framework: The Issue of Inequality between Dimensions’, Courant Research Centre, Poverty, Equity and Growth Discussion Paper 47. Göttingen: University of Göttingen.
Roelen, K. and F. Gassmann (2008). ‘Measuring Child Poverty and Wellbeing: A Literature Review’, Maastricht Graduate School of Governance Working Paper Series 2008/WP001. Maastricht: Maastricht University.
Sen, A. K. (1976). ‘Poverty: An Ordinal Approach to Measurement’, Econometrica, 44(2): 219–31.
Sen, A. K. (1988). The Standard of Living. Cambridge: Cambridge University Press.
Shaked, M. and J. G. Shanthikumar (2007). Stochastic Orders. New York: Springer Science and Business Media.
(p.39) Strassen, V. (1965). ‘The Existence of Probability Measures with Given Marginals’, Annals of Mathematical Statistics, 36(2): 423–39.
World Bank (1990). World Development Report 1990: Poverty. New York: Oxford University Press for the World Bank.
Østerdal, L. P. (2010). ‘The Mass Transfer Approach to Multivariate Discrete First Order Stochastic Dominance: Direct Proof and Implications’, Journal of Mathematical Economics, 46(6): 1222–8.
Notes:
(^{1}) A probability mass function is a function that to each outcome assigns the probability of that outcome. In the context of population comparisons, it assigns the share of the population in that outcome.
(^{2}) Note that FOD is conventionally defined in the weak sense, i.e. a distribution always dominates itself.
(^{3}) CDFs express the probability that the real-valued outcomes X will have a value less than or equal to x.
(^{4}) The equivalence between (B) and (C) was shown by Lehmann (1955). It was also proved independently by Levhari et al. (1975). Kamae et al. (1977) observed that the equivalence between (A) and (C) is a consequence of Strassen’s Theorem (Strassen 1965). See also Østerdal (2010).
(^{5}) A set Z ⊆ Y is lower comprehensive if y ∈ Z, z ∈ Y, and z ≤ y implies z ∈ Z.
(^{6}) Note that the fifth inequality $g\left(0,0\right)+g\left(1,0\right)+g\left(0,1\right)+g\left(1,1\right)\ge f\left(0,0\right)+f\left(1,0\right)+f\left(0,1\right)+f\left(1,1\right)$ is always satisfied with equality by the definition of the probability mass functions since $\sum}f\left(y\right)={\displaystyle \sum}g\left(y\right)=1$.
(^{7}) The approach differs slightly from that outlined by Mosler and Scarsini (1991) and Dyckerhoff and Mosler (1997). In particular, the transfers here are absolute and not relative. Note furthermore that most linear programming packages require the specification of an objective function. This can be defined as an arbitrary constant function.
(^{8}) Even with binary indicators, the linear programming approach might actually be computationally challenged, but that is only so if the number of dimensions is large (Hussain et al. 2016).