Multilevel Methods for Public Health Research
Multilevel Methods for Public Health Research
Abstract and Keywords
This chapter begins by outlining the conceptual motivation behind multilevel analyses and by identifying a core set of research questions that this approach addresses. It then introduces the idea of multilevel structures and discusses simple and complex multilevel models. It emphasizes that the key strength of multilevel models lies in modeling heterogeneity at different levels and shows how multilevel models can be extended to additional contextual levels (e.g., neighborhoods nested within regions). The estimation procedures underlying such models are discussed, showing how a multilevel framework can provide a general, unified approach to data analysis and how this can be achieved by extensions to the basic hierarchical structure of individuals nested within contexts. The chapter concludes with a discussion of issues that researchers should be aware of when applying multilevel methods.
Where you live makes a difference to your health over and above who you are (Jones and Moon, 1993; Roberts, 1999; Berkman and Kawachi, 2000; Macintyre, 2000). People’s lives are lived in different settings, including residential neighborhoods, workplaces, and schools as well as more macrolevel contexts such as metropolitan areas, regions, and states. Over and above individual influences on health, researchers are increasingly emphasizing the role that contexts and environments play in shaping health and health inequalities in the population.
Such a conceptual perspective is intrinsically multilevel, that is, factors that affect health are viewed as simultaneously operating at the level of individuals and at the level of contexts. The term multilevel has also been used to advocate a multidisciplinary perspective on public health (Anderson, 1999). Our use of the term, however, is from an analytical perspective, that is, in relation to the levels of analysis in public health research. We begin by outlining the conceptual motivation behind multilevel analyses and by identifying a core set of research questions that this approach addresses. We then introduce the idea of multilevel structures and discuss simple and complex multilevel models. We emphasize that the key strength of multilevel models lies in modeling heterogeneity at different levels. After introducing the basic structure of a multilevel model, using the example of individuals nested within neighborhoods, we show how this framework can be extended to additional contextual levels (e.g., neighborhoods nested within regions). The estimation procedures underlying such models are then discussed. Our aim is to show how a multilevel framework can provide a general, unified approach to data analysis and how this can be (p.66) achieved by extensions to the basic hierarchical structure of individuals nested within contexts. We conclude with a discussion of issues that researchers should be aware of when applying multilevel methods.
WHY MULTILEVEL METHODS AND ANALYSES?
The problem of the “ecological fallacy” (Robinson, 1950; Selvin, 1958) is well known in epidemiology (Susser, 1994). It refers to the invalid transfer of results obtained at the ecological level to the individual level. A symmetrical fallacy, known as the “individualistic fallacy”, occurs by failing to take into account the ecology, or context, within which individual relationships happen (Alker, Jr., 1969). The issue common to both types of fallacy is, therefore, the failure to recognize the existence of unique relationships observable at multiple levels, each being important in its own right. Specifically, we can think of an individual relationship (e.g., poor individuals are more likely to have poor health); an ecological–contextual relationship (e.g., places with a high percentage of poor individuals are more likely to have higher rates of poor health); and an individual–contextual relationship (e.g., the greatest likelihood of being in poor health is found for poor individuals in places with a high percentage of poor people).
From a statistical standpoint, if individual data are aggregated to a contextual level, then information is lost and statistical analysis loses power. If data are disaggregated to the individual level, but are not independent of one another, then the result is fewer independent data values. Ordinary statistical tests demand that data values for individual observations be independent. Failing to recognize the dependent nature of the data values, along with the source of the dependency, can lead to finding significant relationships where none exist. A multilevel methodological and statistical perspective provides one comprehensive framework to address the above concerns.
CONCEPTUAL CONSIDERATIONS IN MULTILEVEL PUBLIC HEALTH RESEARCH
Contextual and Compositional Sources of Variation
Evidence for variations in poor health between different settings, or contexts, can arise from factors that are intrinsic to, and measured at, the contextual (p.67) level. In other words, the variation is due to what can be described as contextual, area, or ecological effects. Alternatively, variations between places may be compositional, that is, certain types of people who are more likely to be in poor health due to their individual characteristics happen to live in certain places. The question, therefore, is not whether variations exist between different settings (they always do), but what is their source, that is, are the variations across settings compositional or contextual? The notions of contextual and compositional sources of variation have general relevance and are applicable whether the context is administrative (e.g., political boundaries), temporal (e.g., different time periods), or institutional (e.g., schools or hospitals). The research question underlying this concern is: are there significant contextual differences in individual health between settings (such as neighborhoods), after taking into account the individual compositional characteristics of the neighborhood?
Beyond disentangling the contextual and compositional sources of variation, contextual differences may be complex such that they may not be the same for all types of people. For example, while neighborhood contexts may matter for the health outcomes of one population group (e.g., low social class), they may not have any influence on the health status of other groups (e.g., high social class). The research question in this case is: are the contextual neighborhood differences in poor health different for different types of population groups, after taking into account the individual composition of the neighborhood?
Within particular contexts one group’s health experience may be more or less variable than another’s over and above the average differences. For example, people of low social class, in addition to being contextually heterogeneous in terms of health outcomes, may experience more variability compared to other groups. The related question is: are individual differences in poor health different for different types of population groups, after taking into account neighborhood context and the average effects of individual demographic and socioeconomic factors?
Contextual differences, in addition to people’s characteristics, may also be influenced by the different characteristics of neighborhoods. Stated differently, (p.68) individual differences may interact with contexts. For example, poor people (individual characteristic) may experience different levels of health depending on the poverty level (place characteristic) of the area in which they live. The research question of interest is: what is the average relationship between individual poor health and neighborhood-level socioeconomic characteristics, and does the effect of neighborhood-level socioeconomic characteristics on individual health differ for different types of individuals based on their demographic and socioeconomic characteristics, after taking into account the complex effects of individual demographic and socioeconomic factors, as well as the neighborhoods in which individuals reside?
Multiple Hierarchical Contexts
Contextual settings themselves can be conceptualized and measured at multiple levels such that individual health experiences are not simply influenced by people’s proximate environment (e.g., neighborhoods) but also their macroecologic settings (e.g., states). Moreover, neighborhoods rarely exist in a vacuum, and considering their broader contextual settings can be vital given the functional interconnectedness between geographic levels. An analysis of health variations should consider both the immediate contextual setting of people (e.g., neighborhoods) and also the macrocontextual settings to which both people and neighborhoods belong (e.g., states). A related question of interest is: what additional contextual levels are relevant for the health outcome under consideration, and what is the relative importance of different contexual levels?
Changing People, Changing Places
Contexts change over time, as do the circumstances and health of people. Simultaneously incorporating time and space dimensions involves asking the following research question: while the prevalence of poor health may have declined over time, have neighborhood contextual disparities declined, and, if so, for which type of population groups?
Health outcomes themselves are often interrelated. For instance, people often engage simultaneously in high-risk behaviors, such as smoking and excess drinking. Each of these behaviors typically have a qualitative (yes or no) and a quantitative (how much) aspect. For instance, whether a person smokes may reveal nothing about the number of cigarettes smoked. There may be neighborhoods where few people smoke, but those who do, smoke heavily; an average figure would be very misleading. An ideal modeling approach (p.69) should allow consideration of multiple responses and allow us to ask: are neighborhoods with a high percentage of smokers also high in the number of cigarettes smoked, and/or are neighborhoods that are high on smoking also high on drinking, after taking into account individual, compositional differences?
Overlapping “Cross-Classified” Contexts
Not only are contexts hierarchically multiple, they may also overlap. For instance, health behaviors such as smoking may be influenced not only by the neighborhoods in which people live but also by their work environment. Clearly, workplaces and residential neighborhoods need not be nested neatly one within the other. Thus, the relevant question is: what is the relative contribution of different contextual settings that may not be nested within one another, but overlap (e.g., neighborhoods and workplaces)? A related situation is one in which individual health behaviors are influenced not only by the characteristics of the neighborhood in which they occur but also by characteristics of adjoining areas.
As can be seen, the focus of the conceptual approach outlined above is on ascertaining heterogeneity in health either through multiple contexts, multiple times, or multiple outcomes. Standard statistical approaches, however, cannot deal with these requirements because (1) they operate at a single level, and (2) the emphasis is on modeling average relationships and not on the underlying heterogeneity per se. Multilevel statistical methods provide a unified and powerful approach to address these issues (de Leeuw and Kreft, 1986; Bryk and Raudenbush, 1992; Longford, 1993; Goldstein, 1995).
Multilevel methods are pertinent when the research problem under investigation has a multilevel structure and/or when a process is thought to operate at more than one level. It is important to note that multilevel models are not simply about modeling average relationships, but their unique strength lies in modeling the different sources of variability that underlie such relationships that are observable at different levels of analysis.
MULTILEVEL STRUCTURES: AN OVERVIEW
It is well known that once groupings are created (consisting of individuals), even if their origins are essentially “random”, individuals end up being influenced by their group membership. Such groupings can be spatial (e.g., areas) or nonspatial (e.g., ethnicities), and in this chapter our focus is on the former. Hierarchies are one way of representing the dependent, or correlated, nature of the relationship between individuals and their groups. Thus, for instance, we can conceptualize a two-level structure of many level-1 units (e.g., individuals) nested within fewer level-2 groups (e.g., neighborhoods or (p.70)
Multilevel structures, however, may also arise as a consequence of study design. For reasons of cost and efficiency, many large-scale surveys adopt a multistage design. For example, a survey of health status might involve a three-stage design, with regions sampled first, then neighborhoods, and then individuals. A design of this kind generates a three-level hierarchical structure of individuals at level 1, nested within neighborhoods at level 2, which in turn are nested in regions at level 3. Individuals living in the same neighborhood can be expected to be more alike (i.e., they are autocorrelated, or clustered) than they would be if the sample were truly random. Similar autocorrelation can be expected for neighborhoods within a region. As a consequence, such a clustered sample does not contain as much information as simple random samples of similar size, and ignoring this autocorrelation can result in an increased risk of finding a relationship where none exists (Skinner et al., 1989).
While the conventional approach to such correlated data structures is to treat the clustering as a nuisance that needs to be minimized and/or adjusted/corrected, multilevel models view such hierarchical structures as a feature of the population that is of substantive interest. Indeed, “once you know that hierarchies exist, you see them everywhere” (Kreft and de Leeuw, 1998). Individuals, neighborhoods, and regions are seen as distinct structures of the population that should be measured and modeled.
INTRODUCING MULTILEVEL CONCEPTS: VARYING RELATIONSHIPS
One of the main attractions of multilevel models for public health research is their ability to allow relationships to vary across different contextual settings. (p.71)
In Figure 4–2(c)–(f) the contextual variations in poor health–age are allowed to become more complex. In Figure 4–2(c), the pattern is such that neighborhoods make very little difference for the young, but there is a greater degree of neighborhood variation in poor health among the old. Conversely, Figure 4–2(d) shows relatively large neighborhood differentials in poor health for the young. Figure 4–2(e) shows some neighborhoods where the young are in poor health, and others where it is the old. The final graph, Figure 4–2(f), shows that there is no overall, or average, relationship between poor health–age (the single thicker line is horizontal), but specific neighborhoods have distinctive relationships.
The different patterns in Figure 4–2 are achieved by allowing the average (fixed) “intercept” and the average (fixed) “slope” to vary (be random) across neighborhoods. Multilevel models specify the different intercepts and slopes for each context as coming from a distribution at a higher level. The different forms of relationships represented in Figure 4–2(c)–(f) are a result of how the intercepts and slopes are associated. Graphical models represented in Figures 4–2(c)–(f) are also called “randomslopes” or “random coefficients” models because the patterns are achieved by allowing the fixed slope to vary across neighborhoods. Figure 4–2(b), meanwhile, is the simplest form of multilevel model and is referred to as a “random-intercepts” or “variance components” model, as only intercepts are allowed to vary across neighborhoods.
For instance, in Figure 4–2(c) the relationship between poor health and age is strongest in neighborhoods (a steeper slope) where poor health rates are quite high for average age groups (a high intercept). Stated differently, there is a positive association between the intercepts and the slopes. In Figure 4–2(d) high intercepts are shown to be associated with shallower slopes, that is, a negative association between the slopes and the intercepts. The complex criss-crossing in Figure 4–2(e) results from a lack of pattern between the intercepts and the slopes, such that the health achievement rates of a neighborhood at average age tell us nothing about the direction and magnitude of the poor health–age relationship. The distinctive (p.73) feature of Figure 4–2(f) results from the slopes varying around zero. In other words, while typically there is no poor health–age relationship, in some neighborhoods the slope is positive, in others negative. In this case, a single-level model would reveal no relationship whatsoever between poor health and age, and as such the “average” relationship would not occur anywhere.
FROM GRAPHS TO EQUATIONS
All statistical regression equations have the same underlying function, which can be expressed algebraically as:
Response = Fixed/Average Parameters + (Random/Variance Parameters)
We begin with a single individual level regression model:
The antithesis of this individual model (in which health depends only on individual characteristics, such as age) is one in which health depends only on the neighborhood in which a person lives. This is achieved by specifying a micro model, with the response, intercept, and the individual random term now indexed to distinguish between j neighborhoods:
(p.74) The poor health rate in each of the j neighborhoods now depends on the fixed average, β0, plus a random difference allowed to vary for each neighborhood (u 0j). Because the neighborhood differences are allowed to vary according to a higher level distribution (and making the usual IID assumptions), this distribution can be summarized by its overall mean, β0, and its variance, . This model presumes, as does much of census-based mapping of health outcomes, that a description of poor health can be summarized by a single rate for each place.
Both models considered so far are potentially deficient. In equation (1), a single individual model is fitted to all neighborhoods, thereby suppressing important contextual differences that may underlie average relationships. For instance, if the graphical model in Figure 4–2(f) is true, with no average relationship between poor health and age but each neighborhood showing positive or negative relationships, the individual model would be extremely misleading.
In relation to equations (2) and (3), meanwhile, the apparent neighborhood differences might be artifacts of the differential composition of neighborhood populations. Consequently, the model may be overemphasizing or underestimating the “true” contextual differences between neighborhoods. For example, an apparently high neighborhood specific rate could be merely the result of that neighborhood having a larger number of older people, a group who, in general, are more likely to be in poor health (composition-based neighborhood difference).
The converse is also possible, whereby genuinely large contextual effects are masked by failing to control for a neighborhood’s composition. Such a result can occur, for instance, when a neighborhood with a genuinely high rate of poor health has relatively high numbers of young people, who, despite enjoying a lower rate of poor health on average, are nonetheless more likely to be ill compared to the old in other neighborhoods.
In a model that fails to adequately specify individual characteristics, the context (the difference a place makes) is confounded with the compositional (what is in a place). This can be remedied by combining the individual-only model specified in equation (1) with the context-only model specified in equations (2) and (3).
Central to developing a multilevel model is the specification of models at each desired level and their combination into an overall model. Equation (1) can be rewritten as a revised micro model with the response, intercept, and the individual random term now suitably indexed with the subscript j to distinguish neighborhoods:
Such a model assumes that neighborhoods are uniformly high or low in terms of poor health rates and is equivalent to Figure 4–2(b). Such an assumption may be overly simplistic, for the age effect may vary across neighborhoods. Incorporating this complexity requires that all of the β parameters are indexed in the micro model, such that:
As before, substituting the macro models into the micro model gives us:
(p.76) The key change is that the age effect in neighborhood j in equation (10) consists of a fixed average age effect across all neighborhoods, β1, and a differential age effect that is specific to each neighborhood, u 1j. The novel features of a multilevel model are, therefore, the level-2 random terms (u 0j, u 1j) at the neighborhood level.
The model in equation (12) does not, however, allow for heterogeneity between individuals within neighborhoods. Indeed, the standard assumption of ordinary least squares regression models is that residuals at level 1 have a constant variance (the assumption of homoskedasticity). As the variability about the fitted average line is presumed to be constant, it is summarized in a single variance term . Such homoskedastic assumptions may be quite unrealistic; people of different ages may be differentially variable in terms of health status. While older people may have similar health status, young people may be much more variable.
Anticipating and modeling heteroskedasticity, or heterogeneity, at the individual level is particularly important in multilevel analysis, as there may be confounding across levels: what may appear to be contextual heterogeneity (level 2) could be due to a failure to take account of the between-individual (within context) heterogeneity (level 1). In addition, heterogeneity at level 1 also has implications for the fixed part predictions and inferences. Heterogeneity at level 1 can be incorporated by allowing the fixed parameter, β1, associated with age, x 1ij, to vary at the individual level, giving us the following multilevel model:
The model now estimates two sets of residuals at level 1: e 0ij, which is associated with x 0ij, the constant; and e 1ij, which is associated with x 1ij, individual age. Using the model specified in equation (13), we now discuss the key characteristics of a multilevel statistical model.
MODELING INDIVIDUAL AND CONTEXTUAL HETEROGENEITY
Multilevel models are essentially concerned with modeling both the average and the variation around the average. To accomplish this, they consist of two sets of parameters: those summarizing the average relationship(s) and those summarizing the variation around the average at both the level of individuals and neighborhoods. Thus, in equation (13) the parameters β0 and β1 are fixed and give the average poor health–age relationship. The remaining subscripted parameters in the brackets are random (p.77) (allowed to vary) and represent the differences in poor health between neighborhoods and between individuals within neighborhoods.
Representing the between-neighborhood differences in equation (13) are two terms, u 0j, and u 1j, associated with x 0ij and x 1ij, respectively. However, it is not the neighborhood-specific values that are estimated by multilevel models. Rather, they estimate the variance and the covariance based on the underlying distribution of the neighborhood specific differences. Making the usual IID assumptions, the neighborhood differences at level 2 can be summarized through a variance–covariance parameter matrix consisting of the intercept variance (), slope variance (), and covariance (σu0u1). Following a well-known result (Weisberg, 1980), the combined variability for two random variables at level 2 can be written as:
In terms of differences between individuals at level-1, there are also two terms in equation (13). Making the same assumptions and following a similar procedure, we obtain:
Thus, the between-neighborhood variation and between-individual (within neighborhood) variation is estimated in relation to age based on the variance parameters for the constant and the predictor variable and their covariance at each level. Thus, the fixed parameters give the average poor health–age relationship, while the random parameters combine to form quadratic functions representing differences between neighborhoods and between individuals (within neighborhoods).
Heterogeneity as a Quadratic Function
More generally, the size and the magnitude of the three random parameters at each level reflect the structure and form of the heterogeneity in the poor health–age relationships. Figure 4–3 presents one set of illustrative results. Figure 4–3(a) shows the shape of the between-neighborhood variation in relation to age; Figure 4–3(b) shows the shape of the between-individual variation in relation to age. Figure 4–3(c) and 4–3(d), meanwhile, plots the average poor health–age relationship (solid line) and the corresponding 95% predictive intervals for the neighborhood-level population heterogeneity (dashed lines) and individual level heterogeneity (dashed lines). These are approximated as the predicted regression line based on the fixed part ± 1.96 times the square root of the estimated (p.78)
Heterogeneity as a Linear Function
Specifying a quadratic variation between individuals and between neighborhoods may not be appropriate, and instead of differences increasing or decreasing at an accelerating or decelerating rate, they may change at a linear rate. We can, therefore, specify the heterogeneity at each level as a linear function of age. While we would still write the model in equation (13), we would only estimate one variance and one covariance at each level rather than the full set of variances and covariances. Thus, we would specify:
Although estimating a covariance when there is only one variance might seem contradictory, such a specification is entirely feasible (Goldstein, 1995; Rasbash et al., 2000). Put differently, if all three random parameters in equations (14) and (15) are significant, then the differences will be seen as a complex quadratic function of individual predictor age. On the other hand, if the size of the variances and —which contribute to the quadratic functional form—are not substantial and/or are zero but the size of the covariances σu0u1 x 0ijx1ij and σe0e1 x 0ijx1ij are significant and substantial, then the between-neighborhood and between-individual variation will be seen as a linear function of age. In light of this, and especially when we have several predictor variables that are allowed to vary at both the levels, it is important to view all the parameters jointly, and it is the functional form of the heterogeneity that should be interpreted.
Figure 4–4 illustrates one set of possible results for a model based on linear variance functions. For both between-neighborhood variation, Figure 4–4(a), and between-individual (within-neighborhood) variation, Figure 4–4(b), these are negative. As before, they would be based on nonzero variances and negative “covariances,” although now there would be only two of the former, one at each level. The neighborhood differences again occur around the average relationship in which older people are less healthy, with smaller between-neighborhood, Figure 4–4(c), and smaller between-individual (within-neighborhood) differences, Figure 4–4(d). (p.80)
Variance as a Constant Function
Rather than specify quadratic or linear functions, it may, of course, be appropriate to specify a constant function, such that between-neighborhood and between-individual within-neighborhood variation is unchanging with age, thus specifying:
Because the variance functions are based on random parameters, they relate to the broader population of neighborhoods rather than simply the specific sampled neighborhoods. This way of handling neighborhood heterogeneity is in direct contrast to techniques such as analysis of variance/analysis of covariance (ANOVA/ANCOVA) or the specification of indicator neighborhood dummies in the fixed part. These approaches are neither efficient nor parsimonious (Jones and Bullen, 1994). Because they use traditional OLS estimation procedures, they are unable to handle the between-individual heterogeneity because this violates the assumption of homoskedasticity. At the same time, inferences of between-neighborhood (p.82) heterogeneity are based on only the specific neighborhoods explicitly identified and not the wider population from which they are drawn.
It should, however, be noted that at the neighborhood level, predictions of specific relationships (u 0j, u 1j) can be obtained once the overall variance functions have been estimated. Thus, a multilevel estimation procedure can be viewed as a two-stage process. In the first, the overall variance functions are estimated together with the fixed parameters. In the second, these overall fixed and random parameters are combined with neighborhood-specific intercepts and slopes. If a particular neighborhood has few observations or there is little variation in the predictor variable(s), the predictions for such a neighborhood will be down-weighted or shrunk toward the overall fixed relationship (Morris, 1983). A reliably estimated within-neighborhood relationship will, however, be largely immune to this shrinkage. In Bayesian terminology these predictions are known as the posterior residual estimates. By using shrinkage estimators, multilevel models have the potential to avoid the misestimation problems caused by small numbers and sampling fluctuations in traditional methods based on single-level regressions (Jones and Bullen, 1994).
In the preceding paragraphs, we showed how multilevel models are not just concerned with the “average” or “fixed effect,” but about how people, groups, and neighborhoods vary. This is achieved through the specification of variance functions based on random parameters. Crucially, there are no built-in assumptions about the heterogeneity that exists at a particular level. Instead, it is possible to specify differential functional forms (constant, linear, or quadratic) at each level and evaluate which receives the best empirical support from the data. For instance, as shown in Figure 4–6, while the between-neighborhood variation can be a positive quadratic function of age, Figure 4–6(a), the between-individual (within neighborhood) could decrease with age according to a linear function, Figure 4–6(b). In Figure 4–6(c) the dashed lines represent 95% predictive intervals for neighborhood-level population heterogeneity, and in Figure 4–6(d) the dashed lines represent the bounds for the individual heterogeneity, after taking account of the neighborhood heterogeneity, around the average regression line. While older people are less variable, Figure 4–6(d), there are larger neighborhood differences for such people, Figure 4–6(c).
VARIANCE PARTITIONING IN MULTILEVEL MODELS
In multilevel models, residual variation in the response is partitioned into components that can be attributed to the different levels of analysis. Much interest is focused on the amount of variation attributable to the higher level (p.83)
Equation (20) divides the level 2 variance by the total variance (level 2 + level 1 variance). This statistic is also known as the Intra-Unit Correlation (in survey literature referred to as intra-class correlation, ICC). ICC gives us the correlation between two individuals within the same level 2 unit but with different x i values for different individuals. As a result, VPC and ICC will have the same formula in a random intercepts model, as the x i values relate only to the constant, x 0, which is the same for each individual. However, in a complex random-slopes model, as specified in equations (p.84) (12) or (13), we have a variance function at level 2 that is related to the individual predictor variable, x 1, and as such we cannot have a summary ICC statistic, because x 1 can take different values for different individuals, hence the terminology, VPC (Goldstein et al., 2002). The VPC in complex random slopes models as specified in equation (12) is given by:
MODELING CATEGORICAL PREDICTORS
While, so far, our example was based on a continuous predictor (age), multilevel models can readily analyze categorical predictor variables. Figure 4–7 illustrates the interpretation of neighborhood heterogeneity with categorical predictors. We consider social class as a two-category individual variable, high social class and low social class, and these are shown on the horizontal x-axis, with the response being a continuous score of poor health (y-axis) in Figure 4–7.
Figure 4–7(a) presents the simplest outcome: differences between social groups but no variation between neighborhoods. With only one fixed average for each group, it shows an individual-level model in which the same relationship is fitted to all neighborhoods. Figure 4–7(b) represents a two-level model with each of six neighborhoods having its own poor health–social class relationship. The thick solid lines represent the average poor-health rates for the two groups, while the symbol lines (one for each neighborhood) represent the variation between neighborhoods around the average line. Because the individual relationship between social class and poor health is also shown in the model, the graph implies that the variation between neighborhoods is not solely due to the varying social composition of neighborhoods and is, therefore, contextual. The neighborhood differences, however, are assumed to be simple, such that neighborhoods that are high for one group are also high for the other and vice versa (similar to the random-intercepts model) as shown by the similar ordering of symbols for both social class categories in Figure 4–7(b). Thus, while there is a (contextual) geography of poor health, it can be summarized in one map.
We can, however, anticipate the neighborhood variation to be significantly different for the two groups. This difference consists of two dimensions. First, the amount (range) of neighborhood variation can be different (p.85)
When individual categorical predictors are allowed to vary at their own level (that is, at level 1), it is important to note that individuals cannot belong to, for example, both social class groups. Investigating whether the individual social class effect (compositional) varies by neighborhood, such that neighborhoods matter differently for high social class and low social class (contextual heterogeneity) and whether one group is more variable than the other (individual heterogeneity) would require the following statistical form:
In order to implement equation (22), we have created two new separate indicator variables: z 1ij is an indicator variable (1 if low social class, 0 otherwise), and z 2ij is the indicator variable for high social class (1 if high social class, 0 otherwise). It is important to note that the new indicator variables are associated only with the level 1 residual terms. The fixed parameter, β0, gives the average poor health score for high social class associated with constant, x 0ij, and β1 estimates the average differential for low social class associated with the contrast coded dummy, x 1ij. Thus, the average poor health score for low social class would be given as β0 + β1.
Making the usual IID assumptions, the residuals at level 2 (u 0j, u 1j) and at level 1 (e 1ij, e 2ij) can be summarized through a set of variances and covariances. Thus, would estimate the between-neighborhood variation for high social class, while gives the differential variance for low social class, that is, the extent to which the variance for low social class is different from high social class. The between-neighborhood variation for low social class would be given by
If the covariance term is positive (+ σu0u1), then the variation for low social class will be greater compared to that for high social class, and neighborhoods that have high rates for one group will tend to be relatively higher for the other. A negative covariance (+ σu0u1), meanwhile, could imply either that low social class is less variable or that neighborhoods that are high for one group are relatively low for the other, as was (p.87) shown in Figure 4–7(c). The exact interpretation would depend on the relative size of the covariance in relation to the variance, .
At the individual level, gives the variability for high social class, while directly estimates the variability for low social class. Because individuals cannot belong to more than one category, there is not enough information to estimate the full “quadratic” function at level 1, as was specified in equation (15) in the case of the continuous predictor variable.
Allowing different specifications in different parts is an important characteristic of a multilevel model. For instance, in the above illustration, the fixed effect of social class was specified as a difference, as was the contextual heterogeneity at the neighborhood level. Put differently, the model in equation (22) ascertains the average social class gap and the extent to which the social class gap varies across neighborhoods at level 2. However, at level 1, individual heterogeneity between social class groups was specified separately and not as a difference from the other. Furthermore, the “linear” formulation that was discussed in relation to continuous predictor variables at the neighborhood level can be extended to categorical predictor variables as well (Bullen et al., 1997). This flexibility is particularly useful when we have a range of categorical predictors with each of the categorical predictors having two or more categories.
An attractive feature of multilevel models—one that is commonly used in health research—is their ability to model contextuality as a function of characteristics that relate to neighborhoods in addition to individual characteristics. At the same time, the nature and type of interactions between individual characteristics and neighborhood characteristics can also be assessed.
We illustrate the idea of such cross-level interactions by building on our running example of a two-level model (individuals at level 1 within neighborhoods at level 2), with the response being a score for poor health for each individual. We consider the categorical individual predictor, social class (with high social class as a reference and low social class specified as a contrast indicator variable) and a continuous neighborhood-level contextual predictor (e.g., socioeconomic deprivation index). Figure 4–8 portrays a range of hypothetical graphical models. In Figures 4–8(a)–(h), the y-axis represents the poor health score and the x-axis represents the neighborhood socioeconomic deprivation index. The dashed line represents low social class, and the solid line represents high social class.
Figure 4–8(a) shows marked differences between high social class and low social class but no contextual effect for neighborhood socioeconomic deprivation (all individual, no contextual). Figure 4–8(b) represents the converse: a small difference between the two social groups but a large contextual effect of socioeconomic deprivation (all contextual, no individual). The parallel lines in Figure 4–8(c) and 4–8(d) show both individual and contextual effects. In Figure 4–8(c) the neighborhood socioeconomic deprivation is shown to have a detrimental effect on the health of the individuals, and the reverse is shown in Figure 4–8(d). The key point is that the contextual effect of socioeconomic deprivation is seen to be the same for both high social class and low social class. Put differently, while neighborhood socioeconomic deprivation explains the prevalence of poor health, it does not account for the inequalities in health between the social class groups.
In Figure 4–8(e) contextual effects are different for different groups. They are shown to be positive for high social class and negative for low (p.89) social class, such that in neighborhoods with the highest level of socioeconomic deprivation, health inequalities are minimal. Thus, neighborhood-level socioeconomic deprivation is not only related to average health achievements but also shapes social inequalities in health. Figure 4–8(f) represents the case in which contextual effects are strong enough to invert the individual effects. Figures 4–8(g) and 4–8(h) show models in which nonlinear terms are of importance, such that the smallest or largest group inequalities in health are found at “average” levels of socioeconomic deprivation and not at the extreme levels of neighborhood socioeconomic deprivation.
Relating this to equation (22), a neighborhood-level continuous predictor variable, socioeconomic deprivation, n j, referring to the context of the neighborhood, is now introduced. Crucially, the contextual variables in the multilevel model are specified in the macromodels and then combined into the overall model. Thus, the underlying macromodels for equation (22) are now:
The separate specification of micro and macromodels correctly recognizes that the contextual variables are predictors of between-neighborhood differences, after allowing for individual compositional variables. The α parameters represent the relationship between neighborhood differences (after controlling for the individual variable, social class) and the contextual variable, n j. Thus, α0 assesses the relationship between high social class (at the individual level) and the socioeconomic deprivation of the neighborhood. The parameter α1 represents the differential contextual effect for low social class. This formulation makes clear that it is only through multilevel models that cross-level interactions between individual and contextual characteristics can be robustly specified and estimated.
MULTIPLE SPATIAL CONTEXTS
Most of the existing accounts of multilevel methods have been largely restricted to two-level structures, typically with individuals at level 1 and (p.90) places at level 2. In this section we extend the model to consider the multiplicity of spatial levels in public health. For instance, in the United States, geographical units such as block groups (BGs), census tracts (CTs), counties, and states may each exert a differential influence on health in the population. Despite this, most research examining the effects of context on health has conceptualized contextual effects at only one level of geography.
Multiple hierarchical geographic levels may be needed to explain the mechanisms by which contexts at different levels affect health. The multiplicity of geographic levels raises a fundamental issue: determining the number of levels necessary to analyze a particular health outcome and the relative importance of different levels. Failure to address this issue can result in variability being attributed to the wrong contextual level. Consider, for example, a hierarchy of different geographic levels in which BGs are nested within CTs which in turn, are nested within counties within states. If poor health had a strong dependence at the BG level but the analysis only considered the CT level, incorrect inferences would be made at both the individual level and the CT level.
To appreciate the importance and implications of including this additional spatial level (level 3), a series of graphical typologies is useful (Subramanian, Duncan, et al., 2001). For the purposes of clarity and ease of understanding, we start with the simple case, shown in Figure 4–9, in which we assume that the differences between places at two spatial levels are the same for both the social class groups. We continue with the use of the term neighborhoods to represent level 2 spatial units and introduce the term regions to represent level 3 spatial units.
In Figure 4–9, the y-axis represents the individual score for poor health, the solid thick line represents the fixed average, the thinner solid lines represent the regions, while the dashed and the dotted lines represent neighborhoods within regions A and B, respectively. In Figure 4–9(a) it can be seen that while regions vary significantly around the average line, such that one is high (Region B) and one is low (Region A), the neighborhoods within each lie close to their respective region lines. This suggests that there is no need to include neighborhoods as a level and that a structure of individuals nested within regions is sufficient to capture the main source of geographic variation. In Figure 4–9(b) the converse is portrayed: while the differences between regions are insignificant (i.e., they are grouped close to the overall average line), those between neighborhoods are substantial. This would suggest the greater importance of the neighborhood level compared to the region level. Finally, Figure 4–9(c) anticipates a situation with significant variation at both region and neighborhood (p.91)
Ascertaining the relative importance of different spatial scales, after taking into account (individual) compositional effects, can provide important clues about the level “at which the action lies.” A multilevel framework is ideally and readily suited to this task. Thus, underlying Figure 4–9 is a multilevel model based on a three-level structure of individuals (p.92) (level 1) nested within neighborhoods (level 2) nested within regions (level 3). The micro model can be written as:
Depending on the relative size of the neighborhood and region level variance terms ( and , respectively), that summarize the place-specific differentials at each level, this model would produce one of the patterns shown in Figure 4–9.
MULTILEVEL RESIDUAL MAPPING
While it is the variances that are estimated in a multilevel model at each of the specified levels, it is possible to estimate place-specific (posterior) residuals at each of the contextual levels. Residual mapping is an extremely useful application of multilevel models, especially when interest lies in simultaneous multiple geographies and when all the units at each of the geographic level can be observed in the analysis (e.g., the census) (Subramanian et al., 2001). In order to appreciate this, Figure 4–10 unpacks the way in which residuals are constructed when there are two spatial levels.
The region-specific residuals (v 0k) at level 3 represent the difference from the fixed average line, β0. For example, Region A will have a negative residual given its lower rate of poor health compared to the overall average; Region B, in contrast, will have a positive residual given its higher rate compared to the average. Neighborhood-specific residuals (p.93)
Such ideas are extremely useful for social policy (Goldstein and Spiegelhalter, 1996). As an example, consider Neighborhood 2 in Region A, and Neighborhood 2 in Region B in Figure 4–10. Both neighborhoods are seen to be performing well, with low rates of poor health (negative neighborhood residuals). The similarity in neighborhood effects is, however, occurring in entirely different contexts and as such may be telling quite different stories. While low rates in Neighborhood 2 in Region A are being achieved within a favorable context (a low-rate region), in Neighborhood 2 in Region B they are occurring within an unfavorable context (a high-rate region). (p.94)
As this example illustrates, we have a nuanced way of evaluating and monitoring the performance of particular places. One possibility, as shown in Figure 4–11, is to have a simple fourfold typology of neighborhood health performance: Type I—unhealthy neighborhoods in unhealthy regions; Type II—unhealthy neighborhoods in healthy regions; Type III—healthy neighborhoods in unhealthy regions; and Type IV—healthy neighborhoods in healthy regions.
The purpose of such typologies is not simply methodological, but substantive and practical. For instance, Type I neighborhoods are doubly disadvantaged (“unhealthy” neighborhoods in “unhealthy” regions), while Type IV neighborhoods suggest a virtuous reinforcement of contextual advantage (“healthy” neighborhoods in “healthy” regions). For Type II and Type III neighborhoods, meanwhile, contextual advantage at one level offsets disadvantage at the other. Determining the cut-off points for what can be considered “healthy” and “unhealthy” is critical, and care must be taken while identifying specific places, an issue to which we shall return later in this chapter. Nonetheless, our aim here is to illustrate the potential of a multilevel approach for evaluative and monitoring exercises that are usually of interest for public health departments.
PARAMETER ESTIMATION IN MULTILEVEL STATISTICAL MODELS
In this section we provide a brief overview of the estimation strategies that are used to fit multilevel models. Using observed data, a multilevel model estimates the regression coefficients (fixed parameters) and the (p.95) variance components (random parameters). These parameters are usually generated using the maximum likelihood (ML) estimators that provide population values that maximize the so-called likelihood function, which gives the probability of observing the sample data given the parameter estimates. ML estimators, therefore, are parameter estimates that maximize the probability of finding the sample data we have actually found (Hox, 1995). The ML estimators are available using the Newton-Raphson Fisher scoring, iterative generalized least squares, or the expectation maximization algorithms (Longford, 1993).
Computing the ML estimators requires an iterative procedure. At the beginning starting values for the various parameter estimates (usually based on the ordinary least squares regression estimates) are generated. In the next step the computation procedure improves upon the starting values to produce better estimates via generalized least squares. This step is repeated (iterated) until the changes in the estimates between two successive iterations become very small, indicating convergence, with the parameter estimates now being ML estimators. Lack of convergence could suggest model misspecification in the fixed part, misspecification of the variance–covariance structure (either too simple or too complex), or small sample sizes at different levels.
Two different varieties of ML estimators are used in the available software for multilevel modeling. One is the full information maximum likelihood (FIML), in which both the regression coefficients and the variance components are included in the likelihood function. The other is the restricted maximum likelihood (REML), and here only the variance components are included in the likelihood function. The difference is that FIML treats the estimates for the regression coefficients as known quantities when the variance components are estimated, while REML treats them as estimates that carry some amount of uncertainty (Bryk and Raudenbush, 1992; Goldstein, 1995). While REML is more realistic and is recommended, especially when the number of groupings is small (Bryk and Raudenbush, 1992), FIML is computationally less demanding and allows for greater comparison across different model specifications.
The ML theory is based on several assumptions, and three that are critical from an applied perspective are (1) the random parameters at all levels are normally distributed; (2) the level 2 random parameters are independent of the level 1 random parameters; and (3) the sample size is large and tends to infinity. In practice, these assumptions will, at best, be met only approximately. Violations of these assumptions could lead to bias of the estimators and incorrect standard errors. In recent years, however, Bayesian estimation using Gibbs sampling (Gilks et al., 1996; Browne, 2002), quasi-likelihood estimation together with bias correction (p.96) procedures (Goldstein and Rasbash, 1996) have been developed as alternatives. For inference, interval estimates are obtained directly from Gibbs sampling and via large sample deviance statistics or bootstrapping for likelihood function estimation.
We now turn to identifying some key extensions to the multilevel models outlined so far. We consider two types of extensions: the first relates to complex multilevel structures and the second relates to modeling specifications.
EXTENSIONS TO MULTILEVEL STRUCTURES
It is important to note that redesigning the data structure is a way by which some problematic issues are circumvented within multilevel frameworks.
Modeling Spatially Aggregated Data
While we have so far discussed the multilevel structure in terms of individuals at level 1 and places at level 2, we argue that a similar framework of people within places can be established using routinely available aggregate data (e.g., census and mortality data). As is well known, analyses of aggregated data confound the microscale of people and the macroscale of places. Although regrettable, this situation is usually tolerated owing to the other obvious attractions of these data sets (e.g., large, extensive coverage of places at multiple levels). A multilevel approach offers a solution to this problem (Subramanian et al., 2001).
Table 4–1 provides hypothetical data on deaths for two social groups in a format typical for spatially aggregated data. Thus, in Area 1, 9 out of 50 in the low social class category died in a particular year; in Area 2,
Table 4–1. Hypothetical Counts of Death and Total Population by Social Class by Areas
Counts of Death out of Total Population
Low Social Class
High Social Class
9 out of 50
2 out of 50
5 out of 9
5 out of 95
10 out of 80
0 out of 50
20 out of 90
0 out of 0
Five points need to be made about this table. First, it is vital to note that underlying Table 4–1 is simply a set of individual records that happens to be presented in a tabular format but that can easily be changed into an individual record format. Second, just as individuals nest within areas, producing a two-level hierarchical data structure, so do the cells presented in Table 4–1. This is shown in Figure 4–12. Third, although here the data is cross-tabulated by only one individual characteristic, exactly the same principles apply when there is a greater degree of cross-tabulation. Fourth, if in an area there are no people of a particular type (e.g., missing high social class in Area 50 in Table 4–1), this poses no special problems, as multilevel data structures can be unbalanced as shown in Figure 4–12. Finally, there is good reason for invoking the notion of cells even when data is available in an individual record format because the amount of information, and therefore the associated computing time, can be reduced without any substantial loss of information.
Consequently, routinely available aggregated data can readily be adapted to a multilevel data structure with table cells at level 1 (representing the population groups) nested within places at level 2. The counts within each cell give the number of people with the outcome of interest (e.g., number of deaths) together with the “denominator” (the total population). The proportion so formed becomes the response variable, and the cell characteristics, meanwhile, are the individual predictor variables. Such a structure now lends itself to all the analytical capabilities that were discussed earlier (Subramanian et al., 2001).
(p.98) Nonhierarchical Cross-Classified Structures
Individuals live their lives in a number of overlapping settings, such as neighborhood, workplace, home, and so on. Such contexts do not always lend themselves to a neat hierarchical structure. Instead, the different settings may overlap at the same level, producing a crossed structure. The importance of such structures has been long recognized and they are now technically and computationally tractable (Goldstein, 1994; Jones et al., 1998). The “quasi-hierarchical” format employed within cross-classified multilevel models enables an assessment of the relative importance of a number of different, overlapping contexts after allowing for the differential composition of each. Such models identify contexts that have a confounding influence, thus ascertaining the contexts that have the greatest significance. For example, a cross-classified model of health behavior (e.g., smoking) could be formulated with individuals at level 1 and both residential neighborhoods and workplaces at level 2, as shown in Figure 4–13(a). If account is not taken of this cross-classified structure, what may appear to be between-workplace variation could actually be between-neighborhood variation, and vice versa.
A related structure occurs if, for a single level 2 classification (e.g., neighborhoods), level 1 units (e.g., individuals) may belong to more than one level 2 unit. The individual can be considered to belong simultaneously to several neighborhoods, with the contributions of each neighborhood being weighted in relation to its distance (if the interest is spatial) from the individual.
Repeated Measures of People and Places
Health outcomes and behaviors as well as their causal mechanisms are rarely stable and invariant over time, producing data structures that involve repeated measures. Two possibilities arise depending on the unit that is repeatedly measured. When individuals are repeatedly measured within a panel design, the outcomes taken at different times form level 1. The same outcomes measured over different times are nested within individuals at level 2, which in turn nest within higher-level units (e.g., neighborhoods). This structure is shown in Figure 4–13(b) and allows the assessment of individual change within a contextual setting.
The other possibility is repeated cross-sectional surveys in which places are monitored at regular time intervals (repeatedly measuring places over time). The structure would then be individuals at level 1, time/years within places at level 2, and places at level 3, as shown in Figure 4–13(c). Such a structure permits an investigation of trends within geographic settings controlling for their compositional make-up. Multilevel models could be used (p.99)
Multiple Yet Related Outcomes
Multivariate multilevel models can handle situations in which a number of different but related response measurements are made on individuals (Duncan, 1997). The key feature is that the set of responses (outcomes) is nested within individuals. The response could be a set of outcomes that relate to, for instance, different aspects of health behavior (e.g., smoking and drinking). Crucially, such responses could be a mixture of “quality” (do you smoke/do you drink) and “quantity” (how many/how much). A multilevel structure on different aspects of health behavior could be measurements (e.g., smoking and drinking both at level 1) nested within individuals (level 2) within neighborhoods (level 3). The substantive benefit of this approach is that it is possible to assess whether different types of behavior are related to individual characteristics in the same or different ways. Moreover, the residual covariances at level 2 and level 3 measure the “correlation” of behaviors between individuals and between places. Technical benefits in terms of efficiency result if the response is correlated and if there are many missing responses, as in matrix sample designs. Figure 4–13(d) presents a structure in which the responses at level 1 capture four different aspects of health behaviors, and Figure 4–13(e) portrays the idea of “mixed” (quality and quantity) responses on a particular aspect of health behavior. Thus, for example, person 1 in place 1 is a smoker and smokes 20 cigarettes, while person 3 in place 1 is a non-smoker and as such the response related to number of cigarettes smoked does not apply.
While for the purpose of clarity and ease of understanding we have discussed each of the multilevel structures separately, readers are urged to think about these structures in an integrated manner. For instance, in a model of health behaviors, in addition to a mixed multivariate structure (e.g., smoke or not, how many; drink or not, how much), individuals could be repeatedly measured across multiple time periods, who in turn are then cross-nested across neighborhoods and workplaces. The mixed multivariate response would then be the level 1 units that are nested within time periods at level 2, within individuals at level 3, and neighborhoods and workplaces at level 4.
EXTENSIONS TO MODEL SPECIFICATIONS
We have already discussed how multilevel methods offer an extremely powerful framework to (1) disentangle compositional and contextual effects; (p.101) (2) model between contextual and between individual (within context) heterogeneity; (3) model interaction effects between individual and contextual characteristics; and (4) model variation across multiple spatial scales. In this section we draw attention to additional model specifications that we consider important.
Modeling Heterogeneity at Multiple Spatial Levels
The three-level framework presented in equation (29) considered between-neighborhood variation at level 2 and between-region variation at level 3 as a constant function that was unchanging across the social class groups. This assumption can be relaxed such that between-context variation at both the spatial levels can be modeled as a function of social class. Such models are extremely useful in order to explore the relative importance of different spatial levels for different types of population groups. For instance, for individuals of low social class it may be the regions that matter more than the neighborhoods, while it might be the reverse for those of high social class. In addition, such models allow a mapping of the differential geographies for different social classes at each of the geographic level.
Allowing the Effect of Contextual Variables to Vary
Typically, contextual variables (e.g., neighborhood-level socioeconomic index) are modeled in the fixed part of the multilevel model. However, as we emphasize in this chapter, a unique advantage of multilevel models is their ability to model variability in the fixed average relationships, both at their own level (that is, the level at which they are observed and measured) and at higher levels. For instance, the effect of neighborhood-level socioeconomic deprivation may not be uniform across all regions and may vary across the regions. Furthermore, highly deprived neighborhoods may also be characterized by a greater degree of heterogeneity. Considering both these formulations is vital to develop a rich empirical description of the role of neighborhood socioeconomic deprivation on poor health and to condition the inferences and predictions that are derived based on fixed average relationships.
Interaction between Contextual Levels
The notion of cross-level interaction that we discussed earlier using Figure 4–8 can be usefully extended to contextual variables at different geographic levels (e.g., one relating to neighborhood characteristics and the (p.102) other representing region characteristics). Such interactions explore the influence of a contextual variable at the region level for different types of neighborhoods. This idea is particularly significant given the direct interest in characterizing and measuring places. Furthermore, this extension is also intrinsic to earlier arguments in which we emphasized the importance of interpreting neighborhood patterning in relation to regions, given their functional interconnectedness. For example, a neighborhood characteristic (e.g., low and high socioeconomic deprivation index) and region characteristic (e.g., per capita expenditure on health) may interact in ways such that region health expenditure levels may manifest in different ways, depending on the type of neighborhood such that the same level of expenditure may produce better results in low-deprivation neighborhoods compared to high-deprivation neighborhoods.
Nonlinear Multilevel Models
So far we have illustrated the methodological concepts by considering a continuous response variable that has a normal distribution. However, a large number of outcomes of interest in public health research are not continuous and do not have Gaussian (normal) distributive properties. While not discussed in detail here, multilevel models are capable of handling a wide range of responses, and “generalized multilevel models” exist to deal with binary outcomes, proportions (such as logit, log–log, and probit models), multiple categories (such as multinomial and ordered multinomial models), and counts (such as Poisson and negative binomial distribution models) (Leyland and Goldstein, 2001). Indeed, all these outcomes can be modeled using any of the hierarchical and nonhierarchical structures discussed previously (Rasbash et al., 2000).
These models work, in effect, by assuming a specific, non-normal distribution for the random part at level 1, while maintaining the normality assumptions for random parts at higher levels. Consequently, much of the discussion presented in this chapter focusing at the neighborhood and region level (higher contextual levels) would continue to hold regardless of the nature of the response variable. It may, however, be noted that the computation of VPC, which we discussed earlier, in complex nonlinear models is not as straightforward as it is in normal models and is an issue of applied methodological research (Goldstein et al., 2002; Browne, Subramanian et al., 2002). Research developments are currently underway in which multilevel perspectives have been extended to survival and event history models, metaanalysis, structural equation modeling, instrumental variable analysis, and factor analysis (Goldstein, 1998).
(p.103) MULTILEVEL METHODS: A CRITICAL PERSPECTIVE
There has been an enthusiastic rush to use multilevel modeling techniques in recent years (for an excellent review of multilevel applications in health research see Diez Roux, 2001, 2002, in press). In their enthusiasm researchers have often overlooked certain fundamental methodological issues that may have critical conceptual and empirical implications. Having discussed the nature and scope of multilevel methods, we now turn to some of these key issues.
The validity of multilevel models relies entirely on the researchers’ conceptualization and operationalization of the analytical levels. Practical convenience often has guided the selection and identification of contexts. For instance, in the United States, block groups and/or census tracts (spatial administrative units) are commonly used to define the neighborhood setting. Whether such administrative units accurately delimit the boundaries of what constitutes a “neighborhood” is debatable. Related to this issue are problems of missing levels and outcome-contingent hierarchies.
Recent studies have shown that the variance apportioned to different levels may be over or underestimated depending on the ignored nature and number of levels. While there are technical reasons to expect the apportioned variance to change between a two-level and a three-level model (Hutchison and Healy, 2001; Tranmer and Steel, 2001), there are implications for making neighborhood-level inferences and also for the fixed part estimates. While we do not advocate abandoning the use of administrative units to define contexts, there is a need to be conceptually clear about the selection (or omission) of levels in the analysis.
Second, the conceptual and operational multilevel structure may differ depending on the outcome that is being analyzed. Often, regardless of the health outcome, the same set of hierarchical levels is used. Future applications, therefore, need to not only justify the analysis in terms of the choice of levels but also in terms of the extent to which the choice depends on the outcome under investigation.
Closely related to this issue is the level-contingent nature of different contextual predictor variables. For instance, certain contextual variables (e.g., the extent of income inequality in an area) may be more meaningful at the higher level of aggregation (e.g., region) than at lower levels of aggregation (e.g., neighborhoods), while others such as cognitive perceptions of social capital may be more meaningul at lower levels of aggregation. In summary, multilevel models are only as good (or as bad) as (p.104) the underlying theories used to justify the levels and choice of covariates at each of the identified levels.
Endogeneity of Contextual Effects
Because individuals do, to some extent, choose where to live, “unobserved” individual or family factors can be mistaken for neighborhood effects. Similar unobserved factors may also characterize neighborhoods and other spatial levels of analysis. This problem of endogeneity, whereby an unobserved variable is related to a set of predictors and the response, is only beginning to be addressed in the context of multilevel methods. The issue of endogeneity is even more complex in multilevel analysis because the unmeasured influences of omitted variables in the fixed part gets incorporated in the random part of the model, thereby violating the assumption of the independence of regressors and model disturbances (Rice et al., 1998). Three ways of dealing with this issue have been suggested in the multilevel literature. The first is to include data that actually “measure the crucial omitted variable” (Duncan and Raudenbush, 1999). The second is to apply specially developed multilevel instrumental variable estimation techniques (Spencer, 1998; Spencer and Fielding, 2000), which is the standard solution to endogeneity problems in single-level regression now extended to multilevel regression models. The third is to use a repeated measures, cross-classified structure, longitudinal fixed-effects model based on the nesting of panel observations for those who change neighborhoods within a cross-classified structure with time-varying covariates at each level of the analysis (Rasbash and Goldstein, 1994). This strategy, of course, is extremely data intensive and involves intensive computational demands. It is recommended that future applications be sensitive to the causal implications that the issue of endogeneity poses for multilevel research.
Limits to Context-Specific Predictions
Multilevel models, through the estimation of the posterior residuals at the higher level, provide an extremely useful way of measuring and monitoring performances of higher-level units (e.g., neighborhoods, hospitals) (Goldstein and Spiegelhalter, 1996). While this is extremely useful, it is important to realize that the primary function of multilevel models is to model population heterogeneity at different levels (e.g., individuals, neighborhoods) and not to generate context-specific predictions. Because multilevel models treat the neighborhoods as a sample realized from a population of neighborhoods, the main focus is on the variability between neighborhoods rather than the specific effect of each neighborhood. Therefore, (p.105) while predictions for specific neighborhoods are possible, these are not simply point estimates; degrees of uncertainty are associated with them as well as any ranking that derives from them (Goldstein and Spiegelhalter, 1996). Specifically, neighborhood-specific estimates depend on the sample size in specific neighborhoods. Neighborhoods with small sample sizes will have large confidence intervals; they will also contribute little to the estimation of the population parameters given the precision weighting used (Jones and Bullen, 1994). These considerations are extremely important before “naming” (and “shaming”) specific places and institutions.
Power and Sample Size Considerations
As we have emphasized, multilevel models are not about modeling each neighborhood separately, but, rather, the sample of neighborhoods is seen as one realization from a population of neighborhoods. When designing a powerful multilevel study it is important, therefore, to consider two things: the determination of sample sizes at the various levels of analysis, and ensuring the property of exchangeability. We first discuss the issue of sampling in multilevel analysis.
It is vital that the study design have “adequate” numbers of units at all the levels of analysis. In general, by increasing sample sizes at all levels, estimates and their standard errors become more accurate to some extent. The analysis of binomial data in particular requires larger samples than the analysis of normally distributed data (Hox, 1998). Determination of sample sizes at level 1 and level 2 units for efficiency, unbiasedness, and consistency of parameter estimates is not entirely straightforward, and this is especially the case if we are interested in the random slopes component. In a two-level random intercepts model, the sample design question is analogous to computing the effective sample size in two-stage cluster sampling as given by Kish, 1965. Effective sample size of a two-stage cluster sampling design, n eff, is computed by:
where n is the total number of individuals in the study, that is, the actual sample size; n clus is the number of individuals per neighborhood; and ρ is the intraclass correlation. However, the analogy is not straightforward for random slopes models, because the ICC for these models is a function of the independent variable, as was shown in equation (21). Neither are such calculations straightforward in three-level models.
Consensus has yet to be developed on the precise power of calculations within multilevel models. Some argue for a sample of at least 30 groups with at least 30 individuals in each group (Kreft, 1996). This advice (p.106) is considered sound provided interest is largely in the fixed parameters. Modification to this “rule” is advised if interest is in estimating cross-level interactions and/or variance and covariance components (Hox, 1998). For the former a 50/20 rule is generally recommended (about 50 neighborhoods with at least 20 individuals per neighborhood), and for a variance–covariance components model about 100 neighborhoods with about 10 individuals per neighborhood is suggested. Indeed, if this is the case, then one has to be cautious about making neighborhood-specific predictions. These “rules” take into account that there are costs attached to data collection, such that if the number of neighborhoods is increased, the number of individuals per neighborhood decreases (Snijders and Bosker, 1993; Snijders, 2001).
Exchangeability of the Sample
Multilevel models treat the higer-level units as a sample drawn from a common population and inferences are made about this population. However, just because multilevel models operate in this way does not guarantee that they are appropriate in any particular instance. Crucial exchangeability judgments exist that are often neglected (Morris, 1995). Specifically, researchers need to ensure that the sample of neighborhoods does come from, can be exchanged with, and is similar to the population that they wish to make inferences about, with this being true for each specific neighborhood for which they have data. If there are reasons to believe that certain neighborhoods are truly independent or that they come from different populations, they should not be regarded as exchangeable with the remaining random sample of neighborhoods and, as such, should be treated as fixed effects. While one option is to perform diagnostics after the model is fitted and/or conduct multilevel analysis with those neighborhoods that are believed to share exchangeable properties and without those neighborhoods that are believed to violate the exchangeability assumption, a conceptually sound approach is to carefully plan the selection of neighborhoods at the design stage, analogous to the sampling of individuals in a survey (Draper, 1995).
Multilevel models, in conclusion, have several features that make them attractive for public health research. In this chapter we sought to explain and emphasize how these methods offer an extremely flexible yet unified framework for conceptualizing and investigating substantive ideas related (p.107) to contextuality and heterogeneity. Specifically, variability and the correlated nature of data structures are seen as the norm, not an aberration, and consequently multilevel methods neither ignore nor adjust for them, but rather anticipate and model them. In doing so, we showed that these methods encourage and foster refinement in our thinking about different levels of causation.
At the same time, the full potential of multilevel methodologies is yet to be realized. Reviewing applications of multilevel methods in public health research to date reveals three methodological motivations: (1) the need to obtain more accurate estimates and standard errors of the individual correlates that influence health (that is, the need to adjust for any autocorrelation in the response); (2) to establish the average fixed contribution of compositional and contextual factors; and (3) to establish fixed cross-level interaction effects between individual and contextual factors. While not discounting the relevance of these motivations, the methodological focus has been, and continues to be, on the fixed part of the multilevel model, with very little focus on the random part. Furthermore, research applications have also not moved beyond the simple two-level structure of individuals at a lower level nested within a spatial setting at a higher level.
Multilevel methods, we have argued, provide a theoretical and technical framework that can help reconceptualize much of public health research. Specifically, they compel the researcher to reflect on the multilevel nature of causal processes and raise questions that are not simply about fixed averages but rather about the variability and heterogeneity of populations. Indeed, multilevel methods are changing the way we think about “individual effects” and “contextual effects.” From an initial view of interpreting these effects in terms of “who you are in relation to where you are,” multilevel methods encourage us to think along the lines of “who you are depends on where you are.” While the methodological capability of multilevel models to disentangle compositional (individual) and contextual effects related to spatial variations has been well demonstrated, the multilevel framework has also redefined the construct of individual compositional explanations. Indeed, as Macintyre and Ellaway (2000) point out, “your SES and income are partly a product of your place of upbringing, rather than being intrinsically personal attributes.”
While being attractive both conceptually and technically, multilevel methods are also undoubtedly complex and should not be approached in simplistic terms. Indeed, simplistic use of complex methodological tools can lead to interpretive confusion and a potential overstatement of what may validly be concluded from a given piece of research (Draper, 1995). These comments are not meant to discourage the use of multilevel methods. (p.108) Rather, as we have emphasized, multilevel models can raise new research agendas and provide important insights into existing knowledge. At the same time, as one of the pioneers in this field reminds us, multilevel methods, like all statistical methods, should be used “with care and understanding” (Goldstein, 1995).
Alker, HA, Jr. (1969). A typology of ecological fallacies. In Dogan M and Rokkan S, eds.: Quantitative Ecological Analysis. Cambridge, Mass: Massachusetts Institute of Technology Press, pp. 69–86.
Anderson NB (1999). Solving the puzzles of socioeconomic status and health: The need for integrated, multilevel, interdisciplinary research. Ann NY Acad Sci 896: 302–312.
Berkman LF and Kawachi I, eds. (2000). Social Epidemiology. New York: Oxford University Press.
Browne WJ (2002). MCMC Estimation in MLWIN, Version 1.0. London: Centre for Multilevel Modelling, University of London.
Browne WJ, Subramanian SV, et al. (2002). Variance Partitioning in Multilevel Logistic Models that Exhibit Over-dispersion. London: Center for Multilevel Modelling.
Bryk AS and Raudenbush SW (1992). Hierarchical Linear Models: Applications and Data Analysis Methods. Newbury Park, Engl: Sage Publications.
Bullen N, Jones K, et al. (1997). Modelling complexity: Analysing between-individual and between-place variation—a multilevel tutorial. Environment and Planning A 29(4): 585–609.
de Leeuw J and Kreft KGG (1986). Random coefficients models for multilevel analysis. Journal of Educational Statistics 11: 57–85.
Diez Roux AV (2001). Investigating neighborhood and area effects on health. Am J Public Health 91(11): 1783–1789.
Diez Roux AV (2002). Invited commentary: Places, people, and health. Am J Epidemiol 155(6): 516–519.
Diez Roux AV (2003). The examination of neighborhood effects on health: Conceptual and methodological issues related to the presence of multiple levels of organization. In Kawachi I and Berkman LF, eds.: Neighborhoods and Health. New York: Oxford University Press, pp. 43–64.
Draper D (1995). Inference and hierarchical modeling in the social sciences. Journal of Educational and Behavioral Statistics 20: 115–147.
Duncan C (1997). Applying mixed multivariate multilevel models in geographical research. In Westert GP and Verhoeff RN, eds.: Place and People: Multilevel Modelling in Geographical Research. Utrecht: Royal Dutch Geographical Society and Faculty of Geographical Sciences, Utrecht University, pp. 100–115.
Duncan G and Raudenbush S (1999). Assessing the effects of context in studies of child and youth development. Educational Psychologist 34: 29–41.
Gilks W, Richardson S, et al. (1996). Markov Chain Monte Carlo in Practice. London: Chapman & Hill.
Goldstein H (1994). Multilevel cross-classified models. Sociological Methods and Research 22: 364–375.
(p.109) Goldstein H (1995). Multilevel Statistical Models. London: Arnold.
Goldstein H (1998). Multilevel models. In Armitage P and Colton T, eds.: Encyclopaedia of Biostatistics. Vol. 4. Chicester: Wiley. pp. 2725–2731.
Goldstein H, Browne W, et al. (2002). Partitioning Variation in Multilevel Models. London: Institute of Education, University of London.
Goldstein H and Rasbash J (1996). Improved approximations for multilevel models with binary responses. J R Stat Soc A 159: 505–513.
Goldstein H and Spiegelhalter D (1996). League tables and their limitations: Statistical issues in comparisons of institutional performance (with discussion). J R Stat Soc A 159: 385–443.
Hox J (1998). Multilevel modeling: When and why. In Balderjahn I, Mather R, and Schader M, eds.: Classification, data analysis, and data highways. New York: Springer Verlag, pp. 147–154.
Hox J (1995). Applied Multilevel Analysis. Amsterdam: TT-Publikaties.
Hutchison D and Healy M (2001). The effect of variance component estimates of ignoring a level in a multilevel model. Multilevel Modelling Newsletter 13(2): 4–5.
Jones K and Bullen N (1994). Contextual models of urban house prices: A comparison of fixed- and random-coefficient models developed by expansion. Economic Geography 70: 252–272.
Jones K, Gould MI, et al. (1998). Multiple contexts as cross-classified models: The labor vote in the British general elections of 1992. Geographical Analysis 30: 65–93.
Jones K and Moon G (1993). Medical geography: Taking space seriously. Progress in Human Geography 17(4): 515–524.
Kish L (1965). Survey Sampling. New York: Wiley.
Kreft I and de Leeuw J (1998). Introducing Multilevel Models. London: Sage.
Kreft IGG (1996). Are Multilevel Techniques Necessary? An Overview Including Simulation Studies. Los Angeles: Calfornia State University Press.
Leyland AH and Goldstein H, eds. (2001). Multilevel Modelling of Health Statistics. Wiley Series in Probability and Statistics. Chichester, England: Wiley.
Longford N (1993). Random Coefficient Models. Oxford: Clarendon Press.
Macintyre S (2000). The social patterning of health: Bringing the social context back in. Medical Sociology Newsletter 26: 14–19.
Macintyre S and Ellaway A (2000). Ecological approaches: Rediscovering the role of physical and social environment. In Berkman LF and Kawachi I, eds.: Social Epidemiology. New York: Oxford University Press., pp. 332–348.
Morris C (1983). Parametric empirical Bayes. Journal of the American Statistical Association 78: 47–65.
Morris C (1995). Hierarchical models for educational data: An overview. Journal of Educational and Behavioral Statistics 20: 190–199.
Rasbash J, Browne W, et al. (2000). A User’s Guide to MLwiN, Version 2.1. London: Multilevel Models Project, Institute of Education, University of London.
Rasbash J and Goldstein H (1994). Efficient analysis of mixed hierarchical and cross-classified random structures using a multilevel model. Journal of Educational and Behavioural Statistics 19(4): 337–350.
Rice N, Jones A, et al. (1998). Multilevel models where the random effects are correlated with the fixed predictors: A conditioned iterative generalised least squares estimator (CIGLS). Multilevel Modelling Newsletter 10(1): 10–14.
Roberts S (1999). Socioeconomic composition and health: The independent contribution of community socioeconomic context. Annual Review of Sociology 25: 489–516.
(p.110) Robinson S (1950). Ecological correlations and the behaviour of individuals. American Sociological Review 15: 351–357.
Selvin HC (1958). Durkheim’s suicide and problems of empirical research. American Journal of Sociology 63: 607–619.
Skinner C, Holt D, et al., eds. (1989). The Analysis of Complex Surveys. New York: Wiley.
Snijders TAB (2001). Sampling. In Leyland AH and Goldstein H, eds.: Multilevel Modelling of Health Statistics. Chichester, Engl: Wiley, pp. 159–174.
Snijders TAB and Bosker RJ (1993). Standard errors and sample sizes for two-level research. Journal of Educational Statistics 18: 237–259.
Spencer N (1998). Consistent parameter estimation for lagged multilevel models. Statistics Technical Report Paper 1. Hertfordshire, University of Hertfordshire Business School, Report No. UHBS: 19.
Spencer N and Fielding A (2000). An instrumental variable consistent estimation procedure to overcome the problem of endogenous variables in multilevel models. Multilevel Modelling Newsletter 12(1): 4–7.
Subramanian SV, Duncan C, et al. (2001). Multilevel perspectives on modeling census data. Environment and Planning A 33(3): 399–417.
Susser M (1994). The logic in the ecological. Am J Public Health 84: 825–835.
Tranmer M and Steel DG (2001). Ignoring a level in a multilevel model: Evidence from UK census data. Environment and Planning A 33: 941–948.
Weisberg S (1980). Applied Linear Regression. New York: Wiley.
For fundamental ideas underlying multilevel models, see: Raudenbush SW and Bryk AS (2002). Hierarchical Linear Models: Applications and Data Analysis Methods. 2nd edition. Thousand Oaks, California: Sage Publications. Goldstein H (1995). Multilevel Statistical Models. 2nd edition, London: Arnold. (An electronic version of this book can be downloaded free from the following website http://www.arnoldpublishers.com/support/goldstein.htm, accessed September 9, 2002.) Longford N (1993). Random Coefficient Models. Oxford: Clarendon Press.
For an applied perspective on multilevel models, see:
Bullen N, Jones K and Duncan C (1997). Modelling Complexity: Analysing Between-Individual and Between-Place Variation—A Multilevel Tutorial. Environment and Planning A 29(4): 585–609.
Hox J (2002). Multilevel Analysis: Techniques and Applications. Mahwah, NJ: Lawrence Erlbaum Associates.
Leyland AH and Goldstein H, eds. (2001). Multilevel Modelling of Health Statistics. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd.: Chichester.
Snijders T and Bosker R (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. London: Sage Publications.
(p.111) For hands-on practical tutorial based learning, see:
http://multilevel.ioe.ac.uk and http://tramss.data-archive.ac.uk/Software/MLwiN.asp, (accessed September 9, 2002.)
Rasbash J et al., A user’s guide to MLwiN, Version 2.1. 2000, London: Multilevel Models Project, Institute of Education, University of London. (An electronic version of this book can be downloaded free from http://multilevel.ioe.ac.uk/download/manuals.html; accessed September 9, 2002.)
Browne WJ (2002). MCMC Estimation in MLwiN, Version 1.0. London: Centre for Multilevel Modelling, University of London. (An electronic version of this book can be downloaded free from http://multilevel.ioe.ac.uk/dev/develop.html; accessed September 9, 2002.)