Implications of Total Energy Intake for Epidemiologic Analyses - Oxford Scholarship Jump to ContentJump to Main Navigation
Nutritional Epidemiology$

Walter C. Willett

Print publication date: 1998

Print ISBN-13: 9780195122978

Published to Oxford Scholarship Online: September 2009

DOI: 10.1093/acprof:oso/9780195122978.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2015. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 28 August 2016

Implications of Total Energy Intake for Epidemiologic Analyses

Implications of Total Energy Intake for Epidemiologic Analyses

Chapter:
(p.273) 11 Implications of Total Energy Intake for Epidemiologic Analyses
Source:
Nutritional Epidemiology
Author(s):

Walter Willett

Publisher:
Oxford University Press
DOI:10.1093/acprof:oso/9780195122978.003.11

Abstract and Keywords

This chapter deals with the importance of considering total energy intakes in nutritional epidemiology. Topics discussed include the physiologic determinants of energy utilization, adjustment for energy intake in epidemiologic analyses, and alternate approaches to adjust for total energy intake.

Keywords:   total energy intake, nutritional epidemiology, epidemiological studies, energy utilization, alternative approaches

Total energy intake deserves special consideration in nutritional epidemiology for three reasons:

  1. 1. The level of energy intake may be a primary determinant of disease.

  2. 2. Individual differences in total energy intake produce variation in intake of specific nutrients unrelated to dietary composition because the consumption of most nutrients is positively correlated with total energy intake. This added variation may be extraneous, and thus a source of error, in many analyses.

  3. 3. When energy intake is associated with disease but is not a direct cause, the effects of specific nutrients may be distorted or confounded by total energy intake.

Before examining these three issues in detail, the physiologic aspects of energy utilization and the determinants of variation in energy intake in epidemiologic studies are discussed. In accordance with common practice, total caloric intake is used synonymously with total energy intake in this chapter, although use of joules as a unit of measurement is more correct.

PHYSIOLOGIC DETERMINANTS OF ENERGY UTILIZATION

Physiologists have partitioned energy expenditure into several components: resting metabolic rate, thermogenic effect of food, physical activity, and adaptive thermogenesis (Horton, 1983) (Fig. 11–1). Resting (p.274)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–1. Components of energy expenditure. RMR, resting metabolic requirement; TEF, thermogenic effect of food; TEE, total energy expenditure; AT, adaptive thermogenesis. From Horton, 1983: reproduced with permission.

metabolic requirements are quantitatively the most important, representing approximately 60% of total energy expenditure in most individuals. The thermogenic effect of food (which is the metabolic cost of absorbing and processing carbohydrate, protein, and fat) varies with the sources of energy (Donato and Hegsted, 1985), but is only about 10% of the total. Adaptive thermogenesis represents the capacity of an individual to conserve or expend energy in response to variable intake of food or, perhaps, temperature extremes. In humans, adaptive thermogenesis is defined differently by various investigators (Sjostrom, 1985) and is difficult to measure. It has been estimated to be less than ± 10% of calories. In a moderately active individual, physical activity accounts for approximately 30% of energy intake (Horton, 1983).

Determinants of Between-Person Variation in Total Energy Intake

Although it is helpful to consider the average values for physiologic components of energy expenditure, epidemiologists are primarily interested in the determinants of variation in energy intake between individuals. Although many specific factors influence energy intake, they can be considered as three general categories: body size, metabolic efficiency, and physical activity. Departures from energy balance, that is, change in body energy stores due to intake above or below expenditure, also account for part of the observed variation among persons (Fig. 11–2).

Body size affects the amount of energy needed for resting metabolic activity and to sustain physical exertion. On the basis of careful measurements, such as those by Jequier and Schutz (1983) using a specially designed respiratory chamber, it appears that body size is a primary determinant of energy expenditure (Fig. 11–3), particularly at low levels of physical activity. These authors found that 24-hour energy expenditure was a linear function of body weight, which accounted for 74% of the variance in expenditure. Using similar methods, it has further been demonstrated that energy (p.275)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–2. Components of between-person variation in energy intake. The relative sizes of these components vary depending on the population being studied.

expenditure is primarily related to lean body mass rather than to fat mass (Ravussin et al., 1986). It must be noted that these study groups were heterogeneous with respect to age and gender and were atypically enriched with obese subjects, all of which would tend to inflate variation attributable to weight or lean body mass. More important, as noted by these authors, the usual physical activity of subjects in these studies was constrained by their restriction to a small metabolic chamber.

Among free-living subjects of the same gender and similar size, neither height nor weight accounts for a major proportion of the between-person variation in total caloric intake (Thomson and Billewicz, 1961; Gordon et al., 1984). For example, neither measure of body size was significantly correlated with caloric intake among a group of 194 women aged 34 to 59 years who collected four 1-week weighed diet records over 1 year (Willett et al., 1985) (Spearman r = 0.08 for height and r = −0.10 for weight, unpublished data). Underreporting of intake by clearly obese subjects has been documented (Prentice et al., 1986) and could explain some of the lack of a positive association between weight and energy intake. Although body size does not appear to be the major determinant of variation in energy intake among free-living subjects within a specific age and sex group, if analyses were conducted on groups markedly heterogeneous in size, which would be true if both men and women were included, then body size would become a more important determinant.

Although energy expenditure at rest accounts for the majority of absolute energy intake, resting energy expenditure does not vary greatly among individuals of similar age and sex. Thus, physical activity assumes a relatively large role in determining the variation in energy expenditure among individuals unconstrained by a metabolic chamber. The positive relationship between physical activity and energy intake has been appreciated for years (Johnson et al., 1956; Saltzman and Roberts, 1995) and has been clearly demonstrated by Morris and colleagues (1977) in an epidemiologic study among a population of bank clerks (Table 11–1). Among the 194 women previously mentioned, the correlation between a physical activity questionnaire and caloric intake based on 28 days of diet recording was 0.22 (unpublished data). The obese tend to be less physically active, and this relationship is sufficient to explain the weak inverse relation between relative weight and caloric intake that has been observed in most epidemiologic studies (Gordon et al., 1981; Romieu et al., 1988). The relationship also presumably underlies a long-term decline in per capita caloric intake in the United States despite an increasing prevalence of obesity (van Itallie, 1978; Abraham and Carroll, 1979; National Center for Health Statistics, 1979). The true proportion of variation in energy intake accounted for by physical activity varies substantially among populations and is likely to be seriously underestimated in most studies because of difficulty in accurately measuring physical activity. Ravussin and colleagues (1986) have demonstrated that even motor activity within the confines of a respiratory chamber (“fidgeting”) varies dramatically between persons and can account for hundreds of kilocalories per day. Such differences in activity would not be detected by typical (p.276)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–3. Relationship between body weight and 24-hour energy expenditure, based on respiration-chamber measurements. (From Jequier and Schutz, 1983; reproduced with permission.)

questionnaires. Thus, it appears likely that physical activity, which includes both fine motor and major muscle movement, is the dominant explanation for between-person differences in energy intake. Indeed, in most instances total energy intake can be interpreted as a crude measure of physical activity, particularly after controlling for body size, age, and sex.

Metabolic efficiency may contribute to individual differences in caloric intake; metabolically inefficient persons require greater amounts of energy to maintain their level of activity and weight. The mechanisms and determinants of metabolic efficiency (including differences in absorption and the general category of thermogenesis) are poorly defined in humans and are beyond the scope of this chapter. Individual differences apparently exist, and under carefully controlled conditions some subjects gain weight more rapidly than others who have a similar caloric intake (Sims et al., 1973; Saltzman and Roberts, 1995). However, these differences in metabolic efficiency among individuals, particularly between obese and nonobese persons, are relatively small and inconsistently observed. In a number of studies increased energy intake has appeared to reduce metabolic efficiency, in other words to increase thermogenesis (Miller, 1973; Himms-Hagen, 1984; Woo et al., 1985; Leibel et al., 1995). However, recent work using doubly-labeled water indicates that extra calories added to the diets of most persons are mainly stored as fat, suggesting that these individuals are already operating at maximal thermogenesis (Roberts et al., 1993). However, some modest capacity to increase metabolic efficiency (reduce thermogenesis) exists during underfeeding, thus aggravating attempts to lose weight. This degree of adaptive thermogenesis is modest, probably (p.277) less than 10% of total energy intake (Webb, 1985; Saltzman and Roberts, 1995).

Unfortunately, there is no practical method of measuring metabolic efficiency in an epidemiologic setting. The data of Jequier and Schutz (1983) (Fig. 11–3), however, suggest that there is relatively little between-person variation in energy expenditure after physical activity has been restricted and weight is accounted for. Thus, the contribution of metabolic efficiency to between-person variation in energy expenditure remains poorly defined but is not likely to be large.

The net balance of energy intake in relation to body size, metabolic efficiency, and physical activity determines whether a person gains or loses weight. In the absence of compensatory mechanisms, relatively small changes maintained over long periods would have a profound effect on body weight. For example, if an adult male who consumes 2,500 kcal per day increases his caloric consumption by only 2% and other factors remaining constant, over a 10-year period a theoretical 20-kg weight gain will result (theoretical weight change over 10 years = 2,500 kcal per day × 0.02 × 365 days per year × 10 years/9,000 kcal per kg of fat = 20.3 kg). In reality, the increase in weight will not be so dramatic, as the additional energy cost of maintaining and moving the added mass eventually equals the increment in energy intake and a new steady-state is obtained.

Hofstetter and colleagues (1986) have used the cross-sectional data in Figure 11–3, in which 1 kg in weight corresponds to about 20 kcal per day, to estimate the ultimate weight gain associated with a given change in energy availability. In this manner, the 2% change in energy intake (50 kcal/day) would result in an ultimate change of 2.5 kg, which is still readily measurable epidemiologically. (It must be noted that such calculations of projected weight changes from cross-sectional data are likely to be somewhat inaccurate for

Table 11–1. Association of leisure time physical activity and energy intake among London bus drivers

Thirds of distribution or men for energy intake

Physical activity in leisure time

Little

Moderate

Much

Low third

16

9

4

Middle third

9

12

3

High third

5

10

9

From Morris et al., 1977.

several reasons, including that weight in the cross-sectional data is largely lean body mass, whereas changes in weight caused by simple increases in caloric intake would be primarily due to adipose.) Careful, long-term studies of the effects of small increments in energy intake on body weight would be most useful because most studies have been done using large increments (Webb, 1985).

During short periods, such as months, the proportion of energy intake accounted for by balance (i.e., weight gain or loss) is larger if some persons are experiencing rapid weight change. Over the long-term, however, balance can account for only a very small part of between-person differences in energy intake. The true between-person variation in long-term caloric intake is poorly defined. Using 28 days of diet recording per subject (which minimizes variation within persons), we observed a standard deviation of 323 kcal/day (mean 1,620) among middle-aged women (Willett et al., 1985). This amount is somewhat less than the standard deviation of 473 kcal/day for women (mean 1,793) estimated by Beaton and coworkers (1979) from multiple 24-hour recalls using an analysis-of-variance model.

Because differences in total energy intake between individuals are largely determined by physical activity, body size, and metabolic efficiency, it is apparent that an epidemiologic study of only energy intake in (p.278) relation to disease risk is difficult or impossible to interpret. Energy intake is measurable only crudely (errors being considerably larger than 2%) with standard questionnaires or interviews; physical activity is probably measured even more crudely than diet; and metabolic efficiency is essentially unmeasurable in an epidemiologic setting. Thus, it would be difficult if not impossible to partition an individual’s total energy intake with sufficient precision so as to measure the balance available after accounting for physical activity and metabolic efficiency because this balance would be computed as the difference between two crudely measured variables minus an unmeasurable variable. It is therefore not surprising that the degree of obesity is not strongly correlated with energy intake in cross-sectional studies.

Although the interpretation of energy intake data in epidemiologic analyses may be difficult, simple and readily available measures of weight and height can be extremely useful as alternatives to direct measures of energy intake. The presence of high relative weight implies that, sometime in life, a positive balance between energy intake and energy expenditure has occurred. Even more useful, a change in weight implies a positive or negative energy balance during that time. The interpretation of data on weight and height, however, is potentially complicated because individuals apparently have some ability to compensate for increased or decreased caloric intake by changing their thermogenesis (i.e., reducing or increasing metabolic efficiency). Moreover, individuals may vary in their capacity to respond in this way (Miller, 1973). It is thus conceivable that excess caloric intake could increase disease risk in a manner that was mediated by a thermogenic response (i.e., related to increased metabolic rate) rather than by the accumulation of fat. The interpretation of epidemiologic data relating to weight and energy intake, therefore, depends in an important way on how body weight responds to increased energy intake. In model 1 (Fig. 11–4), increased energy intake is fully compensated by adaptive thermogenesis up to a certain point, and weight gain occurs only after a threshold increase in caloric intake is exceeded. In model 2, any long-term increase in energy intake causes weight gain; any compensatory increase in thermogenesis occurs only in conjunction with weight gain.

The implication of model 2 for epidemiologists is that the absence of a relationship between obesity and risk of disease implies that the disease is not caused by excess energy intake. Failure to observe an association between relative weight and risk of disease thus provides evidence that changes in energy intake alone do not influence disease occurrence. This may be appreciated by considering the hypothetical situation where even a small increase in energy intake, say 5%, with physical activity held constant raises the risk of disease. If model 2 applies, this difference in intake would produce an easily measurable weight gain; therefore, individuals who had increased their energy consumption would weigh more, and we would observe a positive association between relative weight and risk of disease. If model 1 were correct, an absence of association with obesity would not exclude the possibility that increased energy intake causes a higher risk of disease by inducing a fully compensating higher level of thermogenesis. Recognizing that individuals may vary in their response to energy intake, model 2 would need to apply only to an appreciable proportion of the population, not necessarily to all individuals, to be relevant epidemiologically.

The data of Donato and Hegsted (1985), based on rats, indicate that gain in body weight is a linear function of energy availability, that is, model 2 applies. Based on a review of metabolic studies among humans, Woo and colleagues (1985) suggested that any adaptive response in thermogenesis occurs only after some change in adiposity, and this has been supported by more recent evidence (Saltzman and Roberts, 1995). Thus, it seems appropriate to (p.279)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–4. Alternate models of response to an increase in long-term energy intake. In model 1, increased thermogenesis prevents weight gain up to a certain increment in energy intake. In model 2, any compensatory increase in energy expenditure occurs only in conjunction with weight gain.

interpret a lack of association with relative weight in a specific study as evidence against a direct causal effect of total energy intake on risk of disease.

Relation of Energy Intake With Specific Nutrient Intake

Intakes of most nutrients in free-living populations tend to be positively correlated with total energy intake (Jain et al., 1980; Lyon et al., 1983; Gordon et al., 1984). Data based on four 1-week diet records were used to examine these correlations among 194 women (see Table 11–2, last row). Correlations were particularly strong for fat, protein, and carbohydrate (which contribute to energy intake); in this population alcohol intake was quite modest and was only weakly correlated with total energy consumption. Every other nutrient examined, however, was also correlated with total energy intake even though many did not contribute to energy. For example, the correlation with energy was 0.24 for fiber, 0.25 for vitamin A, and 0.19 for vitamin C. This tendency for all nutrients, even minerals and vitamins, to be correlated with total energy intake results from the tendency of larger, more active, and less metabolically efficient persons to eat more food in general.

These interrelations are further complicated by the observation that the composition of diets may vary by level of total caloric intake, depending on the behavior of the population. For example, as shown in Table 11–3, women with lower energy intake tended to have a proportionally higher intake of fiber than women with higher energy intake. These correlations between caloric intake and specific nutrient intakes further highlight the need to consider total energy intake when interpreting associations between specific nutrients and disease in epidemiologic studies.

ADJUSTMENT FOR ENERGY INTAKE IN EPIDEMIOLOGIC ANALYSES

When relationships with disease are analyzed, nutritional factors may be examined in terms of absolute amount (crude intake) (p.280)

Table 11–2. Correlations (Spearman r) between intakes of specific nutrientsa

Protein

Total fat

Saturated fat

Polyunsaturated fat

Total carbohydrate

Crude fiber

Vitamin A

Sucrose

Vitamin B6

Vitamin C

Cholesterol

Alcohol

Protein

Total fat

0.44

−0.13

−0.14

Saturated fat

0.42

0.93

−0.10

0.74

−0.10

0.74

Polyunsaturated

0.28

0.73

0.52

fat

−0.08

0.56

0.02

−0.11

0.56

0.02

Total

0.39

0.56

0.47

0.41

carbohydrate

−0.34

−0.48

−0.43

−0.15

−0.23

−0.52

−0.57

−0.14

Crude fiber

0.36

0.01

−0.06

0.12

0.45

0.32

−0.44

−0.51

−0.07

0.35

0.24

−0.45

−0.51

−0.08

0.45

Vitamin A

0.45

0.08

0.03

0.10

0.66

0.66

0.40

−0.33

−0.38

−0.08

0.13

0.64

0.37

−0.32

−0.36

−0.08

0.19

0.61

Sucrose

0.21

0.53

0.46

0.37

0.18

0.18

0.14

−0.54

−0.18

−0.19

−0.13

0.60

−0.17

−0.20

−0.45

−0.21

−0.25

−0.11

0.56

−0.05

−0.12

Vitamin B6

0.56

0.15

0.14

0.12

0.65

0.65

0.57

0.22

0.44

−0.49

−0.39

−0.22

0.26

0.61

0.53

−0.23

0.41

−0.48

−0.38

−0.22

0.33

0.59

0.59

−0.16

Vitamin C

0.34

0.002

−0.05

−0.00

0.61

0.61

0.13

0.13

0.55

0.32

−0.47

−0.46

−0.14

0.17

0.62

0.60

−0.12

0.53

0.26

−0.45

−0.44

−0.15

0.34

0.59

0.58

−0.04

0.51

Cholesterol

0.60

0.51

0.55

0.26

0.05

0.05

0.25

0.18

0.24

0.12

0.58

0.16

0.22

−0.02

−0.36

−0.00

−0.17

−0.34

0.09

0.11

0.50

0.19

0.26

−0.04

−0.32

−0.07

0.13

−0.30

0.05

0.03

Alcohol

−0.03

0.04

0.12

−0.01

−0.15

−0.14

−0.03

−0.06

−0.06

−0.06

−0.04

−0.16

−0.12

0.06

−0.13

−0.50

−0.19

−0.10

−0.14

−0.16

−0.14

−0.15

−0.13

−0.12

0.05

−0.13

−0.54

−0.18

−0.07

−0.21

−0.14

−0.12

−0.12

Energy

0.59

0.86

0.81

0.59

0.82

0.24

0.25

0.71

0.40

0.19

0.47

0.15

(a) For each comparison, top r value is for crude nutrient intake, middle value is for nutrient density, and the bottom value is for calorie-adjusted (using regression analysis) nutrient intake. Data are based on the individual means of 28 days of diet recording by each of 194 women.

(p.281) (p.282)

Table 11–3. Correlations (Pearson r) between total caloric intake, crude nutrient intake, nutrient densities, and calorie-adjusted nutrient intakes a

Nutrient

Calories vs. crude nutrient

Nutrient densityb vs. crude nutrient

Calorie-adjusted vs. crude nutrient’

Calories vs. nutrient densityb

Calories vs. calorie-adjusted’

Nutrient densityb vs. calorie-adjustedc

Protein

0.60

0.31

0.80

−0.57

0.000

0.82

Total fat

0.88

0.52

0.48

0.05

0.000

0.99

Saturated fat

0.81

0.66

0.58

0.09

0.000

0.99

Polyunsaturated fat

0.64

0.77

0.77

−0.01

0.000

0.99

Sucrose

0.72

0.91

0.70

0.33

0.000

0.93

Cholesterol

0.47

0.70

0.88

−0.31

0.000

0.95

Carbohydrate

0.86

0.72

0.52

0.26

0.000

0.97

Fiber

0.34

0.82

0.94

−0.26

0.000

0.97

Vitamin A-ns

0.34

0.84

0.94

−0.23

0.000

0.97

Vitamin A-ws

0.25

0.90

0.97

−0.19

0.000

0.98

Vitamin B6-ns

0.46

0.76

0.89

0.22

0.000

0.97

Vitamin B6-ws

0.15

0.98

0.99

−0.06

0.000

0.99

Vitamin C-ns

0.28

0.88

0.96

−0.20

0.000

0.98

Vitamin C-ws

0.15

0.96

0.99

−0.13

0.000

0.99

(a) Data are based on the individual means of 28 days of dietary recording by each of 194 women and on four 1-week diet records. All values were transformed using natural logarithm to improve normality, ns, without supplements; ws, with supplements.

(b) Nutrient density is the nutrient divided by calories.

(c) Calorie-adjusted using regression analysis.

or in relation to total caloric intake. The analytic approach depends on both the nature of the biologic relationship and the public health considerations. If a nutrient is metabolized in approximate proportion to total caloric intake (such as the rnacro-nutrients and some vitamins), nutrient intake is most likely biologically important in relation to caloric intake. To the extent that energy intake reflects body size, adjustment for total energy intake is usually appropriate as an absolute amount of a specific nutrient tends to have less of an effect for a larger, higher energy-consuming person than for a smaller person. In some situations, we may be unsure whether it is the absolute amount of a nutrient or the amount in relation to total caloric intake that is most biologically relevant. (Of course, the biologically relevant relationship with caloric intake may actually be complex and nonlinear.) If a nutrient selectively affects an organ system that is un-correlated with body size (e.g., the central nervous system), or if physical activity does not affect its metabolism, absolute intake may be most relevant. For example, menstrual blood losses are not thought to increase with physical activity; thus iron requirements to prevent anemia among premenopausal women might be related to absolute intake.

It is interesting to note that if absolute nutrient intake rather than intake in relation to calories is biologically most relevant, caloric intake should be associated with disease, as intakes of virtually all nutrients are positively correlated with total caloric intake. For example, if higher absolute intake of a nutrient is a cause of disease, then those who consume more total food due to being large, active, or metabolically inefficient should be at higher risk of disease. In the example above, we would expect to see a greater risk of iron deficiency anemia among women with lower total energy intake due to lower physical activity. Conversely, a lack of association (p.283) between total energy intake and disease can be taken as evidence against the importance of absolute nutrient intake, but not against the importance of nutrient composition of the diet.

Because a person’s long-term total caloric intake is largely determined by body size, physical activity, and metabolic efficiency, even relatively small changes in caloric intake cannot be made unless changes in weight or physical activity also occur. In the absence of such changes, therefore, most alterations in absolute nutrient intake must be accomplished by changing the composition of the diet rather than the total amount of food. For this reason, Hegsted (1985) has suggested that dietary recommendations should be made in reference to total caloric intake; for example, it has been recommended that we reduce total fat intake to 30% of total caloric intake. Therefore, from a practical or public health standpoint, nutrient intake in relation to total caloric intake (i.e., the qualitative aspect or composition of diet) is most relevant. For this reason, in epidemiologic studies, nutrient intakes adjusted for total energy intake, rather than absolute nutrient intakes, are of primary interest in relation to disease risk. Adjusting nutrient intakes for total energy intake in epidemiologic studies can also be viewed as being analogous to animal experiments or metabolic studies in humans; to determine whether an effect is due to a nutrient per se, it is essential that the diets being compared are isocaloric.

As reports of diet and disease relationships have often been made without adjustment for total caloric intake, the implications of this approach are discussed first. Because body size, physical activity, and metabolic efficiency contribute to the variation of specific nutrient intakes, associations between nutrients and diseases that are actually independent of these factors are weakened. That is, differences in the levels of these factors cause variation in energy intake and, secondarily, variation in intake of specific nutrients that may be extraneous or irrelevant to occurrence of disease. For example, tall and physically active women tend to have high absolute fat intakes on the basis of their size and exercise level alone. If the relevant exposure for breast cancer risk is fat intake independent of body size and physical activity (i.e., the fat composition of the diet), failure to account for these factors results in misclassification of exposure that is likely to be largely random. The influence of removing extraneous variation can be appreciated in studies that have examined the correlation between nutrient intake, such as specific carotenoids, and levels of these nutrients in blood or adipose, which presumably are more directly reflective of biologic effects. In general, adjustment for total energy intake has increased associations between calculated nutrient intakes and levels in blood or adipose (Willett et al., 1983; London et al., 1991; Ascherio et al., 1992; Hunter et al., 1992), sometimes to only a small degree, but in other instances substantially. To partially address this issue, nutrient intakes are sometimes divided by a measure of body size, such as intake per kilogram of body weight (Sopko et al., 1984). Unfortunately, it is seldom possible to account for the effects of physical activity and metabolic efficiency in a similar manner because these are difficult to measure accurately.

When total caloric intake is associated with disease, the interpretation of individual nutrient intake is complex, and the consequences of failing to account for energy intake may be far more serious. As has been pointed out by Lyon and colleagues (1983), specific nutrients tend to be associated with disease simply on the basis of their correlation with caloric intake. For example, in nearly every study of diet and coronary heart disease, subjects who subsequently develop disease have lower total caloric intake on the average than do those who remain free of disease (Morris et al., 1977; Garcia-Palmieri et al., 1980; Gordon (p.284) et al., 1981; Kromhout and de Lezenne Coulander, 1984; McGee et al., 1984b). As a result, intakes of specific nutrients also tend to be lower among cases than among noncases.

These relationships are illustrated in analyses of prospectively collected data on dietary intake and coronary heart disease incidence among men living in Honolulu, Puerto Rico, and Framingham (Gordon et al., 1981). Among the Honolulu men, for example, the crude intakes of 9 of 11 nutrients (including total calories) were lower among those with subsequent coronary heart disease, and for two nutrients there was no difference (Table 11–4). In this situation, it is helpful to consider possible reasons for an observed difference in caloric intake between men who developed coronary disease and those who remained free of disease. Difference in body size is an unlikely explanation as men who subsequently develop coronary heart disease tend, if anything, to weigh more than those who do not. Variation in level of metabolic efficiency is usually impossible to eliminate as an explanation. On the other hand, several investigators have clearly demonstrated that decreased physical activity is associated with an increased risk of coronary heart disease (Morris et al., 1977; Paffenbarger et al., 1978). Although differences in physical activity provide the most likely explanation for the low caloric intake associated with coronary heart disease, this explanation was not universally appreciated (Garcia-Palmieri et al., 1980; McGee et al., 1984a). Thus, an appropriate interpretation of the inverse association between total caloric intake and risk of coronary heart disease is not that one should increase food intake to avoid a myocardial infarction, but rather that an increase in physical activity may reduce the risk of disease. This example, incidentally, illustrates the need to be guided by an understanding of biologic relationships when interpreting statistical associations to avoid absurd conclusions.

Because variation in caloric intake between persons largely reflects physical activity, size, and metabolic efficiency, an association between a specific nutrient and disease is not likely to be of primary etiologic importance if that association is simply the result of a difference in caloric intake. For this reason, Morris and coworkers (1977) have pointed out that it “would not be instructive to present data relating crude nutrient intakes with disease in a situation in which caloric intake has an important relationship with the outcome.”

It is, of course, possible that overeating or undereating (caloric excess or deficiency) is a primary cause of a disease. In this situation, higher intakes of nutrients that contribute to calories (proteins, fats, carbohydrates, and alcohol) might be considered as the primary exposures that lead to increased total caloric intake, which in turn causes disease. It could be argued that adjustment for caloric intake in this situation would represent “overcontrol” of a variable in the causal pathway. Before attributing an effect to a specific nutrient, however, the burden is on the epidemiologist to demonstrate that the association of this nutrient with disease is independent of caloric intake. For example, perhaps excessive caloric intake increases the risk of colon cancer, and dietary fat is associated with this disease because of its high caloric content. Before implicating fat per se as a specific cause, however, it would be essential to demonstrate that this effect is not shared by protein or carbohydrate when these are eaten in equicaloric amounts. Otherwise, a reduction in the fat content of the diet that was replaced by an increase in carbohydrate or protein on a calorie-by-calorie basis would have no effect on disease occurrence: This would only happen when the total caloric intake was also changed. The desirability of relating nutrient intake to total caloric intake has been discussed in a thoughtful correspondence with respect to studies of coronary heart disease (Kushi et al., 1985; Shekelle et al., 1985). Recognizing the need to adjust for the effect of total (p.285)

Table 11–4. Age-adjusted means of crude nutrient intakes and nutrient intakes as a percentage of total calories according to subsequent coronary heart disease (CHD) death or myocardial infarction (MI)a

Crude intakes

Intakes as % of calories

No CHD

MI or CHD death

No CHD

MI or CHD death

(n = 7,008)

(n = 164)

(n = 7,008)

(n = 164)

Total calories (kcal)

2,319

2,149b

Total protein (g)

95

93

16.6

17.4c

Total fat (g)

87

86

33.4

35.6c

Saturated fat (g)

32

31

12.3

12.9c

Monounsaturated fat (g)

33

32

12.8

13.6b

Polyunsaturated fat (g)

16

16

6.0

6.7

Total carbohydrate (g)

264

242b

46.2

45.4

Sugar (g)

46

46

7.9

8.2

Starch (g)

165

151b

29.2

28.5

Other carbohydrate (g)

52

45c

9.1

8.7

Cholesterol (mg)

555

530

Alcohol (g)

14

5b

3.8

1.7b

(a) Data are based on a cohort of 7,172 Honolulu men aged 46–64 years initially free of CHD.

From Tables 4 and 8 of Gordon et al., 1981.

(b) p<0.01.

(c) p<0.05.

food consumption, a number of investigators have employed “nutrient densities” to control for the effect of total caloric intake.

Analyses of Diet–Disease Relationships by the Use of Nutrient Densities

Nutrient densities are measures of dietary composition computed by dividing nutrient values by total caloric intake; they provide a convenient way to describe foods or diets. An analogous approach for macronutrients is to express intake as a percentage of total caloric intake; for purposes of discussion, both approaches are referred to as nutrient density. Nutrient density has the appeal of simplicity and practicality; unfortunately, this is actually a complex variable that can lead to confusion when used to address diet–disease relationships. Such a variable has two components: the nutrient intake and the inverse of total caloric intake. The relative contributions of nutrient intake and total caloric intake to between-person differences in nutrient density are related to the ratio of their variability. Thus, as the between-person variation in the specific nutrient intake decreases, the nutrient density value approaches the inverse of caloric intake (multiplied by a constant).

When energy intake is unrelated to disease, dividing nutrient intakes by total calories may have the desired effect of reducing the variation in nutrient intake that is due to differences in size, activity, and metabolic efficiency. The division, however, also can create unwanted variation. Particularly when a specific nutrient has a weak correlation with total energy intake or has a low variability, dividing by total calories creates a variable that is, in fact, highly related to the factor whose effect we wish to remove, that is, caloric intake. In addition, methodologic error in measuring total energy intake could potentially contribute to variation in nutrient density as a result of this division. The basic principle involved is that dividing by a variable does not necessarily remove or “control for” the effect of that variable.

(p.286) Like any ratio (or cross-product term), the nutrient density can also be viewed statistically as an interaction, which has troubled some persons. However, this ratio is in itself biologically meaningful; it would be expected that the effect on an absolute intake of a specific nutrient would be greater at low energy intakes (e.g., for a small person) than at high energy intakes (e.g., for a large person) (Willett, 1990).

As with absolute nutrient intakes, the use of nutrient densities has serious potential pitfalls when total energy intake is itself associated with disease. Because a nutrient density variable contains the inverse of energy intake as a component, nutrient densities tend to be associated with disease in the direction opposite to that of total caloric intake, even when the nutrient itself has no association with disease independent of energy intake. The data of Gordon and colleagues (1981), which present nutrient intakes as percentages of total calories, again illustrate this point. Because coronary heart disease is associated with low caloric intake, nutrient densities (or intakes as percentages of total calories) tend to be positively associated with disease (Table 11–4). In this instance, dividing by total calories has changed the direction of association with coronary heart disease for protein and total, saturated, monounsaturated, and polyunsaturated fat, and four of these differences become statistically significant. Differences between cases and controls have essentially disappeared for the three measures of carbohydrate intake that were statistically significant in the crude analysis. The potential for artifact created by the use of nutrient densities in this example can most vividly be appreciated by realizing that any random variable divided by total energy intake would be positively associated with risk of coronary heart disease.

In some studies, the reason for an association between energy intake and risk of disease may be obscure. In a carefully conducted Canadian case–control investigation of large bowel cancer by Jain and co-workers (1980), cancer patients reported higher caloric intake than did controls but did not weigh more than the controls (see crude intakes, Table 11–5). In addition, cancer cases consistently reported higher intakes of fat than did noncases, which was interpreted to “support the hypothesis that high dietary fat intake is causally associated with cancer of the colon and rectum.” In interpreting these findings, it is again useful to consider possible explanations for the difference in caloric intake between cases and controls. It seems unlikely that higher physical activity by cancer cases would explain their higher energy intake; indeed, many studies to the contrary have been published (Garabrant et al., 1984; Vena et al., 1985; Giovannucci, 1995). We cannot dismiss the possibility that cases have a metabolic abnormality that renders them less efficient in their utilization of food energy. For example, it is conceivable that subjects who absorb food poorly present more substrate to their fecal flora, which metabolize this to carcinogenic substances and thus increases the risk of large bowel cancer. The case–control difference in caloric intake is unlikely to be the result of simple overeating in this study as cases did not weigh more than controls, even several years before diagnosis.

In addition to biologic factors, recall bias cannot be dismissed as an explanation for the findings of Jain and colleagues (1980) as this was a case–control study. Whatever biologic or methodologic factors contribute to the association of caloric intake with colon cancer, any differences in the intakes of specific nutrients between cases and controls that result from the strong association of caloric intake with cancer must be regarded as secondary. For illustrative purposes, we recalculated intakes as nutrient densities instead of crude values as originally presented; the findings were dramatically altered (Table 11–5). The association with total fat intake essentially disappears for men and is largely eliminated for women. On the other hand, strong inverse associations are seen for fiber and vitamin C intakes expressed as nutrient densities, (p.287)

Table 11–5. Case minus control differences in crude and nutrient density intakes expressed as a percentage of case valuea

Case–control difference (%)

Crude intake (original analysis)

Nutrient density intake (recalculation)

Males

Females

Males

Females

Colon

Rectum

Colon

Rectum

Colon

Rectum

Colon

Rectum

Calories

Neighborhood controls

9b

7

12c

17c

Hospital controls

1

9d

6d

11b

Total fat

Neighborhood controls

8b

6

15c

22c

0

−1

4

7

Hospital controls

2

11d

10d

15b

2

2

4

5

Saturated fat

Neighborhood controls

13b

8

16c

27c

4

1

5

12

Hospital controls

6

13d

9d

19c

5

4

2

8

Crude fiber

Neighborhood controls

−5

−3

1

2

−15

−11

−12

−17

Hospital controls

−2

5

7

5

−3

−5

1

−7

Vitamin C

Neighborhood controls

−3

4

−4

0

−13

−3

−18

−20

Hospital controls

−2

6

−2

−3

−3

−6

−9

−16

(a) Data are calculated from Table 5 of a case–control study of colon and rectal cancer conducted among Canadian men and women between 1976 and 1978 (Jain et al., 1980). No tests of statistical significance are available for nutrient density data.

(b) p<0.01.

(c) p<0.002.

(d) p<0.05.

From Willett and Stampfer, 1986.

which had no association with cancer in the crude analysis. This nutrient density analysis, however, overstates the protective association of fiber and vitamin C and underestimates the effect of fat, because dividing by caloric intake produces inverse associations even when these nutrients are not independently associated with disease. Jain and colleagues (1980) recognized the potential for confounding by total caloric intake and stated that the effects of fat and total caloric intake were difficult to separate as it was not possible to enter both simultaneously in a logistic model because of their high correlation. Instead, they considered fat rather than calories to be the primary factor due to its stronger (albeit slight) association with cancer and the findings of previous animal studies. On the basis of the data presented in the original article, the findings for fat are difficult to interpret as the positive association of this and other nutrients is overstated in crude analyses and understated in nutrient density analyses. If it could be demonstrated that the positive association between caloric intake and colon cancer incidence is related to a real difference in metabolic efficiency between cases and noncases rather than to methodologic bias, this would be an important increment in knowledge, even though the association would not represent a primary etiologic effect of diet and, therefore, would have no direct implications for nutritional advice.

The study of Jain and coworkers (1980) is presented as an example; these authors have provided additional data from their (p.288) study (Howe et al., 1986, 1997) demonstrating that saturated fat was positively associated with risk of colon cancer independent of energy intake, but fiber intake was not independently related to risk. In a meta-analysis of case–control studies (Howe et al., 1997), a strong and consistent positive association was again seen with total energy intake. After controlling for total energy, no association was seen with total or saturated fat, and a strong inverse association was seen with fiber intake. However, results from subsequent large prospective cohort studies do not support a positive relation between energy intake and risk of colon cancer. In all of these studies, the relationship of total energy intake with risk of colon cancer was inverse (Willett et al., 1990; Bostick et al., 1994; Giovannucci et al., 1994; Goldbohm et al., 1994); in two studies this was statistically significant (Bostick et al., 1994; Goldbohm et al., 1994). The prospective findings are thus consistent with the clear evidence of a protective effect of physical activity against colon cancer and strongly suggest that the case–control findings with total energy intake were due to methodologic bias. This discordance raises serious concerns regarding the validity of case–control studies of diet and cancer. Because both the meta-analysis of case–control studies and most cohort studies found little association between fat intake and colon cancer risk after adjusting for total energy intake, this adjustment may lead to correct conclusions by accounting for bias in reporting of overall food intake in some situations, but not necessarily in all circumstances.

ALTERNATE APPROACHES TO ADJUST FOR TOTAL ENERGY INTAKE

For reasons already discussed, it is usually desirable in epidemiologic analyses to employ a measure of nutrient intake that is independent of total energy intake, particularly when energy intake is associated with disease. In this section four analytic strategies and their relationships to each other are considered: the “energy-adjusted” method, the standard multivariate method, the “energy decomposition” method, and the multivariate nutrient density method.

Energy-Adjusted or Residual Method

“Energy-adjusted” nutrient intakes are computed as the residuals from the regression model with total caloric intake as the independent variable and absolute nutrient intake as the dependent variable (Fig. 11–5). The nutrient residuals by definition provide a measure of nutrient intake uncorrelated with total energy intake. Conceptually, this procedure isolates the variation in nutrient intake due only to the nutrient composition of the diet from the overall variation in nutrient intake, which is due to both composition and overall food consumption. For macronutrients, if expressed in units of calories (e.g., calories from fat), the residuals can also be conceptualized as the substitution of that nutrient for a similar number of calories from other sources (Kipnis et al., 1993). This model can also be viewed as analogous to animal or metabolic feeding studies in which total energy is held constant, but the amount of the nutrient being evaluated is varied between the groups being compared. For an energy-bearing nutrient, a decision needs to be made regarding the other dietary components for which the nutrient will be exchanged. Because residuals have a mean of zero and include negative values, they do not provide an intuitive sense of actual nutrient intake. It may, therefore, be desirable to add a constant; logical choices are the predicted nutrient intake for the mean energy intake of the study population or a round number for energy intake near the population mean (Fig. 11–5).

To illustrate adjustment for total energy intake by regression analysis, we have used daily intakes of total calories and total fat based on the means of four 1-week diet records kept by each of 194 women, as described previously (Willett et al., 1985). (p.289)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–5. Calorie-adjusted intake = a + b, where a = residual for subject from regression model with nutrient intake as the dependent variable and total caloric intake as the independent variable and b = the expected nutrient intake for a person with mean caloric intake. (From Willett and Stampfer, 1986; reproduced with permission.)

With 28 days of recording per subject, the effect of day-to-day variation has been sufficiently dampened such that these values can be assumed to reasonably represent each subject’s long-term intake. The unadjusted intake of total fat has a reasonably wide distribution (mean = 68.9, 1 standard deviation = 17.0 g per day; Fig. 11–6). Because total fat and total caloric intake are highly correlated (r = 0.86), adjustment for total caloric intake reduces the variation in fat intake substantially (mean = 68.9, 1 standard deviation = 8.7 g per day; shaded area in Fig. 11–6). Nevertheless, the degree of variation remaining is realistic in relation to current dietary recommendations; the 10th percentile (median of the lowest quintile) represents 33% of calories from fat, and the 90th percentile represents 44% of calories from fat. Thus, sufficient variation exists in this population to test the effectiveness of a 25% reduction in the proportion of calories accounted for by fat, as recommended by the National Heart, Lung, and Blood Institute (Anonymous, 1985).

When total energy intake is an important predictor of disease, total caloric intake should be included in the model with the nutrient calorie-adjusted term (model 2, Table 11–6). This approach is preferable to entering only calorie-adjusted nutrient into a model, as the random error (and width of confidence limits for the effect of the nutrient) (p.290)

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–6. Distribution of total fat intake with (shaded area) and without (dark line) adjustment for total caloric intake. Data are based on four 1-week diet records completed by 194 Boston-area women aged 34 to 59 years. Calorie-adjusted values were calculated as described in the text, with residuals computed in the loge scale to improve normality. (From Willett and Stampfer, 1986; reproduced with permission.)

may be reduced if caloric intake has an important association with the outcome independent of nutrient intake. Gail and coworkers (1984) have also shown that un-correlated variables can confound each other in nonlinear models, including logistic regression. Another advantage of using model 2 rather than model 1 is that the full effect of total caloric intake can be observed.

Shekelle et al. (1987) have calculated calorie-adjusted nutrient intakes using regression analysis and examined their correlations with nutrient densities; these correlations were consistently high (greater than 0.90). On this basis they suggested that, although the calorie-adjusted values were theoretically preferable, the use of nutrient densities may not necessarily lead to materially different conclusions in epidemiologic analyses. We observed similarly high correlations (Table 11–3), but were not confident that the difference between calorie-adjusted values and nutrient densities might not be important in some instances, as the major concern is the potential for confounding by calories. As shown in Table 11–3, the correlations between calorie-adjusted values and calories is, by definition, zero. Several nutrient density values, however, particularly for those nutrients less strongly correlated with total energy intake, were moderately correlated with caloric intake. For example, the correlation of calories with protein/calories was −0.57 and with fiber/calories was −0.26, meaning that the potential for confounding by total calories is substantial if energy intake is related to disease. Because the magnitudes of these correlations are partly related to food choices rather than by laws of nature, these correlations will vary from one group to another. The degree of confounding depends on the strength of association between energy intake and disease as well as nutrient intake and disease; therefore, it seems that distinction between nutrient densities and regression-adjusted values is likely to have practical importance in some, although certainly not all, instances. (p.291)

Table 11–6. Alternative disease risk models for addressing the correlations of specific nutrient intakes with total energy intake in epidemiologic analyses

Model 1

(Residual Method) Disease = b1 Nutrient residuala

Model 2

(Residual Method) Disease = b1 Nutrient residual + b2 Calories

Model 3

(Standard Multivariate Method) Disease = b3 Calories +b4 Nutrient

Model 4

(Energy Partition Method) Disease = b5CalNutrient b + b6Calother c

Model 5

(Multivariate Nutrient Density Method) Disease = b7 Nutrient/Calories +b8Calories

(a) “Nutrient residual” is the residual from the regression of a specific nutrient on calories.

(b) Calnutrient represents calories provided by the specific nutrient.

(c) Calother represents calories from sources other than the specific nutrient.

It appears that this distinction was of major importance for fiber intake in the case–control study by Jain and coworkers (1980) because the nutrient density data in Table 11–5 imply a protective association whereas the regression-adjusted analysis reportedly did not.

The Standard Multivariate Method

The residual approach of calorie adjustment (model 1, Table 11–6) is similar, although not identical, to including both caloric intake and absolute nutrient intake as terms in a multiple regression model with disease outcome as the dependent variable (model 3). The coefficient for the nutrient term in this multivariate model (b4) is identical to that for the calorie-adjusted nutrient term in a univariate model (b1). With total caloric intake also in the calorie-adjusted model (model 2), the standard error as well as the coefficient for the calorie-adjusted nutrient term (b1) is identical to that for the nutrient term (b4) in the standard multivariate model with nutrient and calories.

Although similar in some respects, the standard multivariate model (model 3) creates complexities in interpretation not shared by model 2. If calories and nutrient are simply entered as separate terms, the coefficient for calories (b3) represents calories independent of the specific nutrient, which may have a meaning distinctly different from total energy intake. For example, if fat is the specific nutrient in model 3, then the term for calories has the meaning of carbohydrate plus protein (not considering the possible contribution of alcohol). Thus the inclusion of a specific nutrient together with calories in a model fundamentally changes the biologic meaning of calories. The coefficient for calories in this model (b3) may, therefore, fail to attain significance when total energy intake, in fact, has a significant and important relation with disease. In contrast, the two terms in model 2 clearly address two distinct and clear questions: Is total energy intake associated with disease? Is the nutrient composition of the diet related to disease? The simple and clear meaning of the calorie-adjusted intakes also makes them attractive for bivariate analyses and data presentation. For example, if interest exists in the nutrient composition of the diet, represented by the nutrient residual, it is important to know more about this variable, such as the foods that contribute to its intake and the validity of its measurement as assessed in comparisons with other methods. These issues cannot be easily addressed unless the variable of interest is expressed as a single term.

The use of the standard multivariate model can also create confusion between the distributions of crude nutrient intake and the nutrient intake independent of energy intake. For example, if the nutrient is used as a continuous variable, a single coefficient is obtained; to convert this to a relative risk, a somewhat arbitrary increment in the nutrient intake is needed. Unless the residuals have been computed and examined, an increment based on the crude nutrient intake, for example the 90th versus the 10th percentile, might be chosen. However, this degree of variation in the nutrient intake independent of energy intake would probably not actually exist in the population, (p.292) and the relative risk would thus be unrealistically large. A relative risk based on the 90th versus 10th percentile of the residual nutrient intake, which would be a smaller increment, would be more appropriate.

Some authors have voiced concern over the simultaneous inclusion of strongly correlated variables in the same model, which will frequently occur using the standard multivariate model in nutritional studies. McGee and colleagues (1984a) have noted that widely divergent results are obtained when highly correlated variables are entered in multiple logistic models using various inclusion criteria, and they suggest that variables with correlation coefficients of more than approximately 0.60 not be simultaneously included. The issue of collinearity is, however, better viewed in biologic rather than purely statistical terms. The first problem created by including two strongly correlated variables in a model is that their meaning may change in a manner that is not readily appreciated. The example of calories and fat is noted above, and the problem of height becoming a measure of body composition when weight is included in a model was noted in Chapter 10. The second problem resulting from collinearity is that one or both variables may have a markedly reduced degree of independent variation when they are entered simultaneously. Rather than using an arbitrary level of correlation as a criterion for unacceptable collinearity, however, a more informative approach is to examine the degree of residual variation in the variables of interest and judge whether the remaining differences among individuals are worthy of study. For example, even though the correlation between total energy and total fat intakes was 0.86 in the data displayed in Figure 11–5, the residual variation in fat after adjusting for total calories was found to be of potential interest. If the residual variation in the variable of interest, such as fat intake, is not large enough to be informative, the issue is not statistical but relates to the nature of the study population; the only solution is to find another population with a wider residual variation. When faced with highly correlated nutritional variables, models 1 and 2 allow a clear specification of the meaning of the variables, as well as the opportunity to evaluate the residual variation in nutrient intake directly.

The “Energy Decomposition” or “Energy Partition” Method

Howe and colleagues (1986) have presented an alternative model in which they entered separate terms for energy from a specific macronutrient, such as fat, and for energy from other sources, meaning protein, carbohydrate, and alcohol (model 4, Table 11–6). To extend the comparison with animal or metabolic studies, this model implies that more of the nutrient would simply be added to one diet, keeping the other nutrients constant. Thus, this is not an “isocaloric” comparison, and any observed association with the nutrient can still be confounded by total energy intake. In this model, the coefficient for the specific macronutrient (b5) represents the full effect of the nutrient unconfounded by other sources of energy (b6), but this model does not directly address the question of whether energy from the specific nutrient has an association with disease not shared by other sources of energy. To address this issue would require determining whether the magnitude of the coefficient for the specific nutrient (b5) was actually different from the coefficient for other sources of energy (b6), in other words, the appropriate focus should be b5–b6. It is not adequate merely to note that the nutrient coefficient (b5) is significant, whereas b6 is not; even when all sources of energy have the same relation with disease on a calorie-for-calorie basis, the coefficient for other calories (b6) might not be significant simply because of low between-person variation in this factor. In fact, it can be shown that the difference in these coefficients (b5–b6) and the standard error of this difference are identical to the coefficient and standard error (p.293) for the nutrient residual (b1) in model 2 (Howe, 1989; Pike et al., 1989).

Although the “energy decomposition” model may provide insight in some instances, its coefficients may be misleading unless interpreted with care, particularly when total energy intake has a noncausal relationship with disease. For example, use of this model in the example of coronary heart disease noted previously could easily indicate a protective association for fat intake as its effect would still be confounded by total energy consumption secondary to differences in physical activity. An additional limitation of this model is that it cannot be readily extended to nutrients that do not contribute to energy intake.

Multivariate Nutrient Density Model

Another approach, used in the study by Jain and colleagues (1980) to examine the effect of fiber, is to compute the nutrient density and then enter both this and the total energy in a multiple logistic regression model (model 5). The coefficient for the nutrient density term (b7) represents the relation of the nutrient composition of the diet with disease, holding total energy intake constant; thus, this method is an “isocaloric” analysis and does control for confounding by energy intake. This model overcomes the primary statistical problem associated with the use of the nutrient density alone, while retaining its attractive features of general recognition and intuitive interpretation as a measure of dietary composition. The coefficient for calories in this model (b8) will generally be interpretable as representing the effect of calories in the usual biologic sense because nutrient densities are not inherently part of or highly correlated with total energy intake.

The multivariate nutrient density model may be particularly advantageous when body size (and thus total energy intake) varies greatly among subjects because models 2 and 3 imply that the nutrient residual has a similar effect for subjects with high and low energy intake. That a given increment in nutrient intake would have the same effect in a very small subject (with low energy intake) as in a very large subject (with high energy intake) is not plausible. Among specific age–sex groups of human populations, variation in size and total energy is not great, so that this is usually not a major issue. However, among other species, such as dogs (Sonnenschein et al., 1991), body size can vary more than 10-fold, making the use of the multivariate nutrient density model particularly attractive.

The Energy Determinant Method

An alternative approach, in theory, would be to include the major determinants of energy intake (body size, physical activity, and metabolic efficiency) as separate variables in a multivariate model. Unfortunately, measurements of these variables are usually not available in epidemiologic studies. It could be informative, however, to include as many of these variables as possible along with total energy intake as independent variables. Because energy intake and disease outcome may differ in their relationships with body components such as lean mass and fat, it would be desirable to include both height and a measure of fatness uncorrelated with height as separate terms in a multivariate model. The residual of weight on height, computed as described previously for calorie-adjusted nutrients, provides a measure of weight uncorrelated with height. Modeled in this way, height represents lean body mass as it has a linear relationship with total body water in adults (Mellits and Cheek, 1970), and weight independent of height primarily represents fat in middle-aged subjects.

The interpretation of energy intake as an independent variable depends on which other terms are included. If one assumes a steady state of energy balance, energy intake adjusted for body size and physical activity has the meaning of metabolic efficiency. When adjusted for height and weight only, total energy intake has the meaning of physical activity and metabolic efficiency combined. In real applications, these interpretations must be tempered (p.294) with the knowledge that physical activity is probably measured only crudely and, in case–control studies, that energy intake may also refelct an overall bias in the measurement of dietary intake.

Implications of Non-Normality and Heteroscedasticity

Actual dietary data generally do not have the simple, approximately normal distributions as in Figure 11–5. More typically, energy intake and the nutrient intake are skewed toward higher values, and the variation in nutrient intake (and thus the residuals) is greater at higher total energy intake (Fig. 11–7, for example, using saturated fat). The lack of constant variation in the residuals across level of the independent variable (heteroscedasticity) is in principle a violation of usual regression assumptions and, if ignored, has serious implications for the various methods of energy adjustment. It has been pointed out (M. Maclure, personal communication) that if the residuals from heteroscedastic data are divided into categories, subjects in both the highest and lowest categories will tend to have the highest energy intake. Even though the residuals are overall uncorrelated with energy intake, the residuals would still be confounded by energy intake; if energy intake was positively associated with disease risk, this would create a U-shaped relation between energy-adjusted intake (residuals) and disease risk. If energy were included in the model, this would control for its confounding effect, but the association with disease would be dominated by subjects with high energy intake because they would be over-represented in the extreme quintiles. This is particularly worrisome because the more extreme energy intakes may represent the least reliable data. Transformations, such as taking logarithms of the variables, are typically used to create residuals with a more constant variance across the independent variable (see Fig. 11–8). As a result, subjects will contribute similarly to information on dietary composition and disease

                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–7. Intake of saturated fat by quintiles of total energy intake in the Health Professionals Follow-Up Study (n = 10,000).

risk regardless of their energy intake. However, the use of a logarithmic transformation means that nutrient intake is now expressed as a proportional difference (i.e., disease risk would be described for a percentage change in nutrient intake). As noted above, nutrients expressed as residuals have an unfamiliar scale (and even more so when logarithmically transformed); a back transformation can be made by adding a constant (e.g., the predicted
                   Implications of Total Energy Intake for Epidemiologic Analyses

Figure 11–8. Log-transformed saturated fat intake energy data from Figure 11–7.

(p.295) value for log of 2,000 calories) and then taking the antilog. However, in the interpretation of data it must be remembered that the effect of the energy-adjusted nutrient intake will be expressed for the specified energy intake (e.g., 2,000 calories) and would vary with higher or lower intakes. That the effect of an absolute intake of a nutrient would vary by total energy intake (i.e., an interaction exists) is consistent with our general biologic understanding, as discussed in the context of nutrient density. Indeed, it is interesting that the logarithmically transformed nutrient residual has in common with the nutrient density the concept that its influence on disease risk is related to a relative change in intake.

Although the effects of heteroscedasticity are most transparent in the use of residuals as a measure of dietary composition, the same issues exist with the standard multivariate model, but may not be appreciated because the residuals from one independent variable regressed on another are not typically examined. In the “energy decomposition” model, the impact of non-normally distributed variables is less clear, but it is likely that variability in the nutrient of interest could differ by level of energy intake from other sources in some circumstances. These interrelationships deserve careful examination in any particular application.

More Complex Models

In the analytic approaches discussed previously, only one macronutrient at a time was considered in addition to total energy intake. In principle, these approaches could be extended to include other nutrients as well. For example, using the energy-adjustment approach (model 2), one can compute calorie-adjusted residuals for both protein and fat and include both along with total calories in the same model, or one can use the energy decomposition method to enter energy from fat, protein, and carbohydrate as three separate terms. The capacity to include multiple energy-adjusted nutrients in a model simultaneously will be limited by their intercorrelations and the size of the dataset; if the variables are substantially intercorrelated, the degree of independent variation will quickly become small. However, it will frequently be important to include at least two nutrients at a time. For example, in a study of fiber intake and coronary heart disease risk (Rimm et al., 1996), a critical part of the analysis was to demonstrate that the apparent protective effect of fiber was not explained by lower intake of saturated fat or higher intake of constituents of fruits and vegetables other than fiber. The inclusion of additional nutrient terms to these models should be done with caution as the interpretation of even the two-variable models can be complex.

In the simple multivariate nutrient density model that includes just one type of fat plus total energy intake, the coefficient estimates the effect of substituting that fat for the same amount of energy from the average mix of other macronutrients in that population. The mix would include other fats, proteins, and carbohydrates, which hardly provides a clear comparison. A more completely specified model can be used to make comparisons more explicit. For example, Hu et al. (1997) used a multivariate nutrient density model to study the effects of specific types of fat in relation to risk of coronary heart disease. The model included saturated fat, monounsa-turated fat, polyunsaturated fat, trans fat, and protein (all expressed as percent of energy) as well as alcohol, total energy intake, and other established risk factors for coronary disease. Because carbohydrate was the only macronutrient not included in the model, the coefficient for a specific type of fat or protein estimated the effect of substituting a specified percentage of energy from that nutrient for the same percentage of energy from carbohydrate. Although the choice of the comparison macronutrient left unspecified is somewhat arbitrary, the comparison to carbohydrate is typical in metabolic studies because it is the largest source of energy in most diets. The same, more completely specified, model can also (p.296) be used to estimate the effect of substituting one type of fat for another. As done by Hu et al. (1997) forexample, the effect of substituting polyunsaturated fat for the same percentage of energy from saturated fat can be estimated as the difference between their coefficients. The confidence interval for this estimated effect can be calculated from the pooled variance derived from the variance-covariance matrix for the model.

Categorization of Nutritional Variables

In the preceding discussion of energy adjustment, nutrients have been considered as continuous variables. However, in many analyses nutrients are grouped by quartiles or other categories. Reasons for using categories include the capacity to compute relative risks for actual groups of subjects, the avoidance of imposing a dose–response relationship (such as linear) that does not actually exist, and the ability to avoid undue influences of outlier values. Alternative arguments exist for using continuous variables, including the maximization of statistical power; a thorough analysis will usually involve both approaches.

In conducting categorical nutrient analyses, it is important to recognize that the statistical interchangeability of the standard multivariate, energy partition, and residual models does not apply. This phenomena was demonstrated by Kushi and coworkers (1992) and was examined in detail by Brown and coworkers (1994). In the example provided by Kushi et al., it was noted that for quartiles of dietary fat the standard multivariate and energy partition models gave higher relative risks but wider confidence intervals than the residual method; the findings from the nutrient density method were similar to those from the residual method. This phenomenon arises for several reasons. The most basic difference is that the categories for the nutrient (say fat) in the standard multivariate model are based on the crude (marginal) distribution of fat intake, and the categories for residuals are made after removing variation due to total energy intake. Thus, many individuals are no longer in the same category because someone can have a high crude fat intake but a low fat composition of their diet. Also, as shown earlier, the range in crude fat intake will be much wider than for energy-adjusted fat intake. Thus, the relative risk across quintiles from the standard multivariate categories will correspond to a greater difference in fat intake than for the residual quintiles; this will tend to create higher relative risks. However, because of the high collinearity between fat and total energy, the confidence intervals will be wider for the standard multivariate model. Brown and colleagues (1994) have shown that in the categorical analysis statistical power will be higher with the residual or nutrient density methods. For these reasons, categorical analyses based on the standard multivariate and energy partition models are probably best avoided.

Implications for Food-Frequency Questionnaire Data

The preceding discussion assumes that accurate, quantitative data are available for analysis. Because of the need for rapid, inexpensive methods to assess long-term intake in large numbers of subjects, many epidemiologists use simple or semiquantitative food-frequency questionnaires that are less than perfectly accurate. The meaning of energy intake computed from such questionnaires may be less clear than that from more quantitative methods. To the extent that subjects with higher caloric intakes simply consume larger portion sizes rather than more food items, nutrient intakes may be inherently adjusted for total caloric intake. This adjustment, however, is likely to be only partial at most because many food items (e.g., eggs, bread, and apples) come in predetermined units. With any method of assessing energy intake, subjects may either overestimate or underestimate their overall intake. As suggested by the greater standard deviation for total energy intake estimated from food-frequency (p.297) questionnaires than from diet records (see Chapter 6), it is likely that these tendencies are greater with food-frequency questionnaires.

Although energy intake data from food-frequency questionnaires may be imperfect and thus not fully represent the effects of body size, activity, metabolism, and energy balance, it would still be appropriate to use this measure for the computation of energy-adjusted intakes as described previously. To the extent that this adjustment also reduces extraneous between-person variation due to general overreporting or underreporting of food intake, a further gain in accuracy may be obtained in some instances (see Chapter 6). However, improvement in validity by energy adjustment should be regarded as a secondary benefit rather than the primary justification for energy adjustment.

Energy intake measured by dietary records or 24-hour recall has repeatedly been found to be 10 to 30% lower in comparisons with total energy expenditure measured by doubly-labeled water, energy intake needed to maintain weight, or minimal estimates needed for survival (calculated from basal metabolic rates adjusted for age, gender, height, and weight) (Black et al., 1993; de Vries et al., 1994; Klesges et al., 1995). In general, underreporting has been greater among women and obese persons. Although some have used evidence of underreporting of energy intake to cast doubt on any measurements of dietary intake, total energy intake is rarely of direct interest in epidemiologic studies for reasons described above. Indeed, a major objective in analysis is typically to remove variation in nutrient intake due to energy intake. Moreover, systematic biases do not hinder the capacity to find important associations in epidemiologic studies. Whether underestimation of total energy intake is associated with dietary composition is of some importance to epidemiologists. In an Australian dietary survey, 29% of subjects had implausibly low energy intakes (Smith et al., 1994). However, mean nutrient intakes expressed as nutrient densities did not change appreciably when underreporters were excluded, indicating that underreporting was not associated with dietary composition. Also, Nelson and Bingham (1997) found no relation between energy-adjusted intakes of fat, protein, and carbohydrate and underreporting of total energy intake assessed with doubly-labeled water. Similarly, Lissner and Lindroos (1994) and Martin and colleagues (1996) found substantial underreporting of total energy intake assessed by 24-hour recall among obese women, but there was no evidence of underreporting of macronutrients expressed as a percentage of energy. Thus, while underreporting of total energy intake is an important issue in some circumstances, it is not a major issue in epidemiologic analyses because dietary composition is the primary focus; moreover, the major correlates of underreporting such as age, gender, and body fat are accounted for in typical analyses.

Although adjustment for total energy intake based on a food-frequency questionnaire should reduce confounding by energy intake, because both the nutrient and the energy intake are imperfectly measured control of confounding may not be complete (Greenland, 1980). Data from a validation study can be extremely useful to evaluate the degree to which confounding has been controlled. For example, within the Nurses’ Health Study a reasonably strong association was observed between risk of a certain disease and intake of both energy and total fat. The association with energy-adjusted fat was slightly less strong, but still potentially important. We were concerned, however, that the effect of energy-adjusted fat might be due to residual confounding by total energy intake, which was measured imperfectly by the questionnaire. Therefore, we examined the correlation between energy-adjusted fat intake measured by questionnaire and total energy intake measured by diet record in a validation study (presumably a very good measure of intake). The minimal correlation (p.298) observed (r = 0.01) indicated that the association observed for energy-adjusted fat intake was not materially confounded by total energy intake. The control of confounding obtained, however, by energy adjustment was not complete for all nutrients, further indicating the usefulness of the validation study data as the degree and direction of confounding due to imperfectly measured total energy intake would have been difficult to predict.

SUMMARY

Associations between intakes of specific nutrients and disease cannot be considered primary effects of diet if they are simply the result of differences in total energy intake between cases and noncases resulting from differences in body size, physical activity, and metabolic efficiency. Epidemiologic studies of diet and disease should therefore be principally directed at the effects of the nutrient composition of the diet independent of total energy intake. This can be accomplished by the use of the nutrient density if total energy intake is also included as a covariate or by other methods that adjust nutrient intake for energy intake using regression analysis.

Although pitfalls in the manipulation and interpretation of energy intake data in epidemiologic studies have been emphasized, these considerations also highlight the importance of obtaining a measurement of total energy intake. For instance, if a questionnaire obtained information on only saturated fat intake in a study of coronary heart disease, it is possible that an inverse or no association would be found even if high saturated fat composition of the diet truly caused coronary disease, as the energy intake of cases is likely to be less than that of noncases. Such a finding could be appropriately interpreted if an estimate of total energy intake were available.

The relationships between dietary factors and disease are complex. Even with carefully collected measures of intake, consideration of the biologic implications of various analytic approaches is needed to avoid misleading conclusions.

REFERENCES

Bibliography references:

Abraham, S., and M. D. Carroll (1979). Food consumption patterns in the United States and their potential impact on the decline in coronary heart disease mortality. In Havlik, R. J., and Feinlieb, M. (eds.): Proceedings of the Conference on the Decline in Coronary Heart Disease Mortality (DHEW publication No. 79-1610). Washington, DC: National Institutes of Health, pp 253–281.

Anonymous (1985). Consensus Conference. Lowering blood cholesterol to prevent heart disease. JAMA 253, 2080–2086.

Ascherio, A., M. J. Stampfer, G. A. Colditz, E. B. Rimm, L. Litin, and W. C. Willett (1992). Correlations of vitamin A and E intakes with the plasma concentrations of carotenoids and tocopherols among American men and women. J Nutr 122, 1792–1801.

Beaton, G. H., J. Milner, P. Corey, V. McGuire, M. Cousins, E. Stewart, M. de Ramos, D. Hewitt, P. V. Grambsch, N. Kassim, and J. A. Little (1979). Sources of variance in 24-hour dietary recall data: Implications for nutrition study design and interpretation. Am J Clin Nutr 32, 2546–2549.

Black, A. E., A. M. Prentice, G. R. Goldberg, S. A. Jebb, S. A. Bingham, M. B. E. Livingstone, and W. A. Coward (1993). Measurements of total energy expenditure provide insights into the validity of dietary measurement of energy intake. J Am Diet Assoc 93, 572–579.

Bostick, R. M., J. D. Potter, L. H. Kushi, T. A. Sellers, K. A. Steinmetz, D. R. McKenzie, S. M. Gapstur, and A. R. Folsom (1994). Sugar, meat, and fat intake, and non-dietary risk factors for colon cancer incidence in Iowa women (United States). Cancer Causes Control 5, 38–52.

Brown, C. C., V. Kiphis, L. S. Freedman, A. M. Harman, A. Schatzkin, and S. Wacholder (1994). Energy adjustment methods for nutritional epidemiology: The effect of categorization. Am J Epidemiol 139, 323–338.

de Vries, J. H., P. L. Zock, R. P. Mensink, and M. B. Katan (1994). Underestimation of energy intake by 3-d records compared with energy intake to maintain body weight in 269 nonobese adults. Am J Clin Nutr 60, 855–860.

Donato, K., and D. M. Hegsted (1985). Efficiency of utilization of various sources of energy for growth. Proc Natl Acad Sci USA 82, 4866–4870.

(p.299) Gail, M. H., S. Wieand, and S. Pintadosi (1984). Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 71, 431–444.

Garabrant, D. H., J. M. Peters, T. M. Mack, and L. Berstein (1984). Job activity and colon cancer risk. Am J Epidemiol 119, 1005–1014.

Garcia-Palmieri, M. R., P. Sorlie, J. Tillotson, R. Costas, Jr., E. Cordero, and M. Rodriguez (1980). Relationship of dietary intake to subsequent coronary heart disease incidence: The Puerto Rico Heart Health Program. Am J Clin Nutr 33, 1818–1827.

Giovannucci, E., E. B. Rimm, M. J. Stampfer, G. A. Colditz, A. Ascherio, and W. C. Willett (1994). Intake of fat, meat, and fiber in relation to risk of colon cancer in men. Cancer Res 54, 2390–2397.

Giovannucci, E., A. Ascherio, E. B. Rimm, G. A. Colditz, M. J. Stampfer, and W. C. Willett (1995). Physical activity, obesity, and risk for colon cancer and adenoma in men. Ann Intern Med 122, 327–334.

Goldbohm, R. A., P. A. van den Brandt, P. van’t Veer, H. A. M. Brants, E. Dorant, F. Sturmans, and R. J. J. Hermus (1994). A prospective cohort study on the relation between meat consumption and the risk of colon cancer. Cancer Res 54, 718–723.

Gordon, T., M. Fisher, and B. M. Rifkind (1984). Some difficulties inherent in the interpretation of dietary data from free-living populations. Am J Clin Nutr 39, 152–156.

Gordon, T., A. Kagan, M. Garcia-Palmieri, W. B. Kannel, W. J. Zukel, J. Tillotson, P. Sorlie, and M. Hjortland (1981). Diet and its relation to coronary heart disease and death in three populations. Circulation 63, 500–515.

Greenland, S. (1980). The effect of misclassification in the presence of covariates. Am J Epidemiol 112, 564–569.

Hegsted, D. M. (1985). Dietary standards: Dietary planning and nutrition education. Clin Nutr 4, 159–163.

Himms-Hagen, J. (1984). Thermogenesis in brown adipose tissue as an energy buffer: Implications for obesity. N Engl J Med 311, 1549–1558.

Hofstetter, A., Y. Schutz, E. Jequier, and J. Wahren (1986). Increased 24-hour energy expenditure in cigarette smokers. N Engl J Med 314, 79–82.

Horton, E. S. (1983). Introduction: An overview of the assessment and regulation of energy balance in humans. Am J Clin Nutr 38, 972–7.

Howe, G. R. (1989). Re: “Total energy intake: Implications for epidemiologic analyses” (letter). Am J Epidemiol 129, 1314–1315.

Howe, G. R., A. B. Miller, and M. Jain (1986). Re: “Total energy intake: Implications for epidemiologic analyses” (letter). Am J Epidemiol 124, 157–159.

Howe, G. R., K. J. Aronson, E. Benito, R. Castelleto, J. Cornee, S. Duffy, R. P. Gallagher, J. M. Iscovich, J. Dengao, R. Kaaks, G. A. Kune, S. Kune, H. P. Lee, M. Lee, A. B. Miller, R. K. Peters, J. D. Potter, E. Riboli, M. L. Slattery, D. Trichopoulos, A. Tuyns, A. Tzonou, L. E. Watson, A. S. Whittemore, A. H. Wu Williams, et al. (1997). The relationship between dietary fat intake and risk of colorectal cancer-evidence from the combined analysis of 13 case-control studies. Cancer Causes Control 8, 215–228.

Hu, F. B., M. J. Stampfer, J. E. Manson, E. Rimm, G. A. Colditz, B. A. Rosner, C. H. Hennekens, and Walter C. Willett. (1997) Dietary fat intake and the risk of coronary heart disease in Women. N Engl J Med, 337, 1491–1499.

Hunter, D. J., E. B. Rimm, F. M. Sacks, M. J. Stampfer, G. A. Colditz, L. B. Litin, and W. C. Willett (1992). Comparison of measures of fatty acid intake by subcutaneous fat aspirate, food frequency questionnaire, and diet records in a free-living population of US men. Am J Epidemiol 135, 418–427.

Jain, M., G. M. Cook, F. G. Davis, M. G. Grace, G. R. Howe, and A. B. Miller (1980). A case–control study of diet and colorectal cancer. Int J Cancer 26, 757–768.

Jequier, E., and Y. Schutz (1983). Long-term measurements of energy expenditure in humans using a respiration chamber. Am J Clin Nutr 38, 989–998.

Johnson, M. L., B. S. Burke, and J. Mayer (1956). Relative importance of inactivity and overeating in the energy balance of obese high school girls. Am J Clin Nutr 4, 37–44.

Kipnis, V., L. S. Freedman, C. C. Brown, A. M. Hartman, A. Schatzkin, and S. Wacholder (1993). Interpretation of energy adjustment models for nutritional epidemiology. Am J Epidemiol 137, 1376–1380.

Klesges, R. C., L. H. Eck, and J. W. Ray (1995). Who underreports dietary intake in a dietary recall? Evidence from the Second National Health and Nutrition Examination Survey. J Consulting Clin Psych 63, 438–444.

(p.300) Kromhout, D., and C. de Lezenne Coulander (1984). Diet, prevalence and 10-year mortality from coronary heart disease in 871 middle-aged men: The Zutphen Study. Am J Epidemiol 119, 733–741.

Kushi, L. H., R. A. Lew, F. J. Stare, C. R. Ellison, M. el Lozy, G. Bourke, L. Daly, I. Graham, N. Hickey, R. Mulcahy, and J. Kevancy (1985). Diet and 20-year mortality from coronary heart disease: The Ireland-Boston Diet-Heart study. N Engl J Med 312, 811–818.

Kushi, L. H., T. A. Sellers, J. D. Potter, C. L. Nelson, R. G. Munger, S. A. Kaye, and A. R. Folsom (1992). Dietary fat and post-menopausal breast cancer. JNCI 84, 1092–1099.

Leibel, R. L., M. Rosenbaum, and J. Hirsch (1995). Changes in energy expenditure resulting from altered body weight. N Engl J Med 332, 621–628.

Lissner, L., and A. K. Lindroos (1994). Is dietary underreporting macronutrient-specific? Eur J Clin Nutr 48, 453–454.

London, S. J., F. M. Sacks, J. Caesar, M. J. Stampfer, E. Siguel, and W. C. Willett (1991). Fatty acid composition of subcutaneous adipose tissue and diet in post-menopausal US women. Am J Clin Nutr 54, 340–345.

Lyon, J. L., J. W. Gardner, D. W. West, and A. M. Mahoney (1983). Methodological issues in epidemiological studies of diet and cancer. Cancer Res 43(suppl), 2392–2396.

Martin, L. J., W. Su, P. J. Jones, G. A. Lockwood, D. L. Tritchler, and N. F. Boyd (1996). Comparison of energy intakes determined by food records and doubly labeled water in women participating in a dietary-intervention trial. Am J Clin Nutr 63, 483–490.

McGee, D., D. Reed, and K. Yano (1984a). The results of logistic analyses when the variables are highly correlated: An empirical example using diet and CHD incidence. J Chronic Dis 37, 713–719.

McGee, D. L., D. M. Reed, K. Yano, A. Kagan, and J. Tillotson (1984b). Ten-year incidence of coronary heart disease in the Honolulu Heart Program: Relationship to nutrient intake. Am J Epidemiol 119, 667–676.

Mellits, E. D., and D. B. Cheek (1970). The assessment of body water and fatness from infancy to adulthood. Monogr Soc Res Child Dev 35, 12–26.

Miller, D. S. (1973). Overfeeding in man. In Bray, G. A. (ed.): Obesity in Perspective DHEW Publication No. 75–708. Washington, DC: National Institutes of Health.

Morris, J. N., J. W. Marr, and D. G. Clayton (1977). Diet and heart: A postscript. BMJ 2, 1307–1314.

National Center for Health Statistics (1979). Weight and Height of Adults 18–74 Years of Age: United States, 1971–74. Hyattsville, MD: National Center for Health Statistics.

Nelson, M. and S. Bingham (1997). Food consumption and nutrient intake. In: Margetts, B. and Nelson, M. (eds.) Design Concepts in Nutritional Epidemiology, 2nd. ed., New York: Oxford University Press, pp. 123–170.

Paffenbarger, R. S., Jr., A. L. Wing, and R. T. Hyde (1978). Physical activity as an index of heart attack risk in college alumni. Am J Epidemiol 108, 161–175.

Pike, M. C., L. Bernstein, and R. K. Peters (1989). Re: “Total energy intake: implications for epidemiologic analyses” (letter). Am J Epidemiol 129, 1312–1315.

Prentice, A. M., A. E. Black, W. A. Coward, H. L. Davies, G. R. Goldberg, P. R. Murgatroyd, J. Ashford, M. Sawyer, and R. G. Whitehead (1986). High levels of energy expenditure in obese women. Br Med J Clin Res 292, 983–987.

Ravussin, E., S. Lillioja, T. E. Anderson, L. Christin, and C. Bogardus (1986). Determinants of 24-hour energy expenditure in man: Methods and results using a respiratory chamber. J Clin Invest 78, 1568–1578.

Rimm, E. B., A. Ascherio, E. Giovannucci, D. Spiegelman, M. J. Stampfer, and W. C. Willett (1996). Vegetable, fruit, and cereal fiber intake and risk of coronary heart disease among men. JAMA 275, 447–451

Roberts, S. B., P. Fuss, W. J. Evans, M. B. Heyman, and V. R. Young (1993). Energy expenditure, aging, and body composition. J Nutr 123, 474–480.

Romieu, I., W. C. Willett, M. J. Stampfer, G. A. Colditz, L. Sampson, B. Rosner, C. H. Hennekens, and F. E. Speizer (1988). Energy intake and other determinants of relative weight. Am J Clin Nutr 47, 406–412.

Saltzman, E., and S. B. Roberts (1995). The role of energy expenditure in energy regulation: Findings from a decade of research. Nutr Rev 53, 209–220.

Shekelle, R. B., M. Z. Nichaman, and W. J. Raynor Jr. (1987). Re: Total energy intake: Implication for epidemiolgic analyses (letter). Am J Epidemiol 126, 980–983.

Shekelle, R. B., O. Paul, and J. Stamler (1985). Diet and coronary heart disease (letter). N Engl J Med 313, 120.

Sims, E. A., E. Danforth, Jr., E. S. Horton, G. A. Bray, J. A. Glennon, and L. B. Salans (1973). Endocrine and metabolic effects of experimental obesity in man. Recent Prog Horm Res 29, 457–496.

(p.301) Sjostrom, L. (1985). A Review of Weight Maintenance and Weight Changes in Relation to Energy Metabolism and Body Composition. Recent Advances in Obesity Research. Proceedings of the 4th International Congress on Obesity. Wesport, CT: Food and Nutrition Press.

Smith, W. T., K. L. Webb, and P. F. Heywood (1994). The implications of underreporting in dietary studies. Austral J Public Health 18, 311–314.

Sonnenschein, E., L. Glickman, M. Gold-schmidt, and L. McKee (1991). Body conformation, diet, and risk of breast cancer in pet dogs: A case–control study. Am J Epidemiol 133, 694–703.

Sopko, G., D. R. Jacobs, Jr., and H. L. Taylor (1984). Dietary measures of physical activity. Am J Epidemiol 120, 900–911.

Thomson, A. M., and W. Z. Billewicz (1961). Height, weight and food intake in man. Br J Nutr 15, 241–52.

Van Itallie, T. B. (1978). Dietary fiber and obesity. Am J Clin Nutr 31(suppl), 43S–52S.

Vena, J. E., S. Graham, M. Zielezny, M. K. Swanson, R. E. Barnes, and J. Nolan (1985). Lifetime occupational exercise and colon cancer. Am J Epidemiol 122, 357–365.

Webb, P. (1985). The exchange of matter and energy in lean and overweight men and women: A calorimetric study of overeating, balanced intake and undereating. Int J Obesity 9(suppl 2), 139–145.

Willett, W. C. (1987). Implications of total energy intake for epidemiologic studies of breast and large-bowel cancer. Am J Clin Nutr 45(suppl), 354S–360S.

Willett, W. C. (1990). Total energy intake and nutrient composition: Dietary recommendations for epidemiologists. Int J Cancer 46, 770–771.

Willett W. C., Howe G. R., Kushi L. H. (1997) Adjustment for total energy intake in epidemiologic studies. Am J Clin Nutr, 65(4 suppl s): 1220s–1228s.

Willett, W. C., L. Sampson, M. J. Stampfer, B. Rosner, C. Bain, J. Witschi, C. H. Hennekens, and F. E. Speizer (1985). Reproducibility and validity of a semiquantitative food frequency questionnaire. Am J Epidemiol 122, 51–65.

Willett, W. C., and M. J. Stampfer (1986). Total energy intake: Implications for epidemiologic analyses. Am J Epidemiol 124, 17–27.

Willett, W. C., M. J. Stampfer, G. A. Colditz, B. A. Rosner, and F. E. Speizer (1990). Relation of meat, fat, and fiber intake to the risk of colon cancer in a prospective study among women. N Engl J Med 323, 1664–1672.

Willett, W. C., M. J. Stampfer, B. A. Underwood, F. E. Speizer, B. Rosner, and C. H. Hennekens (1983). Validation of a dietary questionnaire with plasma carotenoid and alpha-tocopherol levels. Am J Clin Nutr 38, 631–639.

Woo, R., R. Daniels-Kugh, and E. S. Horton (1985). Regulation of energy balance. Annu Rev Nutr 5, 411–433.

Parts of this chapter are reproduced from Willett and Stampfer (1986) and Willett (1987) with permission of the original publishers. Several of the ideas on various analytic strategies evolved during a March 1988 workshop sponsored by the SEARCH program of the International Agency for Cancer Research, Lyon, France. Much is owed to discussions with James Marshall, Geoffrey Howe, Larry Kushi, and Larry Freedman, which are partly embodied in Willett et al., 1997.