Jump to ContentJump to Main Navigation
Critical Appraisal of Epidemiological Studies and Clinical Trials$

Mark Elwood

Print publication date: 2007

Print ISBN-13: 9780198529552

Published to Oxford Scholarship Online: September 2009

DOI: 10.1093/acprof:oso/9780198529552.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 25 February 2017

Critical appraisal of a randomized trial of a preventive agent

Critical appraisal of a randomized trial of a preventive agent

(p.387) Chapter 11 Critical appraisal of a randomized trial of a preventive agent
Critical Appraisal of Epidemiological Studies and Clinical Trials

J. Mark Elwood

Oxford University Press

Abstract and Keywords

This chapter presents an example of the application of the scheme for critical appraisal: a randomized clinical trial entitled ‘Prevention of neural tube defects: results of the Medical Research Council Vitamin Study’, published in The Lancetin 1991. This large individually randomized trial has produced a result which has extremely high internal validity, with observation bias and confounding being ruled out as alternative explanations.

Keywords:   randomized trial, critical appraisal, neural tube defects, internal validity

In this chapter we will review an important randomized trial, published in The Lancet, 20 July 1991, 338, 131–37 [1]. In contrast with the trial reviewed in Chapter 10, this concerned a preventive strategy. The summary and methods section are reproduced here, with permission from Elsevier, and the full paper can be accessed at www.thelancet.com.

Prevention of neural tube defects: results of the Medical Research Council Vitamin Study

MRC Vitamin Study Research Group


A randomized double-blind prevention trial with a factorial design was conducted at 33 centres in seven countries to determine whether supplementation with folic acid (one of the vitamins in the B group) or a mixture of seven other vitamins (A, D, B1, B2, B6, C, and nicotinamide) around the time of conception can prevent neural tube defects (anencephaly, spina bifida, encephalocoele). A total of 1817 women at high risk of having a pregnancy with a neural tube defect, because of a previous affected pregnancy, were allocated at random to one of four groups—namely, folic acid, other vitamins, both, or neither. 1195 had a completed pregnancy in which the fetus or infant was known to have or not have a neural tube defect; 27 of these had a known neural tube defect, 6 in the folic-acid groups and 21 in the two other groups, a 72 per cent protective effect (relative risk 0.28, 95 per cent confidence interval 0.12–0.71). The other vitamins showed no significant protective effect (relative risk 0.80, 95 per cent CI 0.32–1.72). There was no demonstrable harm from the folic acid supplementation, though the ability of the study to detect rare or slight adverse effects was limited. Folic acid supplementation starting before pregnancy can now be firmly recommended for all women who have had an affected pregnancy, and public health measures should be taken to ensure that the diet of all women who may bear children contains an adequate amount of folic acid.

(p.388) Methods

The study was an international, multicentre, double-blind randomized trial involving 33 centres (17 in the UK and 16 in six other countries). Women with a previous pregnancy affected by a neural tube defect, not associated with the autosomal recessive disorder Meckel’s syndrome, were eligible for the study if they were planning another pregnancy and were not already taking vitamin supplements. Women with epilepsy were excluded in case the folic acid supplementation adversely affected their treatment. Antenatal diagnosis of neural tube defects was available at all centres in the study. The effect of supplementation both with folic acid and with a selection of other vitamins was investigated by use of a factorial study design. Women were allocated at random to one of four supplementation groups, the supplements containing folic acid, other vitamins, both, or neither, in the following way:


Folic acid

Other vitamins













Comparison of the outcome in groups A and B with those in groups C and D tested the effect of folic acid supplementation; comparison of the outcomes in groups B and D with those in groups A and C tested the effect of the other vitamins. Separate sets of random allocations were used for each centre to ensure that there would be approximately equal numbers of women in each supplementation group at each centre.

The capsules used in the study were prepared by the Boots Company and packaged in 2-week calendar ‘blister’ packs. Women in the trial were asked to take a single capsule each day from the date of randomization until 12 weeks of pregnancy (estimated from the first day of the last menstrual period). Capsules for those in the folic-acid groups contained 4 mg of folic acid—the larger of the two doses used in the previous studies being chosen because a negative result with the lower dose would have left the matter open. Capsules for those in the multivitamin groups contained vitamin A 4000 U, D 400 U, B1 1.5 mg, B2 1.5 mg, B6 1.0 mg, C 40 mg, and nicotinamide 15 mg. The control substance in the capsules was dried ferrous sulphate 120 mg and di-calcium phosphate 240 mg. The potency of the capsules was independently checked every three months by Hoffmann La Roche in Basel, Switzerland. The trial was (p.389) double-blind, in that neither the doctor nor the patient knew which regimen had been allocated. It was agreed that the groups to which patients were allocated would normally be revealed only at the end of the trial. The randomization was carried out through the Clinical Trials Service Unit in Oxford.

Women invited to join the trial were given a week to decide if they wished to take part, so that they could consider the matter at leisure and discuss the matter further with others if they wished. All patients were given a printed information leaflet about the trial.

No special advice was specified regarding diet. On entry into the trial, samples of blood and urine were collected and sent to the central trial office in the Department of Environmental and Preventive Medicine at St Bartholomew’s Hospital for folic acid analysis, performed by radioimmunoassay (Amersham International). Patients were then given capsules and requested to attend every three months so that a note could be made of their general health and how many capsules they had taken. Blood and urine samples were collected at each visit for dispatch to the trial office laboratory and a further supply of capsules was given. The last visit took place in the 12th week of pregnancy. The outcomes of all completed pregnancies were recorded, including details of any fetal malformation, sex, birthweight, and head circumference. In the event of a termination of pregnancy or miscarriage the fetus was examined if possible. A woman remained in the trial until she had a pregnancy in which the fetus could be classified as having a neural tube defect or not (‘informative pregnancy’). If, for example, she had a miscarriage and the fetus was not examined, she remained in the study in the same randomization group until the end of the trial or until she had an informative pregnancy. In this way each woman contributed no more than one informative event to the study. The final results are based on the outcome of all informative pregnancies. Whenever a neural tube defect (anencephaly, spina bifida cystica, or encephalocoele) was reported, independent corroboration was sought, with a necropsy report if one was performed, or a description of the lesion for independent review at the trial centre in London (done without knowledge of the allocated group). To monitor possible toxicity associated with the supplementation, forms were provided for the notification of any medical event arising among the women in the trial irrespective of whether this was thought to be associated with the capsules. The health of each child born into the study was ascertained annually by sending a questionnaire to the mother on the infant’s first, second and third birthday. This part of the study is continuing. The results of the study, available only to the principal investigator, the study administrator, and the data monitoring committee, were reviewed every six months to enable the study to be stopped early if, as indeed occurred, a clear result emerged.

(p.390) A. Description of the evidence

  1. 1. What was the exposure or intervention?

  2. 2. What was the outcome?

  3. 3. What was the study design?

  4. 4. What was the study population?

  5. 5. What was the main result?

The objective of this study was to test whether dietary supplementation with different types of vitamins could reduce the recurrence risk of neural tube defects in pregnancies in women who had already had at least one affected birth. The background, described in the introduction, notes that two intervention studies for such high-risk women had been done, with inconclusive results. One of the studies [2] was large, and showed a large and statistically significant decrease in recurrence risk in women taking a combination of folic acid with seven other vitamins, but as the study was not randomized, the difference might have been due to selection factors or other confounding factors. The second study was randomized and used only folic acid [3], but was too small and showed a non-significant benefit when analysed in an intention to treat manner, although the published results, analysed by compliance with treatment, showed a significant effect. Thus, from these trials, it was uncertain whether there was a true preventive action of dietary supplementation, and it was not clear if folic acid as a single agent or in combination with other vitamins was the more effective agent. As a result, a large randomized trial of high-risk women was designed.

The intervention was a dietary supplementation. Two types of supplement were tested; folic acid alone, 4 mg per day, and several vitamins other than folic acid. An efficient design to test two different interventions, a 2 × 2 factorial design, was used, where the women were randomized into four groups, and then received folic acid, other vitamins, both folic acid and other vitamins, or neither. Four identical sets of capsules were prepared; three sets contained the three combinations of supplementation, and the capsules for the fourth group contained only ferrous sulphate and calcium phosphate, which were also included in all the other groups and are regarded as a placebo.

The outcome was the occurrence of a further neural tube defect. These defects comprise a set of congenital defects of the developing nervous system, and can usually be recognized easily in live births and still births, and in induced and spontaneous abortuses if appropriate examination is done. The expected frequency of such outcomes in pregnancies to women who have already had an affected baby is around 2–5 per cent.

(p.391) The study design was a prospective double-blind randomized trial. A calculation of anticipated study size showed that approximately 2000 pregnancies would be needed, and therefore the study was international, involving 33 centres in seven countries. The eligible population comprised women with a previous pregnancy with an ‘isolated’ neural tube defect, i.e. excluding defects associated with Meckel’s syndrome. This syndrome includes neural tube defects, but because it has an autosomal recessive inheritance pattern, it has a substantially higher recurrence risk than the more common situation of an isolated neural tube defect, which was the focus of this study. Because folic acid interacts with drugs used to control epilepsy, women with epilepsy were excluded from this study.

The main result (Ex. 11.1) consists of two comparisons. One is between all women randomized to take folic acid and all women randomized not to take folic acid. The second is between all women randomized to take other vitamins and all those randomized not to take other vitamins. The outcome variable is prevalence at birth, i.e. the number of affected babies over the total number of births. The results are shown in 2 × 2 tables, and the relative risk calculated as the ratio of the prevalence rates. A Mantel–Haenszel analysis is appropriate.

The main result is that six of the 593 births to mothers allocated folic acid were affected (1.0 per cent) compared with 21 of 602 births in those not receiving folic acid (3.5 per cent), giving a relative risk of 0.29 with 95 per cent

                      Critical appraisal of a randomized trial of a preventive agent

Ex. 11.1. Results of Medical Research Council trial: main treatment comparisons. From MRC Vitamin Study Research Group [1]

(p.392) confidence limits of 0.12 to 0.71 (see Appendix Table 2 for the calculations). The prevalence in births to women receiving other vitamins was 2.0 per cent compared with 2.5 per cent in those not receiving other vitamins, giving a relative risk of 0.80. This modest reduction in risk is not statistically significant, with 95 per cent confidence limits of 0.37 to 1.72.

B. Internal validity: consideration of non-causal explanations

6. Are the results likely to be affected by observation bias?

We must first assess if the recorded results in terms of pregnancy outcome are likely to be accurate, or whether they could be influenced by observation bias. There are two main issues. First, is the outcome of the pregnancy observable? A major problem with any study of birth defects is that what is observed is the net result of two processes: the occurrence of the birth defect, very early in pregnancy, and the survival of the affected fetus in utero. If affected fetuses do not survive, but are expelled as spontaneous abortuses early in the pregnancy, then the outcomes of pregnancy will not be measurable in a routine study. If they survive to present as later spontaneous abortions, as stillbirths, or as live births, then they can be observed. For these reasons the information on all pregnancies occurring, and whether these were ‘informative’, i.e. they could be examined to ascertain if a neural tube defect was present, is important (Ex. 11.2). Overall, 75 per cent of the women randomized had a pregnancy known to the study, and 88 per cent of these pregnancies were informative. The non-informative pregnancies could be different in terms of the occurrence of neural tube defects, and some pregnancies may have occurred without being recognized within the study. The important issue is whether there is any indication of differences between the mothers allocated folic acid and those not allocated folic acid; in fact, the two groups were extremely similar (Ex 11.2). There was a considerably higher number of terminations of informative pregnancies in mothers not receiving folic acid. This will include pregnancies terminated after recognition of a neural tube defect following antenatal diagnosis, as this service was available to all women in this study. There is also a somewhat higher number (three compared with none) of non-informative terminations, i.e. a termination of pregnancy for which information was not available to the study authors; these may include some neural tube defects. Overall, it seems unlikely that any differences in the ascertainment of pregnancy outcomes would influence the result.

The second issue is the completeness and accuracy of ascertainment of neural tube defects in informative pregnancies. In general, neural tube defects (p.393)

                      Critical appraisal of a randomized trial of a preventive agent

Ex. 11.2. Results of Medical Research Council trial: other outcomes of pregnancy. From MRC Vitamin Study Research Group [1]

are severe abnormalities that can be accurately recognized and classified if careful examination is performed, although this is more difficult for a termination than it is for a live birth. There are also minor and even subclinical forms of the condition, so that some biological occurrences of defects may not be recognized. The trial protocol called for independent corroboration of a report of the neural tube defect. The necropsy report or a description of the lesion was sent to the trial centre for independent review, and this was done without knowledge of the allocated group. A total of 27 neural tube defects were reported, and in the discussion the authors comment that in 23 cases the women had a termination of pregnancy after antenatal diagnosis, and in four (p.394) there was a live birth, two of which survived. For 18 of the 25 dead cases there was a necropsy report, and there were confirmatory descriptions in the remaining seven. The authors state that they are confident about the reliability of the diagnoses. The reports of neural tube defects are likely to be valid. The main limitation is that it is difficult to be certain that all neural tube defects were counted; some may have occurred in non-informative pregnancies, some may not have been recognized at all, and some could have been missed in informative pregnancies, although that seems unlikely.

The main potential problem is less complete ascertainment in mothers allocated folic acid, as that direction of bias would contribute to the observed result. If the reported frequency of other abnormalities were also lower in pregnancies to women allocated folic acid, it would suggest general under-ascertainment or that folic acid prevents other abnormalities. In fact, the number of other abnormalities was slightly higher in the group allocated folic acid, suggesting that under-ascertainment did not occur (Ex. 11.2).

7. Are the results likely to be affected by confounding?

The primary strength of the randomized trial design is in protection against confounding. The previous large non-randomized trial of vitamin supplementation showed a statistically significant benefit [2], but because the women allocated the supplement were self-selected by their attendance at a specialized unit, that group may have had other features which put them at a lower risk of recurrence of neural tube defects. From the general epidemiology of these defects, factors such as higher social class, residence in low-risk geographical areas, lower parity, or age could reduce risk. In this study, randomization was used on a substantial number of mothers. In Table II of the paper the four groups resulting from the randomization are compared in terms of age, previous pregnancy history, and the number of previous pregnancies affected by neural tube defect, and information is also given on social class for the women within the UK; all of these show almost identical distributions between the four groups. The randomization was done within each centre, so that the distribution of mothers receiving or not receiving folic acid is similar over the different centres. This design provides good reassurance that the baseline characteristics of the different groups of mothers were the same.

Another major problem with the previous non-randomized study was that the women allocated the supplement were aware of it, and had probably received more information about the disease and their pregnancy than had the comparison women. Therefore the women receiving the supplement may have actively made other changes, such as improving their own diet in other ways, which may have confounded the effect of the supplement. The randomized (p.395) trial used a double-blind design, so that all women involved were given the same information, and were given identical capsules. They were not given any special advice about diet. Many of the women involved in the trial may have taken other actions, such as modifying their diet, changing their consumption of substances such as alcohol, or changing their lifestyle in terms of exercise and occupation, and so on. The strength of the double-blind design is that all women received the same information, and therefore the extent to which they made other modifications should have been the same irrespective of the supplementation they received. The one danger that still remains, even with the double-blind design, is the possibility that all the women in the trial would make dramatic differences to other aspects of their diet, so that their total vitamin intake would increase to the point where the marginal effect of the supplementation would be irrelevant. Partially as a protection against this, and more directly to assess whether women actually took the allocated tablets, serum folic acid levels were assessed from blood samples at various times. These results (Table VII of the paper), show substantially higher serum folic acid levels in those allocated folic acid than in others, which both indicates compliance and shows that the folic acid levels were not increased in the whole group to the extent of removing the effect of the supplement.

8. Are the results likely to be affected by chance variation?

As noted above, the simple analysis of the results shows that the reduction in risk associated with folic acid supplementation is very unlikely to have recurred by chance, whereas the small reduction in risk associated with other vitamins is well within the likely range of chance variation. The results of this study were monitored by an independent review group, using a sequential method of analysis [4] which is described in the paper. The trial was terminated when the cumulative difference between the number of neural tube defects in the folic-acid group compared with the non-folic-acid group reached a preset boundary (Ex. 11.3). Because the trial was stopped at a point that depended on the results at that time, the relative risk estimate produced by standard methods is somewhat exaggerated. The relative risk needs to be calculated by a different method which allows for this early stopping; this gives a result of 0.33, with 95 per cent confidence limits of 0.06 to 0.80. This is not quite as extreme as the more simply calculated relative risk of 0.29. As the simpler estimate is very similar, it can be used to explore the results of this study further.

The relative risks and confidence limits shown in Ex. 11.1 can be reproduced by applying the formulae in Appendix Table 2, using the variance formulae in section 4A. The test-based confidence limits are very similar.


                      Critical appraisal of a randomized trial of a preventive agent

Ex. 11.3. Sequential analysis, showing cumulative difference between number of neural tube defects (NTDs) in the folic-acid and non-folic-acid groups plotted against total number of neural tube defects occurring in the trial. The boundaries of the diagram define the stopping points of the study. The methods used to determine the boundaries are described in the paper. From MRC Vitamin Study Research Group [1]

C. Internal validity: consideration of positive features of causation

9. Is there a correct time relationship?

The time relationship is clear in this prospective intervention trial. It is stated that 7 per cent of women with informative pregnancies had stopped taking their capsules before they became pregnant, usually because they lost interest in participation. Specific results are not reported for these women; if the number were larger, some useful information on the effectiveness of the intervention if it is stopped before pregnancy would be given.

10. Is the relationship strong?

The overall relationship here is strong, with a 67 per cent reduction in risk. However, it is relevant that the result of the previous non-randomized trial was equally large, but that did not adequately deal with the possibility of uncontrolled confounding. Indeed, an even larger variation in risk was seen in (p.397) an observational study comparing the recurrence risks in women grouped by the hospital which they had previously attended, and these large differences in risk remained unexplained by any known differences in supplementation, diet, or other factors [5]. Therefore, while the strength of the relationship is of critical importance in terms of the implementation and impact of these results, in terms of the interpretation of causality, it assists, but the main strength comes from the double-blind randomized design in reducing the possibility of confounding or observation bias.

11. Is there a dose–response relationship?

There is no opportunity here to assess any dose–response relationship, as only a single dose was used. The questions of whether a lower dose would be equally effective, or a higher dose would be more effective, are not resolved.

12. Are the results consistent within the study?

Consistency within the study can be looked at in a number of ways (Ex. 11.4). The geographical distribution breaks down primarily into centres within and outside the UK, and the results are virtually identical for each of these groups. There are two major categories of neural tube defect, anencephalus and spina bifida. The results show a protective effect for each defect, and although the relative risk is lower for spina bifida than it is for anencephalus, the confidence limits overlap. A major difference between the results for the two defects would raise a question of differences in ascertainment, or indicate a true biological difference. It would be possible to analyse the results by other characteristics such as maternal age or previous pregnancy history, but this is not done in this paper.

13. Is there any specificity within the study?

The most important specificity is that the protective effect was shown for folic acid, but not for the other set of vitamins. This distinction is based on a specific hypothesis set up in advance, and therefore can be accepted. The authors conclude that the result is specific to folic acid supplementation, a conclusion of considerable practical importance. However, the results did show some reduction in risk with the other vitamins, although it was not statistically significant, and so some benefit of the other vitamins cannot be ruled out. Although the numbers are insufficient for confident results, there is no evidence that the risk was lower in those who received both folic acid and other vitamins than in those who received folic acid alone (Ex. 11.1). There is no evidence of specificity in terms of the particular defect, as discussed above.


                      Critical appraisal of a randomized trial of a preventive agent

Ex. 11.4. Results of Medical Research Council trial: main treatment comparison within subgroups. From MRC Vitamin Study Research Group [1]

Conclusions with regard to internal validity

The internal validity of this study is extremely high. It is difficult to make a good argument for any explanation of the main result other than that of causality. Because of the randomization, the groups of women being compared are very similar in several factors related to their risk. The occurrence of informative pregnancies, and the documentation of defects resulting from those pregnancies have been assessed consistently and in identical fashion for all women in this study. The main result is consistent with the pre-defined hypothesis and is strong, and the possibility of it occurring by chance variation is extremely remote. The results are consistent within different subgroups.

D. External validity: generalization of the results

14. Can the study results be applied to the eligible population?

The eligible population consists of the next occurring pregnancy in each woman identified as eligible in the participating centres. The fact that a high proportion of pregnancies were informative is an important strength of the study. The study was designed to ensure that each woman contributed no more than one informative event, so that each informative pregnancy is an (p.399) independent observation. Each woman remained in the trial until she had an informative pregnancy, or until the end of the trial; women who had an uninformative pregnancy remained in the study in the same randomization group until an informative pregnancy resulted or the trial ended.

The relationship between the participants and the total eligible population is not clear. Women invited to join the trial were given a week to decide if they wished to take part. There is no information given on how eligible women were identified, what proportion of them were invited to participate, or what proportion of those invited to participate did so. Therefore the participants were a selected subgroup of all eligible women. This is relevant only if the effect of the supplementation could be different in eligible women who did not participate; it is difficult to see how this would occur. We might expect that the participants would be more interested and more knowledgeable about the pregnancy and their risk state than non-participants, and so might be more active in making other changes such as other dietary alterations. Such an effect would apply to all the randomized groups, and its effect, if any, would be to reduce the effect of supplementation. Therefore the results should apply to the eligible population.

15. Can the study results be applied to the source population?

The source population consists of all women who had a previous affected pregnancy who were seen in any of the participating centres. The eligibility criteria were to be planning another pregnancy, not be already taking vitamin supplements, and not to have epilepsy. The first two criteria may exclude women who are both least and most prepared for a further pregnancy. As with the previous issue, the relevant question concerns the generalizability of the result of the study, and these exclusion criteria do not seem likely to limit that.

16. Can the study results be applied to other relevant populations?

The generalization of the result to appropriate target populations is a very important issue. The underlying biological hypothesis being tested was that vitamin supplementation reduces the recurrence risk of these defects in the reproductive population in general, i.e. on a worldwide basis. We have to ask to what extent are the women who had the opportunity to take part in the trial likely to be representative of all at-risk women, i.e. those who have had a previous affected child. Neural tube defects have a complex epidemiological pattern, and are much more common in some countries than others. This aetiology may differ in different societies. They are less common in Oriental, Asian, and African populations, and much more common in European populations, (p.400) particularly those of British origin [6]. The centres participating were in Britain, Australia, and Canada, which have substantial British populations, and in Hungary and Israel, which have lower population risks of neural tube defects. It seems unlikely that there were many mothers of Asian or African origin in this study, and there could be differences in the aetiology of these defects in such women.

Therefore the results are likely to apply, on a worldwide basis, to mothers of European origin who have had a previous affected pregnancy. Direct information for non-European ethnic groups would be useful, but in its absence it is reasonable to generalize the results to such groups in addition. This is particularly so as the application of these results may have a high benefit-to-cost ratio. The results suggest that a dietary supplement, which does not appear to have any major toxic consequences and is readily available and cheap, can produce a substantial reduction in risk for high-risk mothers, but antenatal diagnosis where culturally and legally acceptable would still be offered as a further protection. It seems very reasonable to apply these results to high-risk mothers, on a worldwide basis.

The more challenging question is whether these results show that supplementation with folic acid will reduce the frequency of neural tube defects occurring for the first time in births to mothers without any previous history. A reasonable case is made in this paper that the results would apply. It is generally thought that neural tube defects have an environmental and a genetic component in their causation. We can consider these two factors as joint causes, under the simple models described in Chapter 3. If their joint effect is multiplicative, the relative risk for the environmental cause will be the same in recurrences as in first affected births, and the absolute effect (risk difference) will be greater. If the joint effect is additive, the absolute effect will be the same, and the relative risk for the environmental cause will be less in recurrences than in first affected births. Therefore on these models, the result from this study is either a valid estimate or an underestimate of the relative risk from folate supplementation for general population low-risk pregnancies. The situation in which it could be an overestimate is that of synergy, where the benefit of folate supplementation is particularly great, or even confined to, mothers with a genetic factor which also increases their overall risk. That would apply, for example, if high-risk mothers had a genetic defect hampering their folate metabolism, such that the supplement is beneficial even with normal dietary intakes.

However, the application of the result to the prevention of first occurrences of neural tube defects involves more than accepting the argument of causality. Even if the causal relationship applies, with the same relative risk, the balance of potential benefit to potential harm is dramatically different. These defects (p.401) occur very early in embryological development, before the pregnancy is recognized, and so folic acid supplementation would have to be offered on a population basis to all women who could become pregnant, raising a unique public health challenge. While the risk without supplementation for women who have had a previous abnormal baby is around 2–5 per 100, the corresponding risk in mothers without such a history is around 1–2 per 1000. Therefore the number of women who would need supplementation to prevent one defect (the number needed to treat (NNT)) would rise from around 50 to 5000. Issues of cost, any possibility of side effects, and the social and ethical issues related to mass medication all become relevant.

E. Comparison of these results with other evidence

17. Are the results consistent with other evidence, particularly evidence from studies of similar or more powerful study design?

This is the most powerful study design applied to this problem. The results are generally consistent with earlier studies using weaker designs. In the discussion the authors review the previous small randomized trial and the large non-randomized intervention study, which both showed a benefit, and also mention unpublished interim results of a further trial which will be discussed below. They also note six observational studies using cohort or case–control designs, of which all but one showed an association consistent with protection from increased folic acid consumption. However, the consistency question is not particularly relevant, as this study has the strongest design and is much more rigorous than any of the previous work. Therefore if the results of this trial had shown no benefit, the previous results from non-randomized trials or observation studies would not have offset that null result.

18. Does the total evidence suggest any specificity?

The specificity of the result with regard to folic acid alone is important. In the previous work, the large non-randomized study used several vitamins including folic acid. The observational studies are understandably very weak in separating the effects of different vitamins. Some are concerned with total dietary intakes, and women who consume high folate levels also tend to have high intakes of a wide range of other vitamins; others assess supplementation, which is rarely specific to particular preparations, or used and recalled consistently enough to assist. The previous randomized study used folic acid alone and suggested a benefit, but was too small to be conclusive. Therefore the factorial design of the trial, designed to evaluate this specificity, is very important.

(p.402) 19. Are the results plausible in terms of the biological mechanism?

There is no discussion of biological plausibility or the mechanisms of the effect in this paper. These defects had been produced in experimental animals by folic acid deficiency and by folate antagonists, but as the defects can be produced in animals by a wide range of dietary changes or external agents, these results did not appear particularly striking or important. When this trial was published in 1991 there was no accepted mechanism by which folic acid would protect against these defects. It seemed surprising to many that supplementation with a relatively low dose of a common vitamin could produce a dramatic reduction in the frequency of a common and severe congenital defect, whose occurrence had been studied intensively for several decades. However, this trial is a good example of an empirical advance preceding knowledge of mechanisms. These results stimulated a search for the mechanism by which the protective effect is produced, and considerable progress has been made, which will be reviewed briefly below.

20. If a major effect is shown, is it coherent with the distribution of exposure and the outcome?

The results show a major effect, and therefore coherence can be anticipated. However, coherence as it applies to the factors affecting the recurrence risk of these defects is difficult to establish, as there is little information available on factors, apart from the type of family history, that could relate to the degree of maternal folic acid deficiency as well as to any postulated genetic mechanism. But if, as the authors claim in the discussion, the results also apply to the primary prevention of defects, then the descriptive epidemiology of neural tube defects should fit with this new result. The epidemiology of these defects is complex, showing great variation in geography, time, season, maternal social circumstances, and so on, and the major limitation in judging coherence is that the variability in folic acid consumption is much less well documented. Folic acid is not a particularly easy dietary constituent to measure, and biologically available folic acid may differ considerably from the folate content of foods. There is little information on how the folic acid consumption or tissue levels in pregnant women vary by place, time, or personal or social factors. The demonstration that women in the lower social classes in Britain had low folate intakes was one of the factors leading to the use of supplementation, but this association was very non-specific, as such women have lower intakes of many nutrients. The authors note two issues of coherence. (p.403) The UK has one of the highest rates of neural tube defects in the world, and yet is not known to be particularly deficient in folic acid; the authors suggest that the UK population may indeed have low folic acid intakes, or that the types of folic acids in foods eaten there may be relatively poorly absorbed. They also suggest there could be an interaction with a genetically controlled disturbance of folic acid metabolism, a hypothesis which could of course explain much of the other variability of neural tube defect occurrence. They also note that studies of serum or red cell folate levels in women with affected and unaffected pregnancies have not in general shown substantial differences, and suggest that this may be because the range of values within a population is too narrow to demonstrate such differences easily.

Summary of external validity

In summary, the primary result that folic acid reduces the recurrence rates of these defects can be applied on a worldwide basis. The size of the effect may vary, as the contribution of folic acid deficiency to aetiology may vary in different communities, because of differences in both the defect and the background level of folic intake.

The main issue in generalizability is whether these results on recurrences can be applied to first occurrences of these defects. A reasonable case on both biological and epidemiological grounds is made by the authors, but more direct evidence would be useful. The results are consistent with most observational studies of first occurrences of the defect. At the time of publication, the mechanism for the effect was basically unknown.


This large individually randomized trial has produced a result which has extremely high internal validity, with observation bias and confounding being ruled out as alternative explanations (Ex. 11.5). The result for folic acid is very unlikely to be due to chance, although the precision of the result is still modest, as shown by the fairly wide confidence limits. The association has an appropriate time relationship, is strong, and is consistent by the type of defect and geographical area, and the benefit appears to be specific to folic acid rather than to other multivitamins. Generalization of the result to other high-risk women for the prevention of recurrence is reasonable. Generalization of the result to the situation of the first occurrence of these defects raises further questions, but is certainly strongly suggested by this study. (p.404)

                      Critical appraisal of a randomized trial of a preventive agent

Ex. 11.5. Summary of assessment of the randomized trial of prevention of recurrence of neural tube defects (MRC Vitamin Study Research Group [1])

(p.405) Progress subsequent to this study

This is the strongest study of the six discussed in this section of the book. The study answered an important question and gave a definite result, which was accepted worldwide as relevant and important. This acceptance comes from the strengths of the study: a large international randomized double-blind trial of a feasible intervention.

The publication of this study in July 1991 led to rapid changes in clinical and public health practice. The primary result, that supplementation with 4 mg of folic acid per day would reduce the recurrence rate of these defects, was accepted quickly by health care authorities and professional associations worldwide as accepted clinical practice for high-risk women; for example, within a month, recommendations for high-risk women were issued by the US Centers for Disease Control (CDC) and by the Chief Medical Officer in England. The rapid acceptance of these results contrasts with the reception of the non-randomized trial by Smithells et al. in 1981 [2], although that study showed an even stronger protective effect with a relative risk of 0.15. Although that trial undoubtedly led to some clinical adoption of supplementation for high-risk women, most health authorities and professional associations did not advocate a clear policy because of the limitations of the study discussed in this chapter. The non-randomized study used a lower dose (0.36 mg folic acid per day) combined with other vitamins, and so the question of whether a lower-dose preparation would be effective was raised, but in the absence of evidence for toxicity, there was little enthusiasm to develop a trial on high-risk women to answer this question.

The more challenging issue was what action to take with regard to first occurrences of these defects. Could the results of this randomized trial, on recurrences of the defect, be applied to first occurrences? If not, another even larger trial would be needed to answer that question. Such a trial was being designed in China, and had reached the pilot study phase, but it was abandoned once the MRC trial results were known [7]. Observational studies continued in China [8]. The situation was considerably helped by the publication in 1992 of a study of primary prevention in Hungary, which used a randomized design with a multivitamin supplement including folic acid at a dose of 0.8 mg. This demonstrated a protective effect with regard to first occurrences [9,10]. This supportive result made subsequent public health decisions considerably easier. Before the end of 1992, both the CDC and the Chief Medical Officer in England issued recommendations that all women capable of becoming pregnant should take a supplement of 0.4 mg folic acid per day, in addition to ensuring a good diet. This lower dose was chosen as the Czech randomized (p.406) study used 0.8 mg, the Smithells non-randomized study used 0.36 mg, and various observational studies showed benefits at similar low doses, and toxicity of the 4 mg dose recommended for high-risk women could not be excluded. Similar recommendations were made in many other countries.

Therefore the fact that folate supplementation reduces the risk of these defects was rapidly accepted. Discussion then concentrated on how best to apply this new knowledge, with the main challenge being that advice given once a woman knows she is pregnant is too late. The main approaches have been health education messages to encourage women in the reproductive age groups to take folic acid in anticipation of a possible pregnancy, and increasing the folic acid content of food stuffs, such as flour and cereals, by permissive or compulsory legislative moves, to increase the total folic acid consumption on a community basis [11,12]. For example, Canada and the USA introduced requirements to fortify flour and grain products in 1996 and 1998, respectively, giving about 70 μg extra folate per day, while Australia and the UK allow fortification but do not require it [13,14]. However, a review of trends in the occurrence of neural tube defects up to 1998 in 10 countries found no appreciable effect of recommendations made since 1992 [15]. In 2003, it was estimated that 40 countries had preventive programmes, but that fewer than 10 per cent of the 240 000 cases per year of preventable anencephalus and spina bifida were being prevented [16].

This trial also raises several general issues about the conduct of research. As pointed out in an editorial accompanying the trial result [17], Smithells and his colleagues wished to carry out a randomized trial in the 1970s, but permission to do so was refused by an ethics committee which insisted that all women at increased risk should be offered the supplement. This led to the non-randomized trial, giving results which, while strong and statistically significant, were not regarded as definitive by most authorities. Therefore this ethical committee decision delayed the development of fully randomized trial by some 10 years, and the considerable debate about it delayed recruitment into the MRC trial, so that it took 8 years to accumulate enough participating women to produce the results discussed above. This highlights the need for ethical committees to assess the scientific and ethical consequences of their decisions. On the other hand, the results of the non-randomized trial were correct; they demonstrated a large benefit for vitamin supplementation. Should this evidence from the non-randomized trial, in addition to the considerable number of observational studies, have been accepted as showing the preventive action of vitamin supplementation? By delaying public health policy until the results of the randomized trial came in 10 years later, many preventable defects must have occurred. This situation can be contrasted with the situations where randomized trials have produced results (p.407) contradicting the conclusions from earlier studies, such as with beta-carotene, discussed in Chapter 6.

This trial also stimulated a great deal more work on the biochemistry of folic acid and the mechanisms of this preventive action. Much of this concentrated on the metabolic processes relating to folic acid, vitamin B12, and DNA production. Specific genetic defects have been identified, such as a defect in the gene for 5,10 methylene-tetrahydrofolate reductase [18], which may make some women less able to utilize folic acid, so that they require a higher dietary intake for normal DNA synthesis [19,20]. Other research has shown that women with pregnancies with neural tube defects have autoantibodies to folate receptors [21].

Research has also been stimulated on whether folic acid supplementation can prevent other congenital abnormalities, such as cleft lip and palate [22], and the role of folic acid in chronic diseases such as heart disease and colon cancers. The evidence for the routine use of vitamin supplementation for such prevention is still uncertain [23,24]. However, Professor Wald, who led the MRC trial described in this chapter, is one of those proposing that the routine use of folic acid (0.8 mg daily), with a statin to lower cholesterol, drugs to reduce blood pressure, and aspirin, in a combined ‘Polypill’ for everyone over age 55 could prevent 80 per cent of cardiovascular disease, with adverse symptoms in 8–15 percent [25,26].


Bibliography references:

1. MRC (Medical Research Council) Vitamin Study Group. Prevention of neural tube defects: results of the Medical Research Council Vitamin Study. Lancet 1991; 338: 131–137.

2. Smithells RW, Sheppard S, Schorah CJ, et al. Apparent prevention of neural tube defects by periconceptional vitamin supplementation. Arch Dis Child 1981; 56: 911–918.

3. Laurence KM, James N, Miller MH, Tennant GB, Campbell H. Double-blind randomised controlled trial of folate treatment before conception to prevent recurrence of neural-tube defects. BMJ 1981; 282: 1509–1511.

4. Armitage P. Sequential Medical Trials (2nd edn). Oxford: Blackwell Scientific, 1975.

5. MacCarthy PA, Dalrymple IJ, Duignan NM, et al. Recurrence rates of neural tube defects in Dublin maternity hospitals. Ir Med J 1983; 76: 78–79.

6. Elwood JM, Little J, Elwood JH. Epidemiology and Control of Neural Tube Defects. Oxford: Oxford University Press, 1992.

7. Oakley GP, Jr., Erickson JD, James LM, Mulinare J, Cordero JF. Prevention of folic acid-preventable spina bifida and anencephaly. Ciba Found Symp 1994; 181: 212–231.

8. Berry RJ, Li Z, Erickson JD, et al. Prevention of neural-tube defects with folic acid in China. China–U.S. Collaborative Project for Neural Tube Defect Prevention. N Engl J Med 1999; 341: 1485–1490.

9. Czeizel AE, Dudas I. Prevention of the first occurence of neural tube defects by periconceptional vitamin supplementation. N Engl J Med 1992; 327: 1832–1835.

(p.408) 10. Czeizel AE, Dudas I, Metneki J. Pregnancy outcomes in a randomised controlled trial of periconceptional multivitamin supplementation. Final report. Arch Gynecol Obstet 1994; 255: 131–139.

11. Bentley JR, Ferrini RL, Hill LL. American College of Preventive Medicine Public Policy Statement. Folic acid fortification of grain products in the U.S. to prevent neural tube defects. Am J Prev Med 1999; 16: 264–267.

12. Wald NJ, Bower C. Folic acid and the prevention of neural tube defects. BMJ 1995; 310: 1019–1020.

13. Wald NJ. Folic acid and the prevention of neural-tube defects. N Engl J Med 2004; 350: 101–103.

14. Bower C, de Klerk N, Milne E, et al. Plenty of evidence on mandatory folate fortification. Aust N Z J Public Health 2006; 30: 81–82.

15. Botto LD, Lisi A, Robert-Gnansia E, et al. International retrospective cohort study of neural tube defects in relation to folic acid recommendations: are the recommendations working? BMJ 2005; 330: 571.

16. Oakley GP Jr, Bell KN, Weber MB. Recommendations for accelerating global action to prevent folic acid-preventable birth defects and other folate-deficiency diseases: meeting of experts on preventing folic acid-preventable neural tube defects. Birth Defects Res A Clin Mol Teratol 2004; 70: 835–837.

17. Anonymous. Folic acid and neural tube defects. Lancet 1991; 338: 153–154.

18. Whitehead AJ, Gallagher P, Mills JL, et al. A genetic defect in 5,10 methylenetetra-hydrofolate reductase in neural tube defects. Q J Med 1995; 88: 763–766.

19. Lucock M. Folic acid: nutritional biochemistry, molecular biology, and role in disease processes. Mol Genet Metab 2000; 71: 121–138.

20. Mitchell LE, Adzick NS, Melchionne J, Pasquariello PS, Sutton LN, Whitehead AS. Spina bifida. Lancet 2004; 364: 1885–1895.

21. Rothenberg SP, da Costa MP, Sequeira JM, et al. Autoantibodies against folate receptors in women with a pregnancy complicated by a neural-tube defect. N Engl J Med 2004; 350: 134–142.

22. Botto LD, Olney RS, Erickson JD. Vitamin supplements and the risk for congenital anomalies other than neural tube defects. Am J Med Genet C Semin Med Genet 2004; 125: 12–21.

23. US Preventive Services Task Force. Routine vitamin supplementation to prevent cancer and cardiovascular disease. Nutr Clin Care 2003; 6: 102–107.

24. Strohle A, Wolters M, Hahn A. Folic acid and colorectal cancer prevention: molecular mechanisms and epidemiological evidence (Review). Int J Oncol 2005; 26: 1449–1464.

25. Wald NJ, Law MR. A strategy to reduce cardiovascular disease by more than 80%. BMJ 2003; 326: 1419.

26. Anonymous. Combination pharmacotherapy for cardiovascular disease. Ann Intern Med 2005; 143: 593–599.