Jump to ContentJump to Main Navigation
The Cement of the UniverseA Study of Causation$

J. L. Mackie

Print publication date: 1980

Print ISBN-13: 9780198246428

Published to Oxford Scholarship Online: November 2003

DOI: 10.1093/0198246420.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 17 August 2017

(p.297) Appendix Eliminative Methods of Induction

(p.297) Appendix Eliminative Methods of Induction

(Mill's Methods of Induction)

Source:
The Cement of the Universe
Author(s):

J. L. Mackie

Publisher:
Oxford University Press
DOI:10.1093/0198246420.005.0001

Abstract and Keywords

In this section, Mackie provides an outline of Mill's methods on induction. He sets out a formal account of Mill's methods, employing a variation of Aristotelian logic. The methods are those of agreement and disagreement (both in their simple and complex variants), the method of residues, and the method of concomitant variation. Some criticisms concerning the possibility of the employment of these methods are raised and responded to.

Keywords:   agreement, concomitant variation, disagreement, induction, logic, Mill

1. Introduction

John Stuart Mill, in his System of Logic 1 set forth and discussed five methods of experimental inquiry, calling them the method of agreement, the method of difference, the joint method of agreement and difference, the method of residues, and the method of concomitant variation. He maintained that these are the methods by which we both discover and demonstrate causal relationships, and that they are of fundamental importance in scientific investigation. In calling them eliminative methods Mill drew a rather forced analogy with the elimination of terms in an algebraic equation. But we can use this name in a different sense: all these methods work by eliminating rival candidates for the role of cause.

The inductive character of these methods may well be questioned. W. E. Johnson called them demonstrative methods of induction,2 and they can be set out as valid forms of deductive argument: they involve no characteristically inductive steps, and no principles of confirmation or corroboration are relevant to them. But in each case the conclusion is a generalization wider than the observation which helps to establish it; these are examples of ampliative induction.

The general nature of these methods may be illustrated by examples of the two simplest, those of agreement and of difference.

Mill's canon for the method of agreement runs: ‘If two or more instances of the phenomenon under investigation have only one circumstance in common, the circumstance in which alone all the instances agree is the cause (or effect) of the given phenomenon.’ For example, if a number of people who are suffering from a certain disease have all gone for a considerable time without fresh fruit or vegetables, but have in other respects had quite different diets, have lived in different conditions, have different hereditary backgrounds, and so on, so that the lack of fresh fruit and vegetables is the only (p.298) feature common to all of them, then we can conclude that the lack of fresh fruit and vegetables is the cause of this particular disease.

Mill's canon for the method of difference runs: ‘If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance in common save one, that one occurring in the former; the circumstance in which alone the two instances differ, is the effect, or the cause, or an indispensable part of the cause, of the phenomenon.’ For example, if two exactly similar pieces of iron are heated in a charcoal‐burning furnace and hammered into shape in exactly similar ways, except that the first is dipped into water after the final heating while the second is not, and the first is found to be harder than the second, then the dipping into water while it is hot is the cause of such extra hardness—or at least an essential part of the cause, for the hammering, the charcoal fire, and so on may also be needed. For all this experiment shows, merely dipping iron while hot into water might not increase its hardness.

The method of agreement, then, picks out as the cause the one common feature in a number of otherwise different cases where the effect occurs; the method of difference picks out as the cause the one respect in which a case where the effect occurs differs from an otherwise exactly similar case where the effect does not occur. But in both the conclusion is intended to say more than that this was the cause of that effect in this instance (or this group of instances). The conclusion in our first example is that this particular disease is always produced by a lack of fresh fruit and vegetables, and in our second example that dipping iron which has been heated and hammered in a particular way into water while it is hot always hardens it.

There are many weaknesses in Mill's description of these methods, but there is no need to criticize his account in detail. The interesting questions are whether there are any valid demonstrative methods of this sort, and if so whether any of them, or any approximations to any of them, have a place in either scientific or commonsense inquiry. Several reconstructions of the methods have been offered; the most thorough treatment I know of is that of von Wright, but I find it somewhat unclear.3 I shall, therefore, attempt another reconstruction.

In giving a formal account of the reasoning involved in these methods, I shall use an old‐fashioned sort of logic, traditional (roughly Aristotelian) logic but with complex (conjunctive and (p.299) disjunctive) terms. The letters ‘A’, ‘B’, and so on stand for kinds of event or situation, or, what comes to the same thing, for features the possession of which makes an event or situation one of this or that kind. These conditions (Mill's ‘circumstances’) can therefore be present or absent on particular occasions (in particular instances). To say that all A are B would be to say that whenever feature A is present, so is feature B. To say that A is necessary for B is to say that whenever B is present, A is present, and to say that A is sufficient for B is to say that whenever A is present, B is present also. A conjunctive feature AB is present whenever and only when both A and B are present. A disjunctive feature (A or B) is present whenever at least one of A and B is present. A negative feature not‐A (written Ā) is present whenever and only when A is absent. The following argument forms are obviously valid:

  1. (i) All A are C Therefore, All AB are C

  2. (ii) All A are BC Therefore, All A are B (and All A are C)

  3. (iii) All (A or B) are C Therefore, All A are C (and All B are C)

  4. (iv) All A are B Therefore, All A are (B or C)

  5. (v) All A are C and All B are C Therefore, All (A or B) are C.

In expounding the reasoning implicit in these methods, I shall also use the letters ‘X’, ‘Y’, ‘Z’ as variables whose instances are the features or kinds of occurrence represented by ‘A’, ‘B’, and so on, and I shall take the liberty of quantifying with respect to them; for example, to say that for some X, X is necessary for B will be to say that some feature occurs whenever B occurs; this statement would be true if, say, all B are C. There is nothing unsound in these procedures, and I hope nothing obscure. The slight unfamiliarity of these techniques is, I believe, more than compensated for by the fact that for this particular task they are more economical than the obvious alternatives.4

To avoid unnecessary complications, let us assume that the (p.300) conclusion reached by any application of one of these methods (other than that of concomitant variation which we shall leave aside for the present) is to have the form ‘Such‐and‐such is a cause of such‐and‐such a kind of event or phenomenon’, where a ‘cause’ will, in general, be both necessary and sufficient for the phenomenon, though in some variants of the methods it will be taken as necessary only, or as sufficient only. Let us assume also that we can distinguish in some way causes from effects, and think of the methods only as identifying causes. (Mill, as the canons quoted show, mixes this task up with that of identifying effects, and so far as their logical form is concerned the methods could be applied to any problem about conditions that are necessary for something, or sufficient, or both, but their most interesting applications, and the ones for which the required assumptions are most plausible, are to the identification of causes.) However, the cause will be such, and will be necessary or sufficient or both, in relation to some field, that is, some set of background conditions: our question is, say, ‘What causes this disease in human beings living in ordinary conditions, breathing air, and so on?’ or, ‘What causes the greater‐than‐ordinary hardness in iron in ordinary circumstances and at ordinary temperatures?’ To say that A is necessary for B in the field F will be to say that whenever B occurs in (or in relation to) something that satisfies the conditions summed up as F, A occurs there also—which will be expressed accurately enough by ‘All FB are A’—and so on. I shall use ‘P’ to represent the ‘phenomenon’ whose ‘cause’ is being sought, and I shall call an occasion on which P is present a positive instance, and one on which it is absent a negative instance. The observation that supports the conclusion will be an observation of the presence or absence of various conditions, each of which might be causally relevant, in one or more (positive or negative) instances.

But since, as I said, the conclusion regularly goes beyond the observation, and yet each method is supposed to be demonstrative, that is, deductively valid, the conclusion must be drawn not from the observation alone, but from it in conjunction with an assumption. This pattern, assumption and observation together entailing a conclusion, is characteristic of all these methods. And there are obvious proportional relations between the three items. The less rigorous the assumption, the stronger the observation needs to be if we are to get the same, or perhaps even any, conclusion. With the same observation, a less rigorous assumption will yield a weaker conclusion, if any. And so on.

But what sort of assumption is required, and what is it for it to be more or less rigorous? Since we are to arrive at the conclusion that (p.301) a certain condition is, in the sense indicated above, a cause of the phenomenon, and to do so by eliminating rivals, we must assume at the start that there is some condition which, in relation to the field, is necessary and sufficient (or which is necessary, or which is sufficient) for this phenomenon, and that it is to be found somewhere within a range of conditions that is restricted in some way.

For a formal exposition, it is easiest to take the assumption as indicating some set (not necessarily finite) of possibly relevant causal features (Mill's ‘circumstances’ or ‘antecedents’). Initially I shall speak in terms of a list of such possible causes (p‐cs), but, as I shall show, we can in the end dispense with any such list. A p‐c, since it is possibly causally relevant in relation to the field in question, must—like the phenomenon itself—be something that is sometimes present and sometimes absent within that field: it must not be one of the conditions that together constitute the field.

But are we to assume that a p‐c acts singly, if it acts at all? If the p‐cs are A, B, C, etc., are we to assume that the cause of P in F will be either A by itself or B by itself, and so on? Or are we to allow that it might be a conjunction, say AC, so that P occurs in F when and only when both A and C are present? Are we to allow that the actual (necessary and sufficient) cause might be a disjunction, say (B or D), so that P occurs in F whenever B occurs, and whenever D occurs, but only when at least one of these occurs? Are we to allow that our p‐cs may include counteracting causes, so that the actual cause of P in F may be, say, the absence of C (that is, not‐C, or C¯), or perhaps BC¯, so that P occurs in F when and only when B is present and C is absent at the same time?

There are in fact valid methods with assumptions of different kinds, from the most rigorous, which requires that the actual cause should be just one of the p‐cs by itself, through those which progressively admit negations, conjunctions, and disjunctions of p‐cs and combinations of these, to the least rigorous, which says merely that the actual cause is built up out of some of the p‐cs in some way. There are in fact eight possible kinds of assumption, namely that the actual cause is:

  1. 1. one of the p‐cs.

  2. 2. either one of the p‐cs or a negation of one.

  3. 3. either a p‐c or a conjunction of p‐cs.

  4. 4. either a p‐c or a disjunction of p‐cs.

  5. 5. a p‐c, or the negation of a p‐c, or a conjunction each of whose members is a p‐c or the negation of a p‐c.

  6. 6. a p‐c, or the negation of a p‐c, or a disjunction each of whose members is a p‐c or the negation of a p‐c.

    (p.302)
  7. 7. a p‐c, or a conjunction of p‐cs, or a disjunction each of whose members is a p‐c or a conjunction of p‐cs.

  8. 8. a p‐c; or the negation of a p‐c; or a conjunction each of whose members is a p‐c or the negation of one; or a disjunction each of whose members is a p‐c, or the negation of one, or a conjunction each of whose members is a p‐c or the negation of one.

Analogy with the use of disjunctive normal form in the propositional calculus makes it easy to show that any condition made up in any way by negation, conjunction, and disjunction from a set of p‐cs will be equivalent to some condition allowed by this eighth kind of assumption, which is therefore the least rigorous kind of assumption possible. The form of the observation determines whether a method is a variant of the method of agreement, or difference, and so on. But since each form of observation may be combined with various kinds of assumption, there will be not just one method of agreement, but a series of variants using assumptions of different kinds, and similarly a series of variants of the method of difference, and so on. To classify all these variants we may use a decimal numbering, letting the figure before the decimal point (from 1 to 8) indicate the kind of assumption, and the first figure after the decimal point the form of observation, thus:

  1. 1. a variant of the method of agreement.

  2. 2. a variant of the method of difference.

  3. 3. a variant of the joint method (interpreted as an ‘indirect method of difference’).

  4. 4. a new but related method.

Further figures in the second place after the decimal point will be used for further subdivisions.

A complete survey would take up too much space, but some of the main possibilities will be mentioned.

2. Agreement and Difference—Simple Variants

Positive Method of Agreement

Let us begin with an assumption of the first kind, that there is some necessary and sufficient condition Z for P in F—that is, for some Z all FP are Z and all FZ are P—and Z is identical with one of the p‐cs A, B, C, D, E. We obtain a variant of the method of agreement (1.12) by combining this assumption with the following observation: a set of one or more positive instances such that one p‐c, say A, is present in each, but for every other p‐c there is an instance from which that p‐c is absent. This yields the conclusion that A is necessary and sufficient for P in F.

(p.303) For example, the observation might be:

A

B

C

D

E

I 1

p

a

p

·

a

I 2

p

p

a

a

·

where ‘p’ indicates that the p‐c is present, ‘a’ that it is absent, and a dot that it may be either present or absent without affecting the argument. I 1 and I 2 are positive instances: I 1 shows that neither B nor E is necessary for P in F, and hence that neither can be the Z that satisfies the assumption, similarly I 2 shows that neither C nor D can be this Z; only A can be, and therefore must be, this Z; that is, A is both necessary and sufficient for P in F.

Since this reasoning eliminates candidates solely on the ground that the observation shows them not to be necessary, there is another variant (1.11) which assumes only that there is some necessary condition Z for P in F and (with the same observation) concludes only that A is necessary for P in F.

Negative Method of Agreement

Besides this positive method of agreement in which candidates are eliminated as not being necessary because they are absent from positive instances, there are corresponding variants of a negative method of agreement in which they are eliminated as not sufficient because they are present in negative instances. The required observation consists of one or more negative instances such that one p‐c, say A, is absent from each instance, but for every other p‐c there is an instance in which it is present. For example:

A

B

C

D

E

N 1

a

p

·

·

·

N 2

a

·

p

p

·

N 3

a

·

·

·

p

If the assumption was that one of the p‐cs is sufficient for P in F, this observation would show (1.13) that A is sufficient, while if the assumption was that one of the p‐cs is both necessary and sufficient, the same observation yields the conclusion (1.14) that A is both necessary and sufficient.

Method of Difference

The simplest variant of this method (1.2) combines the assumption that one of the p‐cs is both necessary and sufficient for P in F with this observation: a positive instance I 1 and a negative instance (p.304) N 1 such that of the p‐cs present in I 1 one, say A, is absent from N 1, but the rest are present in N 1. For example:

A

B

C

D

E

I 1

p

p

p

a

·

N 1

a

p

p

·

p

Here D is eliminated because it is absent from I 1 and hence not necessary, and B, C, and E because they are present in N 1 and hence not sufficient. Only A therefore can be, and so must be, the Z that is both necessary and sufficient for P in F. Note that with an assumption of this first kind it would not matter if, say, E were absent from I 1 and/or D were present in N 1: the presence of the actual cause A in I 1 but not in N 1 need not be the only relevant difference between the instances. But this would matter if we went on to an assumption of the second kind. We may also remark that the method of difference, unlike some variants of the method of agreement, requires the assumption that there is some condition that is both necessary and sufficient for P in F. As we shall see with variants 4.2 and 8.2, the conclusion may not fully specify the resulting necessary and sufficient condition, and the factor picked out as (in another sense) the cause is guaranteed only to be an inus condition or better; but the assumption needed is that something is both necessary and sufficient.

Joint Method of Agreement and Difference

Mill's exposition of this method mixes up what we may call a double method of agreement, that is, the use of the positive and negative methods of agreement together, with what Mill himself calls an indirect method of difference, in which the jobs done by the single positive instance and the single negative instance in the method of difference are shared between a set of positive and/or a set of negative instances. The latter is the more interesting and distinctive. Its simplest variant (1.3) combines the assumption that one of the p‐cs is both necessary and sufficient for P in F with this observation: a set Si of positive instances and a set Sn of negative instances such that one of the p‐cs, say A, is present throughout Si and absent throughout Sn, but each of the other p‐cs is either absent from at least one positive instance or present in at least one negative instance. For example:

A

B

C

D

E

Si I 1 I 2

p

p

p

p

a

p

p

p

a

p

Sn N 1 N 2

a

p

a

a

a

a

a

p

a

a

(p.305) This assumption and this observation together entail that A is necessary and sufficient for P in F. As the example shows, none of the other p‐cs, B, C, D, E, could be so, given such an observation: yet Si by itself does not yield this conclusion by the positive method of agreement, nor Sn by the negative method, nor does any pair of positive and negative instances from those shown yield this conclusion by the method of difference.

3. Agreement and Difference—Complex Variants

When we go on to an assumption of the second kind, allowing that the actual cause may be the negation of a p‐c, we need slightly stronger observations. Thus for variants of the positive method of agreement (2.11 and 2.12) we need this: two or more positive instances such that one p‐c (or a negation of a p‐c) say A, is present in each instance, but for every other p‐c there is an instance in which it is present and an instance from which it is absent. The observation described for 1.11 and 1.12 above would be too weak: for example, if D were absent from both I 1 and I 2 as we there allowed, D¯ would not be eliminated, and we could not conclude that A was the actual cause.

For the corresponding variant of the method of difference (2.2) we need this: a positive instance I 1 and a negative instance N 1 such that one p‐c (or a negation of a p‐c), say A, is present in I 1 and absent from N 1, but each other p‐c is either present in both I 1 and N 1 or absent from both. For example:

A

B

C

D

E

I 1

p

p

a

a

p

N 1

a

p

a

a

p

Since B is present in N 1, B is not sufficient for P in F, but since B is present in I 1, B¯ is not necessary for P in F. Since C is absent from I 1, C is not necessary for P in F, but since C is absent from N 1, C¯ is not sufficient for P in F. Thus neither B nor C, nor either of their negations, can be both necessary and sufficient for P in F; D and E are ruled out similarly, so, given the assumption that some p‐c or negation of a p‐c is both necessary and sufficient, A must be so. This is the classic difference observation described by Mill, in which the only possibly relevant difference between the instances is the presence in I 1 of the factor identified as the actual cause; but we need this, rather than the weaker observation of 1.2, only when we allow that the actual cause may be the negation of a p‐c.

The joint method needs, along with this weaker assumption, a similarly strengthened observation: each of the p‐cs other than A (p.306) must be either present in both a positive instance and a negative instance or absent from both a positive instance and a negative instance; this variant (2.3) then still yields the conclusion that A is both necessary and sufficient for P in F.

We consider next an assumption of the third kind, that the actual cause is either a p‐c or a conjunction of p‐cs. This latter possibility seems to be at least part of the complication Mill described as an intermixture of effects. This possibility does not affect the positive method of agreement, since if a conjunction is necessary, each of its conjuncts is necessary, and candidates can therefore be eliminated as before. But since the conjuncts in a sufficient (or necessary and sufficient) condition may not be severally sufficient, the negative method of agreement is seriously affected. The observation described and exemplified above for 1.13 and 1.14 would now leave it open that, say, BC rather than A was the required sufficient (or necessary and sufficient) condition, for if C were absent from N 1, B from N 2, and either from N 3, then BC as a whole might still be sufficient. We can, indeed, still use this method with a much stronger observation (3.14), namely a single negative instance N 1 from which one p‐c, say A, is absent, but every other p‐c is present. This will show (with our present assumption) that no p‐c other than A, and no conjunction of p‐cs that does not contain A, is sufficient for P in F. But even this does not show that the actual cause is A itself, but merely that it is either A or a conjunction in which A is a conjunct. (We may symbolize this by saying that the cause is (A . . .), where the dots indicate that other conjuncts may form part of the cause, and the dots are underlined, while A is not, to indicate that A must appear in the formula for the actual cause, but that other conjuncts may or may not appear.)

The corresponding variant of the method of difference (3.2) needs only the same observation as 1.2; but it, too, yields only the weaker conclusion that (A . . .) is necessary and sufficient for P in F (but therefore that A itself is necessary, but perhaps not sufficient). For while in the example given for 1.2 above B, C, D, and E singly are still eliminated as they were in 1.2, and any conjunctions such as BC which, being present in I 1, might be necessary, are eliminated because they are also present in N 1 and hence are not sufficient, a conjunction such as AB, which contains A, is both present in I 1 and absent from N 1, and therefore might be both necessary and sufficient. Thus this assumption and this observation show only that A is, as Mill put it, ‘the cause, or an indispensable part of the cause’. The full cause is represented by the formula ‘(A . . .)’, provided that only p‐cs which are present in I 1 can replace the dots.

(p.307) The corresponding variant of the joint method (3.3) needs a single negative instance instead of the set Sn, for the same reason as in 3.14, and the cause is identified only as (A . . .).

With an assumption of the fourth kind, that the actual cause is either a p‐c or a disjunction of p‐cs, the negative method of agreement (4.13 and 4.14) works as in 1.13 and 1.14, since if a disjunction is sufficient, each of its disjuncts is so. It is now the positive method of agreement that suffers. For with the observation given for 1.12 above, the necessary and sufficient condition might be, say, (B or C); for this disjunction is present in both I 1 and I 2, though neither of its disjuncts is present in both. Thus the observation of 1.12 would leave the result quite undecided. We need (for 4.12) a much stronger observation, namely a single positive instance in which one p‐c, say A, is present but from which every other p‐c is absent, but even this shows only that the cause is—with the same interpretation of the symbols as above—(A or . . .). This assumption (that the cause may be a disjunction of p‐cs) allows the possibility of what Mill called a plurality of causes, each disjunct being a ‘cause’ in the sense of being a sufficient condition. What we have just noted is the well‐known point that this possibility undermines the method of agreement.

The method of difference, on the other hand, still survives and still needs only the observation of 1.2; this eliminates all p‐cs other than A, and all disjunctions that do not contain A, either as being not sufficient because they are present in N 1 or as not necessary because they are absent from I 1. The only disjunctions not eliminated are those that are present in I 1 but absent from N 1, and these must contain A. Thus this observation, with the present assumption, still shows that (A or . . .) is necessary and sufficient for P in F; that is, the actual cause is either A itself or a disjunction one of whose disjuncts is A and the others are p‐cs absent from N 1. Hence A itself, the differential feature, is sufficient for P in F but may not be necessary.

The joint method with this assumption (4.3) needs a single positive instance instead of the set Si, but can still use a set of negative instances, and it too identifies the cause as (A or . . .).

As the assumptions are relaxed further, the method of agreement needs stronger and stronger observations. For example in 6.12, a variant of the positive method with the assumption that there is a necessary and sufficient condition which may be a p‐c, or a negation of one, or a disjunction of p‐cs and/or negations of p‐cs, the observation needed is this: a set Si of positive instances such that one p‐c, say A, is present in each, but that for every other possible combination of the other p‐cs and their negations there is an instance in (p.308) which this combination is present (that is, if there are n other p‐cs, we need 2n different instances). This observation will eliminate every disjunction that does not contain A (showing it not to be necessary) and will thus show that (A or . . .) is necessary and sufficient, and hence that A itself is sufficient, for P in F. A corresponding variant of the negative method of agreement (5.14) shows that (A . . .) is necessary and sufficient, and hence A itself necessary, for P in F—a curious reversal of roles, since in the simplest variants the positive method of agreement identified a necessary condition and the negative one a sufficient condition.

The method of difference, however, continues to need only the observation prescribed for 1.2, if negations are not admitted, or, if negations are admitted, that prescribed for 2.2. But the conclusions become progressively weaker, that is, the cause is less and less completely specified. By far the most important variant of this method—indeed it is the most important of all those that deal in agreement or disagreement or any combination of them—is 8.2. This has an assumption of the eighth kind, in effect that for some Z, Z is necessary and sufficient for P in F, and Z is a condition represented by some formula in disjunctive normal form all of whose constituents are taken from the p‐cs. With the observation of 2.2, this yields the conclusion that (A . . . or . . .) is necessary and sufficient for P in F. For every condition built up in any way from p‐cs other than A will either be present in both I 1 and N 1, and so not sufficient (because it is present in N 1), or absent from both I 1 and N 1, and so not necessary (because it is absent from I 1); also any condition in normal form in which A occurs only negated will similarly be absent from I 1 if it is absent from N 1, so it will either be absent from both, and hence ruled out as not necessary, or it will be present in N 1, and hence ruled out as not sufficient. Consequently Z must be identical with some condition in disjunctive normal form in which A occurs unnegated, that is, with something covered by the expression ‘(A . . . or . . .)’. Since each disjunct in such a necessary and sufficient condition is itself sufficient, this observation, in which the presence of A in I 1 is the only possibly relevant difference between I 1 and N 1, shows even with the least rigorous kind of assumption that A is at least a necessary part of a sufficient condition for P in F (this sufficient condition being (A . . .)), that is, that A is an inus condition of P in F, or better. (It is this method 8.2 that has been discussed in Chapter 3; as was said there, the conclusion can also be expressed thus: For some X and for some Y (which may, however, be null), all F (AX or Y) are P, and all FP are (AX or Y).)

(p.309) The joint method, as an indirect method of difference, ceases to work once we allow both conjunctions and disjunctions of p‐cs as candidates for the role of actual cause. But what we called the double method of agreement, which in the simplest variants involved redundancy, the positive and negative halves merely, as Mill says, corroborating one another, comes into its own with our eighth, least rigorous, kind of assumption. In 8.12, as in 6.12, if there are n p‐cs other than A, the set of 2n positive instances with A present in each but with the other p‐cs present and absent in all possible combinations will show that (A or . . .) is necessary and sufficient and hence that A is sufficient. Similarly in 8.14, as in 5.14, the corresponding set of 2n negative instances with A absent from each will show that (A . . .) is necessary and sufficient and hence that A is necessary. Putting the two observations together, we could conclude that A is both necessary and sufficient for P in F.

A new method, similar in principle, can be stated as follows (8.4): if there are n p‐cs in all, and we observe 2n instances (positive or negative) which cover all possible combinations of p‐cs and their negations, then (assuming that some condition somehow built up out of these p‐cs is both necessary and sufficient) the disjunction of all the conjunctions found in positive instances is both necessary and sufficient for P in F. For example, if there are only three p‐cs, A, B, and C, and we have the observation set out in the following table:

P

A

B

C

a

p

p

p

p

p

p

a

p

p

a

p

a

p

a

a

a

a

p

p

p

a

p

a

a

a

a

p

a

a

a

a

then (ABC¯ or AB¯C or A¯BC¯) is necessary and sufficient for P in F. For if these are the only possibly relevant factors, each combination of p‐cs and their negations along with which P occurs at least once must be sufficient for P, and each such combination in whose presence P is absent must be non‐sufficient for P; but the disjunction of all the sufficient conditions must be both necessary and sufficient, on the assumption that some condition is so.

We find thus that while we must recognize very different variants of these methods according to the different kinds of assumptions used, and while the reasoning which validates the simplest variants (p.310) fails when it is allowed that the actual cause may be constituted by negations, conjunctions and disjunctions of p‐cs combined in various ways, nevertheless there are valid demonstrative methods which use even the least rigorous kind of assumption, that is, which assume only that there is some necessary and sufficient condition for P in F, made up in some way from a certain restricted set of p‐cs. But with an assumption of this kind we must be content either to extract (by 8.2) a very incomplete conclusion from the classic difference observation or to get more complete conclusions (by 8.12, 8.14, the combination of these two, or 8.4) only from a large number of diverse instances in which the p‐cs are present or absent in systematically varied ways.

There are two very important extensions of these methods, which consist in the relaxing of restrictions which have been imposed for the sake of clarity in exposition. First, since in every case the demonstration proceeds by eliminating certain candidates, it makes no difference if what survives, what is not eliminated, is not a single p‐c but a cluster of p‐cs which in the observed instances are always present or absent together: the conclusion in each case will be as stated above but with a symbol for the cluster replacing ‘A’. For example, if in 2.2 we have, say, both A and B present in I 1 and both A and B absent from N 1, but each other p‐c either present in both or absent from both, it follows that the cluster (A, B) is the cause in the sense that the actual cause lies somewhere within this cluster. Given that this cluster will either be present as a whole or absent as a whole, its presence is necessary and sufficient for P in F. A similar observation in 8.2 would show that, subject to the same proviso, this cluster is an inus condition of P in F, or better, and hence that either A or B is an inus condition, but perhaps each is an inus condition and perhaps their conjunction (AB) is so. Secondly, in order to eliminate candidates it is not really necessary to list them first. The observation required for 2.2 or 8.2, for instance, is only that a certain p‐c (or, as we have just seen, a certain cluster of p‐cs) should be present in I 1 but absent from N 1 while in every other respect that might be causally relevant these two instances are alike. Or for 8.12 we need only observe that A is present throughout a set of diverse positive instances while all the other factors that might be causally relevant occur, and fail to occur, in all possible combinations in these instances. To be sure, we could check this conclusively only if we had somehow counted and listed the possibly relevant factors, but without this we could have a mass of evidence that at least seemed to approximate to the required observation.

(p.311) 4. The Method of Residues

We have, in the wide range of variants so far indicated, covered only three of Mill's five methods. His method of residues needs only brief treatment: it can be interpreted as a variant of the method of difference in which the negative instance is not observed but constructed on the basis of already known causal laws.

Suppose, for example, that a positive instance I 1 has been observed as follows:

A

B

C

D

E

I 1

p

p

a

p

a

Then if we had, to combine with this, a negative instance in which B and D were present and from which A, C, and E were absent, we could infer (according to the kind of assumption made) either by 2.2 that A was necessary and sufficient for P in F, or by 8.2 that (A . . . or . . .) was so, and so on. But if previous inductive inquiries (of whatever sort) can be taken to have established laws from which it follows that given A¯BC¯DE¯ in the field F, P would not result, there is no need to observe N 1; we already know all that an observation of N 1 could tell us, and so one of the above‐mentioned conclusions follows from I 1 alone along with the appropriate assumption.

Again, if the effect or phenomenon in which we are interested can be measured, we can reason as follows. Suppose that we observe a positive instance I 1, with the factors present and absent as above, in which there occurs a quantity x 1 of the effect in question, and suppose that our previously‐established laws enable us to calculate that given A¯BC¯DE¯ in F there would be a quantity x 2 of this effect; then we can regard the difference (x 1x 2) as the phenomenon P which is present in I 1 but absent from (the calculated) N 1. With an assumption of the first, second, fourth, or sixth kind—that is, any assumption which does not allow conjunctive terms in the cause—we could conclude that the cause of P in this instance I 1 was A alone, and hence that A is sufficient for the differential quantity (x 1x 2) in F.

To make an assumption of any of these four kinds is to assume that effects of whatever factors are actually relevant are merely additive, and that is why we can conclude that the extra factor in I 1, namely A, itself produces the extra effect (x 1x 2). But with an assumption of the third, fifth, seventh, or eighth kind, which allows conjunctive causes and hence Mill's intermixture of effects, we could conclude only that a (sufficient) cause of (x 1x 2) was (A . . .). Given the other factors that were present in both I 1 and N 1, A was (p.312) sufficient for this differential effect; but it does not follow that A is sufficient for this in relation to F as a whole.

Though Mill does not say so, it is obvious that such a use of constructed, calculated, instances is in principle possible with all the methods, not only with the method of difference in the way outlined here.

5. Methods of Concomitant Variation

Whereas Mill called this just one method, there is in fact a system of concomitant variation methods mirroring the various presence‐and‐absence methods we have been studying. These too will be forms of ampliative induction: we shall argue from a covariation observed in some cases or over some limited period to a general rule of covariation that covers unobserved instances as well. These methods, then, work with a concept of cause which makes the (full) cause of some quantitative phenomenon that on which the magnitude of this phenomenon functionally depends: causation is now the converse of dependent covariation. As I have argued in Chapter 6, this functional dependence concept is best regarded as a development and refinement of ‘neolithic’ concepts of causation defined in terms of necessity and (perhaps) sufficiency, that is, in terms of conditional presences and absences of features. To let it count as causation, functional dependence requires, of course, the addition of the relation of causal priority. This indeed plays some part in the formal methods that we are studying in so far as it is used in deciding what are possibly relevant causal factors (that is, what can count as p‐cs) and in interpreting a covariation as a directed causal relation, but we can take these points for granted and leave them aside in most of the analysis.

The typical form of a functional causal regularity will then be that the magnitude of P in the field F is always such‐and‐such a function of the magnitude of the factors, say, A, B, C, and D, which we can write:

PF=f (A,B,C,D).
Where such a regularity holds, we may call the whole right hand side of this equation, that is, the set of actually relevant factors together with the function f which determines how they are relevant, the full cause of P in F, while we may call each of the actually relevant factors a partial cause. The full cause is that on which the magnitude of P in F wholly depends; a partial cause is something on whose magnitude the magnitude of P in F partly depends.

A thorough investigation of such a functional dependence would (p.313) involve two tasks, the identifying of the various partial causes and the determination of the function f. Only the first of these two tasks can be performed by concomitant variation methods analogous to those already surveyed.

There are concomitant variation analogues of both the method of agreement and the method of difference; that is, there are ways of arguing to a dependence of P on, say, A both from the observation of cases where P and A remain constant while other possibly relevant factors vary and from the observation of cases where P and A vary while other possibly relevant factors remain constant.

As before, we need an assumption as well as an observation, but we have a choice between different kinds of assumption. The more rigorous kind (corresponding to the second kind of assumption above) would be that in F the magnitude of P wholly depends in some way on the magnitude of X, where X is identical with just one of the p‐cs. The less rigorous and in general more plausible kind (corresponding to the eighth kind of assumption above) would be that in F the magnitude of P wholly depends in some way on the magnitudes of one or more factors X, X′, X″, etc., where each of these is identical with one of the p‐cs.

The simpler covariance analogue of the method of difference combines the more rigorous assumption with this observation: over some period or over some range of instances in F, P has varied while A has varied but all the other p‐cs have remained constant. None of the other p‐cs, then, can be identical with the X in the assumption, so A must be so. That is, the conclusion is that the magnitude of P in F depends wholly in some way not yet determined on that of A.

The more complex covariance difference analogue combines this same observation with the less rigorous kind of assumption above. The observation does not now show that some p‐c other than A, say B, is not a partial cause. But it does show that the magnitude of P in F cannot depend wholly, in any way, on any set of factors that does not include A, for every function of every such set has remained constant while P has varied. This ensures, therefore, that A is a partial cause, but leaves it open whether the full cause is simply of the form f (A) or whether there are other partial causes as well. This observation and this assumption, therefore, show that a full cause of P in F is f (A, . . .). Repeated applications of this method, with other factors being varied one at a time, could fill in further partial causes, but could not close the list.

The simple covariance analogue of the method of agreement combines our more rigorous kind of assumption above with this observation: over some period or range of instances in F, P has remained (p.314) constant while A has remained constant while every other p‐c has varied. We want to argue here that since B, say, has varied while P has not, B cannot be identical with the assumed X on whose magnitude that of P depends. But this does not follow. It might be that .P's magnitude varied with and in sole dependence upon the magnitude of B, and yet that P was not responsive to every change in the magnitude of B: there might be flat stretches, plateaux, in the curve for PF = f (B), with values for B plotted on the x‐axis and values for PF on the y‐axis. So to eliminate B we must either strengthen the assumption by adding that P is responsive to every change in the magnitude of X, or strengthen the observation to include every possible degree of variation of every p‐c other than A, or strengthen both assumption and observation together, so that we can say that every p‐c has varied to a degree to which, if this p‐c were our X, P would be responsive. Only in some such way could we validly reach the conclusion that the magnitude of P in F depends in some way on that of A alone.

A complex covariance analogue of the method of agreement encounters much the same problem as complex variants of the original method. Even if P has remained constant while all the p‐cs other than A have varied, this does not, with our less rigorous assumption, exclude the possibility that PF = f (B, C), say. For it might be that the actual variations of B and C and the actual function f were such that the effects of these variations cancelled out: f might be such as to allow the actual changes in B and C to compensate for one another. It will not help now to strengthen the assumption to include the claim that P is responsive to all variations in the Xs. But will it do if we greatly strengthen the observation, to claim that all the p‐cs other than A have varied independently (and hence in all possible combinations) over the whole possible range or magnitude of each? Even this claim—which is in any case so extreme that we could not hope to do more than approximate to its fulfilment—would yield only the conclusion that the magnitude of P in F does not depend wholly on p‐cs other than A, that is, that A is at least a partial cause. It does not, as we might be tempted to suppose, show that all p‐cs other than A are irrelevant, and that P depends only upon A. For it might be that at some values of A changes in the values of all the other p‐cs were ineffectual, and yet that at other values of A some variations in some of the other p‐cs made a difference. Consequently, if with an assumption of our less rigorous kind we are to be able to close the list of partial causes, nothing less than an analogue of 8.4 will do. That is, we must be able to say that A, as well as all the other p‐cs, has joined in this mutually independent (p.315) variation over the whole possible range of each. And then if the value for P has remained constant for each constant value of A, we can conclude that P's magnitude depends on that of A alone.

As I have said, these concomitant variation methods only identify partial causes of the functional dependence sort, and may in an extreme case tend to close the list of partial causes: they do not in themselves help with the task of determining the function f. But there is another device, still within the range of eliminative induction methods, which at least makes a start on this task. As we saw when discussing the method of residues, we can regard a quantitative difference in some magnitude, something of the form (x 1x 2), as the effect or phenomenon whose cause is identified by some application of the method of difference. Various observations interpreted in this way will ascribe such‐and‐such a change in the quantity of P to such‐and‐such a change in the magnitude of, say, A at certain (temporarily fixed) values of the other p‐cs, and will so at least impose constraints on the function f. That is, even when we are dealing with functional dependences there is an important place for applications of the neolithic presence‐and‐absence method of difference.

6. Uses and Applications of These Methods

I have so far been concerned to show only that there are many demonstratively valid methods of these sorts. But in reaching a more exact formulation of them for this purpose I have incidentally removed some of the more obvious objections to the view that such methods can be and are applied in practice. Thus the introduction of the notion of a field gives these methods the more modest task of finding the cause of a phenomenon only in relation to some field, rather than that of finding a condition which is absolutely necessary and sufficient. By contrasting the p‐cs with the field we have freed the user of the method of agreement from having to make the implausible claim that his instances have only one ‘circumstance’ in common. He has merely to claim that they have only one of the p‐cs in common, while conceding that whatever features constitute the field or are constant throughout the field will belong to all the instances, and that there may be other common features too, but ones that the user has initially judged not to be possibly relevant to this phenomenon.

Similarly, the user of the method of difference has only to claim that no possibly relevant feature other than the one he is picking out as (part of) the cause is present in I 1 but not in N 1. Also, we have taken explicit account of the ways in which the possibilities of (p.316) counteracting causes, a plurality of causes, an intermixture of effects, and so on, need to be allowed for, and we have seen how valid conclusions can still be drawn in the face of these complications, provided that we note explicitly the incompleteness of the conclusions we can now draw or the much greater strength of the observations required for complete conclusions (for example in 8.4).

By making explicit the need for assumptions we have abandoned any pretence that these methods in themselves solve or remove the problem of induction. If the requisite observations can be made, the ultimate justification for any conclusion reached by one of these methods will depend on the justification for the assumption used, and since this proposition is general in form, any reliance that we place on it or on its consequences will have to be backed by some other kind of inductive reasoning or confirmation or corroboration. The eliminative methods cannot be the whole of the logic of scientific discovery. But it does not follow that they cannot be any part of this. It is, I think, instructive to see just how far ‘deduction from the phenomena’ (in Newton's phrase) can go, in the light of some not implausible assumptions.

It is worth stressing the precise form of assumption that our analysis has shown to be required. It is not a general uniformity of nature, a universal determinism, but merely that this particular phenomenon, in this particular field, should have some cause. But we have also found that whereas our ordinary concept of cause requires a cause only to be necessary in the circumstances, the assumption needed for almost all our methods is that something (though this will usually not be the factor that is eventually picked out as, in a popular sense, the cause) should be both necessary and sufficient for this phenomenon in this field. We do have to assume what might be called particular determinism. We must reject the view of Mill that there is some one ‘law of causality’ to which every application of these methods appeals. But at the other extreme we should stop short of the view of Wittgenstein in the Tractatus that this ‘law’ dissolves into the mere logical form of laws in general. Specific deterministic assumptions play a part that makes them in at least one way prior to the laws discovered with their help.

What may seem more of a problem is that we have to assume a somehow restricted range of possibly relevant factors, p‐cs. But, as we shall see, some inquiries are conducted against a background of general knowledge that supplies such a restriction; in others we simply assume that the only items that are possible causes for some result, some change, are other perceptible changes in the spatial neighbourhood and not long before. And what counts as near (p.317) enough in space and time, and as being perceptible, can itself be a matter of working hypotheses. If a conclusion demonstrated with the help of some particular limitation of p‐cs is subsequently falsified, that limitation can and must be relaxed: we must look further afield or try other ways of detecting changes that we did not at first perceive. It must be emphasized that though the eliminative methods are themselves demonstrative, there is no hostility whatever between the use of them and the use of hypotheses whether as working assumptions or otherwise. The demonstrative methods need not be associated with any theory that the method of inquiry as a whole is watertight, or simply aggregative, or mechanical.

The eliminative methods are in fact constantly in use in the sense that innumerable procedures of search, discovery, and checking have implicit within them the forms of reasoning that I have tried to formulate and make precise. I have argued in Chapter 3 that all our basic causal knowledge is founded on what are in effect applications of the method of difference, and that this knowledge has, in consequence, just the form (of elliptical double regularities) that one would therefore expect it to have. It is by before‐and‐after contrast observations that we know that fire burns, that it cooks meat, that rattlesnakes are poisonous; Becquerel discovered that the radium he carried in a bottle in his pocket caused a burn by noticing that the presence of the radium was the only possibly relevant difference between the time when the inflammation developed and the earlier time when it did not, or between the part of his skin where the inflammation appeared and other parts.

But the causal relations revealed in this way need not be so general as to be of scientific interest. Fault‐finding procedures aim only at discovering why this particular machine does something wrong, and hence how it can be put right; but they, too, depend on trying one change after another and seeing what results.

Suppose that a new drug is being tested. It is administered to some subject, and some change (good or bad) is noticed in the subject soon afterwards. There is a prima facie case for supposing that the administration of this drug can cause—that is, is an inus condition of—that change. But why, if the method of difference is a demonstrative method, is it only a prima facie case? Simply because the experimenter cannot be sure that the requirements for that method's observation have been met: some other relevant change may have occurred at about the same time. But if the experiment is repeated and we keep on getting the same result, it becomes less and less likely that on each occasion when the drug was administered, some one other, unnoticed but relevant, change also occurred. But (as I (p.318) said in Chapter 3) such repetition should not be taken as a resort to the method of agreement, it is not the repetition of the AP sequence as such that we are relying on: we are using the repetition merely to check whether on each occasion the requirements for the method of difference were met. If we keep getting the same result, we can say that it is probable that they were met, that no other relevant change occurred, and therefore (given the assumption) that the conclusion demonstrated by the method of difference holds. But repetition can have another function as well. Even if we are reasonably sure that a particular experiment (or set of experiments) has shown that A is an inus condition of P in F (or better), we may repeat it under varying conditions in the hope—which may be fulfilled—of showing that various factors which were present in our original I 1 and N 1, and so might be essential conjuncts with A in the full cause (A. . . or . . . ), are not in fact essential: we are reducing the indeterminacy represented by the first row of dots in this formula. Such repetition under varying conditions is an application of the method of agreement, and turns the whole procedure into something that can be called (in yet another sense) a joint method: it is in fact an approximation to 8.4.

As I have said in Chapter 3, the controlled experiment, in which a control case is deliberately set up alongside the experimental case and made to match it as closely as possible except with regard to the item under test, is another well‐known application of the method of difference.

It is often supposed, and the examples I have given might encourage the belief, that these methods operate only at superficial levels of science, that they establish causal connections only between directly perceivable features. But they are subject to no such restriction. A physicist may take himself to be firing deuterons at bismuth atoms when all that a superficial observer could perceive was his twiddling knobs and switches on a large and impressive piece of apparatus; but if the physicist subsequently finds evidence that some of the former bismuth atoms are now behaving like polonium atoms, it will be method of difference reasoning that enables him to say that deuteron bombardment of bismuth can produce polonium. Of course in such applications the conclusion reached by a method which is in itself demonstrative is established only subject to the proviso that the interpretation of the antecedent conditions and of the result is correct. But once we have detached the use of these methods from the claim that they achieve certainty, this is quite in order. It is also very obvious that the eliminative methods can be used, not only as part of a mere response to experiences that force (p.319) themselves upon us, but also in procedures for testing hypotheses, for finding answers to questions that we ask.

Mill thought that these were methods both of discovery and of proof; and so they can be, but not very easily at once. Rather an observation which appears to conform to the requirements for one of the methods may suggest that a certain causal relationship holds, and a causal hypothesis—whether suggested in this way or in some other—may be tested and perhaps confirmed by a more thorough survey or by a carefully constructed experiment, the results of each being interpreted by another application of the appropriate method.

The method of agreement is frequently used in medical contexts—locating a source of infection, identifying the substance to which someone has an allergic reaction, deciding what bacillus is responsible for a certain disease, and so on. It is particularly in such contexts that a previously developed background of theory supplies the fairly rigorous kind of assumption needed for the simpler variants of this method. In a country in which typhoid is rare, we may hope that if a dozen cases occur within a few days in one town they all result from a single source of infection. A person may be allergic to just one substance, and his response to it may not depend much on other circumstances. But in these applications too the eliminative method of reasoning may well be combined with asking questions, framing and testing hypotheses, rather than the passive reception of information.

The methods of concomitant variation, particularly those that are counterparts of the method of difference, along with statistical procedures that can be regarded as elaborations of these methods, are constantly used when one or more factors are varied while other possibly relevant factors are held constant, and calculations are based on the observed results. In practice the two tasks which I distinguished, the identifying of actual partial causes of a phenomenon and the determination of the function f, tend to be mixed up together; but the separation of these tasks in our analysis may help to distinguish different degrees of conclusiveness with which different parts of an hypothesis have been confirmed.

7. Criticisms of These Methods

We have already met most of the stock objections to these methods. One still outstanding is that they take for granted the most important aspect of inquiry, the detection, isolation, and analysis of the possibly relevant factors. Mill, of course, does not ignore this: he devotes a whole book to ‘operations subsidiary to induction’. But what most (p.320) needs to be stressed is that there is no need for a finally satisfactory analysis of factors before the eliminative methods can be applied: as I argued in Chapter 3, we can start using the methods with a very rough distinction of factors and obtain some correct results, which can be later refined by what I called the progressive localization of a cause. Also, though classification may be subsidiary to induction, it is not purely preliminary to it. Things are classified mainly by their causal properties; it will be by using, in effect, the method of difference that we find that this material regularly does such‐and‐such to that other material, and so recognize it, and other samples that act similarly, as all specimens of some one kind of stuff. A mass of elementary eliminative inductions helps to provide the setting for more sophisticated ones.

Another stock criticism is that causal relations are not the whole, or even the most important, concern of science. (We have already rejected, in Chapter 6, the argument that science is not concerned with causation at all.) We can concede that science has other concerns as well, but this criticism is otherwise to be rebutted by insisting on a wide concept of causation and by seeing that these methods assist the discovery of causes in this wide sense. Especially since entities on all levels are identified largely by dispositional properties, that is, by their causal powers, the detection of causation by particular objects as well as the discovery of causal regularities runs through all processes of discovery.

Thirdly, it is often thought that the use of and reliance on these methods is somehow a discredited and discreditable alternative to the hypothetico‐deductive method. In fact there is no incompatibility, hostility, or even any necessary difference of spirit between the use of these two procedures. Though the eliminative methods are in themselves demonstrative, they are naturally and unavoidably used in contexts which make their conclusions only tentative. Their assumptions require to be confirmed in ways in which these methods cannot themselves confirm them. And the Popperian principle that hypotheses are corroborated by being subjected to severe tests—that is, ones which, if an hypothesis is false, are likely to show it to be false—is admirably illustrated by the methods of agreement and difference. The hypothesis that A is necessary for P in F is severely tested by looking for instances of P which fall within the field F but otherwise differ widely in all conceivably relevant ways, and seeing whether A is present in them all; if the hypothesis stands up to this test, it is by the method of agreement that we draw the (in practice still tentative) conclusion. The hypothesis that A is an indispensable part of a sufficient condition for P in F that was present in I 1 is (p.321) severely tested by finding or constructing an instance that is as like I 1 as possible in all conceivably relevant ways, except that it lacks A, and seeing whether P occurs there or not. And if it does not, this counts as the N 1 from which, along with I 1, we draw our conclusion in accordance with the method of difference.

Notes:

(1) Book III, Chs. 8–10.

(2) Logic, Part II, Ch. 10.

(3) G. H. von Wright, A Treatise on Induction and Probability. Earlier accounts are those of Johnson (referred to above, p. 297) and C. D. Broad, ‘The Principles of Demonstrative Induction’, Mind, xxxix (1930), 302–17 and 426–39.

(4) The classic exposition of this extension of traditional logic to allow for complex terms is in J. N. Keynes's Formal Logic (4th edition, 1906), Appendix C, pp. 468–538. It was much discussed by John Anderson in lectures at Sydney University. For our present purposes we can take the universal propositions as not having existential import, and thus avoid the need for restrictions on the valid argument forms.