Jump to ContentJump to Main Navigation
The Emergent MultiverseQuantum Theory according to the Everett Interpretation$

David Wallace

Print publication date: 2012

Print ISBN-13: 9780199546961

Published to Oxford Scholarship Online: September 2012

DOI: 10.1093/acprof:oso/9780199546961.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 25 February 2017

(p.465) Appendix C Formal Proofs of Decision-Theoretic Results

(p.465) Appendix C Formal Proofs of Decision-Theoretic Results

The Emergent Multiverse
Oxford University Press

(p.465) Appendix C

Formal Proofs of Decision-Theoretic Results

In this appendix, I give formal proofs (and in some cases, formal statements) of various decision-theoretic results stated in Part II and Appendix B.

C.1 Proof of the Additive Representation Theorem

Theorem 1 (Additive Representation Theorem). Suppose that E is a synchronic decision problem and that ≻ is a preference order for some set 𝓑 of bets for the decision problem. If

  • A1 The set 𝓑 includes all possible bets;

  • A2 The event space of E is the set of subsets of some finite set of atomic events;

  • B1 Any two constant bets with the same payoff are equivalent: if r is a reward and 𝓔 1, 𝓔 2 are chance setups, then Cr(𝓔 1) ~ Cr(𝓔 2);

  • B2 There is an associative, commutative map ⊕ from pairs of rewards to rewards, and some reward 0 such that r ⊕ 0 = r for any reward r, such that

    1. (i) If 𝓕 1 ⪰ 𝓕 2 and 𝒢 1 ⪰ 𝒢 2, then 𝓕 1 ⊕ 𝒢 1 𝓕 2 ⊕ 𝒢 2;

    2. (ii) If in addition 𝓕 1 ≻ 𝓕 2, then 𝓕 1 ⊕ 𝒢 1 ≻ 𝓕 2 ⊕ 𝒢 2;

  • B3 There is a reward r such that Cr(𝓔) is not equivalent to C 0(𝓔);

  • C1a For any bets 𝓕, 𝒢, there is some positive integer n such that n𝓕 ≻ 𝒢(where n𝓔 just means 𝓔 added to itself n times);

  • C1b For any bets 𝓕, 𝒢 such that 𝒢 ≻ C 0(𝓔), there is some positive integer n such that n𝓕 + 𝒢 ≺ C 0(𝓔);

  • C1c For any bets 𝓕, 𝒢 such that 𝒢 ≺ C 0(𝓔), there is some positive integern such that n𝓕 + 𝒢 ≺ C 0(𝓔);

  • (p.466) C2a For any bets 𝓕, 𝒢 such that 𝓕 ≻ 𝒢 ≻ C 0(𝓔), there is some positive integer n such that n𝓕 ≺ (n + 1)𝒢;

  • C2b For any bets 𝓕, 𝒢 such that 𝓕 ≺ 𝒢 ≻ C 0(𝓔), there is some positive integer n such that n𝓕 ≺ (n + 1)𝒢.

then ≻ is uniquely represented by some probability measure Pr.

Proof. For simplicity, we start by constructing the utility function 𝒰. By B1, the ordering ≻ determines a well-defined ordering on some subset of 𒄜; by A1, that subset is in fact the whole of 𒄜. We write that ordering also as ≻.

By B3, there is some reward r such that either r ≻ 0 or 0 ≻ r; we will assume the former. (If actually all rewards are less desirable than 0, the proof can just be repeated with ≻ replaced by ≺.) By definition set 𝒰 (0) = 0 and 𝒰(r) = 1.

Now for any reward s ≻ 0, we define 𝒰(s) by

U ( s ) = lub ⁡ { m / n : n s ≻ m r } .

(That some such upper bound exists is a consequence of C1a and the Bolzano–Weierstrass Theorem.1) It is straightforward to see that this is additive: 𝒰(s + t) = 𝒰(s) + 𝒰(t). More importantly, it represents ≻: 𝒰(s) 〉 𝒰(t) iff s ≻ t. To see this, firstly suppose 𝒰(s) 〉 𝒰(t). Then there must be some m, n such that ns ≻ mr but mr ⪰ nt, so by transitivity, ns ≻ nt. But if t ⪰ s, by repeated application of B2 it follows that nt ⪰ ns. It follows that s ≻ t.

Conversely, suppose that s ≻ t. To begin with, assume that t ≻ 0. By C2a, there is an integer N such that Ns ≻ (N + 1)t. Now let {mi} and {ni} be sequences of positive integers such that {mi/ni} is an increasing sequence and limi mi/ni = 𝒰 (t). Then by definition, for each i nit ≻ mi r, and by B2 again, (N + 1)nit ≻ (N + 1)mir. By transitivity and B2 again, we get Nnis ≻ (N + 1) mir. So 𝒰(s) ≥ (N + 1)mi/Nni for all i, and so 𝒰(s) ≥ (N + 1)/N䶰(t).

If instead t ≺ s ≺ 0, the same argument can be run with ≺ and ≻ interchanged, using C2b instead of C2a. Finally, if s ≻ 0 ≻ t, all we need to show is that if s ≻ 0 then 𝒰(s) 〉 0 and that if t ≺ 0 then 𝒰(t) 〈 0. To show the former, we use C1a to find some N such that Ns ≺ R; it follows that 𝒰(s) 〉 1/N. To show the latter, we apply a similar argument using C1b.

(p.467) In fact, we can extend 𝒰 to non-constant bets very straightforwardly: the definition of 𝒰 works as well for non-constant as for constant bets. That is, we define

E U ( ℱ ) = lub ⁡ { m / n : n ℱ ≻ m C r ( ε ) }

for some arbitrary 𝓔. If we define Pr(E|𝓔) = EU(C r,0(E|𝓔)), it follows via A2 and B2 that EU(F) is indeed the expected utility of 𝓕 with respect to this probability measure and to 𝒰.

This proves the existence of a representation. To prove uniqueness, let Pr be the probability measure already constructed, and suppose that Pr′ is another probability measure which represents ≻. By definition there is some utility function 𝒱 such that the expected utility generated by Pr′ and 𝒱 represents ≻ via the MEU rule. 𝒱 must be an increasing function f(𝒰) of the already-constructed utility function 𝒰.

Let 𝓔 be a chance setup and E be an event in O 𝓔 with Pr(E|𝓔) not equal to 0, 1 or 1/2 (if there are no such events, uniqueness is trivial), and for the moment suppose that Pr(E) is rational, Pr(E) = m/n say. This means that C nr,0(E|𝓔) ~ C mr(𝓔), and so Pr′ (E) = f(m)/f(n). Since km/kn = m/n, for consistency we must have f(km) = f(k)f(m) and so (iterating) f(m M) = f(m)M). Putting l = n − m and considering the complement ¬E of E in 𝒪𝓔, we similarly get that f(L L) = f(l)L.

Now, f is an increasing function, so we require that f(k)K 〉 f(l)L iff kK 〉 lL ; taking logarithms of both sides, the requirement is that Klnf (k) 〉 L lnf(l) iff Klnk 〉 L ln l. Since this is true for arbitrary K and L, by constructing a sequence we can show that ln f(k)/ln f(l) = ln k/ln l. If we write f(k) = k λ and f(l) = l μ for some λ, μ, it follows immediately that μ = λ. Hence Pr′ (E)= Pr(E)λ and Pr′(¬E) = Pr(¬E)λ. Since we require Pr′ (E) + Pr′ (¬E)= 1, this forces λ = 1, i.e. Pr′(E) = Pr(E). The extension to irrational probabilities is a straightforward if tedious argument from limiting sequences of rational numbers.

C.2 Proof of the Synchronic Likelihood and Synchronic Preference Theorems

Theorem 2 (Synchronic Likelihood Theorem). Suppose that

  • SL1 O𝓔|𝓔 ≻ ∅|𝓔.

  • SL2 O𝓔|𝓔 ⪰ E|𝓔 ∅|𝓔 for any 𝓔 and any E ∈ 𝓔.

  • (p.468) SL3 If E, F,G ⊂ O𝓔 and E ∧ G = F∧ G = ∅, then E|𝓔 ≻ F|𝓔 iff E ∨ G|𝓔≻ ≻F ∨ G|𝓔.

  • SL4 ≻ is partitionable.

Then ≻ is uniquely almost represented.

Proof. Our first step is to show that, if SL3 holds, then a stronger composition rule is also possible.

Lemma 1. (additivity lemma). Suppose that ≻ is an ordering for an event space E satisfying

I f E ∧ G = F ∧ G = 0 , t h e n E ≻ ¯ F i f f E ∨ G ≻ ¯ F ∨ G .

and that A, B, E, F satisfy A ∧ B = E ∧F = ∅ Then if A ⪰ E and B ⪰ F, A ∨ B ⪰ E ∨ F. Furthermore, the result continues to hold if ⪰ is replaced by≻.

This extension is intuitively plausible, but rather fiddly to prove. Define C and G as the complements of A∨ B and E∨ F respectively; then elements like A ∧ F and C∧ E form a ninefold partition of 𝓔. For simplicity let us write elements of the partition as (e.g.) AF. The proof now proceeds by repeated use of C.3.

Start with A and E: A⪰E, so AE∨AF ∨ AG⪰AE ∨ BE ∨ CE, so AE ∨ AF ∨ AG ∨ BG ⪰ AE ∨ BE ∨ CE ∨ BG.

Now do B and F: B⪰F so BE ∨ BF ∨ BG⪰AF ∨ BF ∨ CF, so BE ∨ BG⪰AF ∨ CF, so BE ∨ BG ∨ AE ∨ CE⪰AF ∨ CF ∨ AE ∨ CE.

Combining these: AE ∨ AF ∨ AG ∨ BG⪰AF ∨ CF ∨ AE ∨ CE, so AE ∨ AF ∨ AG ∨ BG ∨ BE ∨ BF⪰AF ∨ CF ∨ AE ∨ CE ∨ BE ∨ BF.

Rearranging: A(E ∨ F ∨ G) ∨ B(E ∨ F ∨ G)≼(A ∨ B ∨ C)E ∨ (A ∨ B ∨ C)F; that is, A ∨ BE ∨ F, and the lemma is proved.

(Since this proof may appear incomprehensible at first sight, I recommend the following way of visualizing it. We represent the event space as a 3-by-3 grid, with A–C labelling the rows and E–G the columns; we represent events by shading in squares (so that AE ∨ CG is represented by a grid with the top left and bottom right squares shaded in). The equivalences between events are now equivalences between squares, and the application of additivity now comes down to the following rule: if two grids are identical, they remain so if we shade in a square which was unshaded in both grids, or erase a square which was shaded in in both grids. The stages of the proof can now be represented graphically as a series of shadings or unshadings of pairs of grids.)

(p.469) We begin the main proof by proving that if F 1 ∨ … ∨ FN = O𝓕 and G 1 ∨ … ∨ GN = O 𝔖 are partitions satisfying Fi|𝓕 ~ Fj|𝓕 and Gi|𝔖 ~ Gj|𝔖, then Fi|𝓕 ~ Gj|𝔖. We do so via the ‘preferred’ N-element partition E 1 ∨ … EN = O𝓔N guaranteed by partitionability. The definition of partitionability tells us that there exists an event space 𝓔∗ and an N 2 element partition Eij of o𝓔∗,such that

  • • Eij|𝓔∗ ~ Ei′j′|𝓔∗.

  • • E i * = E i 1 ∨ … ∨ E i N | ε * ~ E i | ε .

  • • F j * = E 1 j ∨ … ∨ E N j | ε * ~ F j | ℱ .

Applying the additivity lemma repeatedly tells us that, if E i * | ε * ≻ F j * | ε * ,then O 𝓔 ∗ |𝓔 ∗ ≻ O 𝓔 ∗|𝓔∗—which obviously is contradictory. So the result follows.

Given this result, it follows in particular that

E | ε ≻ E 1 ∨ ⋯ ∨ E M | ε N iff E | ε ≻ E 1 ∨ ⋯ ∨ E k M | ε k N .

So for any rational number α = M /N, we can consistently write E|𝓔 ≻ α to indicate that E|𝓔 ≻ E 1 ∨ … ∨ EM|𝓔N.

We now define the probability function as follows:

Pr ⁡ E = lub ⁡ { α : E ≻ α } .

Suppose E ∧ F = ∅. If E ≻ M /N and F ≻ M′/N′, by the additivity lemma E ∨ F ≻ (M/N) + (M′/N′). So Pr(E ∨ F) ≥ Pr(E) + Pr(F); running the same argument for ∨ shows that Pr(E ∨ F) ≤ Pr(E) + Pr(F), so Pr is dditive. Clearly, Pr(𝓔) = 1.

Suppose Pr(E) 〉 Pr(F); then there is some rational number M/N such that E ≻ M/N ≻ F; hence, E ≻ F. So Pr almost represents ≻.

Theorem 3 (Synchronic Preference Theorem). Suppose ≻ is a total ordering on bets for some synchronic decision problem E, satisfying:

  • SP1 If f, g are bets on the same chance setup 𝓔 and E1, … En is a partition of O𝓕 such that f−1 (Ei) and g−1(Ei) are single-valued (such a partition can always be found), then

    1. 1. If f−1 (Ei)|𝓔 ⪰ g−1 (Ei)|𝓔 for all non-null Ei, then f ⪰ g.

    2. 2. If in addition f−1(Ei)|𝓔 ≻ g−1 (Ei)|𝓔 for at least one non-null Ei, then f ≻ g.

      (This is sometimes called the dominance principle.)

  • SP2 ≻ defines a well-defined ordering on the set of rewards: that is, Cr(𝓔) ∼ Cr(𝓕).

  • (p.470) SP3 ≻ defines a well-defined ordering ≻E on the restriction of bets to any event E: that is, it satisfies the sure thing principle.

  • SP4 ≻ definesawell-definedlikelihoodordering: that is, it satisfies probabilism.

  • SP5 The likelihood order determined by ≻ is partitionable.

Then the preference ordering is uniquely almost represented.

Proof. Our first step is to prove that (at least for finite bets on a fixed chance setup) a bet is completely characterized by the probabilities it assigns to each reward. This we do via SP3. For suppose E and F are two events in 𝒪𝓔 with equal probability, that f is a reward-valued partition such that f −1(E) = r and f −1(F) = s, and that g is obtained from f by swapping the rewards on E and F.

Now, define f′ and g′ by f′(r) = E, f′(s) = 𝒪 𝓔 − E and g′ (r) = F, g′(s) = 𝒪𝓔 − F. By SP3, if 〈f,𝓔〉 ⪰ 〈g,𝓔〉, then also 〈f ′,𝓔〉 ⪰ 〈g′,𝓔〉. But this contradicts the definition of probability.

Since any two finite bets on 𝓔 which assign the same probability to each reward can be obtained from one another by finitely many permutations, our first result follows.

The remainder of the proof will be easier to follow if we assume that for any finite set of positive real numbers p 1… pn summing to 1, 𝓔 can be partitioned into N sets E 1,…En with Pr(Ei |𝓔 = pi. This is not in fact guaranteed to be true, but the idealisation is harmless: the partition canbe simulated for p 1 .… pn rational by using the partitioning axiom SP5, and if the numbers are irrational we can approximate them arbitrarily well with rational numbers (this is one place where looking only for almost representations makes life simpler). I leave the technical details to the reader.

So: let r, s, t be any three rewards with r ≻ t ≻ s. For any set E ∨ 𝓔, and any α 〈 1, I define αE to be some arbitrary subset of E with Pr(αE) = αPr(E); with respect to r and s, I define the E-utility of t as

U E ( t ) = lub ⁡ ( α : C t ( E ) ≻ E C r , s ( α E ) ) .

(A moment’s reflection will show that in the special case where E = O𝓔, this is a formalization of our informal definition of utility, on a scale where 𝒰 (r) = 0 and 𝒰 (s) = 1.)

By our previous result, if E and F are equiprobable then 𝒰 E(t) = 𝒰 F(t). If E and F are not equiprobable, but Pr(E)/Pr(F) is a rational number (M/N, say), we can decompose E into M sets and F into N sets such that all M + N sets are equiprobable; it follows again that 𝒰E(t) = 𝒰F(t). And if Pr(E)/Pr(F) is irrational, a standard least-upper-bound argument (p.471) establishes the same thing again. We conclude that for given 𝓔, all the E-utilities are the same, and we can without ambiguity just refer to the 𝓔 -utility of t, 𝒰 𝓔(t). By definition,

C t ( ε ) ≻ C r , s ( α O E | ε )

whenever α 〈 𝒰 𝓔(t) and

C t ( ε ) ≺ C r , s ( α O E | ε )

whenever α 〉 𝒰 𝓔(t). Now let 𝓔 and 𝓕 be two distinct chance setups, and suppose that 𝒰 𝓔(t) 〉 𝒰 𝓕(t). Then there must exist α such that

C t ( ε ) ≻ C r , s ( α O E | ε )


C r , s ( α O F | ℱ ) ≻ C t ( ℱ ) .

But by the definition of probability C r, s(α𝒪E|𝓔) ~ C r, s(α𝒪F|𝓕), and by SP2 Ct(𝓔) ~ C t(𝓕), so this is impossible. It follows that the 𝓔 −utility is actually independent of 𝓔 as well, and that we can talk unproblematically just of the utility of t (with respect to r and s).

We now consider any comparable set Ƭ of rewards all of which lie between r and s, and restrict our attention to finite bets with outcomes in Ƭ. We have seen that for preference purposes any such bet can be characterized by the probabilities assigned to each reward; we can then represent a bet unambiguously as a real-valued function F on Ƭsatisfying

∑ t ∈ T F ( t ) = 1.

Our strategy is now to replace every reward except r and s themselves with a mixture of r and s. In particular, suppose that φ + and φ − are functions of Ƭ satisfying 0 ≤ φ±(t) ≤ 1, φ+(t) 〉 𝒰 (t), and φ − t 〈 𝒰 (t). If for some particular t, the bet gives reward t on outcome E, then it follows from the definition of E-utility that partitioning E into α(t)E and (1 − α(t))E, and changing the bet to give reward r on α(t)E and s on (1 − α(t))E, will produce a bet preferred to the original one if α(t) 〉 𝒰 (t) and vice versa if α(t) 〈 𝒰(t).

If we make such substitutions sequentially for all outcomes of the bet, we obtain a bet whose only outcomes are r and s, and which gives r with probability

∑ t ∈ T α ( t ) F ( t ) .

(p.472) If for all tα(t) ≤ 𝒰(t) then this bet is definitely not preferred to the original bet; if for all tα(t) 〉𝒰(t), then it definitely is preferred. It follows via transitivity that if we define

E U ( F ) = ∑ t ∈ T U ( t ) F ( t )

then F is preferred to G whenever EU(F) 〉 EU(G). But this, of course, is the result we seek.

C.3 Proof of the Diachronic Representation Theorem

Theorem 4 (Diachronic Representation Theorem).Suppose that

  • • Ƥ is a diachronic decision problem satisfying reward availability;

  • • ≻ is a solution to Ƥ satisfying macrostate indifference and diachronic consistency;

  • • ≻ is diachronically partitionable:for any available history h, any pair of rewards r, s, and any N, there is an act AN available at end(h), and an N-element partition E 1 N … E N N of end(h), such that

    1. 1 〈 A N , { E 1 N , … E N N } 〉 is independent of any act available at end(h).

    2. 2 A r , s N ( E i ) ∼ h A r , s N ( E j ) .

  • • The preference order ≻h at h is probabilistic: that is, the likelihood order over events that it determines is independent of the bets used to define that order.

Then the preference order ≻h at h satisfies axioms SP1 − SP4 of theorem 4, and hence is uniquely almost representable; furthermore, it is Bayesian(that is, uniquely almost represented by a Bayesianprobability measure on any comparable set of acts).

Proof. To show that ≻h is partitionable, fix N, r and s, and (in the terminology used in section B.6 to define synchronic partitioning) take 𝓔N = AN. Given any other act B, we know there exist acts A′, B′ such that O A′·B = O B′·A N and (A′ · B)f = (B′ · AN)f. Take 𝓔∗ = O A′ · ·B.

We now need to construct embeddings of OB and OAN into 𝓔∗, and we do so as follows:

  • • If E ⊂ OB, ν(E) = OA′|E.

  • • If F ⊂ O A N, ν′(F) = O B′|F.

(p.473) SP3 now follows from the definition of independence. SP1 – 2 have been shown to follow from macrostate indifference and diachronic consistency; SP4 is an immediate consequence of probabilism.

It follows that ≻is uniquely almost representable, but it remains to be shown that it is Bayesian. To show this, first notice that if A is an act available at end(h) and f is a reward partition on its outcome space taking values in some set of comparable rewards, by our earlier constructions we have

EU h ( A f ) = ∑ r Pr ⁡ ( f ( r ) | A , h ) × V ( r ) .

Now let B be available at OA and E ⊂ OA. Define B 1 = B|E and B 2 = B|O A −E, and let f, f′ be reward partitions on OB which agree on restriction to O B 2. Then we have

EU h [ ( B ⋅ A ) f ] = ∑ r Pr ⁡ ( f ( r ) ∧ Ο B 1 | B ⋅ A , h ) × V ( r ) + ∑ r Pr ⁡ ( f ( r ) ∧ Ο B 2 | B ⋅ A , h ) × V ( r )


EU h [ ( B ⋅ A ) ′ f ] = ∑ r Pr ⁡ ( f ′ ( r ) ∧ Ο B 1 | B ⋅ A , h ) × V ( r ) + ∑ r Pr ⁡ ( f ( r ) ∧ Ο B 2 | B ⋅ A , h ) × V ( r ) .

Subtracting out the common part and dividing by a common factor, we have that (B · A)f≻h(B · A)f′ if

∑ r Pr ⁡ ( f ( r ) ∧ O B 1 | B ⋅ A , h ) Pr ⁡ ( O B 1 | B ⋅ A , h ) × V ( r ) 〉 ∑ r Pr ⁡ ( f ′ ( r ) ∧ O B 1 | B ⋅ A , h ) Pr ⁡ ( O B 1 | B ⋅ A , h ) × V ( r ) .

But by diachronic consistency, (B · A)f≻h(B · A)f′ iff (B|E)f≻〈 E,A〉·h(B|E)f′. By assumption Pr uniquely almost represents ≻ that is, by assumption it is the only probability measure such that (B ·A)f≻h(B·A)f′ if


∑ r Pr ⁡ ( f ( r ) | B | E , 〈 E , A 〉 ⋅ h ) × V ( r ) 〉 ∑ r Pr ⁡ ( f ′ ( r ) | B | E , 〈 E , A 〉 ⋅ h ) × V ( r ) .

It follows (replacing B 1 with B|E) that

Pr ⁡ ( F | B | E , 〈 E , A 〉 ⋅ h ) = Pr ⁡ ( F | B ⋅ A , h ) Pr ⁡ ( O B | E | B ⋅ A , h ) .

That Pr(OB |E|B · A, h) = Pr(E|A, h) follows again from the construction of Pr (and thus, indirectly, from macrostate indifference).

C.4 Statement and proof of the Everettian Inference theorem

I take as a starting point the definition of a quantum decision problem given in Chapter 5, of a state-dependent solution to that problem as specified in the same chapter, and of a (non-state-dependent) solution to the problem as specified in chapter 6. Note also that a quantum decision problem is a plain decision problem in the sense of Appendix B, and the two definitions of solution coincide. In particular, the definitions in Appendix B of ordering, diachronic consistency, macrostate indifference and probabilism apply directly to solutions of quantum decision problems.

I now state formally some additional constraints on quantum decision problems and their solutions. Recall that for any act U available at E, and any partition 𝓕 of 𝓔, the set

{ O r ( U ) : F ∈ ℱ }

is a positive operator valued measure (POVM) on E, which I call the 𝓕 -POVM of U, and that a POVM is available at E iff for some partition 𝓕, some act available at E has that POVM as its 𝓕 -POVM.

I need only one additional constraint on a decision problem.

Rich structuring. Suppose that 𝓕 is an available set of macrostates and that for each F ℇ 𝓕, XF is a (finite or countable) set of positive operators on F satisfying ∑XℇXF X ≤ 1F. Then there is a compatible act function 𝒰 for 𝓕 such that for each F ℇ 𝓕, the characteristic POVM of 𝒰 (F) contains all the operators in XF.

And I need the following additional constraints on solutions to a problem.

(p.475) Solution continuity. If h is an available history and U, U′ are available at end(h), and U ≻ h U′, then in the space of unitary maps from end(h) into ℋ there are neighbourhoods (in norm topology) 𝒩, 𝒩′ of U, U′ respectively such that any act in 𝒩 available at end(h) is preferred (at h) to any act in 𝒩′ available at end(h).

Noncontextuality. For any available history h and any U, V available at end(h), if U and V have the same characteristic POVM then U ~hV.

I now state and prove, in succession, the noncontextual inference theorem and the Everettian inference theorem.

Noncontextuality Theorem. Suppose that:

  1. • Ƥ is a quantum decision problem satisfying reward availability, problem continuity, branching availability, and rich structuring.

  2. • ≻h is a solution to Ƥ satisfying macrostate indifference, ordering, diachronic consistency, solution continuity, and probabilism.

Then ≻h is represented by some density operator iff it is noncontextual.

Proof.If ≻h is represented by some density operator, trivially it is noncontextual and probabilistic; the trick is proving the converse. I prove it in three steps. First I will establish that ≻h is uniquely almost represented by some probability measure. Secondly I will prove that this probability function can be represented by a density operator. Finally, I prove that ≻h is represented, not just almost represented.

To prove the first step, fix an available history h and a partition 𝓕 = F 1, … Fn of 𝓔 ; by branching availability, there is an 𝓕 -POVM available at end(h) whose element associated with Fi is (1/N) 1end(h). Let U be the act generating this POVM: then bets (in the technical sense of Appendix B) on Fi ∧ 𝒪U and Fj ∧ 𝒪U generate the same ℜ-POVM and so are equally likely at h (again, in the sense of Appendix B).

Now, for any act V available at end(h), let U′ be an act available at 𝒪V whose 𝓕 -partition at 〈𝒪V, V〉·h associates (1/N) 1𝒪 V to each element of 𝓕. It should now be apparent that U is independent (in the sense of Appendix B) of V, and so ≻h is diachronically partitionable. As such, it follows from the diachronic representation theorem of Appendix B that ≻h is uniquely almost represented.

The second step is essentially a variant on the POVM version of Gleason’s theorem proved by Caves et al. (2004). Specialize to the trivial history ∅ and the starting-point macrostate M, define (p.476)

X = { U † Π E U : U is available at M , E ⊂ O U }

and define a functionon X by

Λ ( U † Π E U ) = Pr ⁡ ( E | U , h 0 ) .

(That this iswell-defined follows from noncontextuality). By therich structuring axiom and the additivity of Pr, Λ satisfies Λ(X + Y) = Λ (X) + Λ(Y) whenever X + Y ≤ 1M and hence Λ(M/NX) = M /N(X). Diachronic consistency entails that if p 〉 q then Λ(pX) 〉 Λ(qX), so it follows that Λ is linear on X, and so extends uniquely to the span of X in the real linear space of bounded self-adjoint operators on M. If this span is not the whole space, we extend it to the whole space arbitrarily;2 we can then extend it uniquely to an antilinear functional on the whole space of bounded self-adjoint operators on M. Since that space is a Hilbert space with inner product 〈X, Y〉 = Tr(X † Y), there is a unique operator ρ such that Tr(ρX) = Λ(X).

Finally, we define ρh recursively: if h is an available history, U is available at end(h), and hE is shorthand for 〈E, U〉 · h, then

ρ Φ = ρ


ρ h E = Π E U ρ h U † Π E Tr ( Π E U ρ h U † Π E ) .

If Tr(U †∏E Uρh) = Pr(E|U, h), then for any V available at OU and any F ⊂ 𝒪V we have

Tr ( V † | E Π F V | E ρ h E ) = Tr ( V † | E Π F V | E Π E U ρ h U † Π E ) Tr ( Π E U ρ h U † Π E ) = Tr ( U † Π E V † | E Π F V | E U ρ h ) Tr ( Π E U ρ h U † Π E )

= Tr ( ( V U ) † Π F V U ρ h ) Tr ( Π E U ρ h U † Π E ) = Pr ⁡ ( F | V U , h ) Pr ⁡ ( E | U , h ) = Pr ⁡ ( F | V | E , h E )

as required.

The third step follows from problem and solution continuity; I leave it to the reader.

Everettian Inference theorem.Suppose that:

  • • Ƥ is a quantum decision problem satisfying reward availability, problem continuity, branching availability and rich structuring;

  • • ≻h is a solution to Ƥ satisfying ordering, diachronic consistency and solution continuity;

  • • ≻ψ is a state-dependent solution to Ƥ, satisfying ordering, diachronic consistency, solution continuity, branching indifference, macrostate indifference, and state supervenience, and defined on an act-closed set of states which contains all states in the starting point of ≻h;

  • • ≻ψ and ≻h are compatible.

Then ≻h is represented by a density operator.

Proof.First, note that by the Born Rule theorem, there is some quasi-unique utility function such that ≻ψ is given by the MEU principle with respect to that utility function and the Born-rule probabilities. Noncontextuality and macrostate indifference of ≻h follow immediately.

Next, take two rewards r and s with r ≻ s (if there are not two such rewards, the conclusion is trivially true) and divide ℜ arbitrarily into two subsets ℜr and ℜs. If we define r ∗ and s ∗ as the direct sums of ℜr and ℜs respectively, there will be a new decision problem Ƥ ∗ with the same acts, events and macrostates as Ƥ but with rewards r ∗, s ∗.

For any event E there will be an act VE available at E such that if t ∈ ℜr, VE maps t ∧ E into r and similarly mutatis mutandis for s. We can define a solution ≻ h * to the decision problem Ƥ ∗ as follows:

U ≻ h * U ′ iff V O U U ≻ h V O U ′ U ′

Since there are only two rewards in Ƥ ∗, ≻∗ is trivially probabilistic; by the noncontextual inference theorem, it is represented by some density operator ρ. Clearly, ρ also represents the original solution so long as we (p.478) restrict our attention to acts whose outcomes lie in r ∨ s (for any such act U, V𝒪U= 1𝒪U).

Now let 𝒰 be a utility function for the state-dependent solution satisfying 𝒰(r) = 1 and 𝒰(s) = 0 (there is exactly one such function, of course). For any available h and any reward t satisfying r ⪰ t ⪰ s, suppose that Ut and Ur ,s are acts available at end(h) whose ℜ-POVMs are defined as follows: the ℜ-POVM of Ut assigns 1end(h) to t and 0 to all other rewards; the ℜ-POVM of Ur ,s assigns 𝒰(t) 1end(h) to r, (1 − 𝒰(t)) 1end(h) to s, and 0 to all other rewards. For any ψ ∈ end(h) U t ~ ψ Ur,s, so by compatibility, Ut ~h Ur,s

Now for any available h, and any act U available at end(h) whose possible rewards all lie between r and s, I write the ℜ-POVM of U as a function from rewards between r and s to positive operators:Ot(U) is the positive operator assigned to t. By the above result, if U′ is another such act available at end(h) with

O r ( U ' ) = ∑ t ∈ R : r ≥ t ≥ s O t ( U ' ) u ( t ) O s ( U ' ) = ∑ t ∈ R : r ≥ t ≥ s O t ( U ' ) ( 1 − u ( t ) O t ( U ' ) = 0 otherwise

then by the above result and diachronic consistency, U ~h U′; by branching availability, such an act is available. Since EUρh(U) = EUρh(U′), it follows that ρ represents ≻h for any acts whose outcomes lie between r and s. Since r and s were arbitrary, the result follows.

C.5 Statement and proof of the classical inference theorems

I define a classical decision problem as specified by:

  • • A topological space ℋ. Given a countable set 𝒮 of subsets of ℋ, I write ∨𝒮 ≡ ∪𝒮 and ∧𝒮 ≡ ∩𝒮 ; Given subspaces E and F, I define E ∨ F = ∨{E, F} and likewise for ∧.

  • • A complete Boolean algebra 𝓔 of Borel subsets of ℋ, the eventspace. (So 𝓔 contains ℋ and is closed under ∨, ∧, and taking the complement.) I define a partition of an event E to be a countable setof mutually disjoint events whose conjunction is E.

  • (p.479)
  • • For each E ∈ 𝓔, a set 𝒰E of continuous maps from E into ℋ, which we call the set of acts available at E. We write 𝒪U for the smallest event containing the range of the act U and require that the choice of available acts satisfies:

    1. 1. Restriction: If E, F ∈ 𝓔 and F ⊂ E, then if U is available at E then the map U|F, defined by U(x) = U |F(x) whenever x ∈ F, is available at F.

    2. 2. Composition: If U is available at E, and V is available at 𝒪U, then VU is available at E.

    3. 3. Indolence: For any event E, if there are any acts available at E then the identity 1E is available at E. (More precisely, the embedding map of E into ℋ is available at E.)

    4. 4. Continuation: If U is available at some E, then there is some act available at 𝒪U.

    5. 5. Irreversibility: If U is available at E ∨ F and E ∧ F = ∅, 𝒪U|E ∧ 𝒪U|F = ∅.

    6. 6. Extension: If F ⊂ E and U is available at F, there is some act V available at E such that U = V |F.

  • • A partition ℜ of 𝓔 (that is, a set of mutually disjoint elements of 𝓔 whose disjunction is ℋ), the set of rewards.

A classical decision problem is a diachronic decision problem: as such, many of the concepts of general diachronic decision theory (in particular, the concepts of solution and history) carry over to it.

If E is the starting point of a solution ≻h, and ρ is a [finite] Borel probability measure on E, then we define ρh as a measure on end(h) recursively by

ρ ϕ = ρ


ρ 〈 E , U 〉 . h ( X ) = ρ h ( U − 1 ( X Λ O O U ) ) ρ h ( U − 1 ( E Λ O U ) )

whenever the right-hand side is defined (here 𝒪U is the range of U, as in the quantum case). And we say that ρ [almost] represents ≻h provided that the probability measure

Pr ( E | U , h ) = ρ h ( U − 1 ( E Λ O U ) )

[almost] represents ≻h.

For any act U available at E, and any partition 𝓕 of ℋ by events, I define the 𝓕 -partition generated by U to be a map f from elements of 𝓕 to subsets of E, defined by


f ( F ) = U − 1 ( F Λ O U ) .

(I will conflate f and the range of f where no confusion would be caused by this.) A subset of E is available at E iff it is a member of the characteristic partition of some act available at E.

I say that a classical decision problem is richly structured if, given any set X of disjoint subsets of an event all of which are available at the event, there is some act available at that event and some partition 𝓕 of ℋ such that all elements of X are elements of the 𝓕 -partition generated by the act. And I say that a solution to the decision problem is classically noncontextual if for any two acts U and V available at end(h), if U and V have the same ℜ-partition then U ~h V.

Finally, a classical decision problem is extension-friendly if, given any event E and any real function f of the available subsets of E which is additive on disjoint sets and satisfies f(∅) = 0, f is extendible, though not necessarily uniquely, to a Borel measure on E. (This will be satisfied if, for instance, the set of available acts is large enough that all Borel subsets of E are available at E, although of course this is a very strong idealization.)

We can now prove the

Classical noncontextual inference theorem.Suppose that:

  • • Ƥ is a classical decision problem satisfying reward availability, rich structuring and extension-friendliness;

  • • ≻h is a solution to Ƥ satisfying ordering, diachronic consistency, and macrostate indifference;

  • • ≻h is diachronically partitionable;

  • • ≻h is probabilistic;

  • • ≻h is classically noncontextual.

Then ≻h is almost represented by some finite Borel measure.

Proof. By the diachronic representation theorem, there is a probability measure Pr on Ƥ which almost represents ≻h. Pr(E|U, h) = Pr(F|V, h) whenever U −1 (E ∧ 𝒪U) = V −1 (F ∧ 𝒪V).

Specialize to the starting-point event M of ≻h, and the trivial history h 0. If E ⊂ M is available at M then there is some event F and some act U available at M such that U −1 (F ∧ 𝒪U) = E; without contradiction we can define μ(E) = Pr(F|U, h 0). By rich structuring, if E 1,…En are available at M, there is some single act U and some events F 1 … Fn such that Ei= U −1(Fi ∧ 𝒪U); hence,

∑ i μ ( E i ) = ∑ i Pr ( F i | U , h 0 ) = Pr ( ∨ i F i | U , h 0 ) = μ ( ∨ i E i ) ;

(p.481) so μ is additive on disjoint sets. By extension-friendliness, there is some Borel measure ρ on M which agrees with μ on available subsets of M.

We now prove recursively that Pr(E|U, h) = ρh(U −1 (E ∧ 𝒪h)). Suppose it true for given h and suppose U is available at end(h), that E is a subevent of 𝒪U and that W is available at E. Then there is some act V available at 𝒪U such that V|E = W, and we have (writing hE as shorthand for 〈E, U〉 · h)

ρ h E ( V | ( F ∧ O V | E ) ) = ρ h ( V | ( F ∧ O V | ) ∧ O ) ) ρ h ( U − 1 ( E ∧ O U ) ) = Tr ( ( V U ) † Π F V U ρ h ) Tr ( Π E U ρ h U † Π E ) = Pr ( F | V U , h ) Pr ( E | U , h ) Pr ( F | V | , h E ) .

We now define a state-dependent solution to a classical decision problem as an association, to each event E and each x ⊂ E, of a preference order ≻x over the acts available at E(E will usually be taken as tacit). A state-dependent solution satisfies ordering if ≻x is a total ordering for each x, diachronic consistency if V ⪰U(x) W iff VU ⪰x WU, macrostate indifference if U ~x V whenever U(x) and V(x) are in the same reward, and x state supervenience if U ~x V whenever U(x) = V(x). A state-dependent solution is compatible with a solution iff, whenever U ⪰x V for all x, U ⪰ V.

A classical decision problem satisfies erasure iff for any reward r, any subevents E 1, E 2 ⊂ r, and any x 1, x 2 ∈ E 1, E 2, there are acts U 1, U 2 such that Ui is available at Ei and U 1(x 1) = U 2(x 2) ∈ r.

Now we can prove the

Classical realist inference theorem. Suppose that:

  • • Ƥ is a classical decision problem satisfying reward availability, rich structuring, extension-friendliness, and erasure;

  • • ≻x is a state-dependent solution to Ƥ satisfying ordering, diachronic consistency, macrostate indifference, and state supervenience;

  • • ≻h is a solution to Ƥ satisfying ordering and diachronic consistency;

  • • ≻h is diachronically partionable;

  • • ≻h is probabilistic;

  • • ≻h is compatible with ≻x.

Then ≻h is almost represented by some finite Borel measure.


Proof.It suffices to prove that ≻h is macrostate-indifferent and noncontextual; macrostate indifference is immediate from compatibility and the macrostate-indifference of ≻x. For noncontextuality, suppose that U and V are available at end(h) and have the same ℜ-partition. For any x ∈ end(h), U (x) and V (x) are in the same reward (r, say); hence by erasure there are acts U′ and V′ available at 𝒪U and 𝒪V respectively such U′U(x) that U′U(x) = V′V(x) ∈ r. By macrostate indifference, U′ ~U(x) 1𝒪U and so by diachronic consistency U′U ~ x U ; the same is true mutatis mutandis for V and V′. State supervenience means that U′U ~xx V′V, hence U ~ x V. Since x was arbitrary, noncontextuality now follows from the compatibility of ≻x and ≻h.

C.6 Statement and proof of the Everettian Epistemic theorem

A quantum structure for a diachronic decision problem is specified by:

  • • a Hilbert space ℋ;

  • • a Boolean algebra isomorphism E → EQ of the event space 𝓔 onto a Boolean algebra 𝓔Q of subspaces of ℋ including ℋ itself;

  • • For each event E, a map A → UA from acts available at E to unitary maps from EQ into ℋ, satisfying

  1. (i) (𝒪A)Q is the smallest element of 𝓔Q containing the range of UA.

  2. (ii) UAB = UAUB.

  3. (iii) UA|E = UA|EQ.

Fairly clearly, this makes the diachronic decision problem into a quantum decision problem whose events and macrostates coincide.

A partially quantum decision problem is a triple 〈Ƥ, Q, X〉 is a diachronic decision problem, Q is a subproblem of Ƥ, and X is a quantum structure for Q.

Everettian Epistemic theorem.Suppose that:

  • • 〈Ƥ, Q, X〉 is a partially quantum decision problem satisfying reward availability;

  • • ≻h is a solution to 〈Ƥ, Q, X〉 satisfying ordering, diachronic consistency, macrostate indifference, probabilism, diachronic partitioning and noninfinitesimality;

  • • The quantum decision problem 〈Q, X〉 satisfies branching availability and rich structuring;

  • (p.483)
  • • ≻ψ is a solution to 〈Q, X〉 satisfying ordering, diachronic consistency,

  • • macrostate indifference, and state supervenience;

  • • The restriction of ≻h to 〈Q, X〉 is compatible with ψ.


  1. (i) ≻h is uniquely represented by a probability measure Pr;

  2. (ii) If end(h) ∧ Q ≠ ∅, there is a density operator ρh on end(h) ∧ Q such that if we write EQ ≡ E ∧ Q and EC ≡ E ∧ ¬Q, we have

    + Tr ( ρ h U A | E Q † Π E Q U A | E Q ) Pr ( Q | 1 end ( h ) , h ) .

  3. (iii) ρh obeys the update rule

    ρ h E = Π E Q U A | ρ h U A | end ( h ) Q † Π E Q Tr ( Π E Q U A | end ( h ) Q ρ h U A | end ( h ) Q † )

Proof.(i) is just a special case of the Diachronic Representation Theorem. For the rest, it follows from the Everettian Inference theorem that the restriction of ≻h is represented by some density operator ρh, and (C.35) is just a special case of (B.31).


(1) See e.g. Apostol (1974: 54).

(2) Technical detail: Λ(X) ≤ ||X||, where ||X|| = sup{Xx: ||x|| ≤ 1} is the operator norm. One of the forms of the Hahn–Banach theorem (see e.g. Rudin 1991) is that any linear functional on a subspace of a normed space which is bounded by that norm may be extended to a linear functional on the whole space.