(p.465) Appendix C Formal Proofs of DecisionTheoretic Results
(p.465) Appendix C Formal Proofs of DecisionTheoretic Results
(p.465) Appendix C
Formal Proofs of DecisionTheoretic Results
In this appendix, I give formal proofs (and in some cases, formal statements) of various decisiontheoretic results stated in Part II and Appendix B.
C.1 Proof of the Additive Representation Theorem
Theorem 1 (Additive Representation Theorem). Suppose that E is a synchronic decision problem and that ≻ is a preference order for some set 𝓑 of bets for the decision problem. If
A1 The set 𝓑 includes all possible bets;
A2 The event space of E is the set of subsets of some finite set of atomic events;
B1 Any two constant bets with the same payoff are equivalent: if r is a reward and 𝓔 _{1}, 𝓔 _{2} are chance setups, then C_{r}(𝓔 _{1}) ~ C_{r}(𝓔 _{2});
B2 There is an associative, commutative map ⊕ from pairs of rewards to rewards, and some reward 0 such that r ⊕ 0 = r for any reward r, such that
(i) If 𝓕 _{1} ⪰ 𝓕 _{2} and 𝒢 _{1} ⪰ 𝒢 _{2}, then 𝓕 _{1} ⊕ 𝒢 _{1} 𝓕 _{2} ⊕ 𝒢 _{2};
(ii) If in addition 𝓕 _{1} ≻ 𝓕 _{2}, then 𝓕 _{1} ⊕ 𝒢 _{1} ≻ 𝓕 _{2} ⊕ 𝒢 _{2};
B3 There is a reward r such that C_{r}(𝓔) is not equivalent to C _{0}(𝓔);
C1a For any bets 𝓕, 𝒢, there is some positive integer n such that n𝓕 ≻ 𝒢(where n𝓔 just means 𝓔 added to itself n times);
C1b For any bets 𝓕, 𝒢 such that 𝒢 ≻ C _{0}(𝓔), there is some positive integer n such that n𝓕 + 𝒢 ≺ C _{0}(𝓔);
C1c For any bets 𝓕, 𝒢 such that 𝒢 ≺ C _{0}(𝓔), there is some positive integern such that n𝓕 + 𝒢 ≺ C _{0}(𝓔);
(p.466) C2a For any bets 𝓕, 𝒢 such that 𝓕 ≻ 𝒢 ≻ C _{0}(𝓔), there is some positive integer n such that n𝓕 ≺ (n + 1)𝒢;
C2b For any bets 𝓕, 𝒢 such that 𝓕 ≺ 𝒢 ≻ C _{0}(𝓔), there is some positive integer n such that n𝓕 ≺ (n + 1)𝒢.
then ≻ is uniquely represented by some probability measure Pr.
Proof. For simplicity, we start by constructing the utility function 𝒰. By B1, the ordering ≻ determines a welldefined ordering on some subset of 𒄜; by A1, that subset is in fact the whole of 𒄜. We write that ordering also as ≻.
By B3, there is some reward r such that either r ≻ 0 or 0 ≻ r; we will assume the former. (If actually all rewards are less desirable than 0, the proof can just be repeated with ≻ replaced by ≺.) By definition set 𝒰 (0) = 0 and 𝒰(r) = 1.
Now for any reward s ≻ 0, we define 𝒰(s) by
(That some such upper bound exists is a consequence of C1a and the Bolzano–Weierstrass Theorem.^{1}) It is straightforward to see that this is additive: 𝒰(s + t) = 𝒰(s) + 𝒰(t). More importantly, it represents ≻: 𝒰(s) 〉 𝒰(t) iff s ≻ t. To see this, firstly suppose 𝒰(s) 〉 𝒰(t). Then there must be some m, n such that ns ≻ mr but mr ⪰ nt, so by transitivity, ns ≻ nt. But if t ⪰ s, by repeated application of B2 it follows that nt ⪰ ns. It follows that s ≻ t.
Conversely, suppose that s ≻ t. To begin with, assume that t ≻ 0. By C2a, there is an integer N such that N_{s} ≻ (N + 1)t. Now let {m_{i}} and {n_{i}} be sequences of positive integers such that {m_{i}/n_{i}} is an increasing sequence and lim_{i} m_{i}/n_{i} = 𝒰 (t). Then by definition, for each i n_{i}t ≻ m_{i} r, and by B2 again, (N + 1)n_{i}t ≻ (N + 1)m_{i}r. By transitivity and B2 again, we get Nn_{i}s ≻ (N + 1) m_{i}r. So 𝒰(s) ≥ (N + 1)m_{i}/Nn_{i} for all i, and so 𝒰(s) ≥ (N + 1)/N䶰(t).
If instead t ≺ s ≺ 0, the same argument can be run with ≺ and ≻ interchanged, using C2b instead of C2a. Finally, if s ≻ 0 ≻ t, all we need to show is that if s ≻ 0 then 𝒰(s) 〉 0 and that if t ≺ 0 then 𝒰(t) 〈 0. To show the former, we use C1a to find some N such that N_{s} ≺ R; it follows that 𝒰(s) 〉 1/N. To show the latter, we apply a similar argument using C1b.
(p.467) In fact, we can extend 𝒰 to nonconstant bets very straightforwardly: the definition of 𝒰 works as well for nonconstant as for constant bets. That is, we define
for some arbitrary 𝓔. If we define Pr(E𝓔) = EU(C _{r,0}(E𝓔)), it follows via A2 and B2 that EU(F) is indeed the expected utility of 𝓕 with respect to this probability measure and to 𝒰.
This proves the existence of a representation. To prove uniqueness, let Pr be the probability measure already constructed, and suppose that Pr′ is another probability measure which represents ≻. By definition there is some utility function 𝒱 such that the expected utility generated by Pr′ and 𝒱 represents ≻ via the MEU rule. 𝒱 must be an increasing function f(𝒰) of the alreadyconstructed utility function 𝒰.
Let 𝓔 be a chance setup and E be an event in O _{𝓔} with Pr(E𝓔) not equal to 0, 1 or 1/2 (if there are no such events, uniqueness is trivial), and for the moment suppose that Pr(E) is rational, Pr(E) = m/n say. This means that C _{nr,0}(E𝓔) ~ C _{mr}(𝓔), and so Pr′ (E) = f(m)/f(n). Since km/kn = m/n, for consistency we must have f(km) = f(k)f(m) and so (iterating) f(m ^{M}) = f(m)^{M}). Putting l = n − m and considering the complement ¬E of E in 𝒪_{𝓔}, we similarly get that f(L ^{L}) = f(l)^{L}.
Now, f is an increasing function, so we require that f(k)^{K} 〉 f(l)^{L} iff k^{K} 〉 l^{L} ; taking logarithms of both sides, the requirement is that Klnf (k) 〉 L lnf(l) iff Klnk 〉 L ln l. Since this is true for arbitrary K and L, by constructing a sequence we can show that ln f(k)/ln f(l) = ln k/ln l. If we write f(k) = k ^{λ} and f(l) = l ^{μ} for some λ, μ, it follows immediately that μ = λ. Hence Pr′ (E)= Pr(E)^{λ} and Pr′(¬E) = Pr(¬E)^{λ}. Since we require Pr′ (E) + Pr′ (¬E)= 1, this forces λ = 1, i.e. Pr′(E) = Pr(E). The extension to irrational probabilities is a straightforward if tedious argument from limiting sequences of rational numbers.
C.2 Proof of the Synchronic Likelihood and Synchronic Preference Theorems
Theorem 2 (Synchronic Likelihood Theorem). Suppose that
SL1 O_{𝓔}𝓔 ≻ ∅𝓔.
SL2 O_{𝓔}𝓔 ⪰ E𝓔 ∅𝓔 for any 𝓔 and any E ∈ 𝓔.
(p.468) SL3 If E, F,G ⊂ O𝓔 and E ∧ G = F∧ G = ∅, then E𝓔 ≻ F𝓔 iff E ∨ G𝓔≻ ≻F ∨ G𝓔.
SL4 ≻ is partitionable.
Then ≻ is uniquely almost represented.
Proof. Our first step is to show that, if SL3 holds, then a stronger composition rule is also possible.
Lemma 1. (additivity lemma). Suppose that ≻ is an ordering for an event space E satisfying
$$If\text{}E\wedge G=F\wedge G=\overline{)0},then\text{}E\underset{\xaf}{\succ}F\text{}iff\text{}E\vee G\underset{\xaf}{\succ}F\vee G.$$(C.3)and that A, B, E, F satisfy A ∧ B = E ∧F = ∅ Then if A ⪰ E and B ⪰ F, A ∨ B ⪰ E ∨ F. Furthermore, the result continues to hold if ⪰ is replaced by≻.
This extension is intuitively plausible, but rather fiddly to prove. Define C and G as the complements of A∨ B and E∨ F respectively; then elements like A ∧ F and C∧ E form a ninefold partition of 𝓔. For simplicity let us write elements of the partition as (e.g.) AF. The proof now proceeds by repeated use of C.3.
Start with A and E: A⪰E, so AE∨AF ∨ AG⪰AE ∨ BE ∨ CE, so AE ∨ AF ∨ AG ∨ BG ⪰ AE ∨ BE ∨ CE ∨ BG.
Now do B and F: B⪰F so BE ∨ BF ∨ BG⪰AF ∨ BF ∨ CF, so BE ∨ BG⪰AF ∨ CF, so BE ∨ BG ∨ AE ∨ CE⪰AF ∨ CF ∨ AE ∨ CE.
Combining these: AE ∨ AF ∨ AG ∨ BG⪰AF ∨ CF ∨ AE ∨ CE, so AE ∨ AF ∨ AG ∨ BG ∨ BE ∨ BF⪰AF ∨ CF ∨ AE ∨ CE ∨ BE ∨ BF.
Rearranging: A(E ∨ F ∨ G) ∨ B(E ∨ F ∨ G)≼(A ∨ B ∨ C)E ∨ (A ∨ B ∨ C)F; that is, A ∨ BE ∨ F, and the lemma is proved.
(Since this proof may appear incomprehensible at first sight, I recommend the following way of visualizing it. We represent the event space as a 3by3 grid, with A–C labelling the rows and E–G the columns; we represent events by shading in squares (so that AE ∨ CG is represented by a grid with the top left and bottom right squares shaded in). The equivalences between events are now equivalences between squares, and the application of additivity now comes down to the following rule: if two grids are identical, they remain so if we shade in a square which was unshaded in both grids, or erase a square which was shaded in in both grids. The stages of the proof can now be represented graphically as a series of shadings or unshadings of pairs of grids.)
(p.469) We begin the main proof by proving that if F _{1} ∨ … ∨ F_{N} = O_{𝓕} and G _{1} ∨ … ∨ G_{N} = O _{𝔖} are partitions satisfying F_{i}𝓕 ~ F_{j}𝓕 and G_{i}𝔖 ~ G_{j}𝔖, then F_{i}𝓕 ~ G_{j}𝔖. We do so via the ‘preferred’ Nelement partition E _{1} ∨ … E_{N} = O_{𝓔N} guaranteed by partitionability. The definition of partitionability tells us that there exists an event space 𝓔^{∗} and an N _{2} element partition E_{ij} of o𝓔^{∗},such that
• E_{ij}𝓔∗ ~ E_{i′j′}𝓔∗.
• ${E}_{i}^{*}={E}_{i1}\vee \dots \vee {E}_{iN}{\epsilon}^{*}~{E}_{i}\epsilon .$
• ${F}_{j}^{*}={E}_{1j}\vee \dots \vee {E}_{Nj}{\epsilon}^{*}~{F}_{j}\mathcal{F}.$
Applying the additivity lemma repeatedly tells us that, if ${E}_{i}^{*}{\epsilon}^{*}\succ {F}_{j}^{*}{\epsilon}^{*}$,then O _{𝓔 ∗} 𝓔 ∗ ≻ O _{𝓔 ∗}𝓔∗—which obviously is contradictory. So the result follows.
Given this result, it follows in particular that
$$E\left\epsilon \succ {E}_{1}\vee \cdots \vee {E}_{M}\right{\epsilon}_{N}\text{iff}\text{}E\left\epsilon \succ {E}_{1}\vee \cdots \vee {E}_{kM}\right{\epsilon}_{kN}.$$(C.4)So for any rational number α = M /N, we can consistently write E𝓔 ≻ α to indicate that E𝓔 ≻ E _{1} ∨ … ∨ E_{M}𝓔_{N}.
We now define the probability function as follows:
$$\mathrm{Pr}E=\mathrm{lub}\left\{\alpha :E\succ \alpha \right\}.$$(C.5)Suppose E ∧ F = ∅. If E ≻ M /N and F ≻ M′/N′, by the additivity lemma E ∨ F ≻ (M/N) + (M′/N′). So Pr(E ∨ F) ≥ Pr(E) + Pr(F); running the same argument for ∨ shows that Pr(E ∨ F) ≤ Pr(E) + Pr(F), so Pr is dditive. Clearly, Pr(𝓔) = 1.
Suppose Pr(E) 〉 Pr(F); then there is some rational number M/N such that E ≻ M/N ≻ F; hence, E ≻ F. So Pr almost represents ≻.
Theorem 3 (Synchronic Preference Theorem). Suppose ≻ is a total ordering on bets for some synchronic decision problem E, satisfying:
SP1 If f, g are bets on the same chance setup 𝓔 and E_{1}, … E_{n} is a partition of O_{𝓕} such that f^{−1} (E_{i}) and g^{−1}(E_{i}) are singlevalued (such a partition can always be found), then
1. If f^{−1} (E_{i})𝓔 ⪰ g^{−1} (E_{i})𝓔 for all nonnull E_{i}, then f ⪰ g.
2. If in addition f^{−1}(E_{i})𝓔 ≻ g^{−1} (E_{i})𝓔 for at least one nonnull E_{i}, then f ≻ g.
(This is sometimes called the dominance principle.)
SP2 ≻ defines a welldefined ordering on the set of rewards: that is, C_{r}(𝓔) ∼ C_{r}(𝓕).
(p.470) SP3 ≻ defines a welldefined ordering ≻_{E} on the restriction of bets to any event E: that is, it satisfies the sure thing principle.
SP4 ≻ definesawelldefinedlikelihoodordering: that is, it satisfies probabilism.
SP5 The likelihood order determined by ≻ is partitionable.
Then the preference ordering is uniquely almost represented.
Proof. Our first step is to prove that (at least for finite bets on a fixed chance setup) a bet is completely characterized by the probabilities it assigns to each reward. This we do via SP3. For suppose E and F are two events in 𝒪_{𝓔} with equal probability, that f is a rewardvalued partition such that f ^{−1}(E) = r and f ^{−1}(F) = s, and that g is obtained from f by swapping the rewards on E and F.
Now, define f′ and g′ by f′(r) = E, f′(s) = 𝒪 _{𝓔} − E and g′ (r) = F, g′(s) = 𝒪_{𝓔} − F. By SP3, if 〈f,𝓔〉 ⪰ 〈g,𝓔〉, then also 〈f ′,𝓔〉 ⪰ 〈g′,𝓔〉. But this contradicts the definition of probability.
Since any two finite bets on 𝓔 which assign the same probability to each reward can be obtained from one another by finitely many permutations, our first result follows.
The remainder of the proof will be easier to follow if we assume that for any finite set of positive real numbers p _{1}… p_{n} summing to 1, 𝓔 can be partitioned into N sets E _{1},…E_{n} with Pr(E_{i} 𝓔 = p_{i}. This is not in fact guaranteed to be true, but the idealisation is harmless: the partition canbe simulated for p _{1} .… p_{n} rational by using the partitioning axiom SP5, and if the numbers are irrational we can approximate them arbitrarily well with rational numbers (this is one place where looking only for almost representations makes life simpler). I leave the technical details to the reader.
So: let r, s, t be any three rewards with r ≻ t ≻ s. For any set E ∨ 𝓔, and any α 〈 1, I define αE to be some arbitrary subset of E with Pr(αE) = αPr(E); with respect to r and s, I define the Eutility of t as
$${\mathcal{U}}_{E}\left(t\right)=\mathrm{lub}\left(\alpha :{C}_{t}\left(E\right){\succ}_{E}{C}_{r,s}\left(\alpha E\right)\right).$$(C.6)(A moment’s reflection will show that in the special case where E = O_{𝓔}, this is a formalization of our informal definition of utility, on a scale where 𝒰 (r) = 0 and 𝒰 (s) = 1.)
By our previous result, if E and F are equiprobable then 𝒰 _{E}(t) = 𝒰 _{F}(t). If E and F are not equiprobable, but Pr(E)/Pr(F) is a rational number (M/N, say), we can decompose E into M sets and F into N sets such that all M + N sets are equiprobable; it follows again that 𝒰_{E}(t) = 𝒰_{F}(t). And if Pr(E)/Pr(F) is irrational, a standard leastupperbound argument (p.471) establishes the same thing again. We conclude that for given 𝓔, all the Eutilities are the same, and we can without ambiguity just refer to the 𝓔 utility of t, 𝒰 _{𝓔}(t). By definition,
$${C}_{t}\left(\epsilon \right)\succ {C}_{r,s}\left(\alpha {\mathcal{O}}_{E}\epsilon \right)$$(C.7)whenever α 〈 𝒰 _{𝓔}(t) and
$${C}_{t}\left(\epsilon \right)\prec {C}_{r,s}\left(\alpha {\mathcal{O}}_{E}\epsilon \right)$$(C.8)whenever α 〉 𝒰 _{𝓔}(t). Now let 𝓔 and 𝓕 be two distinct chance setups, and suppose that 𝒰 _{𝓔}(t) 〉 𝒰 _{𝓕}(t). Then there must exist α such that
$${C}_{t}\left(\epsilon \right)\succ {C}_{r,s}\left(\alpha {\mathcal{O}}_{E}\epsilon \right)$$(C.9)and
$${C}_{r,s}\left(\alpha {\mathcal{O}}_{F}\mathcal{F}\right)\succ {C}_{t}\left(\mathcal{F}\right).$$(c.10)But by the definition of probability C _{r, s}(α𝒪_{E}𝓔) ~ C _{r, s}(α𝒪_{F}𝓕), and by SP2 C_{t}(𝓔) ~ C _{t}(𝓕), so this is impossible. It follows that the 𝓔 −utility is actually independent of 𝓔 as well, and that we can talk unproblematically just of the utility of t (with respect to r and s).
We now consider any comparable set Ƭ of rewards all of which lie between r and s, and restrict our attention to finite bets with outcomes in Ƭ. We have seen that for preference purposes any such bet can be characterized by the probabilities assigned to each reward; we can then represent a bet unambiguously as a realvalued function F on Ƭsatisfying
$$\sum _{t\in \mathcal{T}}F\left(t\right)}=1.$$(C.11)Our strategy is now to replace every reward except r and s themselves with a mixture of r and s. In particular, suppose that φ _{+} and φ _{−} are functions of Ƭ satisfying 0 ≤ φ_{±}(t) ≤ 1, φ_{+}(t) 〉 𝒰 (t), and φ _{−} t 〈 𝒰 (t). If for some particular t, the bet gives reward t on outcome E, then it follows from the definition of Eutility that partitioning E into α(t)E and (1 − α(t))E, and changing the bet to give reward r on α(t)E and s on (1 − α(t))E, will produce a bet preferred to the original one if α(t) 〉 𝒰 (t) and vice versa if α(t) 〈 𝒰(t).
If we make such substitutions sequentially for all outcomes of the bet, we obtain a bet whose only outcomes are r and s, and which gives r with probability
$$\sum _{t\in \mathcal{T}}\alpha \left(t\right)}F\left(t\right).$$(C.12)(p.472) If for all tα(t) ≤ 𝒰(t) then this bet is definitely not preferred to the original bet; if for all tα(t) 〉𝒰(t), then it definitely is preferred. It follows via transitivity that if we define
$$EU\left(F\right)={\displaystyle \sum _{t\in \mathcal{T}}\mathcal{U}\left(t\right)}F\left(t\right)$$(C.13)then F is preferred to G whenever EU(F) 〉 EU(G). But this, of course, is the result we seek.
C.3 Proof of the Diachronic Representation Theorem
Theorem 4 (Diachronic Representation Theorem).Suppose that
• Ƥ is a diachronic decision problem satisfying reward availability;
• ≻ is a solution to Ƥ satisfying macrostate indifference and diachronic consistency;
• ≻ is diachronically partitionable:for any available history h, any pair of rewards r, s, and any N, there is an act A^{N} available at end(h), and an Nelement partition ${E}_{1}^{N}\dots {E}_{N}^{N}$ of end(h), such that
1 $\u3008{A}^{N},\left\{{E}_{1}^{N},\dots {E}_{N}^{N}\right\}\u3009$ is independent of any act available at end(h).
2 ${A}_{r,s}^{N}\left({E}_{i}\right){\sim}_{h}{A}_{r,s}^{N}\left({E}_{j}\right)$.
• The preference order ≻_{h} at h is probabilistic: that is, the likelihood order over events that it determines is independent of the bets used to define that order.
Then the preference order ≻_{h} at h satisfies axioms SP1 − SP4 of theorem 4, and hence is uniquely almost representable; furthermore, it is Bayesian(that is, uniquely almost represented by a Bayesianprobability measure on any comparable set of acts).
Proof. To show that ≻_{h} is partitionable, fix N, r and s, and (in the terminology used in section B.6 to define synchronic partitioning) take 𝓔_{N} = A^{N}. Given any other act B, we know there exist acts A′, B′ such that O _{A′·B} = O _{B′·A} ^{N} and (A′ · B)_{f} = (B′ · A^{N})_{f}. Take 𝓔∗ = O _{A′ · ·B.}
We now need to construct embeddings of O_{B} and O_{AN} into 𝓔∗, and we do so as follows:

• If E ⊂ O_{B}, ν(E) = O_{A′}_{E}.

• If F ⊂ O _{A N}, ν′(F) = O _{B′F}.
(p.473) SP3 now follows from the definition of independence. SP1 – 2 have been shown to follow from macrostate indifference and diachronic consistency; SP4 is an immediate consequence of probabilism.
It follows that ≻is uniquely almost representable, but it remains to be shown that it is Bayesian. To show this, first notice that if A is an act available at end(h) and f is a reward partition on its outcome space taking values in some set of comparable rewards, by our earlier constructions we have
Now let B be available at O_{A} and E ⊂ O_{A}. Define B _{1} = B_{E} and B _{2} = B_{O} A _{−E}, and let f, f′ be reward partitions on O_{B} which agree on restriction to O _{B 2}. Then we have
and
Subtracting out the common part and dividing by a common factor, we have that (B · A)_{f≻h}(B · A)_{f′} if
But by diachronic consistency, (B · A)_{f≻h}(B · A)_{f′} iff (B_{E})_{f≻〈 E,A〉·h}(B_{E})_{f′}. By assumption Pr uniquely almost represents ≻ that is, by assumption it is the only probability measure such that (B ·A)_{f≻h}(B·A)_{f′} if
It follows (replacing B _{1} with B_{E}) that
That Pr(O_{B} _{E}B · A, h) = Pr(EA, h) follows again from the construction of Pr (and thus, indirectly, from macrostate indifference).
C.4 Statement and proof of the Everettian Inference theorem
I take as a starting point the definition of a quantum decision problem given in Chapter 5, of a statedependent solution to that problem as specified in the same chapter, and of a (nonstatedependent) solution to the problem as specified in chapter 6. Note also that a quantum decision problem is a plain decision problem in the sense of Appendix B, and the two definitions of solution coincide. In particular, the definitions in Appendix B of ordering, diachronic consistency, macrostate indifference and probabilism apply directly to solutions of quantum decision problems.
I now state formally some additional constraints on quantum decision problems and their solutions. Recall that for any act U available at E, and any partition 𝓕 of 𝓔, the set
is a positive operator valued measure (POVM) on E, which I call the 𝓕 POVM of U, and that a POVM is available at E iff for some partition 𝓕, some act available at E has that POVM as its 𝓕 POVM.
I need only one additional constraint on a decision problem.
Rich structuring. Suppose that 𝓕 is an available set of macrostates and that for each F ℇ 𝓕, X_{F} is a (finite or countable) set of positive operators on F satisfying ∑_{XℇXF} X ≤ 1_{F}. Then there is a compatible act function 𝒰 for 𝓕 such that for each F ℇ 𝓕, the characteristic POVM of 𝒰 (F) contains all the operators in X_{F}.
And I need the following additional constraints on solutions to a problem.
(p.475) Solution continuity. If h is an available history and U, U′ are available at end(h), and U ≻ _{h} U′, then in the space of unitary maps from end(h) into ℋ there are neighbourhoods (in norm topology) 𝒩, 𝒩′ of U, U′ respectively such that any act in 𝒩 available at end(h) is preferred (at h) to any act in 𝒩′ available at end(h).
Noncontextuality. For any available history h and any U, V available at end(h), if U and V have the same characteristic POVM then U ~_{h}V.
I now state and prove, in succession, the noncontextual inference theorem and the Everettian inference theorem.
Noncontextuality Theorem. Suppose that:
• Ƥ is a quantum decision problem satisfying reward availability, problem continuity, branching availability, and rich structuring.
• ≻_{h} is a solution to Ƥ satisfying macrostate indifference, ordering, diachronic consistency, solution continuity, and probabilism.
Then ≻_{h} is represented by some density operator iff it is noncontextual.
Proof.If ≻_{h} is represented by some density operator, trivially it is noncontextual and probabilistic; the trick is proving the converse. I prove it in three steps. First I will establish that ≻_{h} is uniquely almost represented by some probability measure. Secondly I will prove that this probability function can be represented by a density operator. Finally, I prove that ≻_{h} is represented, not just almost represented.
To prove the first step, fix an available history h and a partition 𝓕 = F _{1}, … F_{n} of 𝓔 ; by branching availability, there is an 𝓕 POVM available at end(h) whose element associated with F_{i} is (1/N) 1_{end(h)}. Let U be the act generating this POVM: then bets (in the technical sense of Appendix B) on F_{i} ∧ 𝒪_{U} and F_{j} ∧ 𝒪_{U} generate the same ℜPOVM and so are equally likely at h (again, in the sense of Appendix B).
Now, for any act V available at end(h), let U′ be an act available at 𝒪_{V} whose 𝓕 partition at 〈𝒪_{V}, V〉·h associates (1/N) 1_{𝒪 V} to each element of 𝓕. It should now be apparent that U is independent (in the sense of Appendix B) of V, and so ≻_{h} is diachronically partitionable. As such, it follows from the diachronic representation theorem of Appendix B that ≻_{h} is uniquely almost represented.
The second step is essentially a variant on the POVM version of Gleason’s theorem proved by Caves et al. (2004). Specialize to the trivial history ∅ and the startingpoint macrostate M, define (p.476)
and define a functionon X by
(That this iswelldefined follows from noncontextuality). By therich structuring axiom and the additivity of Pr, Λ satisfies Λ(X + Y) = Λ (X) + Λ(Y) whenever X + Y ≤ 1_{M} and hence Λ(M/NX) = M /N(X). Diachronic consistency entails that if p 〉 q then Λ(pX) 〉 Λ(qX), so it follows that Λ is linear on X, and so extends uniquely to the span of X in the real linear space of bounded selfadjoint operators on M. If this span is not the whole space, we extend it to the whole space arbitrarily;^{2} we can then extend it uniquely to an antilinear functional on the whole space of bounded selfadjoint operators on M. Since that space is a Hilbert space with inner product 〈X, Y〉 = Tr(X ^{†} Y), there is a unique operator ρ such that Tr(ρX) = Λ(X).
Finally, we define ρ_{h} recursively: if h is an available history, U is available at end(h), and h_{E} is shorthand for 〈E, U〉 · h, then
and
If Tr(U ^{†}∏_{E} Uρ_{h}) = Pr(EU, h), then for any V available at O_{U} and any F ⊂ 𝒪_{V} we have
as required.
The third step follows from problem and solution continuity; I leave it to the reader.
Everettian Inference theorem.Suppose that:
• Ƥ is a quantum decision problem satisfying reward availability, problem continuity, branching availability and rich structuring;
• ≻_{h} is a solution to Ƥ satisfying ordering, diachronic consistency and solution continuity;
• ≻^{ψ} is a statedependent solution to Ƥ, satisfying ordering, diachronic consistency, solution continuity, branching indifference, macrostate indifference, and state supervenience, and defined on an actclosed set of states which contains all states in the starting point of ≻_{h};
• ≻^{ψ} and ≻_{h} are compatible.
Then ≻_{h} is represented by a density operator.
Proof.First, note that by the Born Rule theorem, there is some quasiunique utility function such that ≻^{ψ} is given by the MEU principle with respect to that utility function and the Bornrule probabilities. Noncontextuality and macrostate indifference of ≻_{h} follow immediately.
Next, take two rewards r and s with r ≻ s (if there are not two such rewards, the conclusion is trivially true) and divide ℜ arbitrarily into two subsets ℜ_{r} and ℜ_{s}. If we define r ^{∗} and s ^{∗} as the direct sums of ℜ_{r} and ℜ_{s} respectively, there will be a new decision problem Ƥ ^{∗} with the same acts, events and macrostates as Ƥ but with rewards r ^{∗}, s ^{∗}.
For any event E there will be an act V_{E} available at E such that if t ∈ ℜ_{r}, V_{E} maps t ∧ E into r and similarly mutatis mutandis for s. We can define a solution ${\succ}_{h}^{*}$ to the decision problem Ƥ ^{∗} as follows:
Since there are only two rewards in Ƥ ^{∗}, ≻^{∗} is trivially probabilistic; by the noncontextual inference theorem, it is represented by some density operator ρ. Clearly, ρ also represents the original solution so long as we (p.478) restrict our attention to acts whose outcomes lie in r ∨ s (for any such act U, V_{𝒪U}= 1_{𝒪U}).
Now let 𝒰 be a utility function for the statedependent solution satisfying 𝒰(r) = 1 and 𝒰(s) = 0 (there is exactly one such function, of course). For any available h and any reward t satisfying r ⪰ t ⪰ s, suppose that U_{t} and U_{r} _{,s} are acts available at end(h) whose ℜPOVMs are defined as follows: the ℜPOVM of U_{t} assigns 1_{end(h)} to t and 0 to all other rewards; the ℜPOVM of U_{r} _{,s} assigns 𝒰(t) 1_{end(h)} to r, (1 − 𝒰(t)) 1_{end(h)} to s, and 0 to all other rewards. For any ψ ∈ end(h) U _{t} ~ ^{ψ} U_{r,s}, so by compatibility, U_{t} ~_{h} U_{r,s}
Now for any available h, and any act U available at end(h) whose possible rewards all lie between r and s, I write the ℜPOVM of U as a function from rewards between r and s to positive operators:O_{t}(U) is the positive operator assigned to t. By the above result, if U′ is another such act available at end(h) with
then by the above result and diachronic consistency, U ~_{h} U′; by branching availability, such an act is available. Since EU_{ρh}(U) = EU_{ρh}(U′), it follows that ρ represents ≻_{h} for any acts whose outcomes lie between r and s. Since r and s were arbitrary, the result follows.
C.5 Statement and proof of the classical inference theorems
I define a classical decision problem as specified by:

• A topological space ℋ. Given a countable set 𝒮 of subsets of ℋ, I write ∨𝒮 ≡ ∪𝒮 and ∧𝒮 ≡ ∩𝒮 ; Given subspaces E and F, I define E ∨ F = ∨{E, F} and likewise for ∧.

• A complete Boolean algebra 𝓔 of Borel subsets of ℋ, the eventspace. (So 𝓔 contains ℋ and is closed under ∨, ∧, and taking the complement.) I define a partition of an event E to be a countable setof mutually disjoint events whose conjunction is E.
(p.479)

• For each E ∈ 𝓔, a set 𝒰_{E} of continuous maps from E into ℋ, which we call the set of acts available at E. We write 𝒪_{U} for the smallest event containing the range of the act U and require that the choice of available acts satisfies:

1. Restriction: If E, F ∈ 𝓔 and F ⊂ E, then if U is available at E then the map U_{F}, defined by U(x) = U _{F}(x) whenever x ∈ F, is available at F.

2. Composition: If U is available at E, and V is available at 𝒪_{U}, then VU is available at E.

3. Indolence: For any event E, if there are any acts available at E then the identity 1_{E} is available at E. (More precisely, the embedding map of E into ℋ is available at E.)

4. Continuation: If U is available at some E, then there is some act available at 𝒪_{U}.

5. Irreversibility: If U is available at E ∨ F and E ∧ F = ∅, 𝒪_{U}_{E} ∧ 𝒪_{U}_{F} = ∅.

6. Extension: If F ⊂ E and U is available at F, there is some act V available at E such that U = V F.


• A partition ℜ of 𝓔 (that is, a set of mutually disjoint elements of 𝓔 whose disjunction is ℋ), the set of rewards.
A classical decision problem is a diachronic decision problem: as such, many of the concepts of general diachronic decision theory (in particular, the concepts of solution and history) carry over to it.
If E is the starting point of a solution ≻_{h}, and ρ is a [finite] Borel probability measure on E, then we define ρ_{h} as a measure on end(h) recursively by
and
whenever the righthand side is defined (here 𝒪_{U} is the range of U, as in the quantum case). And we say that ρ [almost] represents ≻_{h} provided that the probability measure
[almost] represents ≻_{h}.
For any act U available at E, and any partition 𝓕 of ℋ by events, I define the 𝓕 partition generated by U to be a map f from elements of 𝓕 to subsets of E, defined by
(I will conflate f and the range of f where no confusion would be caused by this.) A subset of E is available at E iff it is a member of the characteristic partition of some act available at E.
I say that a classical decision problem is richly structured if, given any set X of disjoint subsets of an event all of which are available at the event, there is some act available at that event and some partition 𝓕 of ℋ such that all elements of X are elements of the 𝓕 partition generated by the act. And I say that a solution to the decision problem is classically noncontextual if for any two acts U and V available at end(h), if U and V have the same ℜpartition then U ~h V.
Finally, a classical decision problem is extensionfriendly if, given any event E and any real function f of the available subsets of E which is additive on disjoint sets and satisfies f(∅) = 0, f is extendible, though not necessarily uniquely, to a Borel measure on E. (This will be satisfied if, for instance, the set of available acts is large enough that all Borel subsets of E are available at E, although of course this is a very strong idealization.)
We can now prove the
Classical noncontextual inference theorem.Suppose that:
• Ƥ is a classical decision problem satisfying reward availability, rich structuring and extensionfriendliness;
• ≻_{h} is a solution to Ƥ satisfying ordering, diachronic consistency, and macrostate indifference;
• ≻_{h} is diachronically partitionable;
• ≻_{h} is probabilistic;
• ≻_{h} is classically noncontextual.
Then ≻_{h} is almost represented by some finite Borel measure.
Proof. By the diachronic representation theorem, there is a probability measure Pr on Ƥ which almost represents ≻_{h}. Pr(EU, h) = Pr(FV, h) whenever U ^{−1} (E ∧ 𝒪_{U}) = V ^{−1} (F ∧ 𝒪_{V}).
Specialize to the startingpoint event M of ≻_{h}, and the trivial history h _{0}. If E ⊂ M is available at M then there is some event F and some act U available at M such that U ^{−1} (F ∧ 𝒪_{U}) = E; without contradiction we can define μ(E) = Pr(FU, h _{0}). By rich structuring, if E _{1},…E_{n} are available at M, there is some single act U and some events F _{1} … F_{n} such that E_{i}= U ^{−1}(F_{i} ∧ 𝒪_{U}); hence,
(p.481) so μ is additive on disjoint sets. By extensionfriendliness, there is some Borel measure ρ on M which agrees with μ on available subsets of M.
We now prove recursively that Pr(EU, h) = ρ_{h}(U ^{−1} (E ∧ 𝒪_{h})). Suppose it true for given h and suppose U is available at end(h), that E is a subevent of 𝒪_{U} and that W is available at E. Then there is some act V available at 𝒪_{U} such that V_{E} = W, and we have (writing h_{E} as shorthand for 〈E, U〉 · h)
We now define a statedependent solution to a classical decision problem as an association, to each event E and each x ⊂ E, of a preference order ≻x over the acts available at E(E will usually be taken as tacit). A statedependent solution satisfies ordering if ≻^{x} is a total ordering for each x, diachronic consistency if V ⪰^{U(x)} W iff VU ⪰^{x} WU, macrostate indifference if U ~_{x} V whenever U(x) and V(x) are in the same reward, and x state supervenience if U ~^{x} V whenever U(x) = V(x). A statedependent solution is compatible with a solution iff, whenever U ⪰^{x} V for all x, U ⪰ V.
A classical decision problem satisfies erasure iff for any reward r, any subevents E _{1}, E _{2} ⊂ r, and any x _{1}, x _{2} ∈ E _{1}, E _{2}, there are acts U _{1}, U _{2} such that U_{i} is available at E_{i} and U _{1}(x _{1}) = U _{2}(x _{2}) ∈ r.
Now we can prove the
Classical realist inference theorem. Suppose that:
• Ƥ is a classical decision problem satisfying reward availability, rich structuring, extensionfriendliness, and erasure;
• ≻^{x} is a statedependent solution to Ƥ satisfying ordering, diachronic consistency, macrostate indifference, and state supervenience;
• ≻_{h} is a solution to Ƥ satisfying ordering and diachronic consistency;
• ≻_{h} is diachronically partionable;
• ≻_{h} is probabilistic;
• ≻_{h} is compatible with ≻^{x}.
Then ≻_{h} is almost represented by some finite Borel measure.
Proof.It suffices to prove that ≻^{h} is macrostateindifferent and noncontextual; macrostate indifference is immediate from compatibility and the macrostateindifference of ≻^{x}. For noncontextuality, suppose that U and V are available at end(h) and have the same ℜpartition. For any x ∈ end(h), U (x) and V (x) are in the same reward (r, say); hence by erasure there are acts U′ and V′ available at 𝒪_{U} and 𝒪_{V} respectively such U′U(x) that U′U(x) = V′V(x) ∈ r. By macrostate indifference, U′ ~^{U(x)} 1_{𝒪U} and so by diachronic consistency U′U ~ ^{x} U ; the same is true mutatis mutandis for V and V′. State supervenience means that U′U ~^{x}x V′V, hence U ~ ^{x} V. Since x was arbitrary, noncontextuality now follows from the compatibility of ≻_{x} and ≻_{h}.
C.6 Statement and proof of the Everettian Epistemic theorem
A quantum structure for a diachronic decision problem is specified by:

• a Hilbert space ℋ;

• a Boolean algebra isomorphism E → E_{Q} of the event space 𝓔 onto a Boolean algebra 𝓔_{Q} of subspaces of ℋ including ℋ itself;

• For each event E, a map A → UA from acts available at E to unitary maps from EQ into ℋ, satisfying

(i) (𝒪_{A})_{Q} is the smallest element of 𝓔_{Q} containing the range of UA.

(ii) U_{AB} = U_{A}U_{B}.

(iii) U_{AE} = U_{A}_{EQ}.
Fairly clearly, this makes the diachronic decision problem into a quantum decision problem whose events and macrostates coincide.
A partially quantum decision problem is a triple 〈Ƥ, Q, X〉 is a diachronic decision problem, Q is a subproblem of Ƥ, and X is a quantum structure for Q.
Everettian Epistemic theorem.Suppose that:
• 〈Ƥ, Q, X〉 is a partially quantum decision problem satisfying reward availability;
• ≻_{h} is a solution to 〈Ƥ, Q, X〉 satisfying ordering, diachronic consistency, macrostate indifference, probabilism, diachronic partitioning and noninfinitesimality;
 (p.483)
• The quantum decision problem 〈Q, X〉 satisfies branching availability and rich structuring;
• ≻^{ψ} is a solution to 〈Q, X〉 satisfying ordering, diachronic consistency,
• macrostate indifference, and state supervenience;
• The restriction of ≻_{h} to 〈Q, X〉 is compatible with ψ.
Then:
(i) ≻_{h} is uniquely represented by a probability measure Pr;
(ii) If end(h) ∧ Q ≠ ∅, there is a density operator ρ_{h} on end(h) ∧ Q such that if we write E_{Q} ≡ E ∧ Q and E_{C} ≡ E ∧ ¬Q, we have
$$+\text{Tr}\left(\rho h{U}_{A{E}_{Q}}^{\u2020}{\Pi}_{{E}_{Q}}{U}_{A{E}_{Q}}\right)\text{Pr}\left(Q{1}_{\text{end}\left(h\right),}h\right).$$(C.34)(iii) ρ_{h} obeys the update rule
$$\rho {h}_{E}=\frac{{\Pi}_{EQ}UA\rho h{U}_{A\text{end}\left(h\right)Q}^{\u2020}{\Pi}_{EQ}}{\text{Tr}\left({\Pi}_{EQ}{U}_{A\text{end}{\left(h\right)}_{Q}}{\rho}_{h}{U}_{A\text{end}\left(h\right)Q}^{\u2020}\right)}$$(C.35)
Proof.(i) is just a special case of the Diachronic Representation Theorem. For the rest, it follows from the Everettian Inference theorem that the restriction of ≻_{h} is represented by some density operator ρ_{h}, and (C.35) is just a special case of (B.31).
Notes:
(^{2}) Technical detail: Λ(X) ≤ X, where X = sup{X_{x}: x ≤ 1} is the operator norm. One of the forms of the Hahn–Banach theorem (see e.g. Rudin 1991) is that any linear functional on a subspace of a normed space which is bounded by that norm may be extended to a linear functional on the whole space.