Jump to ContentJump to Main Navigation
Probabilities in Physics$

Claus Beisbart and Stephan Hartmann

Print publication date: 2011

Print ISBN-13: 9780199577439

Published to Oxford Scholarship Online: September 2011

DOI: 10.1093/acprof:oso/9780199577439.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2019. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 22 March 2019

The Past Histories of Molecules

The Past Histories of Molecules

(p.83) 4 The Past Histories of Molecules
Probabilities in Physics

Craig Callender

Oxford University Press

Abstract and Keywords

This chapter unfolds a central philosophical problem of statistical mechanics. This problem lies in a clash between the Static Probabilities offered by statistical mechanics and the Dynamic Probabilities provided by classical or quantum mechanics. The chapter looks at the Boltzmann and Gibbs approaches in statistical mechanics and construes some of the great controversies in the field — for instance the Reversibility Paradox — as instances of this conflict. It furthermore argues that a response to this conflict is a critical choice that shapes one's understanding of statistical mechanics itself, namely, whether it is to be conceived as a special or fundamental science. The chapter details some of the pitfalls of the latter ‘globalist’ position and seeks defensible ground for a kind of ‘localist’ alternative.

Keywords:   statistical mechanics, probabilities, static, Reversibility Paradox, special science, Boltzmann, Gibbs

The philosophical foundations of thermodynamics and statistical mechanics can seem a bewildering labyrinth. You fall in by innocently wondering why microscopic systems trace out patterns consistent with macroscopic thermodynamics. The next thing you know, you're then wandering around an intricate structure of paths each corresponding to a different research program in statistical mechanics. Boltzmann, Gibbs, Krylov, Khinchin, Jaynes, Lanford, and others all offer ways out. But the paths are different and never seem to converge. The hedge of the maze is too high for you to gain a peek at the overall landscape. You're left wondering whether the paths' respective advocates are even answering the same question. Confusion ensues.

The philosophical foundations of statistical mechanics contrast with, say, the foundations of quantum mechanics. In quantum mechanics there is marked difference of opinion about the best solution to the measurement problem. Is it Bohr's, Wigner's, Bohm's, Ghirardi's, or Everett's? The answers differ, yet all participants more or less agree on the problem. In foundations of statistical mechanics, however, the various programs differ even over the questions. What are fundamental questions of statistical mechanics? Recovering thermodynamics from mechanical considerations? Justifying certain inferences? Explaining why certain techniques—e.g. Gibbs' phase averaging—work?

Amidst this chaos, it is important for those new to the topic to bear in mind that there are some big and deep issues in common dispute among the different programs in statistical mechanics. These questions are not always explicitly discussed, but they loom large in the background. What I hope to do in this essay is bring some of these issues to the foreground.

Foremost among these issues is the conflict between what we might call Static and Dynamic Probabilities. The Static Probabilities are provided by statistical mechanics; the Dynamic ones given by classical or quantum mechanics. In what follows, I'll read some of the great episodes in the foundations of statistical mechanics as instances or foreshadowings of this conflict. Then we'll tackle the conflict itself. Should the Static Probabilities ‘track’ the microscopic dynamics? This is the question behind the famous Reversibility Paradox. This (p.84) paradox, I'll argue, is a crucial juncture for the interpretation of statistical mechanics. Different responses to it shape one's view of statistical mechanics itself, and in particular, whether it is to be conceived as a special or a fundamental science.

1 Probability: The price of a mechanical theory of heat1

Scientists devised kinetic theory and then statistical mechanics with the hope of providing a so‐called ‘dynamical’ or ‘mechanical’ theory of heat. At its most general, the idea is to account for thermal phenomena not with new entities or forces, but rather, as Clausius puts it, with a new type of motion. The motion is that of particles governed by Newtonian mechanics or some variant there of, e.g. Hamiltonian mechanics. The idea that heat, temperature, and other thermal phenomena are ‘nought but molecules in motion’ (Maxwell, quoted in Campbell & Garnett 1884, p. 410) existed for hundreds and perhaps thousands of years. Galileo and Gassendi each spoke approvingly of such a possibility, and Newton could even offer a kind of mechanical explanation of Boyle's gas law. Until the middle of the nineteenth century, however, no very sophisticated or powerful mechanical theory existed. Although there were early forerunners (e.g. Bernoulli 1738, Herapath 1821), it wasn't until the work of Clausius, Maxwell, and Boltzmann that kinetic theory and then statistical mechanics really blossomed.

The key to success was the introduction of probability. It is not possible to trace the phenomena of thermodynamics back to mechanics without the introduction of probabilistic notions. Bernoulli, Herapath, and Clausius all introduce probabilistic methods in their work. Not surprisingly, given that probability theory itself was developing, the notions used are often very blunt instruments. Probability might not even be explicitly invoked. For instance, an assumption of some of these early models is that the gas particles in a box are confined to travel on six independent beams moving in each of the six directions. Nothing probabilistic there, if taken literally. But of course a claim like this is really calculational shorthand for a claim about the average velocities—a hint of the ‘equiprobability’ assumptions to come.

Later the work gets much more sophisticated in Maxwell. Introducing the crucial notion of the distribution or density function

f ( υ ) d υ = the probablity that the velocity is between v and v +dv,

Maxwell seeks the unique distribution of velocities in a gas that will leave it in thermal equilibrium, in which f is constant in time. With a few somewhat shaky (p.85) assumptions, Maxwell derives that this distribution of speeds ( υ = v 2 ) is (in contemporary terms)

( υ υ 2 e m υ 2 / 2 k T ,

where k is a constant and T is the temperature. The distribution describes a normal distribution, a bell‐shaped curve peaking around the average value and declining on either side, multiplied by υ 2. This distribution implies that the energy is equally distributed among the degrees of freedom of the system. Crucial to this derivation in 1860 is an assumption that the velocity components of an individual molecule are statistically independent of one another, i.e.

f ( υ i , υ j , υ k ) = f ( υ i ) f ( υ j ) f ( υ k ) ,

where (i,j,k) run through the indexes for the three orthogonal directions (x, y, z).

Seeking to improve this ‘precarious’ argument (Maxwell 1867, p. 66), Maxwell 1867 provides a new derivation of (1). This time Maxwell assumes the system is in equilibrium, and seeks the analytical conditions on a distribution function that will maintain that fact. In his argument, Maxwell now makes an assumption of probabilistic independence with respect to the velocities of colliding particles in contrast to independence of the velocity components of single particles. Boltzmann's 1872 work is built upon the same infamous posit, which is often called Stoβzahlansatz, or ‘Hypothesis of Molecular Chaos.’ The postulate states that for particle pairs that will collide, but not necessarily for particle pairs that have just collided, the following holds:

f ( 2 ) ( v 1 , v 2 ) = f ( v 1 ) f ( v 2 ) ,

where f (2) is the distribution function for pairs of molecules (he considers only binary collisions). In words it says that the probability to find a pair of molecules with velocities in d 3 v 1 around v 1 and d 3 v 2 around v 2 is equal to the product of the probabilities to find a molecule in d 3 v 1 around v 1, f(v 1)d 3 v 1, and to find one in d 3 v 2 around v 2, f(v 2) d 3 v 2. Call the velocities after the collision v 1 , v 2 . Because of the time‐reversal invariance of the laws, Maxwell assumes that the Stoβzahlansatz holds also for reversed collisions, in which the initial velocities v 1 , v 2 are changed to v 1, v 2. He further assumes that, in equilibrium, the number of pairs of particles that switch their initial velocities v 1, v 2 to final velocities v 1 , v 2 in unit time is equal to the number of pairs that switch their velocities from v 1 , v 2 to v 1, v 2 in that same unit time. All this implies that

f ( v 1 ) f ( v 2 ) = f ( v 1 ' ) f ( v 2 ' ) for suitable  v 1 , v 2 and  v 1 ' , v 2 ' .

Maxwell claims that his distribution (1) satisfies this condition and that any other doesn't (see Uffink 2007).

(p.86) The question we'll face is whether static probability principles like the Stoβzahlansatz are compatible with the underlying mechanics. After all, from mechanics it seems we know that they're not! If two molecules collide—either in the past or the future, since the mechanics is time‐symmetric—then the velocity of the one depends on the velocity of the other. If a moving molecule hits a stationary one (according to, say, the center‐of‐mass frame), then how fast the formerly still one is going hangs on how fast the one who hit it was going. Surely the molecules' velocities are not probabilistically independent of one another in a system with collisions, at least according to most understandings of ‘probabilistic independence.’

In any case, with this posit and a host of other assumptions Boltzmann seems able to work miracles (literally). Not content to assume a system is in equilibrium, Boltzmann aims to show how it got there from nonequilibrium. The idea was to explain, mechanically, the centerpiece of classical thermodynamics, the Second Law of Thermodynamics. According to this law, the entropy of adiabati-cally isolated systems will never decrease with time. Boltzmann thus sought a demonstration that for an arbitrary nonequilibrium distribution function f(v, t), this function would approach equilibrium and, once there, stay there. The result was the famous H‐Theorem.

Define the H‐function as

H( t )= d 3 v f ( v , t ) log f ( v , t ) ,

where we assume f(v, t ) is a solution of the Boltzmann Equation and the gas is spatially homogeneous. Boltzmann proves that H is always nonincreasing with time and that the only stationary solution of his equation is Maxwell's distribution, (1). This claim, plus a few other heuristic considerations, yields the identification of entropy with — H. Putting all these new facts and suggestions together, it appears that we have a proof, from mostly mechanical assumptions, of perhaps the most significant piece of thermodynamics, the transition from nonequilibrium to equilibrium. Take any nonequilibrium low‐entropy distribution f(v,t) that is a solution to Boltzmann's Equation, then in the limit of time this distribution is the equilibrium Maxwellian distribution.

A result this big attracts scrutiny. Before too long, the statistical postulate at the heart of Boltzmann's argument became the subject of a great controversy that continues to this day. Taken at face value, the H‐Theorem seems to prove what one a priori knows can't be proved from a purely classical‐mechanical basis.2 As Boltzmann's contemporaries pointed out, classical mechanics has at least two features that are incompatible with Boltzmann's conclusion: it is (p.87) quasi‐periodic and time‐reversal invariant. Given these mechanical features, how could Boltzmann have proved what he claimed? Because Clausius and Maxwell were after equilibrium and not also the rise to equilibrium, they did not threaten then‐known features of mechanics. But, in retrospect, we might say that their probabilistic assumptions merited such critical scrutiny too. Applying the Stoβzahlansatz in both temporal directions at equilibrium might be said to be twice as objectionable as Boltzmann's application, rather than unobjectionable. However, with probability newly injected into physics, its interpretation inconsistently applied, and the whole field awash in simplifying assumptions, conflict between such statistical postulates and the dynamics was hard to see.

Later in life Maxwell spent a great deal of time reflecting on the introduction of probability into physics. He came to believe that the ‘statistical method’ was incompatible with the ‘dynamical method.’ But even in 1867, describing his work in a book review, one sees a hint that Maxwell already saw trouble:

I carefully abstain from asking the molecules which enter [the volume under consideration] where they last started from. I only count them and register their mean velocities, avoiding all personal enquiries which would only get me in trouble. (Garber, Brush & Everitt 1995, Document 15)

While discretion is undoubtedly a virtue, is it compatible with a thoroughgoing mechanics? Not if the molecules, like shadowy figures in detective novels, have secret past histories …

2 Statistical mechanics

Moving forward in time, these issues survived the change from kinetic theory to statistical mechanics. There is no hard and fast line between kinetic theory and statistical mechanics. Typically one thinks of the difference as between whether probabilities are being used with respect to the state of the entire system (statistical mechanics) or to aspects, like collision angles, of a single system (kinetic theory).

Today in statistical mechanics one finds a divide between Gibbsian formulations, which are actually what are used by practicing physicists, and descendants of a Boltzmannian theory, which surfaces primarily in foundational discussions. In the foundations of the subject, there are disputes between these two camps as well as internecine battles within each. Here, I can't do justice to all the intricacies that arise. Fortunately, the topic I want to discuss crosscuts Gibbsian and Boltzmannian lines to a certain extent. We can make do with only a cursory introduction to each, and then get on with the main arguments. Let's begin with some concepts and terminology both approaches have in common.

We can describe the exact microscopic state of an unconstrained classical Hamiltonian system by a point in an abstract space. Let G be a 6N‐dimensional abstract space spanned by the possible locations and momenta of each particle; (p.88) then a point X ∈ Γ, where X = (q 1,p 1,q 2,p 2, …, q n,p n), details all the positions and momenta of all the particles in the system at a given time. X's movement through time is determined by the particular Hamiltonian, H(X), and Hamilton's equations of motion. Since energy is conserved, this evolution is restricted to a (6N − 1)‐dimensional hypersurface of Γ, which we'll dub ΓE.

Both approaches place a measure over this energy hypersurface. A measure is a device that, in the present case, allows one to speak of sets of infinite numbers as having ‘sizes.’ On the real‐number line, there are as many numbers between 1 and 2 as between 1 and 3—a continuous infinity—yet intuitively the second interval is larger. Measures let us say this. When a particular one, called Lebesgue measure, is adapted to the unit interval, it assigns values via ǀbaǀ, for real numbers a and b. In this sense the set of numbers [1,2] = { x∈ℝ ǀ 1 ≤ x≤ 2 } is half as large as the set of numbers [1,3]. We face similar problems in defining N‐dimensional volumes to regions of phase space. Because position and momentum take on real‐number values, every region in phase space has an infinite number of microstates in it. To speak of sizes, volumes, and so forth, we need to impose a measure on phase space. Avery natural one is Liouville measure μ, a measure that weights points in an intuitively uniform manner. The Liouville measure is just the Lebesgue measure used above when adapted to the canonical representation of Γ in terms of positions and momenta.

An important feature of μ and the Hamiltonian dynamics can now be stated: The dynamics is measure‐preserving. If A t stands for the time‐development of the points in region A after time t, then a theorem known as Liouville's Theorem implies that

μ ( A ) = μ ( A t ) .

As the set A changes with time, its ‘size’ nevertheless remains invariant.

2.1 Boltzmann

The modern Boltzmannian picture is based, somewhat loosely, on the ‘geometrical’ picture first offered in Boltzmann 1877 and elsewhere. The idea begins with the notion of a macrostate. A macrostate M in thermodynamics is a state that has certain values of pressure, volume, temperature, and so on. Many microstates X ∈ ΓE give rise to each macrostate, of course. Using the resources described above, we can consider volumes μ(M) of ΓE that correspond to the set of all X's that would realize macrostate M. The set of all such volumes partitions ΓE.

The Boltzmannian now identifies equilibrium as the macrostate corresponding to the largest (or to one among the largest, see Lavis 2005) macrostate in phase space. This identification is grounded in Boltzmann's (1877) famous ‘combinatorial argument.’ Here Boltzmann showed that for an ideal gas, the macrostate with the largest proportion of volume—that is, the greatest ‘number’ of microstates compatible with it—is the one whose distribution function corresponds to a local (p.89) Maxwellian (a function identical locally to the Maxwell distribution). Recall that the Maxwell distribution is the equilibrium distribution.

This result suggests a new ‘geometrical’ interpretation of entropy. Define the Boltzmann entropy of a system with macrostate M as

S B = k log μ ( Γ M ),

where k is Boltzmann's constant. The macrostate with the largest volume, i. e. the highest entropy, is by definition the equilibrium macrostate. Notice that this entropy, unlike the classical thermodynamic entropy, is defined in and out of equilibrium. In equilibrium, it will take the same value as the Gibbs fine‐grained entropy (defined below) if N is large. Outside equilibrium, the entropy can take different values and will exist so long as a well‐defined macrostate does.

Earlier we saw that Boltzmann devised a theory of the movement from nonequilibrium to equilibrium. In the case of the dilute gas, he is able to show that the Boltzmann entropy inherits all these nice features. That is, he is able to show that—at least in the case of the ideal gas3—when N is large, S B is essentially equivalent to H, which in turn was already plausibly identified with the thermodynamic entropy. Endowing the measure with probabilistic significance and hoping the dynamics cooperate, it suggests—following Earman (2006)—Boltzmann's version of the Second Law:

If at some time (t)0 the S B((t)0) of a system is low relative to the entropy of the equilibrium state, then for later times tt 0, so long as (tt 0) is small compared to the recurrence time of the system, it is highly probable that S B(t) 〉 S B(t 0).

The picture is that when μ ( Γ M t ) μ ( Γ M t 0 ) , the system will have an overwhelming likelihood of transiting from M t 0 to M t.

We can connect this result with the earlier Boltzmann as follows.4 The Boltzmann Equation describes the evolution of the distribution function f(x, v) over a certain span of time, and this evolution is one toward equilibrium. But we learn from the Reversibility Paradox (see below) that not every microstate will do so. Relative to the measure, however, most of them will. So let Γδ ⊂ ΓE be the set of all particle configurations X that have distance δ 〉 0 from f(x, v). A typical point X ∈ Γδ is one whose solution (a curve tX(t)) for some reasonable span of time stays close to the solution of the Boltzmann Equation (a curve tf t(x, v)).

An atypical point X ∈ Γδ is one that departs from the solution to the Boltzmann Equation. The claim is then that, measure‐theoretically, most nonequilibrium (p.90) points X ∈ Γδ are typical. The expectation—proven only in limited cases— is that the weight of good points grows as N increases. The Boltzmannian wants to understand this as providing warrant for the belief that the microstate underlying any nonequilibrium macrostate one observes is almost certainly one subsequently heading toward equilibrium. The desired conclusion hangs on highly nontrivial claims, in particular the claim that the solution to Hamilton's equations of motion for typical points follows the solution to the Boltzmann Equation.

2.2 Gibbs

The biggest difference between Boltzmann and Gibbs is that Boltzmann defines equilibrium and entropy in terms of individual systems, whereas Gibbs does not. Gibbs 1981 [1902] instead works with ensembles of systems. The ensemble is a fictitious collection of the continuous infinity of microstates that yield the same macroscopic state. Gibbsian statistical mechanics works by imposing a density function ρ(q, p, t) on the ensemble and then calculating various functions of this density. These functions will correspond to the observables.

Gibbs wants to interpret the density as a probability density, so the function ρ is normalized. Using this probability density one calculates what are called phase or ensemble averages. These are expectation values of functions f(X) on phase space:

f = Γ f ( X ) ρ ( X ) d Γ E .

Gibbs claims that every experimentally observable function corresponds to some phase average—with the crucial exceptions of entropy, temperature, and the chemical potential. The so‐called ‘fine‐grained entropy,’ from which temperature is derived, is instead given by

S G ( ρ ) = k Γ ρ log ( ρ ) d Γ .

The temperature and chemical potential are then recovered from the above ingredients.

The framework we have described is empty until we specify a precise probability distribution. Which one do we choose? There are, after all, an infinite number of them. Gibbs primarily wants to recover equilibrium thermodynamics and, like Maxwell before him, understands unchanging macroscopic properties as the hallmark of equilibrium. He therefore insists that the probability distribution be stationary, i.e. ∂ρ/∂t = 0. However, every distribution that is a function of the Hamiltonian is stationary; stationarity doesn't single one ensemble out. What singles out the relevant distribution, according to Gibbs, are the constraints on the system and other desirable features. If we keep the energy and particle number constant, then Gibbs looks for the distribution that maximizes (p.91) the fine‐grained entropy. This distribution, called the microcanonical probability measure, or microcanonical ensemble, is the measure uniform on ΓE and zero elsewhere:

ρ ( X ) = R δ ( H ( X ) E ) .

Here δ is a Dirac delta‐function and R a renormalization constant. One can see that it is a stationary measure by observing that it depends on X only via H(X), yet H(X) is a constant of the motion. For systems with different constraints there are different distributions, most famously the canonical ensemble (a system in contact with a heat bath) and the grand canonical ensemble (an ‘open’ system, in which particle number is not fixed).

Whichever ensemble is chosen, the probability of finding the microstate in some particular region A of ΓE is given by

P t ( A ) = A ρ ( X , t ) d Γ E .

This claim is a central feature of the Gibbsian approach.

3 The clash between static and dynamic probabilities

We now arrive at the central problem of this paper. The difficulty is more or less the same worry we've voiced about Maxwell and the earlier Boltzmann. Are the static‐probability posits compatible with the underlying mechanics? Before, we were worried about independence assumptions holding at a time; now we're worried about an even more basic probability assumption holding at a time. Both the Gibbsian and Boltzmannian frameworks are committed to a static‐probability rule:

(SP) If a system is in macrostate M at t, and X is a microstate that corresponds to M, then the probability that X lies in a subset A of ΓM is r, where r is a real number between 0 and 1, inclusive.

For the Boltzmannian, the probability that the actual microstate lies in A is given by μ(A)/μE). Supplemented with an argument that macrostates closer to equilibrium occupy a greater proportion of Γ the Boltzmannian can then try to explain the increase in entropy as the result of an improbable‐to‐probable transition. For the Gibbsian, as we have just witnessed, the probability that the actual microstate lies in some region A is calculated with the microcanonical ensemble (if the system is isolated; a different ensemble if not). This ensemble is a probability measure imposed on the microstates themselves; via this measure one can calculate the probability in SP via equation (3). The theories differ over scope, the Boltzmannian applying his framework to equilibrium and nonequilibrium, the Gibbsian to only equilibrium. Yet they are equally committed to SP, the Boltzmannian through μ(A)/μE) and the Gibbsian through (3).

(p.92) When one reflects on SP one sees it is actually quite remarkable. Merely on the basis of knowing the macrostate of a system, the Gibbsian and Boltzmannian immediately know the probability of the corresponding microstates (strictly, the probability of regions of microstates). When one steps back and considers matters from a purely microscopic and mechanical perspective, this ability to know the probability of microstates based on such limited information at one time seems almost miraculous. It's as if I reliably know the probability you're in the kitchen based merely on the restriction that you're in the house. Surely, one might think, the probability instead hangs on the time of day, when you last ate, how tired you are, what your favorite television show is, and so on. Miracle or not, statistical mechanics proceeds quite successfully with the continual invocation of SP.

The exact microdynamics for the system, by contrast, does take the past history of the system into account. Indeed, it takes everything that bears on the probability of some system evolving into some microstate into account. Instead of providing static probabilities, the exact microdynamics dictates that systems evolve with specific transition probabilities. Transition probabilities stipulate the chances of a system evolving to a later or earlier state given the present state. In a deterministic theory, like classical mechanics, all these chances are either one or zero; in an indeterministic theory, like many versions of quantum mechanics, the chances can be between one and zero.

The question, at its most general level, is whether a marriage can be arranged between the transition chances provided by mechanics and statistical mechanics’ SP. Can the two be consistently combined? By ‘consistently combined’ we don't mean specifying the same exact probability to every event at every time. Sp probabilities are unconditional on previous states, whereas dynamical probabilities aren't, so they typically won't specify the same numerical value— most obviously when the microevolution is deterministic. I mean something looser, namely, whether SP is probabilistically disconfirmed, on its own terms, by consistently supposing that the ensemble of systems it operates on always evolves according to the dynamics.

As a purely logical exercise, it's easy to produce examples with static and transition chances in harmony. It can happen. For a trivial example consider a Bernoulli process, which is a discrete‐time stochastic process consisting of a finite or infinite sequence of independent random variables taking on one of two possible values (e.g. heads or tails). Coin flips are the canonical example of Bernoulli processes. The neat thing about Bernoulli processes is that the probability that a variable will take on a particular value, e.g. heads, is constant for all variables in the sequence. Bernoulli processes are memory‐less: the static probability (e.g. chance of heads on flip n) is independent of what transpires in the past and future. Trivially, the dynamic transition probabilities match the static probabilities. For a much less trivial example, note with Albert (2000) (p.93) that the Born probability distribution from quantum mechanics turns out to be a static‐probability rule that is perfectly consistent with the underlying deterministic dynamics of the Bohm interpretation (see Callender 2007 for some of the subtleties of this claim). The task is not impossible, even if, intuitively, we require a kind of delicate contrivance between the two to maintain consistency.

3.1 The Reversibility Paradox

The Reversibility Paradox provides an illustration of the charge that SP‐chances do not fit with the classical‐mechanical transition probabilities.

In terms of the Boltzmannian picture, the Reversibility Paradox can be put as follows. Nowhere in our theory did the direction of time play a role; since the physics is time‐symmetric, the reasoning can be reversed so that entropy increase toward the past is most probable too. In particular, make the following assumptions (Earman 2006): (i) the microscopic dynamics is time‐reversible, (ii) the measure of a region in phase space is equal to the measure of the time‐reversed region: μ(A) = μ(A T) for all A, and (iii) if M is a macrostate with corresponding volume μm), then M T is also a macrostate with corresponding volume μ ( Γ M T ) . The first two assumptions follow from the Hamiltonian mechanics we're using; the third simply from the Boltzmannian framework. With these assumptions in place it is possible to prove that P(M(t 0 + τ) ǀ M(t 0)—the probability that the system is in macrostate M(t 0 + τ) at time t 0 + τ, given it was in macrostate M(t 0) at t 0—is equal to P(M T(t 0τ) ǀ M(t 0)), where τ 〉 0. Therefore, if it's likely that the initial Boltzmann entropy S B(M(t 0)) rises after τ to S B(M(t 0 + τ)), it is also likely that, many seconds before, the entropy S B (M T(t 0τ)) was higher than at t 0.

To this problem the modern Boltzmannian answers: the framework is, as it must be, time‐symmetric, but time‐asymmetric input will yield time‐asymmetric output. Add the time‐asymmetric assumption that the initial state of the system has low Boltzmann entropy. Then, if low enough, we can expect time‐asymmetric increase in entropy. Problem solved. Eventually, the entropy will decrease and there will even be recurrence for systems with bound phase spaces, but calculations suggest that it will be a very, very long time before this occurs for any macroscopic system. If it happens, that is a prediction of the theory. We only want to recover thermodynamics from mechanics when thermodynamics actually (approximately) holds.

The above answer is fine, so far as it goes. If I start out with a system at time t 0 that is a freshly poured cup of coffee, immerse it in room temperature, then—if all the assumptions hold up—most likely the coffee cup will relax to room temperature by time t 0 + τ. Success! However, let's think about this more carefully. If I repeat this procedure at time t 0 + τ, I will again place a uniform probability distribution over all the microstates compatible with the macrostate (p.94) at t 0 + τ. Because the dynamics is time‐reversible, I can evolve this probability distribution backwards and make a prediction for t 0. That prediction would be that only a small proportion of the earlier microstates should be further from equilibrium rather than closer to equilibrium (because according to the reapplication of SP at t 0 + τ, only a small proportion of the microstates are atypical ones). So this derived probability distribution at t 0 conflicts with the originally assumed probability distribution at t 0. The original one weighted the low‐entropy states heavily whereas the ‘second’ one weighted the higher‐entropy states heavily. This is a flat‐out contradiction. Yet all we did to create this inconsistency is use the very procedure that works so well, but this time jumping back a time‐step, fully in accord with the dynamical laws.

The Gibbsian faces essentially the same difficulty, although it's exacerbated by a few extra factors special to Gibbs. Consider a gas in equilibrium at t 0 in an isolated chamber of volume V, with one wall connected to a piston. Slide the moveable wall out a way so the chamber is now of volume V′, where V′ = 2V. After some time t the gas is relaxed to a new equilibrium. The Gibbsian will describe the gas at time t 0 with the microcanonical ensemble ρ 0; and later at t 1 she will describe it with a new ensemble ρ 1. Naturally, since the distributions are with respect to different volumes, ρ 0ρ 1. Gibbsians calculate the entropy change by finding the difference in the S G's associated with each ensemble. Our question is: Is there any way of getting from one distribution to the other via mechanics? With the reversibility worry above in mind, can I evolve ρ 1 backward in time and get ρ 0? Because we are dealing with a situation with a time‐dependent Hamiltonian, the distributions can nontrivially evolve with time, unlike in the case wherein the Hamiltonian is time‐independent and ρ is stationary. Yet we already know what will happen: nothing, at least to the entropy. Due to Liouville's Theorem, the fine‐grained entropy remains constant. Since we know that the entropies at those two times are different, that means there is no way to get back to the original distribution. The same goes for the time‐reversed distributions. Even worse, in the Gibbsian case there is even a problem in the future direction, for if we evolve ρ 0 forward in time we get the wrong fine‐grained entropy; or if we evolve ρ 1 backward in time we again get the wrong entropy. Since we can't get from one empirically correct distribution to another via the dynamics, it seems the standard procedure is not justifiable from a mechanistic perspective. As Sklar (1993) complains, it is ‘not fair to choose a new ensemble description of the system at a later time’ (p. 54).

Phrased in terms of Gibbs entropy, this problem is known as the ‘paradox of the fine‐grained entropy.’ The fine‐grained entropy cannot change with time, as one quickly sees by taking the time‐derivative of (2) and noting Liouville's Theorem. Some followers of Gibbs at this stage introduce a new entropy, the so‐called ‘coarse‐grained entropy.’ The coarse‐grained entropy, unlike S G, can (p.95) change with time. I haven't space to discuss this move here.5 Suffice to say, apart from its other difficulties, one can repeat the Reversibility Paradox for this entropy too.

Stepping back, we see that SP conflicts with the dynamics in one of two ways it could. There are two possible conflicts, one corresponding to each direction of time, but we're assuming a solution to one of them. First, SP might not be compatible with the dynamics in the forward direction of time. Since SP works so well in this direction, however, we turn a blind eye to this question. We implicitly assume that the dynamics will cooperate. Some mixing‐like behavior (see e.g. Earman 2006) is assumed to be operative, but it's important to note that we haven't shown this. Second, and worse, SP works dismally in the past direction. For any system in equilibrium at any time, it's always possible that it reached that state through a myriad of different past macroscopic histories. The system may have always been in equilibrium, just that moment settled down to equilibrium, or relaxed to equilibrium twenty minutes ago. Guidance about the past history based on SP can't get this right.6

At this point, one can start looking at different programs in the foundations of statistical mechanics. One might examine attempts to single out the Lebesgue probability measure as exceedingly special, attempts to derive this measure from more fundamental physics, or attempts to weaken one's commitment to this measure by showing that manyother measures also would work. As Sklar (2006) emphasizes, each of these different approaches (and there are more still!) yields a very different understanding of SP.

Nevertheless, in this paper I want to continue our focus on the Reversibility Paradox and the compatibility of SP with the underlying mechanics. I see this issue as a crucial juncture in the foundations of the subject. The different paths leading from distinct answers produce theories that are radically dissimilar, theories with different research questions, and theories that are radically at odds when it comes to the wishes of the founding fathers of statistical mechanics. Bringing this out is what I hope to do in the sequel.

4 From local to global

What I want to do now is show that answering this problem while keeping the original motivations of the mechanists somewhat intact seemingly calls for moving the theory in a dramatically ‘global’ direction. Until now, we have spoken of SP as being applied to coffee cups, boxes of gas, and so on, i.e. small, (p.96) relatively isolated systems. However, Boltzmann ended up applying his theory to the universe at large, treating the entire universe as one isolated system:

That the mixture was not complete from the start, but rather that the world began in a very unlikely state, this must be counted amongst the fundamental hypotheses of the whole theory. (Boltzmann 1974, p. 172)7

Boltzmann exasperates some commentators for taking this lunge toward totality, given the limitations of his demonstrations.8 But within Boltzmann's framework, one must admit that there are certain natural pressures to go global. If one is after compatibility between SP and the mechanical framework, it does no good to ignore the serious pressures to go global. The choice over how far to extend the Reversibility Paradox is tantamount to the question of whether statistical mechanics is a special or fundamental science, a local or global theory.

The most straightforward way to avoid the Reversibility Paradox is to imagine imposing the initial probability distribution when the system begins. No worries about bad retrodictions present themselves if the system doesn't exist at those times. The trouble‐making past is eliminated. Of course, if the ‘first’ state is one of high entropy, this maneuver won't do us any good. So we need to additionally suppose that the ‘first’ state is one of extremely low entropy. This recipe—cut off the past, impose the probabilities on this first state—is clearly an effective wayof solving the problem.

Moreover, thinking in terms of coffee cups, ice cubes, and laboratory setups, this assumption is perfectly natural. All of these systems do in fact begin their ‘lives’ initially in low entropy. Until they become energetically open on a macroscopic scale, SP works wonderfully for such systems. We could also imagine specifying very precisely the degree of energetic isolation necessary for a system to count as beginning its life. The proposal, then, is that we impose SP at (roughly) the first moment low‐entropy macroscopic systems become suitably isolated.

This position is more or less Reichenbach's 1956 ‘Branch‐System’ Hypothesis.9 The ‘branches’ are the energetically isolated subsystems of the universe to which the statistical postulate is applied. Davies (1974), the physicist, as well (p.97) as a slew of contemporary philosophers, have advocated it. It's easy to see the appeal: it solves the Reversibility Paradox with a modest cost.

However, it's hardly clear that this solution is compatible with the underlying dynamics. Albert (2000) has made some forceful objections to the branch view, to which Frigg (2008) and Winsberg (2004a) have each replied. Albert raises many problems quickly, the first few of which center on the vagueness of the whole branching theory. When does a branch come into life? Frigg and Winsberg respond that this vagueness is benign. As mentioned, one can imagine various well‐defined recipes for defining macroscopically energetically isolated systems. In my view, while vagueness can ultimately be a worry, it's not the main threat to branches. The real action lies in Albert's warning that

serious questions would remain as to the logical consistency of all these statistical‐hypotheses‐applied‐to‐individual‐branch‐systems with one another, and with the earlier histories of the branch systems those branch systems branched off from. (Albert 2000, p. 89)

Forget the first, ‘one another’ worry and concentrate on the second, ‘earlier histories’ concern. This concern is essentially that branching hasn't solved the Reversibility Paradox. Just as it's ‘not fair’ to use the dynamics in one temporal direction and ignore it in the other, it doesn't seem fair to ignore the past histories of coffee cups, boxes of gas, and so on. True, we human beings using natural languages don't call the highly disjoint and scattered pre‐formation stages of the coffee cup ‘a cup of coffee,’ but so what? The microstate is still there, it still evolves according to the dynamics, and the dynamics is still time‐reversible. All of these features together suggest that we can legitimately backward‐time‐evolve that static probability distribution. When we do, their prior histories will create havoc, as we know from the Reversibility Paradox.

Put another way, branches, lives, and cups of coffee aren't part of the vocabulary of microphysics. Are ensembles of microstates, just because they evolve into regions of phase space associated with human beings calling them ‘cups of coffee,’ now supposed to scramble and distribute themselves uni-formly?10

In response to this worry, Winsberg looks squarely at the grand project and abandons it, so far as I can tell. He writes:

Where does the branch‐systems proposal get the temerity, then, to flagrantly disregard the clear and present authority of the microlaws and simply stipulate that the microconditions (or, at least, the probability distribution of a set of possible microconditions) just are such and such, as an objective, empirical and contingent fact about the world. But this worry, like (p.98) all worries, has a presupposition. … If the worry is about the authority of the microlaws, the proponent of the framework conception can simply reply that it is a mistake to think that the microlaws need offer us a complete description of the universe. (2004a, pp. 715–16)

Echoing Winsberg, Frigg points out that if one doesn't view the mechanical laws as universal, then this objection has no force.

That of course is certainly correct. Yet what is the resulting picture? It's essentially a branching view of the underlying mechanics too. When the branch systems develop, then they start operating according to classical mechanics, statistical mechanics, and thermodynamics, all at once. Before that …? The past mechanical histories of particles have been eliminated. What occasioned the start of all these laws? On this view, you can't ask.

If one was worried about the vagueness and relativity of the branch proposal before, one really worries now. Now, when the brancher states that the coffee cup is a branch with respect to the house, the house with respect to the Earth, and the Earth with respect to the solar system (pretending that each is approximately energetically isolated), we really have headaches. If we admit that, say, the Earth is evolving classical‐mechanically—and surely Newton was onto something— then aren't all the parts of the earth also evolving classical‐mechanically, whether or not they are branch systems? But then can't we talk about the prior mechanical history of the coffee cup?

The contemporary brancher appears to solve the Reversibility Paradox by throwing the baby out with the bathwater. We have lots of evidence for the existence of both Static and Dynamic Probabilities. How can we have both? The contemporary brancher answers that we can have both only by rejecting the basic assumption of the project, that systems are always evolving according to mechanics. Maybe one's big‐picture views in philosophy of science will countenance such a pick‐and‐choose approach to the laws of nature. We can't tackle this question here. What's certainly the case is that we've now abandoned the project of Clausius, Maxwell, and Boltzmann (and probably even Reichenbach, the founder of branching). To the extent that this project is worthwhile, then, we have reason to keep working. Later we will see a return of the branching picture, but one perhaps not as radical as the contemporary one envisioned here.

In light of the above problems, it's easy to see that avoiding the Reversibility Paradox seems to demand a global solution. Impose the SP on your coffee cup, then we need to worry about it's pre‐formation stages as a lump of clay in a factory. Put it on the lump of clay, then we need to worry about its state when the clay was part of a riverbed, and so on. Same goes for my coffee cup, made from the same factory, divided from the same lump of clay. The logic of the explanation leads inexorably to the earliest and spatially greatest‐in‐extent state mechanically connected to anything we now judge an instance of thermodynamics at work.

(p.99) Boltzmann's lunge for a global understanding of statistical mechanics was not a spectacular lapse of judgment; within his framework it was instead a natural product of the demand for consistency and for a mechanical explanation of why SP works.

In the standard model of cosmology, the earliest and widest state of the universe is associated with the Big Bang. The resulting picture is to impose SP on an early state of the universe while simultaneously claiming that that state is one of especially low entropy. This claim, that the entropy of the extremely early universe was very low, is dubbed the ‘Past Hypothesis’ by Albert (2000).11

We have gained consistency with the mechanics by pushing SP back to the beginning of the universe. That doesn't tell us what to use now, however. For the global state of the universe now, Albert advises us to abandon the original SP— that holds only at the Big Bang—and modify it for use at later times. The modification goes as follows. Let the Past State (the state the Past Hypothesis stipulates) be at time t 0. The new measure for some later time t, where tt 0, is the old SP except changed by conditionalizing on both the current macrostate and the Past State. That is, consider all the microstates at time t compatible with the current macrostate and the Past State macrostate. These microstates form a set, a proper subset of the set of microstates compatible with the current macrostate. Intuitively put, these are the microstates compatible with the current macrostate that allegedly don't give us reversibility headaches. Put the uniform (Lebesgue) measure on this restricted set. Then the chances that any microstate is in region A is determined the usual way, i.e. by the proportion of the measure of A with respect to the measure of this restricted set. Let's call this new probability principle SP*.

Stepping back from the details, we seem to have achieved our goal. We wanted to know how SP could be compatible with the dynamics. The answer is twofold: One, push back the static probabilities to the beginning of the universe, and two, replace the static probabilities used at later times with a new probability, that encoded in SP*, one that makes essentially the same probabilistic predictions for the future as the old static probability (presumably, since there is no ‘Future Hypothesis’).

5 Problems with going global

Extending the SP backward and outward undeniably has many benefits, but it also brings with it some costs. One common objection is that it's not clear (p.100) what a probability distribution over the state of the whole universe even means (Torretti 2007). Barring far‐fetched scenarios wherein our universe is selected from a giant urn of universes, what could it mean for an entire world history to be probable or improbable? My own view about this objection is that it shouldn't ultimately be decisive. If one grants that the above theory constitutes a good physical explanation of entropy increase, then it's the job of philosophers of probability to devise a notion of probability suitable to this theory. It would be remiss to abandon the theory just because philosophers and others haven't yet produced such a notion. Furthermore, there are plenty of theories of chance that would fill the role of providing an objective notion of probability without requiring an absurd ensemble of universes (e.g. Loewer 2004). None of these theories are perfect, but some are promising and work on them is ongoing. More worrisome to me are what I'll call the Subsystem, Definability, and Imperialism Concerns—and their entanglement.

Starting with the first, we began talking about relatively isolated subsystems of the universe and then moved to the universe as a whole: can we go back again to talk of subsystems? After all, our empirical experience is with the thermodynamics of subsystems of the universe. If we only recover an explanation compatible with mechanics that applies to the whole world, then this is a Pyrrhic victory. What follows for isolated subsystems from the fact that the global entropy is likely to increase? To me, this question is among the most pressing facing the Globalist solution.

The worry isn't that the odd cup of coffee might not go to equilibrium. That is precisely what we expect if the above theory works and the universe lasts long enough, i.e. we expect fluctuations. This pseudo‐problem assumes the theory works for most thermodynamic terrestrial subsystems. The real worry is that it doesn't. Why does it follow from the entropy increase of the system that the entropy defined for subsystems also increases? After all, the subsystems correspond in phase space to subspaces of smaller dimension, all of which have zero volume. So even if one assumed the system is additive or extensive—which one shouldn't—adding up a bunch of logarithms of zero volumes doesn't get us anywhere. It's a mistake to rely on the intuition that if the system's entropy is increasing then ‘most’ of the subsystems' entropies are increasing too.

Certain physical considerations can also be marshaled in support of skepticism. Notice that in terrestrial systems, where gravity is approximately uniform, entropy‐increasing systems tend to expand through their available volumes. When faced with the smooth early state of the universe, however, which suggests an initially high entropy, people immediately remark that because gravity is attractive, we should expect low‐entropy states to be spread out in the configuration sector of phase space. None of this talk is at all rigorous (Callender 2010, Earman 2006, Wallace 2010a). Never mind. Suppose it's roughly correct.

(p.101) Then the Past Hypothesis's entropy seems to be low ‘primarily because of the gravitational contribution, whereas that contribution is irrelevant for the kinds of subsystems of interest to us’ (Earman 2006, p. 419). The low entropy is driving the rise of structure in the cosmos, the creation of clusters, galaxies, stars, and so on. Compared to that, the cooling of the coffee in my cup is small beans. As Earman points out, we can't expect the entropy rise in the gravitational degrees of freedom to help with entropy increase in the nongravitational degrees of freedom, for the time‐scales are all wrong. The stars, for instance, are effectively ‘fixed’ during the lifetime of my morning coffee. The gravitational case appears to be precisely the sort of nightmare case imagined just above wherein it doesn't matter to the global entropy increase if locally, here on Earth, entropy never increased in small isolated systems. The entropy increase occasioned by the rise of galactic structure swamps anything that happens here on our little planet. Moreover, it's hardly clear that entropy is additive in the gravitational context (Callender 2010), which is an assumption charitably underlying this reasoning.

In reply, the Boltzmannian must cross her fingers and hope that the dynamics is kind. Recall that any subsystem corresponds, in phase space, to a lower‐dimensional subspace of the original phase space. The hope must be that when we sample the original, approximately uniform distribution onto this subspace and renormalize we again find a distribution that is approximately uniform. Pictorially, imagine a plane and a thoroughly fibrillated set of points on this plane, a set so fibrillated that it corresponds to an approximately uniform measure. Now draw a line at random through this plane and color in the points on the line that intersect the fibrillated set. Are these colored points themselves approximately uniformly distributed? That is what the Boltzmannian needs for local thermodynamic systems, except with vastly many higher dimensions originally and much greater dimensional gaps between the phase space and subspaces.

How are we to evaluate this reply? Based on experience with many types of systems, some physicists don't balk at the thought of such fibrillation. They see it in some of the systems they deal with and in the absence of constraints ask why things shouldn't be so fibrillated. Others, such as Winsberg (2004b), think it false. The unhappy truth, however, is that we simply have no idea how to evaluate this claim in general. My own attitude is to note this assumption as a large one and move on. No good will come of ‘intuitions’ one wayor the other on a topic in which we're largely in the dark. That's not to say indirect arguments can't be mustered. Suppose we had reasons for confidence in the mechanics, in the thermodynamics, and in this Boltzmannian story being the only way to reconcile the two. Then we would have an indirect argument that the dynamics must be thus‐and‐so to support claims in which we have great confidence. Yet (p.102) that's a far cry from any direct evidence for the specific dynamical claim under consideration.

The second problem of going Global is whether the Past Hypothesis is actually definable. This has always been a worry for empiricists like Reichenbach and Grünbaum (1963), who have been skeptical that an entropy of the entire universe (especially if open) makes sense. The worry has renewed life in the recent paper of Earman (2006). Let's not worry about cosmic inflation periods, the baryogenesis that allegedly led to the dominance of matter over antimatter, the spontaneous symmetry‐breaking that purportedly led to our forces, and so on. Stick with the less speculative physics found at, say, 10−11 seconds into the universe's history. Forget about dark energy and dark matter. Putting all that to one side, for confirmation of Boltzmann's insight one still needs to understand the Boltzmann entropy in generally relativistic space‐times. That is highly nontrivial. Worse, when Earman tries to define the Boltzmann framework in the limited cases wherein one has measures over solution spaces, the Boltzmann theory collapses into nonsense.

I can think of two responses available to the Globalist. First, one might advocate rolling up one's sleeves and finding a way to write the Past Hypothesis in terms of the most fundamental physics. That Earman found it impossible in the known physics of some ideal systems is a long way from a no‐go theorem implying it's impossible. Second, one could imagine placing the probability distribution on states that are approximately classical. Wait until the universe cools down, flattens out, etc., and can be described with physics that sustains the Boltzmann apparatus. The downside of this approach is that it's not exactly clear how to conceive of those earlier nonclassical states. They evolved to the later ones. If they evolved via time‐reversible physics, then the Reversibility Paradox beckons. What do we say about the probability distribution during the earlier time‐periods?

The third problem is Imperialism. To be fair, it's controversial whether this is a vice or a virtue. The issue is this: the probability measure over initial microstates will give us probabilities over an awful lot more than claims in thermodynamics. Thermodynamics tells us, abstractly speaking, that macrostates of a certain type, say A, reliably tend to coexist with or evolve into macrostates of another type, say B. Boltzmann explains this by asserting that most of the microstates compatible with A are also in B or evolve into B, respectively. But there are lots of other sciences that claim one type of macrostate is regularly preceded by another type. The current theory will provide a probability for these claims too. There are all sorts of counterfactual‐supporting generalizations, whether enshrined in what we'd call a science or not, that also make such claims. As Albert (2000) notices, the probability measure will make likely (or unlikely) the proposition that spatulas tend to be located in kitchens. On this view, all regularities turn (p.103) out to be probabilistic corollaries of physics plus this probability distribution. Science becomes unified in a way Newton and Einstein never dared to dream. This is because the chances, on this view, are right there with the quarks, gluons, and whatever else is part of the fundamental inventory of the world.

The reaction bysome to this imperialism is: Great! Loewer (2009), for instance, uses these consequences to great effect in the explanation of the special sciences. By contrast, others are shocked by the ambition of the theory (Leeds 2003; Callender & Cohen 2008). One doesn't have to subscribe to Cartwright's (1999) views in philosophy of science—that causal generalizations typically hold only in tightly circumscribed, nearly ideal experimental scenarios—to feel that the reach of the theory outstrips the evidence here. Even in statistical mechanics there are those who question the reach of the Boltzmannian picture. Schrödinger (1950) long ago rejected it, for instance, because it works only for reasonably dilute systems. Though it can be extended a bit (see Goldstein & Lebowitz 2004), there is a serious question over how much of thermodynamics is recovered by Boltzmann and SP*. Outside of thermodynamics there is simply not a shred of evidence that SP* is underlying nonthermodynamic regularities. True, we place uniform probabilities over many possibilities in daily life and in the special sciences. Will the coin land heads or tails? Where will the balanced pencil land when I release the tip? Some special sciences may even construct models using Lebesgue measure. None of these, however, are SP*, at least as far as we can tell. SP* is defined with respect to a very special state space, one spanned by positions and momenta. It's logically possible that the probability distributions used elsewhere are truncations of SP*, when SP* is translated into the language of the nonthermodynamic regularities, e.g. translated into ‘heads and tails’ talk. But we lack any positive evidence to believe this is so. See Callender & Cohen 2010, Sec. 4, for more on imperialism.

For the above reasons, many readers may wish to retreat from the Global understanding of Boltzmann. To me the interesting question is whether there is any way to do this while remaining true to the original intentions of the founders of the theory. Or put differently—since one needn't be an ‘originalist’ with respect to the interpretation of science any more than one need be of political constitutions—can one withdraw to a more local understanding of statistical mechanics while salvaging the core of (e.g.) Boltzmann's beautiful explanation of entropy increase?

6 Interlude: Subjectivism, instrumentalism

Embrace a subjective interpretation of statistical‐mechanical probabilities, some people have thought, and all of our problems vanish. Or embrace instrumen-talism about SP, viewing it only as a tool for predicting macroscopic futures, and again the Reversibility Paradox dissolves. Gibbs himself is sometimes inter‐ (p.104) preted as endorsing both lines of thought. Others have subscribed to one or the other (cf. Jos Uffink's contribution in this volume, pp. 25–49).

I won't argue here that neither position is tenable. They may well be right, in the end; however, what I do think is true is that neither position can underwrite mechanical explanations of the kind we've been envisaging, nor are they necessary to resist Globalism. To the extent that we're searching for positions that can do both, neither are successful. This consequence is less obvious with subjectivism, our first topic, than instrumentalism, our second.

Subjectivism comes in many varieties, but the basic idea is that a probability distribution is a feature of the epistemic states of agents. The agent is ignorant of the true microscopic history of the system, and this lack of knowledge justifies assigning probabilities over the possible states of the system. Bayesians, Jaynesians, logicists, and others all give rules for how these probabilities are to be assigned. In the current context, the idea common to all is that because probabilities are a feature of an agent's information, there isn't a deep worry about whether this tracks the physical dynamics. The question, however, is whether statistical‐mechanical explanations survive the transition to subjectivism. There are two points to make here.

First, if we conceive the problem as the conflict between the Static and Dynamic Probabilities, then it's not at all clear how the interpretation of probabilities matters. If the two probabilities conflict, they conflict no matter how interpreted. This is a point Albert (2000, p. 86) makes. Unless the subjectivist thinks they can just assign their degrees of belief as or when they see fit, then, adopting subjectivism doesn't help. Of course it's possible to impose these probabilities in a time‐biased way to try to avoid the Reversibility Paradox. This is what Gibbs occasionally sounds like he is endorsing. But what reason could there be for doing this? The subjectivist would be saying that she is happy imposing SP‐probabilities and letting the dynamics evolve these in one temporal direction but not the other, even though the dynamics are time‐reversible. This decision just seems capricious. Worse, what rationale can there be for imposing a probability distribution over present events that one knows is inconsistent with what we knowof the past? Consider, for instance, Jaynes' famous objective Bayesian position, whereby one is supposed to maximize entropy subject to all known constraints. As Sklar (1993, p. 258) puts it, on what basis does Jaynes feel he can ignore ‘known facts about the historical origin of the system in question’ when doing this? I venture that the Reversibility Paradox threatens any sensible subjectivism as much as any nonsubjective approach.

Second, do we still get the explanation we seek on a subjectivist account? Albert writes:

Can anybody seriously think that it is somehow necessary, that it is somehowa priori, that the particles that make up the material world must arrange themselves in accord with (p.105) what we know, with what we happen to have looked into? Can anybody seriously think that our merely being ignorant of the exact microconditions of thermodynamic systems plays some part in bringing it about, in making it the case, that (say) milk dissolves in coffee? How could that be? (2000, p. 64)

If we think of an explanation of some phenomenon as revealing in part the causal chain of said phenomenon, Albert is here imagining an explanation including subjective probabilities in that causal chain.

To put the issue vividly, ask: why should one flee from a gaseous poison released into a close room? Now, of course, no one actually thinks credences play a causal role in making the gas spread through its available volume. SP‐probabilities aren't pushers or pullers in the world. The difference between objective and subjective accounts is that the former tell us why gases go to equilibrium, whereas the latter tell us why we ought to believe that gases move towards equilibrium. Both recommend fleeing—assuming some utilities such as preferring to live rather than die. The objectivist will flee because she will set her rational credences equal to the objective probabilities (as recommended by the so‐called Principal Principle). The subjectivist will flee presumably because her account states that the credences here should come directly from statistical mechanics (subjectively understood). At bottom, then, the complaint is really this: why should we believe the gas likely moves toward equilibrium unless it's objectively likely to move toward equilibrium? The thought is that the subjective probabilities should bottom out in something objectively probable.

Whatever one makes of such a claim, I think we should acknowledge that the Boltzmannian framework and attempted explanation is in fact an attempt to explain why gases go to equilibrium, not merely why we should believe certain things and not others. Inasmuch as wewant an interpretation that shares the goal of providing the mechanical underpinnings of thermodynamics, subjectivism won't help.

Finally, let us briefly acknowledge that an instrumentalism about SP is of course another reaction to our troubles. On this view, SP is a tool for making predictions about the macroscopic futures of systems. Although SP makes claims about the distribution of microstates underlying macrostates, this probability distribution is to be interpreted as a useful fiction. So once again we must renounce Boltzmann‐style explanations. For the instrumentalist, the reason why fluctuations happen as they do, the reason why it makes sense to run from a poisonous gas released into a room, and so on, is not that the microstates are probabilistically distributed thus‐and‐so.12

(p.106) 7 Special sciences

In closing, let me describe a way of regarding SP that is at once Local, but embraces neither subjectivism, instrumentalism, nor the more radical versions of branching. The claim, originally hinted at in Callender 1997, is motivated by thinking through the idea of statistical mechanics being a so‐called special science, a nonfundamental science like biology or economics. The picture is essentially a branching view of statistical mechanics shorn of Reichenbach's attempt to get time‐asymmetry out and of Winsberg's rejection of the universal reach of the fundamental dynamical laws.13 No doubt the picture described will not satisfy those with Globalist inclinations, nor does it do the hard detailed work the branching view demands. Yet I think there is value in trying to spell out a consistent Localist understanding. As we saw, the Globalist picture comes with its share of problems. It is therefore of interest to see whether the fan of SP and Boltzmann‐style explanations need be committed, on pain of inconsistency, to Globalism.

What we know is that some subsystems of theworld—generally those characterized as thermodynamic—are such that when SP is applied to them, it generates probabilities that are predictively successful. Call these subsystems for which SP works SP‐systems. The world is populated with indefinitely many SP‐systems. Globalism is the claim that what explains this fact is that the universe as a whole is an SP‐system. The universe being an SP‐system is supposed to entail— though the details on how are left out—the formation of what we might call local SP*‐systems. Pictorially:

The Past Histories of Molecules

Fig. 1. Globalism.


The Past Histories of Molecules

Fig. 2. Liberal Globalism.

That Globalism is not forced upon us, at least not in this form, is seen by the logical possibility of what we might call Liberal Globalism. Liberal Globalism notices that many other probability distributions over initial conditions will ‘work,’ i.e. make probable the generalizations of thermodynamics, in addition to the standard one. David Albert (private communication) then suggests the following strategy. Take the set {SP i} of all such probability distributions that work. There will be uncountably manyof these. Dictate that physics is committed to those propositions on which {SP i } plus the dynamical laws all agree. Such a picture will not have ‘imperialistic’ tendencies, or at least on its face it will not. For it will be agnostic about claims ‘finer’ than those about the thermodynamic macrostates. That is, ecology doesn't follow from knowing only the thermodynamic states of objects. This information is hidden at a finer level; but at this level, the claim is, the probability distributions will disagree and so the theory makes no claim. The advantage of this position, if it works, is that it isn't committed to any one probability distribution doing the job, nor does it have so many ‘imperialistic’ consequences, i.e. claims about the nonthermodynamic. (Whether the latter is correct is disputed byAlbert.) See Fig. 2 for a pictorial representation. Liberal Globalism is still a version of Globalism, however. Local SP‐systems are still explained by the percolation of the primordial distribution. Are there any alternatives?

Here is one. Call it Simple Localism. This position just takes the existence of SP‐systems as a boundary condition. It doesn't try to explain these SP‐systems using SP itself. Nor does it attempt to explain their existence using any prior statistical‐mechanical probability at all, contrary to even Liberal Globalism, nor does it offer any explanation of SP‐systems at all (for a pictorial representation, see Fig. 3 on the following page). Simple Localism faces various challenges. On its face it seems to turn a blind eye on what looks like a conspiracy of sorts. Boltzmann seems to have this in mind when he advocates Globalism: (p.108)

This [temporal asymmetry due to initial conditions] is not to be understood in the sense that for each experiment one must specially assume just certain initial conditions and not the opposite ones which are likewise possible; rather it is sufficient to have a uniform basic assumption about the initial properties of the mechanical picture of the world, from which it follows with logical necessity that, when bodies are always interacting, they must always be found in the correct initial conditions. (1964, p. 442)

The thought seems to be that it would be very unlikely to have to assume of each subsystem that its entropy was initially low. Horwich (1987) also presses this point. The idea is that Globalism provides a common‐cause explanation for what would otherwise be unexplained correlations. We could also worry less about temporal directedness and more about frequencies. Isn't it miraculous that all these systems use the same SP‐probability? The Simple Localist looks to be running an explanatory deficit.

Put like this, however, it seems like every special science is running an explanatory deficit. The special sciences—economics, geology, biology, etc.—don't explain why the objects of those sciences arrange themselves so regularly into the patterns that they do. They just assume this. Let's pursue this thought a little and see if it can help the Simple Localist.

Consider biology, in particular evolutionary biology. It is a theory of living organisms and how these organisms evolve with time. Like statistical mechanics, the theory has a complicated probabilistic apparatus, providing both forward transition probabilities (e.g. expected frequencies of offspring in subsequent generations, given earlier ones) and static probabilities of the type we've considered (e.g. genotype probabilities at Hardy–Weinberg equilibrium). To what bits of matter does this elaborate probabilistic theory apply? As mentioned, it is a theory of life, so the probabilistic apparatus of natural selection is ‘turned on’ when branches of the universe deemed living obtain. That raises the question: What is life? One way of answering this question is to specify some characteristics essential to life, such as mouths, legs, or, more plausibly, metabolism. But another way is to implicitly define life as that to which the probabilistic apparatus applies. That is precisely the way John Maynard Smith and Eörs Szathmáry

The Past Histories of Molecules

Fig. 3. Simple Localism.

(p.109) (1999) define life, namely, ‘entities are alive if they have the properties of multiplication, variation and heredity’ (1999, p. 3). Bits of matter that multiply, vary, and pass information along to subsequent generations, will then evolve features like mouths, legs, and metabolisms. However, life itself is defined as that to which the probabilistic apparatus of natural selection can be applied.

Of course one can push back the question: How and why do organisms develop the properties of multiplication, variation, and heredity? That is a question tackled in chemistry and elsewhere in origins‐of‐life research. The only point I want to make about this research is that it does not take the probabilities found in natural selection and then turn around and explain the formation of entities that (e.g.) multiply with these probabilities. But if successful—and we have some models of how this might go—it would, in one perfectly good sense of ‘explain,’ explain the origin of life.

As witnessed, a committed Globalist believes that all the special sciences are a manifestation of SP percolating through different levels of structure. Thus he may search deep into chemistry in an attempt to explain the origin of multiplying entities with SP. Or he may point out that the mutations that drive variation are probabilistic and assert that these are the SP*‐probabilities at work. All of that may be so. But unless one is already committed to Globalism, there isn't the slightest reason to believe this will pan out—never mind that it has to be this way.

Let's press the analogy further. If fundamental physics is time‐reversible, then we can imagine a counterpart of the Reversibility Paradox for evolutionary biology. The living creatures to which natural selection applies are composed of particles. These particles obey time‐reversible laws of physics, so we can evolve these systems backward in time. At the microlevel the particles don't care whether we call them living or not. These systems at one time formed from nonliving particles. Shouldn't the probabilities be consistent with the past histories of the particles comprising living organisms too?

Adopting the Maynard Smith–Szathmáry line on life, the answer is No. The probabilities apply to living creatures, and living creatures are those entities to which the probabilities apply. And even if we don't adopt the Maynard Smith–Szathmáry line, the answer might still be No. The probabilities need not apply to nonliving entities themselves. Life arises. This is a kind of boundary condition of the theory. When life exists, then the probabilities are operant. But evolutionary biology is, and can be, agnostic about whether these probabilities (or any objective probabilities) apply to the formation of living creatures themselves. That doesn't imply that we don't explain the formation of such creatures; it merely means that we don't necessarily use the probabilities themselves in so doing.

(p.110) The same story can be repeated for virtually any special science using probabilities. Ecology and economics, for example, both make heavy use of transition and even static probabilities. But they are not themselves committed to applying these probabilities, or any probabilities, to the formation of rabbits or markets, respectively. Somehow rabbits and economic markets appear. When this happens they become the subjects of fruitful probabilistic theories—theories so fruitful that we may wish to be realists and objectivists about the probabilities invoked. None of this is antimechanist or antiphysicalist. Rabbits and markets supervene upon physical particles and fields. However, why these rabbits and markets develop with the frequencies they do is something about which the special science is agnostic.

Think of statistical mechanics the same way. It's a new special science, one that grounds and unifies a lot of macroscopic behavior. It too is restricted to certain kinds of systems, in this case macroscopic systems whose energy and entropy are approximately extensive. (Surely a better characterization can be given, but this will do for now.) The claim is that the SP‐probabilities only kick in when we have systems meeting such a description. Once they develop, one uses SP to great effect. But one does not use it to describe and explain the frequency with which such systems develop in the first place. The science of statistical mechanics is about systems with certain features. Yet it's no part of the science to say anything about the frequencies of these systems themselves. You will look in vain for a statistical mechanics textbook telling you how many thermodynamic systems you should expect to find in San Diego. A new science— one that may or may not invoke SP—is required for that. One might think, for instance, of the science needed to explain the largest local source of low entropy, the sun. Astrophysics explains this, just as chemistry (etc.) explains the origin of life.

A better version of Localism then is Special Science Localism. Special Science Localism allows for explanations of SP‐systems, unlike Simple Localism, but is agnostic\atheistic about these explanations invoking SP itself (for a pictorial representation, see Fig. 4). It is motivated by the usual picture of the special sciences as a set of patterns that are physically contingent and not probabilistic corollaries of physics. Unlike Winsberg's version of branching, the special‐sciences view doesn't claim that macroscopic systems aren't composed of particles that always evolve according to microdynamical laws. They are. It's simply not placing a probability distribution over the possible initial conditions of the universe at large.

The Globalist can still see a conspiracy lurking. For the Globalist, conspiracies are everywhere. Rabbits execute complicated patterns described by ecology. Yet they are flesh, blood, and fur, and those things are composed of particles. How do these particles ‘know’ to evolve in ecology‐pattern‐preserving (p.111)

The Past Histories of Molecules

Fig. 4. Special Science Localism.

ways rather than non‐ecology‐pattern‐preserving ways? Same for the particles in dollar bills and their preservation of economic patterns. Same for the particles in living systems and the preservation of biological patterns. Sure, the Special Science Localist can explain—in some sense of ‘explain’—the origin of SP‐systems. But that doesn't change anything, for it still took a remarkable contrivance of particle trajectories to get an SP‐system in the first place.

In response the Special Science Localist can go on the offense or defense. On the offense, the Localist can remind the Globalist of the Subsystem Concern. Recall that Globalists face a major difficulty in showing how their theory of global entropy increase has any implications for garden‐variety local thermodynamic systems. For the theory to apply to subsystems, we need to take it on faith that the dynamics evolves the volume in phase space associated with any subsystem of interest in such a fibrillated manner that SP* approximately holds for it. As Winsberg (2004b) notes, at this point the advocate of branching can cry ‘foul.’ We criticized branching for assuming that subsystems of the universe magically scramble to have their microstates distributed according to SP just when we deem a branch to have come into being. But now the Globalist wants the dynamics to scramble the microstates in such a way that conditionalizing on earlier, more global SPs results in a mini‐SP (SP*) working for my coffee cup. How is that any better? We're really being given the framework for a common‐cause explanation without the detailed cause.

On the defense, one can try to motivate a picture of the various sciences and their interrelationships wherein this conspiracy worry is alleviated. The details of this theory are found in Callender & Cohen 2010, but the rough idea is as follows. Think of the laws of nature, causes, and so on, as the result of systemizing certain domains described by various natural kinds. Thus, ecology (p.112) is the best systemization of certain biological kinds, chemistry of chemical kinds, and so on. Each systemization may make use of its own probability measure on its own state space. But no one of them is metaphysically distinguished. They are each just different windows onto the same world. Now, when it's possible to see the same entity through more than one window, there is no guarantee that what is typical with respect to one systemization's measure is also typical with respect to the other's. In general it won't be. From the perspective of physics, ecological patterns look conspiratorial; but equally, from the perspective of ecology, the fact that rabbits all fall downwards is conspiratorial. Without the assumption that physics is doing the real pushing and pulling in the world, there is no reason to privilege the perceived conspiracy from physics. And if it's a problem for everyone, it's a problem for no one.

Finally, North (forthcoming) launches many objections against Callender 1997's proposal to think of statistical mechanics as a special science. Some of these objections have been treated here, explicitly or implicitly. One that hasn't been covered and that has some intuitive bite is that statistical mechanics, unlike biology, economics, and so on, ranges over the very same variables as mechanics. Ecology ranges over offspring, biology over alleles, but statistical mechanics ranges over the very same positions and momenta dealt with in mechanics. North is correct that this is a disanalogy. Yet I'm not sure it's enough to worry the Localist. Assuming physicalism, offspring and alleles are complicated functions of positions and momenta too. Metaphysically speaking, there isn't really a difference. Furthermore, it seems the question hangs on how we interpret the chances in statistical mechanics. Viewed as part of the fundamental theory, then yes, the theory looks fundamental. But we've been exploring a position wherein the fundamental theory is one that doesn't imply the statistical‐mechanical chances. From this perspective statistical mechanics is a theory including nonfundamental predicates, namely, the chances.

Something like this Localist position, I urge, is what instrumentalists, subjec-tivists, and branchers have sought when retreating from Globalism. Yet Localism does not require these positions. Just as a biologist can be a realist and objectivist about fitness, so too can a statistical mechanic be a realist and objectivist about the statistical‐mechanical probabilities. Instrumentalism and subjectivism may, after all, be correct; but they are not forced upon us in the name of resisting Globalism. Additionally, just as a biologist need not reject the mechanical basis of biological systems, neither does a statistical mechanic need reject, like Winsberg, the possible universal applicability of mechanical laws.

Whether Localism or Globalism is ultimately correct is a question I leave for the reader. I am content carving space for a Localist alternative to Globalism that preserves the mechanical aspects of statistical‐mechanical explanations.

(p.113) Acknowledgments

For comments on this paper I thank Claus Beisbart, Jonathan Cohen, Carl Hoefer, Tim Maudlin, Jos Uffink, Christian Wüthrich, audiences at Rutgers and UC Davis, and an anonymous referee. (p.114)


(1) For the history of statistical mechanics, see Brush 1976, 2003, Garber, Brush & Everitt 1995, Harman 1998, and the contribution by J. Uffink in this volume (pp. 25–49).

(2) For the most detailed discussion of the critique from Boltzmann's contemporaries and also many references, see Brown, Myrvold & Uffink 2009.

(3) Uffink reminds me that this proviso is ironic, given that the ideal gas lacks collisions but collisions drive the H‐Theorem.

(4) For a general discussion, see Goldstein 2002 and references therein. For the specific formulation here, see Spohn 1991, p. 151.

(5) For discussion of the coarse‐grained entropy, see Callender 1999 and Sklar 1993.

(6) This is a point Schrödinger (1950) emphasizes and that Davey (2008) uses to ground his claim that no justification for any probability posit like SP can ever be had. As we'll see, one response to Davey is to push back SP to the beginning of the universe, thereby eliminating the past histories that stymie the justification of SP. That move will bring additional worries, however.

(7) Incidentally, note the remainder of the quote in the context of the dispute about whether the low‐entropy past itself demands explanation: ‘… and we can say that the reason for it is as little known as that for why the universe is and it is not otherwise.’ See Callender 2004 for a defense of this Boltzmannian line.

(8) Torretti 2007, n. 30, p. 748.

(9) Reichenbach thought he could also get the time‐asymmetry of thermodynamics from the branching. Sklar (1993) points out many of the problems with this argument. Contemporary branchers, like Winsberg, want to disassociate branching from this argument. In language we'll use later, they acknowledge that branching will need lots of ‘mini’–Past Hypotheses.

(10) Maybe this sounds crazier than it is. Many systems are prepared in low‐entropy states; if the preparation procedure somehow makes likely the useful application of a uniform distribution, then we have a reason for the microstates scrambling into the right distribution. But then the question is pushed back to the distribution of the preparers …

(11) How early should the Past Hypothesis be imposed? Strictly speaking, since time is a continuum, there is no first moment of time. But the earlier it's placed, the better. The Reversibility Paradox threatens, of course. If imposed just when the universe begins its third second of existence, for instance, then for all the reasons mentioned, the theory predicts two seconds of anti‐thermodynamic behavior. For this reason, it should be imposed on that moment of time by which we're confident the world is thermodynamic.

(12) North (forthcoming) interprets Leeds 2003 as an instrumentalist approach. But I'm not entirely confident that he isn't better seen as some version of the ‘special sciences’ position detailed below.

(13) Drory (2008) also defends a branch‐type view with these features, but his is motivated by an interesting re‐consideration of the Reversibility Paradox. As I understand his position, it is entirely complementary to the one motivated here.