Mathematical and Physical Appendix - Oxford Scholarship Jump to ContentJump to Main Navigation
The Comprehensibility of the UniverseA New Conception of Science$

Nicholas Maxwell

Print publication date: 2003

Print ISBN-13: 9780199261550

Published to Oxford Scholarship Online: October 2011

DOI: 10.1093/acprof:oso/9780199261550.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE ( (c) Copyright Oxford University Press, 2015. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see Subscriber: null; date: 27 September 2016

(p.255) Mathematical and Physical Appendix

(p.255) Mathematical and Physical Appendix

The Comprehensibility of the Universe
Oxford University Press

1. Differential Equations

For those not familiar with the notion of a differential equation, here is a very brief explanation. Differential equations are equations that involve derivatives, such as dy/dx, d2y/dx2 and so on. Derivatives express the rate at which one quantity, y, changes with respect to another quantity, x. Thus, if s is distance travelled by a body in time t, then ás/át expresses rate of change of distance, s, with respect to time, t, that is velocity, v; and d2s/dt2 expresses rate of change of rate of change of distance with time, that is acceleration, a. This is also rate of change of velocity with time, dv/dt. We thus have:

Mathematical and Physical Appendix

Given that we plot on a graph distance s against time t to form a curve, then ds/dt, the instantaneous velocity at time t, is, from a geometric standpoint, the slope of the tangent to the curve at time t. If the curve is specified by the function s = Af, where A is a constant and n an integer, then ds/dt = nAtn − 1.

Examples of differential equations are:

  1. (a) ds/dt = 0

  2. (b) d2s/dt2 = a (where a is some constant)

  3. (c) y · dy/dx + x = 0

  4. (d) x · d2y/dx2 − dy/dx = 0.

A differential equation, typically, specifies an infinite family of curves or, equivalently, of functions. This is best understood by working backwards, from one such function to the corresponding differential equation. Consider (a) above. Given the above interpretation of s and t, this equation asserts that velocity is zero. This represents infinitely many horizontal straight lines, specified by infinitely many functions of the form s = s0, where s 0 is the distance from the origin at t = 0. As the velocity of the body = 0, it remains where it is for all time. Given s = s0, if we differentiate with respect to t, we obtain the differential equation (a). From infinitely many functions, s = s 0, corresponding to the infinitely many values of s0, we obtain one differential equation, ds/dt = 0.

A somewhat more interesting example is obtained if we consider equation (b). This asserts that acceleration is a constant. Given all objects accelerating in the same direction at a constant rate a, these objects may have started from a variety of initial places, s 0, with a variety of initial velocities, v0, at time t = 0. Thus (b) applies to a doubly infinite class of curves, corresponding to the doubly infinite set of values that s0 and v0 can have. These curves are represented by the (p.256) equations s = (a/2)t2 + v0t + s0. Differentiating any of these equations twice we obtain first as I at = at + v0, and then d2s/dt2 = a (that is, equation (b) above). The result of differentiating twice is to lose the two constants, s0 and v0. By adding particular values, particular initial conditions, of s0 and v0, to the differential equation d2s/dt2 = a, we pick out the corresponding curve, or path pursued by the object, from the infinitely many possible curves or paths.

Similar points apply to (c) and (d), and other differential equations.

The essential point, from the standpoint of unity in physics, is that infinitely many diverse functions, each specifying the distinctive way in which some particular physical system evolves in space and time, can be specified by just one differential equation. This latter specifies what all the different functions have in common: there is a fixed, common, unchanging relationship between the way in which the rates of various variable quantities change with respect to one another. This, or rather what exists physically which determines this, is what U is. In the case of NT, U is that which exists physically, everywhere at all times, which determines that particles will move in accordance with F = ma, F = Gm1m2/d2, and Ftotal = SFi. These differential equations determine, for any system of particles, with initial positions, velocities, and masses specified, the precise paths that these particles will pursue, the precise functions that determine these paths.

2. Maxwell’s Equations of the Electromagnetic Field

Maxwell’s equations (actually only given this form by Heaviside and Lorentz) are:

(1) Mathematical and Physical Appendix
(2) Mathematical and Physical Appendix
(3) Mathematical and Physical Appendix
(4) Mathematical and Physical Appendix
(5) Mathematical and Physical Appendix

Here are a few words of explanation as to what these equations mean (in addition to those given in Chapter 4, Section 11) for those not familiar with vector analysis and classical electrodynamics. E and B are the electric and magnetic fields of force respectively, each a vector field which assigns a vector to each space-time point, and which varies in a continuous way through space and time. (Vectors are written in bold type thus: E and B.) A vector field may be thought of as assigning a tiny arrow to each space-time point which varies in length and direction with changes in space and time. The length and direction of the arrow, at any spatial point and instant, represents the strength and direction of the field at that space-time point. A vector field might represent flowing water, in which case the arrow at each space-time point would specify the velocity of the water at that point and instant.

(p.257) ∇ · E means ∂E x/∂x + ∂E y/∂y + ∂E z/∂z, where Ex, Ey, and Ez are the x, y, and z components of the vector E. E, and its x, y, and z components change, in general, with respect to changing positions at any given time, and with respect to changing time. ∂E x/∂x, a partial derivative, tells us how Ex changes with respect to a change in the direction of the x axis, all other changes being kept constant. ∇ · E thus gives us the sum of the rate of change of E in space with respect to the three spatial directions, x, y, and z. ∇ · E is a measure of the extent to which there is net ‘flow’ of electric field out of, or into, a region of space (all the arrows pointing away from, or towards, some common point). Postulate (1), ∇ · E = 4πϱ, thus tells us that the net ‘flow’ of electric field out of a region is proportional to the charge density within that region. Postulate (2), ∇ · B = 0, tells us that there are no sources (or sinks) for the magnetic field—no isolated magnetic poles. The magnetic field is always the result of the changing electric field and/or the motion of electric charge, in accordance with postulate (4). If B represented the flow of water, then ∇ · B = 0 would express the fact that nowhere does water flow into, or flow away from, the given quantity of water (by means of some fixed pipe or drain, for example).

Vx£, itself a vector, means:

Mathematical and Physical Appendix
where 1,7, and k are unit vectors that point in the positive directions of the x, y, and z coordinates. ∇ × E is a measure of the extent to which the ‘flow’ of E (the spatial array of arrows that represent E) has a circular motion or pattern—so that if E represented the flow of water, ∇ × E ≠ 0 would express the fact that there are whirlpools. Postulate (3) specifies the manner in which the circular flow of E is related to the change of B with time; and postulate (4) specifies the way in which the circular flow of B is related to the change of E with time and/or the electric current j.

3. The Role of Symmetry and Group Theory in Physics

Symmetry, we have seen, is an important ingredient of unity. But what is symmetry? What is its role, in general, in theoretical physics? Symmetry in physics is a vast and complex topic; here I make only a few remarks just sufficient for the purposes of this book.1

To say of some object that it exhibits such-and-such a symmetry is to say that if the object is changed in some way, it remains the same. Thus a homogeneous sphere (on whose surface there are no distinguishing marks) can be rotated about its centre through any angle, and it remains the same; a cube, by contrast, can only be rotated through certain multiples of 90° about its centre if it is to remain unchanged. These are examples of continuous and discrete symmetries respectively. Such spatial symmetries, both continuous and discrete, are important in theoretical physics.

(p.258) The notion of symmetry is not, however, restricted to the spatial; whenever any kind of object, however non-spatial or abstract, can be thought of as being changed, as a result of one or other of a number of possible operations, O 1 O 2,…, being performed on the object, it becomes possible to speak of the object exhibiting some kind of symmetry (on analogy with the cases of the sphere and cube).

From a mathematical point of view, the object exhibits a symmetry if and only if the set of operations that leave things unchanged, O 1 O 2,…ON, forms a group, i.e. satisfies the axioms of group theory. These state: among OxON there is the identity operation, I, which does nothing; any two operations, Op and Oq, can be combined to form a third, On so that OqOp = Or repeated operations are associative, that is such that Or(OqOp) = (OrOq)Op; and finally, every operation, On has its inverse, O1, which is such that Or xOr = I (the identity operation). It may, or may not, be the case that OrOs = OsOr. If this does hold, the symmetry (and group) is said to be Abelian, and if it does not hold, non-Abelian. The symmetry of a circle in two dimensions is Abelian, whereas the symmetry of a sphere in three dimensions is non-Abelian. Rotate the circle through angle a (Ta) and then through angle ß (Sß), that is, in sum, perform the operation TßTa; the outcome is the same if these operations are done in reverse order, which means: O’Oa = OaO'. This equation does not hold, however, for the sphere; rotating the sphere about one axis X, and then another axis Y (both through the centre of the sphere) in general gives a different outcome if these operations are performed in reverse order. The symmetry of the sphere is non-Abelian. Groups, like the symmetries they represent, may be either continuous or discrete. And just as some symmetries can be thought of as being made up of component symmetries, so some groups possess (proper) subgroups. (For T* to be a proper subgroup of G, it is necessary and sufficient that both T* and G satisfy the group axioms, and that G includes T* in the sense that all operations in T* are also in G but not vice versa, and T* consists of more than the identity element.)

Symmetry arises in physics in a number of different ways.

  1. 1. Symmetry arises because certain composite physical systems—atoms, molecules, crystals—exhibit spatial symmetries.

  2. 2. Symmetry arises because postulated fundamental physical entities exhibit spatial symmetries (for example, the Newtonian point-particle which exhibits spherical symmetry).

  3. 3. There are symmetries associated with space and time.

  4. 4. There are symmetries of laws and theories (which include symmetries of type 2 and 3).

It is the symmetries of theories that is our concern here. To say that a theory, G, exhibits a symmetry is to say that a characteristic change may be made to any isolated system evolving in accordance with T, and the way the system evolves will be unaffected by the change. Consider, for example, symmetry with respect (p.259) to rotation. Take any isolated system, evolving in accordance with T, and rotate the entire system about any axis: if the rotated system evolves precisely as before (and this is true of all isolated systems to which T applies), then T exhibits rotational symmetry. In referring to the symmetries of T we are really referring to the symmetries of all possible evolutions predicted by T.

Typical symmetries of this type, associated with flat space-time, are invariance with respect to change of (1) initial spatial orientation, (2) initial spatial position, (3) initial time of occurrence, (4) inertial motion.2 In each case, we take any isolated system, 5, and change the initial state with respect merely to one or other of (1) to (4). The theory, T, has the associated symmetry if the evolution of any isolated system, S, to which T applies, is unaffected by the change.

What does it mean to say that the evolution of a system S is unaffected by a change in the initial conditions of SI In performing an operation Ot on the state of S at time t (where Ot is a change in orientation, location, or whatever) we are, in effect, creating a new system, S*. What does it mean to say that the evolution of S and of S* are the same, when these are two distinct systems in the space of all possible systems to which T applies?

One way of explicating this is to say that it means that the same change performed in reverse at any later time will return one to the original system. In other words, given a system 5, and a second possible system S*, got from S by performing the operation Ot on the state of S at time tl9 we require that the reverse operation, 0t, performed on S* at any later time t 2, will return S* to S. If we are considering rotational symmetry, we may rotate S through any angle at time tx to create S*, and then at any later time t 2 rotate S* through the same angle in reverse to re-create S.3

There is an obvious sense in which all systems that can be obtained from one system, S, in this way, as a result of the initial state of S being changed by operations {O} associated with a symmetry, can be regarded as different versions of the same system, the same evolution. All systems obtained from S merely by rotations are in effect different versions of the same system. This is a general feature of a symmetry. It has the effect of dividing the space of possible evolutions predicted by the theory, T, into infinitely many equivalence classes,4 all the evolutions in any one equivalence class being obtainable from any one evolution in that equivalence class by the operations {O} associated with the symmetry. In the case of rotational symmetry, one such equivalence class consists of all systems obtained from one system by all possible rotations.

In addition to the symmetries associated with space-time that we have considered so far, there are also so-called ‘internal’ symmetries. These latter arise when the initial state is changed in some way other than a change with respect to space and time, and the change leaves the evolution of all systems predicted by the theory unaffected (where this is understood as before). The distinction between space-time and ‘internal’ symmetries corresponds, roughly, to symmetries that arise as a result of the nature of space-time, and those that (p.260) arise as a result of the nature of ‘matter’—the nature of everything physical that exists in addition to space-time (particles, forces, fields).

Examples of internal symmetries are the global and local gauge invariance of MT, QED, QEWD, and QCD, discussed in Chapter 4.

So far we have been considering continuous symmetries, but physical theories also exhibit symmetries that are discrete. Examples are symmetry with respect to time-reversal, charge conjugation, and parity. A theory, T, is time-symmetric if, given any evolving isolated system S to which T applies, the evolution with the direction of time reversed (so that the future is the past, and the past is the future) would also accord with the predictions of T. T exhibits parity symmetry if, given any S to which T applies, a system equivalent to the mirror image of S would also accord with the predictions of T. Finally, T exhibits charge conjugation symmetry if, given any system S (to which T applies) that is made up of fundamental particles, the system that is obtained from S by replacing each particle with its antiparticle also evolves in accordance with T.

At one time it was more or less taken for granted that fundamental physical theories would exhibit all three of these discrete symmetries. Parity symmetry seemed especially obvious and immune to doubt, since it is equivalent to demanding that fundamental physical laws make no distinction between left-handedness and right-handedness. And yet in 1956 it was found to be false. In that year, Yang and Lee proposed that the weak interaction might violate parity symmetry. The conjecture was tested as follows. The nuclei of a radioactive isotope of cobalt were placed in a magnetic field in such a way that the spinning nuclei were aligned by the field. The nuclei decay by means of the weak interaction, and emit electrons. If parity obtains, electrons will be emitted in equal numbers in the direction of the field, and in the opposite direction. In fact electrons were found to be emitted preferentially in the opposite direction to the field. This violates parity symmetry.

It was quickly realized that the experiment does not refute a new discrete symmetry that can be formed by combining charge conjugation, C, and parity, P. According to this new symmetry, CP 9 a theory, T 9 is CP-symmetric if, given a system S to which T applies, T also applies to S*, obtained from S by considering the mirror image of S, and replacing all particles by antiparticles. It was subsequently found, in 1964, that particles called neutral kaons, which decay by means of the weak interaction, do so in a manner which violates CP-symmetry.

We can, however, put parity, P, charge conjugation, C, and time-reversal, G, together to form a new discrete symmetry, CPT. A fundamental theorem (the Luders-Pauli theorem) demands that quantum field theories must comply with CPT-symmetry. Thus, given that the weak force violates CP-symmetry, it can only observe CPT-symmetry if it violates T-symmetry. This is an astonishing result: it means that just one of the four basic forces fails to be symmetric with respect to time-reversal.5

(p.261) All this, incidentally, illustrates a basic feature of AOE. Symmetry principles can be regarded as non-empirical methodological principles governing choice of theory in physics; but they can also be regarded as physical principles, either true or false, associated with level 3 blueprint ideas. However obviously true such principles may appear to be, they may nevertheless be false; but in discovering such principles to be false, new principles need to be discovered if physics is to continue. Level 4 physicalism is retained, even if more specific level 3 versions of physicalism are rejected.

It is, I hope, quite clear from the above account of symmetry that the symmetry of a theory, T, has everything to do with the content of T, and nothing to do with the form of G, in the first instance at least. This deserves to be emphasized, as symmetry is often characterized in terms of changes which leave the form of a theory unchanged.

It is, however, always possible to choose terminology which is such that the symmetries of the physical content of T are reflected in the symmetries of the form of T: all that needs to be done is to formulate T using terminology that satisfies the same symmetry principles as the physical reality which T postulates. Thus, theories that presuppose flat space-time and are invariant with respect to changes of spatial location and orientation become terminologically invariant if formulated in terms of vectors. The result of building the physical symmetries of T into the terminology of T is to create two versions of the symmetry principles, usually called the active and passive. The active version is the one we have been considering above: the evolution of a physical system remains unchanged by a change in initial conditions (such as a change in spatial location). The passive version considers, not a change in the physical system, but a corresponding change in the description of the system: the location of the coordinate system, in terms of which the system is described, is changed. Physical symmetries are fully reflected in terminology if, corresponding to every physical change of initial conditions that leaves the evolution unchanged, there is a change in the description (e.g. a change of coordinate system) which leaves the description of the evolution of the physical system unchanged in the same way. A symmetry given its passive interpretation indicates that certain terminological conventions have been adopted: it is the active, not the passive, version of a symmetry that has real physical content.

I have a final point to make about symmetry concerning the connection between symmetry and conservation principles. According to a famous theorem of Emmy Noether, for every theory that can be given a Hamiltonian or Lagrangian formulation,6 every continuous symmetry gives rise to a conservation principle. Invariance with respect to spatial location and orientation give rise to conservation of linear and angular momentum respectively; invariance with respect to time of occurrence gives rise to conservation of energy. Gauge invariance of classical electromagnetism (MT) gives rise to conservation of charge.

(p.262) 4. Is Symmetry Necessary for Unity, or is it Just One Possible Ingredient?

The first step in tackling this question is to clear up a possible terminological confusion. Throughout the book I argue that, in order to be a precise version of physicalism a (potential) theory of everything, G, must postulate the existence of a U which is invariant throughout all phenomena, and which determines the way in which phenomena occur. It is important to appreciate, however, that to say that T is invariant in this sense of postulating an invariant U is not at all the same as to say that T satisfies certain invariance or symmetry principles, in the way we have just been discussing. A theory, T, might be invariant with respect to position, orientation, time, uniform (inertial) motion, and yet fail miserably to postulate an invariant U: an example would be the aberrant version of Newtonian theory, discussed in Chapter 2, according to which gold spheres, in certain circumstances, attract each other in accordance with an inverse cube law of gravitation.

Nevertheless, that a theory, G, is invariant throughout all phenomena to which it applies is a kind of symmetry of the theory. The group, G, of this symmetry is the group of all one-one mappings of possible evolutions predicted by T into possible evolutions predicted by T. One might think that this means that the idea of G being ‘invariant’ throughout the possible phenomena to which it applies can be explicated in terms of the notions of symmetry and group theory; but this is not the case. Suppose that T* is an aberrant version of T; the form of T* matches that of T throughout the space of all possible evolutions except for a small region R in this space, where T* is quite different from T. In this case, T* exhibits a symmetry whose group is Gr*. (If points in R according to T* can be put into one-one correspondence with points in R according to G, then (but only then) will GT and Gr* be formally the same, even though they are given different physical interpretations.) Appealing to symmetry, and group theory, in this way, does not differentiate between the invariant theory T and the non-invariant, aberrant theory T*. In order to make the distinction we would need to appeal to the symmetry: ‘the theory remains invariant throughout the space of possible evolutions’. If {G1} are all the physically interpreted groups corresponding to this symmetry, for all possible invariant theories {G}, then GT is a member of {G1} whereas Gr* is not. But here, in distinguishing between T and T* we are appealing to the very thing we are trying to explicate, namely ‘invariance throughout all phenomena to which the theory applies’.

That a physical theory, G, is invariant (i.e has an invariant content) throughout all phenomena to which it applies is a kind of symmetry of T; but it is not a symmetry of T in the sense in which this notion is used in physics, as explicated above. The two notions of symmetry are, however, related: corresponding to the symmetries of T (as physicists use the term) there is a symmetry group, 5, and this is a subgroup of G, where this is defined as above.7

What restrictions do we need to place on the one-one mappings that are elements of G to arrive at physical symmetries of the type discussed above? (p.263) There appear to be three constraints that must be imposed on the one-one mappings of G to turn this symmetry into physical symmetries in the conventional sense.

  1. 1. The one-one mappings must be such that the space of all possible evolutions is divided up into infinitely many equivalent classes, such that any evolution, predicted by T, is only mapped onto evolutions in the same equivalence class.

  2. 2. The one-one mappings must constitute some common, invariant, operation, 0, such as changing the location or orientation of the initial state of the system.

  3. 3. This operation must commute with the ‘time-evolution operator’, t(tx → i2), which takes any state, at time tl9 of an isolated system (which evolves in accordance with T) to the state at time t2. That is, given any state, 5, of any isolated system evolving in accordance with G, T · t(txt2)(S) = t(txt2) · O(S).

One-one mappings of G which satisfy these three conditions form groups of symmetries of T.

What relevance do the symmetries of T (in this sense) have for the unity of T? Is it possible that T might satisfy minimum requirements for unity, and yet possess no symmetries?8

One symmetry does seem to be absolutely necessary for unity, namely symmetry with respect to the passage of time. One could of course imagine that physical laws change with the passage of time, but any such change would need to be uniform, or invariantly related to some factor—such as the size of the universe perhaps—if any semblance of unity of the true theory of everything, G, is to be preserved. Both possibilities would ensure that T has a symmetry with respect to time.

The unity of T does not seem to require that there is symmetry with respect to uniform velocity, spatial orientation, or even spatial position. One could imagine a quasi-Aristotelian universe, W, in which events occur in accordance with a unified theory of everything, T, even though absolute position, orientation, and velocity all exist. We can imagine that this universe has a special position, 0, all objects experiencing a force F 0, directed towards 0, and such that F0 = —kR0, where R 0 is the vector from 0 to the object in question, and k is a constant. As a further Aristotelian law we have: F = mv, where T is velocity. Motions are governed by a differential equation, and yet symmetry with respect to location, orientation, and uniform velocity are all violated.

It is also true, however, that a spatial symmetry of a kind does exist in this Aristotelian universe, W: the force on objects has spherical symmetry about the spatial point 0. We could imagine another Aristotelian universe, W*, in which the force varies in some fixed but arbitrary fashion with distance and direction. Such a non-symmetrical universe would lack unity.

I conclude that symmetries of some kind are a necessary ingredient of unity, although they need not be of the type associated with theoretical physics as we know it.

(p.264) Symmetry is especially relevant when it comes to unification by synthesis, as we saw in Chapter 4 in connection with the unifying powers of MT, and of QED, QEWD, and QCD (the locally gauge invariant character of these latter theories being essential to their capacity to unify by synthesis). As we have seen, one can even account for the imperfect unity by synthesis of QEWD in terms of the group structure of its locally gauge invariant symmetry. This has the form of a direct product of two distinct groups, one associated with the particles W+, W, and , the other associated with the particle Vo. Unification by synthesis is, I suggest, inherently a matter of discovering symmetries associated with the elements to be unified.

It deserves to be noted, in passing, that symmetry cannot be invoked to overcome Goodmanesque difficulties, of the kind discussed in Section 15 of Chapter 4, not at least if symmetry is invoked in a purely formal sense. Any object (or dynamic structure), however apparently non-symmetrical, could always be construed to exhibit symmetry, if the group of operations associated with the symmetry are sufficiently ‘alien’ and Goodmanesque in character. Consider an object, 0, that is not remotely spherical; suppose, now, by ‘rigid rotation’ we mean ‘rotation that so deforms the object being rotated that the net effect is to leave O unaffected by a rotation in this sense’. With respect to this Goodmanesque interpretation of ‘rigid rotation’, O has spherical symmetry. In a similar fashion, the quasi-Aristotelian universe, W*, just considered, which lacks spherical symmetry, could be construed by Goodmanesque aliens to possess spherical symmetry (of a peculiar, alien type). An appeal to symmetry does not of itself overcome Goodmanesque difficulties; these need to be dealt with in the manner indicated in Section 15 of Chapter 4.

5. Groups and Matrices

The continuous groups important in physics are (or have the same group-structure as) groups of matrices.

A matrix is a rectangular (or square) array of numbers, real or complex, such that it can be added to and multiplied by other matrices in specified ways. Thus, if A, B, and C are 3×3 matrices, with A × B = C, the matrix C is defined as follows. Let A = ( a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ) and let T and C, similarly, have elements b' and with i and; running from 1 to 3. Then A T T = C is defined to be such that, c/; = Hkaikbkj. The elements of the i-th row of A are multiplied with corresponding elements of the y-th column of B, and the products are then added. Thus c23 = a 21 × b 13 + a 22 × b 23 + a 23 × b33. (In general A × B ≠ B × A.) With this definition of the product of groups, all N × N matrices form a group, called U(Af), as long as the matrices are ‘unitary’. A matrix, A, is unitary if its inverse, A1, is equal to its transpose conjugate. The transpose of a matrix, A, is formed by exchanging off-diagonal elements with each other symmetrically (the diagonal (p.265) running from the top left-hand corner to the bottom right-hand corner). The conjugate of A is the matrix obtained from A by exchanging each element with its complex conjugate, so that x + iy becomes x — iy. Restricting the elements of U(N) to unitary elements ensures that each element has an inverse (which otherwise is not in general the case). The unit member of U(N) is the N × N matrix consisting of ones on the diagonal, and zeros everywhere else. It is not hard to show that, given these definitions, all N × N unitary matrices satisfy the axioms of group theory.

TABLE A1 Continuous groups and matrices important in physics

Group name

Matrices in group

U (N)

N × N unitary matrices

SU (N)

N × N unitary matrices with determinant = 1

O (N)

N × N real orthogonal matrices

SO (N)

N × N real orthogonal matrices with determinant = 1

If, in addition, it is required that the matrices have determinant equal to 1, the resulting group is called SU(N) (‘S’ for ‘special’). If, on the other hand, the matrices of U(N) are restricted to those that have only real numbers as elements, the resulting group is called 0(N); in this case each matrix is such that its inverse is equal to its transpose—such matrices being called ‘orthogonal’. If, in addition, the demand is made that the determinant of these matrices equals one, the group is called SO(N). SO(3), in particular, is the symmetry group of the sphere.

All this may be summed up in Table Al.

6. Introduction to Quantum Theory

The central enigma of the quantum domain is vividly apparent in the famous two-slit experiment. Quantum systems—photons, electrons, or even atoms—having a precise momentum are directed at a two-slitted screen. Behind the screen, the systems that go through the slits are detected by a photographic plate (or its equivalent). The intensity of the beam of quantum systems may be so low that on average there is only one system (photon, electron, or atom) in the apparatus at any one time.

In appropriate conditions, interference bands are detected by the photographic plate, a result that can be readily explained if it is assumed that each quantum system is an extended wave-like entity with wavelength T = h/p, where h is Planck’s constant and p is the momentum of the quantum systems in the direction of flight. The wave-like system passes through both slits. At certain regions on the photographic plate crests from one slit arrive (p.266) simultaneously with troughs from the other slit; the waves cancel each other out. At other regions, the waves arrive in phase from the two slits, and thus reinforce each other. The outcome is the bands detected by the photographic plate. Essentially the same effect arises when ocean waves enter a harbour with two entrances: at certain places on the beach, the waves interfere destructively, and the water is still; at other places the waves reinforce each other. It is all but impossible to see how this quantum-experimental result (and countless others) can be explained in any other way except by supposing that the quantum systems are extended wave-like entities.

But the very same experiment (and countless others) also establishes—so it seems—that individual quantum systems cannot possibly be extended wavelike systems. This is because each quantum system is detected as a minute dot on the photographic plate. The wave-like photon or electron that passes through the two slits, and is spread over the whole of the photographic plate, interacts with just one silver-bromide molecule in some minute region on the photographic plate so that the molecule is dissociated and a silver atom is deposited on the plate. (This then becomes a dot of millions of silver atoms when the plate is developed.) Each individual photon or electron interacts with the photographic plate as if it is a highly localized particle, with a definite trajectory through space; it is only when millions of such interactions are taken into account that an interference pattern begins to emerge. The wave-like aspect of the photon or electron is only detected experimentally via a great number of particle-like detections.

Modern QT was first developed by Heisenberg in 1925. From the outset, Heisenberg sought to develop the theory in such a way that it was restricted to predicting the outcome of performing measurements on quantum systems—so that the problem of the paradoxical character of quantum systems could be avoided. Later, in 1926, Schrodinger developed wave mechanics with the idea that the theory would describe the real wave-like character of quantum systems. But Schrödinger’s version of QT, so interpreted, could not do justice to the particle-like aspect of quantum systems. In 1927 Born proposed that Schrödinger’s wave function should be interpreted as determining the probability of detecting the quantum system in a small region of space a position measurement is performed on the system. Schrödinger proved that his version of QT, and Heisenberg’s, are experimentally equivalent. Heisenberg, Born, Bohr, Dirac, and others (but not Einstein or Schrödinger) adopted the view that the new QT had to be interpreted in such a way that it is restricted to making probabilistic predictions about the outcome of performing measurements on quantum systems, it being impossible, and unnecessary, to specify the nature of a quantum system when not being measured.

In short, because the creators of QT did not know how to develop a consistent model of quantum systems capable of doing justice to particle-like and wave-like aspects of quantum systems, they were forced to develop the theory (p.267) in such a way that it is restricted to predicting the outcome of performing measurements on quantum systems (with unfortunate consequences for the theory: see Section 2 of Chapter 7).

Here, in very briefest outline, is the basic structure of the theory that emerged—‘orthodox’ quantum theory (OQT). I indicate only the Schrödinger version of the theory.

Corresponding to the wave-like aspect of a quantum system such as an electron, there is a ‘wave’ function, ψ(x,y,z,t), which assigns a complex number to each point in space (x,y,z) at a given time t. (A complex number c is a number of the form a + ib, where a and b are real numbers, and i = J — 1.)

According to OQT, Ψ(x,y,z,t), or T for short, changes in two quite distinct ways, depending on whether a measurement does not or does occur.

First, if no measurement is performed, Ψ(x,y,z,t) changes with the passage of time in a fully deterministic fashion in accordance with Schrodinger’s time-dependent equation:

(1) Mathematical and Physical Appendix

Here, as before, i = — 1. ħ = h/2π where h is Planck’s constant, m is the mass of the system, and V(x,y,z) is the potential at each point (x,y,z) which determines the force experienced by the system at (x,y,z). In the one-dimensional case, the force, F(x), at the point x, due to the potential V(x), at that point, is given by F(x) = —dV(x)/dx. The more rapidly V(x) changes with x (i.e. the greater the slope of the graph of V(x))9 so the greater the force on the system, the force always pointing in the direction in which V(x) decreases, d/dt means differentiate once with respect to time. V2 means d2/dx2 + d2/dy2 + d2/dz2; thus ∇2Ψ means that T is to be differentiated twice with respect to position.

The equation tells us that the rate of change of Ψ(x,y,z,t) with time is equal to minus the rate of change of rate of change of Ψ(x,y,z,t) with respect to space plus the potential at the given spatial point, (x,y,z). That changes in Ψ(x,y,z,t) with respect to space and time are interrelated in this way suffices to determine Ψ(x,y,z,t) for any t given Ψ(x,y,z,t 0) for some initial t = t 0.

Secondly, if a measurement is performed, an apparently probabilistic change in general occurs, in that the measurement detects more or less precisely some value of a so-called ‘observable’, such as position, momentum, energy, or angular momentum. The result is determined probabilistically by the quantum state of the measured system at the moment of measurement. Thus, in the case of position, the probability of detecting the system (an electron, say) in volume element dV is given by |Ψ|2 dV.

This postulate, first put forward by Max Born in 1926, can be generalized to include the measurement of other so-called ‘observables’—momentum, energy, spin. In general, mathematical operations performed on the Ψ-function first (p.268) determine a range of possible values, ab that may be obtained if a measurement of such-and-such an observable, A, is performed, and secondly determine the probabilities, pb of obtaining these values. Corresponding to any observable, A, there is a specific mathematical operator, A, which acts on state functions, Ψ, to produce new state functions. Any such operator (corresponding to an observable) is such that there is a set of state functions {f} such that:

  1. (i) If Φ is a member of {f}, then ÂΦi = aiΦi where ai is a real number.

  2. (ii) Any state function Ψ can be represented uniquely in the form: Ψ = ΣciΦi, where the ci are complex numbers, and Σ|ci|2 = 1.

In this case, the generalized Born postulate asserts:

(2) Given a system is in a state Ψ, if observable A is measured, the value ai will be obtained with probability I q I2, where ai and q are determined as in (i) and (ii).

A geometrical interpretation of all this is available, Ψ can be regarded as a vector in an abstract space called Hubert space; the functions {Φi} can be regarded as unit vectors pointing along coordinates of a coordinate system in Hubert space. c iΦi is the projection of ψ onto the i-th coordinate. The Schrödinger equation has the effect of rotating the vector corresponding to T in Hubert space, thus changing the values of the complex numbers {c i}, and so changing the probabilities of obtaining one or other of the possible values of the observable A if a measurement of A is made. If Ψ = Φi, then the probability of obtaining the value a i of A = 1.

In the case of a system consisting of two ‘particles’, 1 and 2, with coordinates (x1,y1,z1), and (x2,y2,z2) respectively, the state function is a function of six-dimensional ‘configuration’ space, Ψ(x1,y1,z1,x2,y2,z2) at any given time t. |Ψ(x1,y1,z1,x2,y2,z2)|2 dv 1 dv 2 represents the probability of detecting particle 1 in volume element dvx and, simultaneously, particle 2 in volume element dv2, granted that the appropriate position measurement is performed.

In the case of a two-particle system, the potential function V(x1,y1,z1,x2,y2,z2) can be interpreted as representing the force between the two particles when 1 is at (x1,y1,z1) and 2 is at (x2,y2,z2).

One striking feature of the theory is its highly non-local character. Once two particles, 1 and 2, have interacted, they remain ‘quantum-entangled’ even if widely separated spatially, in such a way that the particles do not have separate states. A measurement performed on particle 1 instantaneously affects the quantum state of 2, even though 1 and 2 are vast distances apart. The manner in which the measurement on 1 instantaneously affects the quantum state of 2 is such, however, that it is not possible to transmit a signal by its means. No local measurement performed on 2 can determine whether or not a measurement has been performed on 1. The change in the quantum state of 2 as a result of the measurement performed on 1 cannot be detected experimentally by means of measurements performed locally on 2.


(1.) For a marvellous, entirely non-technical account of the role of symmetry and group theory in theoretical physics, see Zee (1986). For more technical expositions of the elements of group theory relevant to physics, in increasing difficulty, see Rosen (1983); Joshi (1982); Isham (1989); Jones (1990). See also Moriyasu (1983).

(2.) Inertial motion is uniform motion in a straight line with respect to any one of infinitely many privileged reference frames—so-called ‘inertial’ reference frames—all moving with uniform velocity with respect to each other, and in terms of which the (relevant) laws of physics (such as Newton’s law of gravitation) are valid. This last clause rules out a set of reference frames all accelerating off at the same constant rate in some direction, or a set of rotating reference frames.

(3.) Physicists usually characterize a symmetry of a theory, T, as corresponding to a type of physical transformation performed on isolated systems to which T applies that leave the Hamiltonian unchanged. (The Hamiltonian is an expression for the total energy of the system.) It is not hard to show, however, that the requirement that the Hamiltonian is invariant under such transformations is equivalent to the demand that the transformations commute with the time evolution operator, that is, leave the evolutions of the systems unaffected. This latter way of characterizing a symmetry seems, however, closer to the intuitive idea of what a symmetry of a theory is.

(4.) Equivalence classes are defined as follows. Given a set of objects (such as all possible systems to which a theory, T, applies), then a binary relation, R, between the objects (such as ‘…can be rotated into…’) divides the set into mutually exclusive subsets called equivalence classes if and only if the following holds. Given any objects (systems), s 1 s 2, and s 3, in any one subset, then: (1) s 1 Rs 1; (2) s 1 Rs 2 implies s 2 Rs 1; and (3) s 1 Rs 2 and s 2 Rs 3 imply s 1 Rs 3.

(5.) For a more detailed account of the downfall of the discrete symmetries, P, CP, and T, see Pais (1986; Ch. 20).

(6.) For a discussion and proof of Noether’s theorem, see Goldstein (1980: 588–96).

(7.) This point is made by Houtappel et al (1965). See also Redhead (1975).

(8.) A quick answer is that if T is devoid of symmetry, then it will fail to satisfy many, if not all, of the facets of unity discussed in Sect. 5 and 6 of Ch. 3. These eight facets of unity do not, however, define ‘unity’: it is certainly the (p.295) case that some of these possible facets of unity will turn out to be irrelevant to the particular way in which the universe is unified. The question is: Could they all be irrelevant? Could there be a meaningful notion of unity devoid of symmetry?