Jump to ContentJump to Main Navigation
Analytical Mechanics for Relativity and Quantum Mechanics$

Oliver Johns

Print publication date: 2011

Print ISBN-13: 9780191001628

Published to Oxford Scholarship Online: December 2013

DOI: 10.1093/acprof:oso/9780191001628.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 26 February 2017

(p.607) Appendix E Geometry of Phase Space

(p.607) Appendix E Geometry of Phase Space

Analytical Mechanics for Relativity and Quantum Mechanics
Oxford University Press

(p.607) Appendix E

Geometry of Phase Space

In Section 20.9, we asserted without proof that a binary number a can always be found that will make the matrix ( Y/ p) defined in eqn (20.64) nonsingular. Proof of that fact requires an excursion into the geometry of phase space.

In the present reference chapter we define general abstract linear vector spaces and state some of their important properties. We then apply the general theory to define a vector space in phase space, and prove some theorems of importance for the theory of canonical transformations.

E.1 Abstract Vector Space

A linear vector space is a set of objects called vectors that obey the following axioms, abstracted from the properties of the three-dimensional Cartesian displacement vectors defined in Appendix A. In the following, we will denote vectors in an abstract vector space by the same bold, serif typeface as has been used for the special case of Cartesian threevectors.

E.1.1 Axioms

  1. 1. Closure under addition and scalar multiplication: If v and w are any vectors, then addition is defined by some rule such that v + w is also a vector. If α is any scalar (assumed here to be real numbers) and v any vector, then scalar multiplication is defined by some rule such that α v and v a are also vectors.

  2. 2. Commutativity: v + w = w + v and α v = v a.

  3. 3. Associativity of addition: (u + v) + w = u + (v + w).

  4. 4. Associativity of scalar multiplication: (αβ)v = α(β v).

  5. 5. Existence of additive identity: There is a vector ∅ such that every vector v satisfies the equation v + ∅ = v. Vector 0 is also referred to as the null vector.

  6. 6. Existence of additive inverse: For every vector v, there is a vector denoted —v that is a solution of the equation v + (−v) = ∅.

  7. 7. Linearity in scalars: (α + β)v = α v + β v.

  8. 8. Linearity in vectors: α(v + w) = α v + α w.

  9. 9. Multiplication by one: 1v = v.

E.1.2 Derived properties

The axioms167 can be used to prove the following additional properties of linear vector spaces:

  1. (p.608) 1. The additive identity is unique: If v + u = v for all vectors v, then u = ∅.

  2. 2. The additive inverse is unique: If v + u = ∅, then u = −v.

  3. 3. Law of cancellation: v + u = v + w implies u = w.

  4. 4. Unique solution: If u satisfies the equation v + u = w then u = w + (−v).

  5. 5. Multiplication of a vector by scalar zero: 0v = ∅.

  6. 6. Multiplication of additive identity by a scalar: α∅ = ∅.

  7. 7. Multiplication by a negative number: (−.l)v = −v.

  8. 8. Inverse of the inverse: −(−v) = v.

  9. 9. Simplified notations: Since the above axioms and additional properties imply that α v + β(−w) = α v + (−β)w, we need not be so careful about placement of the minus sign. Both of these equal expressions are commonly denoted simply as a vβ w. Thus the solution in derived property 4 is written u = wv. Also, properties 5 and 6 allow one to drop the distinction between the scalar 0 and the additive identity ∅. The equation in axiom 5 becomes v + 0 = v. Leading zeros are usually dropped; for example, 0 − v is written −v.

E.1.3 Linear independence and dimension

The set of vectors v 1, v 2,…, v r is linearly dependent (LD) if and only if there exist scalars α 1, α 2,…, α r, not all zero, such that

α 1 V 1 + α 2 V 2 + + α r V r = 0
In the opposite case, the set is linearly independent (LI) if and only if eqn (E.1) implies that α 1 = α 2 = … = α r = 0.

The vector space has dimension N if and only if it contains an LI set of N vectors, but every set in it with more than N vectors is LD. The dimension of a linear vector space is often indicated by labeling the space as V N.

Let e 1, e 2, …, e N be any LI set of N vectors in a vector space V n of dimension N. Such an LI set is called a basis for the vector space, and is said to span it. Any vector x in V N can be expanded as

x = i = 1 N x i e i = x 1 e 1 + x 2 e 2 + + x N e N
The numbers x i are called components of vector x relative to this basis. For a given basis, the components x i of a vector x are uniquely determined.

A vector space may have many different bases. But all of these bases will have the same number of LI vectors in them. And the vector space can be specified uniquely by listing any one of its bases.

Vector equations are equivalent to equations between components (in a given basis). The equation u = α v + β w is true if and only if u i = α v i + β w i is true for all i = 1,…, N. This latter relation is often written as an equation of column matrices. If one (p.609) defines

[ v ] = ( v 1 v 2 v N )
with a similar definition for other vectors, the equality of components may be written as [u] = α[v] + β[w].

Any LI set of vectors x 1, x 2,…, x m for mN can be extended to a basis in V N. If e 1, e 2, …, e n is some known basis for V N then vectors e k>m+1, e k m+2,…, e k n can be selected from it so that x 1,…, x m,e k m+1, e k m+2, …,e k N is an LI set of N vectors and so forms a basis in V N.

E.2 Subspaces

A subspace of V N is a linear vector space all of whose member vectors are also members of V N. Denoting a subspace as ℝ, it follows from the closure axiom for vector spaces that if x and y are members of ℝ, then (α x + β y) must also be a member of ℝ. Thus subspaces are different from subsets in set theory. If one constructs a subspace by listing a set of vectors, then the subspace must also contain all linear sums of those vectors and not just the vectors in the original list.

Subspaces may be specified by listing an LI set of vectors that spans the space. For example, if V N has a basis e 1,…, e N then the set of all vectors x = α e 1 + β e 4 + γ e 5, where α, β, γ may take any values, forms a subspace.

Subspaces may also be specified by stating some criterion for the inclusion of vectors in the subspace. For example, the set of threevectors with components (0, y, z) forms a subspace of V 3, but the set with components (1, y, z) does not since the sum of two such vectors does not have a 1 as its first component.

The null vector is a member of every subspace. If ℝ contains only the null vector, then we write ℝ = 0.

E.2.1 Linear Sum and Intersection

If ℝ and 𝕊 are subspaces of V N, then we define their linear sum (ℝ + 𝕊 ) as the set of all vectors x = r + s where r ∈ ℝ (which is read as “r is a member of ℝ”) and s ∈ 𝕊. The linear sum (ℝ + 𝕊 ) of two subspaces is itself a subspace of V N.

If ℝ and 𝕊 are subspaces of V N, then we define their intersection (ℝ ∩ 𝕊 ) as the set of all vectors x that are members of both subspaces, x ∈ ℝ and x ∅ 𝕊. The intersection (ℝ ∩ 𝕊 ) of two subspaces is itself a subspace of V N.168

Subspaces are vector spaces. Just as was done for V N, the dimension N r of ℝ is defined as the number of vectors in the largest LI set it contains, and this set then forms (p.610) a basis for ℝ. The notation dim(ℝ) = N r will also be used, meaning that the dimension of ℝ is N r. This notation can be used to write the dimension rule,

dim ( + S ) = dim ( ) + dim ( S ) dim ( S )

If (ℝ ∩𝕊) = 0 then we say that ℝ and 𝕊 are disjoint Two subspaces are disjoint when their only common vector is the null vector.

In the special case in which (ℝ + 𝕊) = V N and (ℝ ∩ 𝕊) = 0, we say that ℝ and 𝕊 are complementary subspaces or complements relative to V N. This complementarity relation will be denoted V N S . Complementary subspaces are disjoint and their linear sum is the whole of the space V N. Since the empty subspace (consisting only of the null vector) has zero dimension by definition, the dimension rule gives N = N r + N S for complements.

If ℝ and 𝕊 are complements, and if ℝ has a basis r 1,… r N R and 𝕊 has a basis s 1,…, s n s, then the set r 1,…, rNR,s1,…, s N S is an LI set and forms a basis for V N. It follows that every vector xV n can then be written as x = r + s where r and s are unique vectors in ℝ and 𝕊, respectively.

Conversely, suppose that V n has some basis e 1, e 2,…, e N and we segment that basis into two disjoint LI sets of vectors, say e 1,…, e n and e n+1,…, e N. Then if we define ℝ to be the subspace spanned by e 1,…, e n and define 𝕊 to be the subspace spanned by e n+1,…, e N, it follows that ℝ and 𝕊 are complements relative to V N.

Every subspace ℝ has some subspace 𝕊 such that ℝ and 𝕊 are complements. But this 𝕊 is not unique. So it is incorrect to speak of “the” complement of a given subspace.

The notation ℝ ⊃ 𝕊 means that every x ∈ 𝕊 is also a member of ℝ. We say that ℝ contains 𝕊. Hence (ℝ + 𝕊) ⊃ 𝕊, (ℝ + 𝕊) ⊃ ℝ, ℝ ⊃ (ℝ ∩ 𝕊), 𝕊 ⊃ (ℝ ∩ 𝕊).

E.3 Linear Operators

A linear operator 𝒜 in vector space V n operates on vectors xV n and produces other vectors yV N. We write this operation as y = 𝒜x. The assumed linearity property is

A ( α x+ β y ) = α A x+ β A y
It follows that, for a given basis e 1,…, e N, there is a unique N-rowed square matrix A associated with each operator 𝒜. Since any xV N may be expanded in the basis, as x = x 1 e 1 + … + x N e N it follows from linearity that
y = A x = A ( x 1 e 1 + + x N e N ) = x 1 A e 1 + + x N A e N
But 𝒜e k is also a vector in V n and hence may also be expanded in the basis, as
A e k = i = 1 N e i A i k
where the matrix elements 𝒜ik are uniquely determined. Then
i = 1 N y i e i = y = k = 1 N x k i = 1 N e i A i k = i = 1 N ( k = 1 N A i k x k ) e i
(p.611) together with the uniqueness of the expansion of y in the basis implies that
y i = k = 1 N A i k x k or,in matrix from, [ y ] = A [ x ]

E.3.1 Nonsingular Operators

Since canonical transformations are nonsingular, we will be interested here in nonsingular operators. By definition, an operator 𝒜 is nonsingular if it satisfies the condition that 𝒜x = 0 only when x = 0.

An operator 𝒜 is nonsingular if and only if it possesses an inverse. For nonsingular operators, the equation y = 𝒜x can be solved uniquely for x. This inverse relation is denoted as x = 𝒜−1 y where the operator 𝒜−1 is called the inverse of 𝒜. The inverse 𝒜−1 is also nonsingular.

In matrix form, the definition of a nonsingular operator translates to the requirement that A [x] = 0 only when [x] = 0. In Corollary B.19.2, such nonsingular matrices were shown to have a nonzero determinant and hence to possess an inverse matrix A −1. If A is the matrix of 𝒜 in some basis, then A −1 is the matrix of the inverse operator 𝒜−1.

E.3.2 Nonsingular Operators and Subspaces

If ℝ is a subspace of V n and 𝒜 is a nonsingular linear operator, then 𝒜ℝ is also sub-space. It consists of all vectors y = 𝒜x where x∈ℝ. The following properties hold for nonsingular operators acting on subspaces

A ( + S ) = ( ( A ) + ( A S ) ) A ( S ) = ( ( A ) ( A S ) )

If r 1, …, r N R are a basis in ℝ then 𝒜r 1,…, 𝒜r n r are an LI set and form a basis in 𝒜ℝ. Thus, nonsingular operators also have the property

dim ( A ) = dim ( )

It follows from eqns (E.10, E.11) that if ℝ and 𝕊 are complements and 𝒜 is nonsingular, then 𝒜ℝ and 𝒜𝕊 are also complements

V N S if and only if ( A ) V N ( A S )

E.4 Vectors in Phase Space

A set of values for the variables q 0,…, q D, p 0, …, p d of extended phase space defines a point in what is called a differentiable manifold. We want to establish a linear vector space such that the differentials of these variables d q 0,…, d q D, d p 0,…, d p D are components of vectors in an abstract linear vector space of dimension 2D + 2.169 Introducing (p.612) basis vectors q 0,…, q D,p 0,…, p d a typical vector may be written as

d γ = d q 0 q 0 + + d q D q D + d p 0 p 0 + + d p D p D
We may also use the symplectic notation introduced in Section 19.3 to write eqn (E.13) as
d γ = d γ 0 γ 0 + + d γ ( 2 D + 1 ) γ ( 2 D + 1 )
where q 0,…, q D =γ 0,…, γ d and p 0, …, p D =γ(D+1), …, γ(2D+1).

So far in this chapter, we have made no mention of inner products of vectors. The natural definition for the inner product in phase space is what will be called the symplectic inner product, or symplectic metric. It can be defined by defining the products of the basis vectors to be, for i, j = 0,…, (2D + 1),

γ i γ j = s i j
where s ij are the matrix elements of the symplectic matrix s defined in Section 19.4. We use the small circle between the vectors rather than the usual dot to emphasize that this is not the standard Cartesian form of the inner product.

Inner products or metrics introduced into linear vector spaces are required to obey certain properties.

  1. 1. Linearity: x ∙ (α y + β z) = α xy + β xz.

  2. 2. Non-degeneracy: xy = 0 for all xV n if and only if y = 0.

  3. 3. Symmetry: xy = yx

  4. 4. Positive definiteness: If x ≠ 0 then xx > 0.

The inner product defined in eqn (E.15) is assumed to have the first property. It has the second property since the matrix s is nonsingular. But it does not satisfy properties 3 and 4. Equation (E.15) therefore does not define a proper metric but what is referred to in mathematics texts as a structure function. However, we will continue to refer to it as a metric, following the somewhat looser usage in physics. (The inner product of two timelike or lightlike fourvectors in Minkowski space, for example, violates property 4 and yet g ij is generally called a metric.)

Using the assumed linearity of the inner product, we can write the inner product of any two vectors as

d γ d ϕ = i = 0 2 D + 1 j = 0 2 D + 1 d γ i s i j d ϕ j
from which it follows that properties 3 and 4 above are replaced by:
  1. 3′. Anti-symmetry: d γd ϕ = −d ϕd γ

  2. 4′. Nullity: d γd γ = 0 for every vector d γ.

E.5 Canonical Transformations in Phase Space

In Section 19.3 a canonical transformation q, pQ, P was written in symplectic coordinates as γ → Γ. Then a (2D + 2) × (2D + 2) Jacobian matrix with components J ij = Γi(γ)/ γ j was defined. It follows from the chain rule that the differentials of (p.613) the symplectic coordinates transform with this Jacobian matrix J just as the derivatives were shown to do in eqn (19.24)

d Γ i = j = 0 2 D + 1 J i j d γ j

In discussing canonical transformations in phase space it is simplest to adopt the active definition of canonical transformations discussed in Section 20.14. Thus the transformation of differentials in eqn (E.17) is assumed to transform the vector d γ in eqn (E.14) into a new vector

d Γ = J d γ = i = 0 2 D + 1 d Γ i γ i
in the same basis. As is always the case for active transformations, the vector d γ is transformed into a new vector dΓ while the basis vectors y i are not transformed.

The symplectic inner product defined in Section E.4 has the following important property.

Theorem E.5.1: Invariance of Symplectic Inner Product

The symplectic inner product is invariant under canonical transformations. If 𝒥 is the operator of any canonical transformation and d γ and d ϕ are any two phase-space vectors, then

( J d γ ) ( J d ϕ ) = d γ d ϕ

Proof: Writing eqn (E.19) in component form, we want to prove that

i = 0 2 D + 1 j = 0 2 D + 1 ( k = 0 2 D + 1 J i k d γ k ) s i j ( l = 0 2 D + 1 J i l d ϕ l ) = k = 0 2 D + 1 l = 0 2 D + 1 d γ k ( J T s J ) k l d ϕ l
is equal to
k = 0 2 D + 1 l = 0 2 D + 1 d γ k s k l d ϕ l
The Lagrange bracket condition, eqn (19.54), states that any canonical transformation has J T s J = s, which proves the present theorem.       ◻

E.6 Orthogonal Subspaces

If two phase-space vectors have a zero inner product d γd ϕ = 0 then we will say that they are orthogonal. But we must note that “orthogonality” in a vector space with a symplectic metric will not have the same geometrical meaning as in the space of three-vectors. For example, as seen in property 4′ of Section E.4, every vector is orthogonal to itself! We will adopt the notation d γd ϕ to indicate that the vectors are orthogonal and obey d γ o d ϕ = 0. Thus d γd γ for every vector d γ.

This idea of orthogonality can be extended to subspaces. If we have two subspaces ℝ and 𝕊 then we will write ℝ ⊥ 𝕊 if d r o d s = 0 for every d r ∈ ℝ and d s ∈ 𝕊. Also a (p.614) subspace can be self-orthogonal, with ℝ ⊥ ℝ, which means that d x o d y = 0 for every d x and d y in ℝ.

It follows from Theorem E.5.1 that mutual- and self-orthogonality of subspaces is invariant under canonical transformations. If, as we did in Section E.3, we denote by 𝒥ℝ the set of all vectors d x = 𝒥d r where d r ∈ ℝ, then it follows that for any canonical transformation 𝒥 and any subspaces ℝ and 𝕊

S if and only if ( J ) ( J S )
Thus, in the special case in which ℝ = 𝕊, we also have that
if and only if ( J ) ( J )

Lemma 19.7.1 proved that the matrix of any canonical transformation is nonsingu-lar. Hence eqn (E.11) applies to canonical transformations, with the result that

dim ( J ) = dim ( )
for any subspace ℝ.

E.7 A Special Canonical Transformation

In Section 20.6 we defined a transformation (linear, with constant coefficients) from the variables Q, P to the new variables X, Y. Taking the differentials of the defining eqn (20.40) it can be written as

d X k = α ¯ k d Q k α k d P k and d Y k = α k d Q k + α ¯ k d P k
If we define symplectic coordinates dΓ0,…,dΓ(2 D+1) =d Q 0, …, d Q D, d P 0,…, d P D as was done in Section 19.3 and also define d0,…,d∧(2 D+1) = d X 0,…,d X D, d Y 0,…, d Y D using a similar pattern, then eqn (E.25) may be written as
d Λ i = j = 0 2 D + 1 A i j d Γ j or d Λ = A d Γ
where the (2D + 2) × (2D + 2) matrix A is composed of diagonal (D + 1) × (D + 1) blocks a and ā. These blocks are defined in terms of the α k binary digits in eqn (20.38) by a ij = δ ij α i and āij = δ ij āj. The matrix A is
A= ( a ¯ a a a ¯ )
As the reader may verify, the matrix A is both canonical (obeying the definition eqn (19.37), for example) and orthogonal. Thus the operator 𝒜 in nonsingular, and defines an active canonical transformation. The inverse matrix is
A 1 = ( a ¯ a a a ¯ )

(p.615) E.8 Special Self-Orthogonal Subspaces

Before proving the main theorems of this chapter, we require some preliminary definitions and lemmas.

  1. 1. Define a subspace ℚ to be all vectors of the form d γ ( q ) = i = 0 D d q i q i . As can be confirmed using eqn (E.16), ℚ Γ ℚ. Since ℚ is spanned by (D + 1) basis vectors, dim (ℚ) = D + 1.

  2. 2. Define a subspace ℙ to be all vectors of the form d γ ( q ) = i = 0 D d q i q i . As can be confirmed using eqn (E.16), ℙ⊥ ℙ. Since ℙ is spanned by (D + 1) basis vectors, dim(ℙ) = D + 1.

  3. 3. Given any canonical transformation J, define a subspaces 𝕄 = Jℚ and ℕ = Jℙ. It follows from the results in Section E.6 that 𝕄⊥𝕄, ℕ ⊥ ℕ, and dim(𝕄) = D + 1 = dim(ℕ).

  4. 4. Given any canonical transformation J, and the special canonical transformation A for any choice of the α k, define subspaces 𝕎 = AM = AJℚ and ℤ = 𝒜ℕ = AJℙ. It follows from the results in Section E.6 that 𝕎 ⊥ 𝕎, ℤ ⊥ ℤ, and dim(𝕎) = D + 1 = dim(ℤ).

  5. 5. Given the special canonical transformation 𝒜 for any choice of the α k, define the subspaces 𝔼 = 𝒜−1ℚ and 𝔽 = 𝒜 −1ℙ. It follows from the results in Section E.6 that 𝔼⊥𝔼, 𝔽 ⊥ 𝔽, and dim(𝔼) = D + 1 + dim(𝔼).

  6. 6. By construction ℚ ∩ ℙ = 0 and ℚ + ℙ = V(2 d+2) where V(2 d+2) denotes the whole of the vector space, of dimension (2D + 2). Thus ℚ and ℙ are complements relative to V(2 d+2), denoted V ( 2 D + 2 ) . It then follows from eqn (E.12) that the following pairs are also complements: M V ( 2 D + 2 ) .

  7. 7. Define an operator S whose associated matrix is s, the symplectic matrix. Then, as can be verified by writing out the component equation, ℙ = Sℚ. From the definitions 5 (𝔼 = 𝒜−1ℚ and 𝔽 = 𝒜−1ℙ) it follows that 𝔽 = 𝒜−1 𝒮𝒜𝔼 = 𝒜T 𝒮𝒜𝔼 = S𝔼 where we used the orthogonality of A to write A −1 = 𝒜 A T and the Lagrange bracket condition eqn (19.54) to write A T S A = S for the canonical transformation A.

The scheme of subspace relations can be summarized as

E A 1 J M A W s s F A 1 J A
where subspaces in the upper row and the corresponding subspaces in the lower row are all complements.

The following lemma will also be required.

Lemma E.8.1: Maximum Dimension of Self-Orthogonal Subspace

Ifis any self-orthogonal subspace, with ℝ ⊥ ℝ, then ℝ ∩ (𝒮 ℝ ) = 0 and N r = dim( ℝ ) ≤ (D + 1).

(p.616) Proof: The operator 𝒮 defined in property 7 above is nonsingular and is a canonical transformation. Its nonsingularity is proved in Section 19.4. As is proved in eqn (19.29), 𝒮 is also orthogonal, with 𝒮−1 = 𝒮T. Thus SSS T = 𝒮, which shows that 𝒮 satisfies the Poisson bracket condition eqn (19.37) and is a canonical transformation.

Let d x be any vector in ℝ ∩ (𝒮ℝ), so that d x is in both ℝ and 𝒮ℝ. Since d x ∈ ℝ, it follows that Sd x is in 𝒮ℝ. But d x is also in 𝒮ℝ. Hence both d x and Sd x are in 𝒮ℝ.

It follows from eqn (E.23) that 𝒮ℝ ⊥ 𝒮ℝ. Hence d x o Sd x = 0 must hold. Writing this equation out in terms of components and using (s)2 = − U gives

0 = i = 0 2 D + 1 j = 0 2 D + 1 d x i s i j ( k = 0 2 D + 1 s j k d x k ) = i = 0 2 D + 1 ( d x i ) 2
which implies dx i = 0 for every i value. Hence d x is the null vector, which implies that ℝ ∩ (𝒮ℝ) = 0.

Since 𝒮 is a nonsingular operator, eqn (E.11) implies that dim(𝒮ℝ) = dim(ℝ) = N r. It then follows from the dimension rule eqn (E.4) that

dim ( + ( S ) ) = dim ( ) + dim ( S ) = 2 N R
But every subspace in V(2 d+2) must have a dimension less than or equal to (2D + 2). Thus 2N r ≤ (2D + 2) and so N r ≤ (D + 1) as was to be proved.     ◻

E.9 Arnold's Theorem

The following theorem allows us to prove that every canonical transformation has a mixed generating function.

Theorem E.9.1: Arnold's Theorem

Ifis any self-orthogonal subspace of dimension (D +1), there is some choice of the binary digits α k used to define A in Section E.7 such thatand the self-orthogonal subspace 𝔼 defined in property 5 of Section E.8 are disjoint,

𝔼 = 0

Proof: The proof first gives a definite procedure for choosing the α k values. We then prove that the resulting 𝔼 has the property eqn (E.32).170

Starting with ℕ and the self-orthogonal subspace ℙ defined in property 2 of Section E.8, consider the intersection ℕ ∩ ℙ. This intersection is a self-orthogonal subspace since (ℕ ∩ ℙ) ⊂ ℙ and ℙ is self-orthogonal. Denote the dimension of ℕ ∩ ℙ by n. It thus has a basis consisting of an LI set of vectors x 0,…, x (n1). Since ℕ(ℕ ∩ ℙ) ⊂ ℙ, this basis can be extended to a basis for ℙ by adding vectors p k n,…, p k D selected from the full ℙ basis p 0,…, p das needed. Then

x 0 , , x ( n 1 ) , p k n , , p k D
are (D + 1) linearly independent vectors that form a basis for ℙ.

(p.617) Choose the binary digits α k so that 1 = α k 0 = … = α k(n1) and 0 = α k n = Ȧ = α k Dwhere k 0,…, k(n1) are all those indices not selected for the p k i in eqn (E.33). (The k o,…, k D are therefore an arrangement of the integers 0;1,…, D.) With this definition of the α k, the definitions in Section E.7 may be used to verify that the (D +1)-dimensional subspace 𝔼 = 𝒜−1ℚ will be spanned by the basis

q k 0 , , q k ( n 1 ) , p k n , , p k D
Using this definition of 𝔼, we now proceed to the proof of eqn (E.32).

Inspection of eqn (E.34) shows that the intersection 𝔼 ∩ ℙ is a self-orthogonal sub-space of dimension (D + 1 − n) spanned by p ^ k n , , p ^ k D . Also, we have already seen that ℕ ∩ ℙ is a self-orthogonal subspace of dimension n spanned by x 0,…, x(n1). Thus, eqn (E.33) shows that ℙ has a basis consisting of a basis of ℕ ∩ ℙ concatenated with a basis of 𝔼 ∩ ℙ. It follows from the discussion in Section E.2.1 that ℕ ∩ ℙ and 𝔼 ∩ ℙ are complements relative to ℙ and that

= ( ) + ( E ) and ( ) ( E ) = 0

Since (ℕ ∩ ℙ) ⊂ ℙ and ℙ is self-orthogonal, it follows that (ℕ ∩ ℙ) ⊥ ℙ. Similarly, (𝔼 ∩ ℙ) ⊥ ℙ. Also, since ℕ ∩ 𝔼 and ℕ ∩ ℙ are both contained in the self-orthogonal subspace ℕ, it follows that (ℕ ∩ 𝔼) ⊥ (ℕ ∩ ℙ). Similarly, (ℕ ∩ 𝔼) ⊥ (𝔼 ∩ ℙ) since 𝔼 is self-orthogonal. Therefore

( E ) { ( ) + ( E ) } =
where eqn (E.35) was used to get the last equality.

Now ℙ is a self-orthogonal subspace of dimension (D + 1). If ℕ ∩ 𝔼 ≠ 0, and if it is not true that (ℕ ∩ 𝔼) ⊂ ℙ, then there will be a vector not in ℙ that is symplectically orthogonal to all vectors in ℙ. This would constitute a self-orthogonal subspace of dimension (D + 2) which is proved impossible by Lemma E.8.1. Thus, whether ℕ ∩ 𝔼 = 0 or not, it is true that

( E )
Thus, using eqn (E.35),
( E ) = { ( E ) = ( ) } ( E ) = 0
as was to be proved. ◻

E.10 Existence of a Mixed Generating function

In Section 20.9 we demonstrated that every canonical transformation can be generated by a mixed generating function F(q, Y). A crucial point in that demonstration was the assertion, without proof, that the binary number α defined in Section 20.6 can always be chosen so that the matrix ( Y/ p) defined in eqn (20.64) is nonsingular. Using Arnold's theorem, we can now provide the proof of this assertion.

(p.618) Theorem E.10.1: Existence of Mixed Generating function

Assuming a general canonical transformation q, pQ, P, the variables X, Y were defined in eqn (20.63) as

X k ( q , p ) = α ¯ k Q k ( q , p ) α k P k ( q , p ) Y k ( q , p ) = α k Q k ( q , p ) + α ¯ k P k ( q , p )
The digits α k of the binary number α = α Dα 1, α 0 used in this definition can always be chosen so that (D +1)-dimensional square matrix ( Y/ p) defined by its matrix elements
( Y p ) i j = Y i ( q , p ) p j
will have a nonzero determinant and hence be nonsingular.

Proof: As in Theorem E.9.1, we continue to use the subspace definitions from Section E.8. From Arnold's theorem, we know that there is some choice of α suchthat ℕ∩𝔼 = 0. Multiplying by the nonsingular operator 𝒜 defined in Section E.7 and using eqn (E.10) gives

0 = A ( E ) = ( A ) ( A E ) =

By its definition in property 4 of Section E.8, ℤ = AJℙ, where ℙ is the subspace defined in property 2, consisting of vectors of the form d γ ( p ) = i = 0 D d p i p i . Using the definition of symplectic coordinates from eqn (E.26), vectors in ℤ are

d Λ ( p ) = i = 0 2 D + 1 d Λ i ( p ) γ i = i = 0 D ( d X i ( p ) q i + d Y i ( p ) p i )
where the components come from the matrix equation
[ d Λ ( p ) ] = ( [ d X ( p ) ] [ d Y ( p ) ] ) = A J ( [ 0 ] [ d p ] )

Now suppose that at least one of the dp i is nonzero, so that [dp] ≠ 0. If [dY (p)] = 0 in that case, it would follow that a vector in ℤ would be of the form i = 0 D d X i ( p ) q i . But such a vector is entirely expressed in the q 0,…, q D basis and therefore is a vector in ℚ. This is impossible, since, by eqn (E.41), there are no vectors in both ℤ and ℚ. Therefore it is impossible to have [dp] ≠ 0 and [dY (p)] = 0.

Examination of eqn (E.39) shows that we can also use the chain rule and eqn (E.40) to write [dY (p)] as

[ d Y ( p ) ] = ( Y p ) [ d p ]
We have proved above that [dY (p)] = 0 implies [dp] = 0. By Corollary B.19.2, this is the necessary and sufficient condition for ( Y/ p) in eqn (E.44) to be nonsingular and have a nonzero determinant. Thus ǀ Y / p ǀ ≠ 0 as was to be proved.     ◻


(167) See Chapter 2 of Mirsky (1961) or Chapter 7 of Birkhoff and MacLane (1977). Different texts differ as to which are the axioms and which are the derived properties. However, there is general agreement that linear vector spaces do satisfy the whole list of axioms and derived properties given here.

(168) It might appear that a distributive rule would hold for linear sums and intersections, but it does not. In general ℝ ∩ ( 𝕊 + 𝕋) ≠ ℝ∩𝕊 + ℝ∩𝕋. For example, consider vectors in the Cartesian xy plane V 2. Let ℝ be the one-dimensional subspace with Cartesian components (r, r), 𝕊 with (s, 0), and 𝕋 with (0, t). Then ℝ ∩ 𝕊 = 0 and ℝ ∩ 𝕋 = 0, so (ℝ ∩, 𝕊 + ℝ ∩ 𝕋) = 0. But ℝ ∩ (𝕊 + 𝕋) = ℝ since (𝕊 + 𝕋) spans the whole ofV 2.

(169) In modern differential geometry, a curve through a given point of the manifold is defined by writing γ i = γ i (β) for each coordinate i = 0, …, (2D + 1). Then the derivatives d γ i (β)/d β, taken at the given point, are used as the components of what is called a tangent vector. The differentials that we use for vector components are related to these tangent vectors by d γ i = (d γ i (β)/d β)d β. In either case, the geometrical idea is that the d γ is to represent an arbitrary displacement starting from some point of the manifold. Since we have used differentials (in the sense of the word defined in Section D.12) throughout this text, we will continue to use them here. Those more familiar with the differential-geometric notation may mentally replace the d q k, etc., by components of the corresponding tangent vectors.

(170) This theorem and its proof are derived from Chapters 8 and 9 of Arnold (1989).