(p.607) Appendix E Geometry of Phase Space
(p.607) Appendix E Geometry of Phase Space
(p.607) Appendix E
Geometry of Phase Space
In Section 20.9, we asserted without proof that a binary number a can always be found that will make the matrix (∂ Y/∂ p) defined in eqn (20.64) nonsingular. Proof of that fact requires an excursion into the geometry of phase space.
In the present reference chapter we define general abstract linear vector spaces and state some of their important properties. We then apply the general theory to define a vector space in phase space, and prove some theorems of importance for the theory of canonical transformations.
E.1 Abstract Vector Space
A linear vector space is a set of objects called vectors that obey the following axioms, abstracted from the properties of the three-dimensional Cartesian displacement vectors defined in Appendix A. In the following, we will denote vectors in an abstract vector space by the same bold, serif typeface as has been used for the special case of Cartesian threevectors.
1. Closure under addition and scalar multiplication: If v and w are any vectors, then addition is defined by some rule such that v + w is also a vector. If α is any scalar (assumed here to be real numbers) and v any vector, then scalar multiplication is defined by some rule such that α v and v a are also vectors.
2. Commutativity: v + w = w + v and α v = v a.
3. Associativity of addition: (u + v) + w = u + (v + w).
4. Associativity of scalar multiplication: (αβ)v = α(β v).
5. Existence of additive identity: There is a vector ∅ such that every vector v satisfies the equation v + ∅ = v. Vector 0 is also referred to as the null vector.
6. Existence of additive inverse: For every vector v, there is a vector denoted —v that is a solution of the equation v + (−v) = ∅.
7. Linearity in scalars: (α + β)v = α v + β v.
8. Linearity in vectors: α(v + w) = α v + α w.
9. Multiplication by one: 1v = v.
E.1.2 Derived properties
The axioms167 can be used to prove the following additional properties of linear vector spaces:
(p.608) 1. The additive identity is unique: If v + u = v for all vectors v, then u = ∅.
2. The additive inverse is unique: If v + u = ∅, then u = −v.
3. Law of cancellation: v + u = v + w implies u = w.
4. Unique solution: If u satisfies the equation v + u = w then u = w + (−v).
5. Multiplication of a vector by scalar zero: 0v = ∅.
6. Multiplication of additive identity by a scalar: α∅ = ∅.
7. Multiplication by a negative number: (−.l)v = −v.
8. Inverse of the inverse: −(−v) = v.
9. Simplified notations: Since the above axioms and additional properties imply that α v + β(−w) = α v + (−β)w, we need not be so careful about placement of the minus sign. Both of these equal expressions are commonly denoted simply as a v − β w. Thus the solution in derived property 4 is written u = w − v. Also, properties 5 and 6 allow one to drop the distinction between the scalar 0 and the additive identity ∅. The equation in axiom 5 becomes v + 0 = v. Leading zeros are usually dropped; for example, 0 − v is written −v.
E.1.3 Linear independence and dimension
The set of vectors v 1, v 2,…, v r is linearly dependent (LD) if and only if there exist scalars α 1, α 2,…, α r, not all zero, such that
The vector space has dimension N if and only if it contains an LI set of N vectors, but every set in it with more than N vectors is LD. The dimension of a linear vector space is often indicated by labeling the space as V N.
Let e 1, e 2, …, e N be any LI set of N vectors in a vector space V n of dimension N. Such an LI set is called a basis for the vector space, and is said to span it. Any vector x in V N can be expanded as
A vector space may have many different bases. But all of these bases will have the same number of LI vectors in them. And the vector space can be specified uniquely by listing any one of its bases.
Vector equations are equivalent to equations between components (in a given basis). The equation u = α v + β w is true if and only if u i = α v i + β w i is true for all i = 1,…, N. This latter relation is often written as an equation of column matrices. If one (p.609) defines
Any LI set of vectors x 1, x 2,…, x m for m ⩽ N can be extended to a basis in V N. If e 1, e 2, …, e n is some known basis for V N then vectors e k>m+1, e k m+2,…, e k n can be selected from it so that x 1,…, x m,e k m+1, e k m+2, …,e k N is an LI set of N vectors and so forms a basis in V N.
A subspace of V N is a linear vector space all of whose member vectors are also members of V N. Denoting a subspace as ℝ, it follows from the closure axiom for vector spaces that if x and y are members of ℝ, then (α x + β y) must also be a member of ℝ. Thus subspaces are different from subsets in set theory. If one constructs a subspace by listing a set of vectors, then the subspace must also contain all linear sums of those vectors and not just the vectors in the original list.
Subspaces may be specified by listing an LI set of vectors that spans the space. For example, if V N has a basis e 1,…, e N then the set of all vectors x = α e 1 + β e 4 + γ e 5, where α, β, γ may take any values, forms a subspace.
Subspaces may also be specified by stating some criterion for the inclusion of vectors in the subspace. For example, the set of threevectors with components (0, y, z) forms a subspace of V 3, but the set with components (1, y, z) does not since the sum of two such vectors does not have a 1 as its first component.
The null vector is a member of every subspace. If ℝ contains only the null vector, then we write ℝ = 0.
E.2.1 Linear Sum and Intersection
If ℝ and 𝕊 are subspaces of V N, then we define their linear sum (ℝ + 𝕊 ) as the set of all vectors x = r + s where r ∈ ℝ (which is read as “r is a member of ℝ”) and s ∈ 𝕊. The linear sum (ℝ + 𝕊 ) of two subspaces is itself a subspace of V N.
If ℝ and 𝕊 are subspaces of V N, then we define their intersection (ℝ ∩ 𝕊 ) as the set of all vectors x that are members of both subspaces, x ∈ ℝ and x ∅ 𝕊. The intersection (ℝ ∩ 𝕊 ) of two subspaces is itself a subspace of V N.168
Subspaces are vector spaces. Just as was done for V N, the dimension N r of ℝ is defined as the number of vectors in the largest LI set it contains, and this set then forms (p.610) a basis for ℝ. The notation dim(ℝ) = N r will also be used, meaning that the dimension of ℝ is N r. This notation can be used to write the dimension rule,
If (ℝ ∩𝕊) = 0 then we say that ℝ and 𝕊 are disjoint Two subspaces are disjoint when their only common vector is the null vector.
In the special case in which (ℝ + 𝕊) = V N and (ℝ ∩ 𝕊) = 0, we say that ℝ and 𝕊 are complementary subspaces or complements relative to V N. This complementarity relation will be denoted . Complementary subspaces are disjoint and their linear sum is the whole of the space V N. Since the empty subspace (consisting only of the null vector) has zero dimension by definition, the dimension rule gives N = N r + N S for complements.
If ℝ and 𝕊 are complements, and if ℝ has a basis r 1,… r N R and 𝕊 has a basis s 1,…, s n s, then the set r 1,…, rNR,s1,…, s N S is an LI set and forms a basis for V N. It follows that every vector x ∈ V n can then be written as x = r + s where r and s are unique vectors in ℝ and 𝕊, respectively.
Conversely, suppose that V n has some basis e 1, e 2,…, e N and we segment that basis into two disjoint LI sets of vectors, say e 1,…, e n and e n+1,…, e N. Then if we define ℝ to be the subspace spanned by e 1,…, e n and define 𝕊 to be the subspace spanned by e n+1,…, e N, it follows that ℝ and 𝕊 are complements relative to V N.
Every subspace ℝ has some subspace 𝕊 such that ℝ and 𝕊 are complements. But this 𝕊 is not unique. So it is incorrect to speak of “the” complement of a given subspace.
The notation ℝ ⊃ 𝕊 means that every x ∈ 𝕊 is also a member of ℝ. We say that ℝ contains 𝕊. Hence (ℝ + 𝕊) ⊃ 𝕊, (ℝ + 𝕊) ⊃ ℝ, ℝ ⊃ (ℝ ∩ 𝕊), 𝕊 ⊃ (ℝ ∩ 𝕊).
E.3 Linear Operators
A linear operator 𝒜 in vector space V n operates on vectors x ∈ V n and produces other vectors y ∈ V N. We write this operation as y = 𝒜x. The assumed linearity property is
E.3.1 Nonsingular Operators
Since canonical transformations are nonsingular, we will be interested here in nonsingular operators. By definition, an operator 𝒜 is nonsingular if it satisfies the condition that 𝒜x = 0 only when x = 0.
An operator 𝒜 is nonsingular if and only if it possesses an inverse. For nonsingular operators, the equation y = 𝒜x can be solved uniquely for x. This inverse relation is denoted as x = 𝒜−1 y where the operator 𝒜−1 is called the inverse of 𝒜. The inverse 𝒜−1 is also nonsingular.
In matrix form, the definition of a nonsingular operator translates to the requirement that A [x] = 0 only when [x] = 0. In Corollary B.19.2, such nonsingular matrices were shown to have a nonzero determinant and hence to possess an inverse matrix A −1. If A is the matrix of 𝒜 in some basis, then A −1 is the matrix of the inverse operator 𝒜−1.
E.3.2 Nonsingular Operators and Subspaces
If ℝ is a subspace of V n and 𝒜 is a nonsingular linear operator, then 𝒜ℝ is also sub-space. It consists of all vectors y = 𝒜x where x∈ℝ. The following properties hold for nonsingular operators acting on subspaces
If r 1, …, r N R are a basis in ℝ then 𝒜r 1,…, 𝒜r n r are an LI set and form a basis in 𝒜ℝ. Thus, nonsingular operators also have the property
E.4 Vectors in Phase Space
A set of values for the variables q 0,…, q D, p 0, …, p d of extended phase space defines a point in what is called a differentiable manifold. We want to establish a linear vector space such that the differentials of these variables d q 0,…, d q D, d p 0,…, d p D are components of vectors in an abstract linear vector space of dimension 2D + 2.169 Introducing (p.612) basis vectors q 0,…, q D,p 0,…, p d a typical vector may be written as
So far in this chapter, we have made no mention of inner products of vectors. The natural definition for the inner product in phase space is what will be called the symplectic inner product, or symplectic metric. It can be defined by defining the products of the basis vectors to be, for i, j = 0,…, (2D + 1),
Inner products or metrics introduced into linear vector spaces are required to obey certain properties.
1. Linearity: x ∙ (α y + β z) = α x ∙ y + β x ∙ z.
2. Non-degeneracy: x ∙ y = 0 for all x ∈ V n if and only if y = 0.
3. Symmetry: x ∙ y = y ∙ x
4. Positive definiteness: If x ≠ 0 then x ∙ x > 0.
The inner product defined in eqn (E.15) is assumed to have the first property. It has the second property since the matrix s is nonsingular. But it does not satisfy properties 3 and 4. Equation (E.15) therefore does not define a proper metric but what is referred to in mathematics texts as a structure function. However, we will continue to refer to it as a metric, following the somewhat looser usage in physics. (The inner product of two timelike or lightlike fourvectors in Minkowski space, for example, violates property 4 and yet g ij is generally called a metric.)
Using the assumed linearity of the inner product, we can write the inner product of any two vectors as
3′. Anti-symmetry: d γ ∘ d ϕ = −d ϕ ∘ d γ
4′. Nullity: d γ ∘ d γ = 0 for every vector d γ.
E.5 Canonical Transformations in Phase Space
In Section 19.3 a canonical transformation q, p → Q, P was written in symplectic coordinates as γ → Γ. Then a (2D + 2) × (2D + 2) Jacobian matrix with components J ij = ∂Γi(γ)/∂ γ j was defined. It follows from the chain rule that the differentials of (p.613) the symplectic coordinates transform with this Jacobian matrix J just as the derivatives were shown to do in eqn (19.24)
In discussing canonical transformations in phase space it is simplest to adopt the active definition of canonical transformations discussed in Section 20.14. Thus the transformation of differentials in eqn (E.17) is assumed to transform the vector d γ in eqn (E.14) into a new vector
The symplectic inner product defined in Section E.4 has the following important property.
Theorem E.5.1: Invariance of Symplectic Inner Product
The symplectic inner product is invariant under canonical transformations. If 𝒥 is the operator of any canonical transformation and d γ and d ϕ are any two phase-space vectors, then
Proof: Writing eqn (E.19) in component form, we want to prove that
E.6 Orthogonal Subspaces
If two phase-space vectors have a zero inner product d γ ∘ d ϕ = 0 then we will say that they are orthogonal. But we must note that “orthogonality” in a vector space with a symplectic metric will not have the same geometrical meaning as in the space of three-vectors. For example, as seen in property 4′ of Section E.4, every vector is orthogonal to itself! We will adopt the notation d γ ⊥ d ϕ to indicate that the vectors are orthogonal and obey d γ o d ϕ = 0. Thus d γ ⊥ d γ for every vector d γ.
This idea of orthogonality can be extended to subspaces. If we have two subspaces ℝ and 𝕊 then we will write ℝ ⊥ 𝕊 if d r o d s = 0 for every d r ∈ ℝ and d s ∈ 𝕊. Also a (p.614) subspace can be self-orthogonal, with ℝ ⊥ ℝ, which means that d x o d y = 0 for every d x and d y in ℝ.
It follows from Theorem E.5.1 that mutual- and self-orthogonality of subspaces is invariant under canonical transformations. If, as we did in Section E.3, we denote by 𝒥ℝ the set of all vectors d x = 𝒥d r where d r ∈ ℝ, then it follows that for any canonical transformation 𝒥 and any subspaces ℝ and 𝕊
Lemma 19.7.1 proved that the matrix of any canonical transformation is nonsingu-lar. Hence eqn (E.11) applies to canonical transformations, with the result that
E.7 A Special Canonical Transformation
In Section 20.6 we defined a transformation (linear, with constant coefficients) from the variables Q, P to the new variables X, Y. Taking the differentials of the defining eqn (20.40) it can be written as
(p.615) E.8 Special Self-Orthogonal Subspaces
Before proving the main theorems of this chapter, we require some preliminary definitions and lemmas.
1. Define a subspace ℚ to be all vectors of the form . As can be confirmed using eqn (E.16), ℚ Γ ℚ. Since ℚ is spanned by (D + 1) basis vectors, dim (ℚ) = D + 1.
2. Define a subspace ℙ to be all vectors of the form . As can be confirmed using eqn (E.16), ℙ⊥ ℙ. Since ℙ is spanned by (D + 1) basis vectors, dim(ℙ) = D + 1.
3. Given any canonical transformation J, define a subspaces 𝕄 = Jℚ and ℕ = Jℙ. It follows from the results in Section E.6 that 𝕄⊥𝕄, ℕ ⊥ ℕ, and dim(𝕄) = D + 1 = dim(ℕ).
4. Given any canonical transformation J, and the special canonical transformation A for any choice of the α k, define subspaces 𝕎 = AM = AJℚ and ℤ = 𝒜ℕ = AJℙ. It follows from the results in Section E.6 that 𝕎 ⊥ 𝕎, ℤ ⊥ ℤ, and dim(𝕎) = D + 1 = dim(ℤ).
5. Given the special canonical transformation 𝒜 for any choice of the α k, define the subspaces 𝔼 = 𝒜−1ℚ and 𝔽 = 𝒜 −1ℙ. It follows from the results in Section E.6 that 𝔼⊥𝔼, 𝔽 ⊥ 𝔽, and dim(𝔼) = D + 1 + dim(𝔼).
6. By construction ℚ ∩ ℙ = 0 and ℚ + ℙ = V(2 d+2) where V(2 d+2) denotes the whole of the vector space, of dimension (2D + 2). Thus ℚ and ℙ are complements relative to V(2 d+2), denoted . It then follows from eqn (E.12) that the following pairs are also complements: .
7. Define an operator S whose associated matrix is s, the symplectic matrix. Then, as can be verified by writing out the component equation, ℙ = Sℚ. From the definitions 5 (𝔼 = 𝒜−1ℚ and 𝔽 = 𝒜−1ℙ) it follows that 𝔽 = 𝒜−1 𝒮𝒜𝔼 = 𝒜T 𝒮𝒜𝔼 = S𝔼 where we used the orthogonality of A to write A −1 = 𝒜 A T and the Lagrange bracket condition eqn (19.54) to write A T S A = S for the canonical transformation A.
The scheme of subspace relations can be summarized as
The following lemma will also be required.
Lemma E.8.1: Maximum Dimension of Self-Orthogonal Subspace
If ℝ is any self-orthogonal subspace, with ℝ ⊥ ℝ, then ℝ ∩ (𝒮 ℝ ) = 0 and N r = dim( ℝ ) ≤ (D + 1).
(p.616) Proof: The operator 𝒮 defined in property 7 above is nonsingular and is a canonical transformation. Its nonsingularity is proved in Section 19.4. As is proved in eqn (19.29), 𝒮 is also orthogonal, with 𝒮−1 = 𝒮T. Thus SSS T = 𝒮, which shows that 𝒮 satisfies the Poisson bracket condition eqn (19.37) and is a canonical transformation.
Let d x be any vector in ℝ ∩ (𝒮ℝ), so that d x is in both ℝ and 𝒮ℝ. Since d x ∈ ℝ, it follows that Sd x is in 𝒮ℝ. But d x is also in 𝒮ℝ. Hence both d x and Sd x are in 𝒮ℝ.
It follows from eqn (E.23) that 𝒮ℝ ⊥ 𝒮ℝ. Hence d x o Sd x = 0 must hold. Writing this equation out in terms of components and using (s)2 = − U gives
E.9 Arnold's Theorem
The following theorem allows us to prove that every canonical transformation has a mixed generating function.
Theorem E.9.1: Arnold's Theorem
If ℕ is any self-orthogonal subspace of dimension (D +1), there is some choice of the binary digits α k used to define A in Section E.7 such that ℕ and the self-orthogonal subspace 𝔼 defined in property 5 of Section E.8 are disjoint,
Starting with ℕ and the self-orthogonal subspace ℙ defined in property 2 of Section E.8, consider the intersection ℕ ∩ ℙ. This intersection is a self-orthogonal subspace since (ℕ ∩ ℙ) ⊂ ℙ and ℙ is self-orthogonal. Denote the dimension of ℕ ∩ ℙ by n. It thus has a basis consisting of an LI set of vectors x 0,…, x (n−1). Since ℕ(ℕ ∩ ℙ) ⊂ ℙ, this basis can be extended to a basis for ℙ by adding vectors p k n,…, p k D selected from the full ℙ basis p 0,…, p das needed. Then
(p.617) Choose the binary digits α k so that 1 = α k 0 = … = α k(n−1) and 0 = α k n = Ȧ = α k Dwhere k 0,…, k(n−1) are all those indices not selected for the p k i in eqn (E.33). (The k o,…, k D are therefore an arrangement of the integers 0;1,…, D.) With this definition of the α k, the definitions in Section E.7 may be used to verify that the (D +1)-dimensional subspace 𝔼 = 𝒜−1ℚ will be spanned by the basis
Inspection of eqn (E.34) shows that the intersection 𝔼 ∩ ℙ is a self-orthogonal sub-space of dimension (D + 1 − n) spanned by . Also, we have already seen that ℕ ∩ ℙ is a self-orthogonal subspace of dimension n spanned by x 0,…, x(n−1). Thus, eqn (E.33) shows that ℙ has a basis consisting of a basis of ℕ ∩ ℙ concatenated with a basis of 𝔼 ∩ ℙ. It follows from the discussion in Section E.2.1 that ℕ ∩ ℙ and 𝔼 ∩ ℙ are complements relative to ℙ and that
Since (ℕ ∩ ℙ) ⊂ ℙ and ℙ is self-orthogonal, it follows that (ℕ ∩ ℙ) ⊥ ℙ. Similarly, (𝔼 ∩ ℙ) ⊥ ℙ. Also, since ℕ ∩ 𝔼 and ℕ ∩ ℙ are both contained in the self-orthogonal subspace ℕ, it follows that (ℕ ∩ 𝔼) ⊥ (ℕ ∩ ℙ). Similarly, (ℕ ∩ 𝔼) ⊥ (𝔼 ∩ ℙ) since 𝔼 is self-orthogonal. Therefore
Now ℙ is a self-orthogonal subspace of dimension (D + 1). If ℕ ∩ 𝔼 ≠ 0, and if it is not true that (ℕ ∩ 𝔼) ⊂ ℙ, then there will be a vector not in ℙ that is symplectically orthogonal to all vectors in ℙ. This would constitute a self-orthogonal subspace of dimension (D + 2) which is proved impossible by Lemma E.8.1. Thus, whether ℕ ∩ 𝔼 = 0 or not, it is true that
E.10 Existence of a Mixed Generating function
In Section 20.9 we demonstrated that every canonical transformation can be generated by a mixed generating function F(q, Y). A crucial point in that demonstration was the assertion, without proof, that the binary number α defined in Section 20.6 can always be chosen so that the matrix (∂ Y/∂ p) defined in eqn (20.64) is nonsingular. Using Arnold's theorem, we can now provide the proof of this assertion.
(p.618) Theorem E.10.1: Existence of Mixed Generating function
Assuming a general canonical transformation q, p → Q, P, the variables X, Y were defined in eqn (20.63) as
Proof: As in Theorem E.9.1, we continue to use the subspace definitions from Section E.8. From Arnold's theorem, we know that there is some choice of α suchthat ℕ∩𝔼 = 0. Multiplying by the nonsingular operator 𝒜 defined in Section E.7 and using eqn (E.10) gives
By its definition in property 4 of Section E.8, ℤ = AJℙ, where ℙ is the subspace defined in property 2, consisting of vectors of the form . Using the definition of symplectic coordinates from eqn (E.26), vectors in ℤ are
Now suppose that at least one of the dp i is nonzero, so that [dp] ≠ 0. If [dY (p)] = 0 in that case, it would follow that a vector in ℤ would be of the form . But such a vector is entirely expressed in the q 0,…, q D basis and therefore is a vector in ℚ. This is impossible, since, by eqn (E.41), there are no vectors in both ℤ and ℚ. Therefore it is impossible to have [dp] ≠ 0 and [dY (p)] = 0.
(167) See Chapter 2 of Mirsky (1961) or Chapter 7 of Birkhoff and MacLane (1977). Different texts differ as to which are the axioms and which are the derived properties. However, there is general agreement that linear vector spaces do satisfy the whole list of axioms and derived properties given here.
(168) It might appear that a distributive rule would hold for linear sums and intersections, but it does not. In general ℝ ∩ ( 𝕊 + 𝕋) ≠ ℝ∩𝕊 + ℝ∩𝕋. For example, consider vectors in the Cartesian x–y plane V 2. Let ℝ be the one-dimensional subspace with Cartesian components (r, r), 𝕊 with (s, 0), and 𝕋 with (0, t). Then ℝ ∩ 𝕊 = 0 and ℝ ∩ 𝕋 = 0, so (ℝ ∩, 𝕊 + ℝ ∩ 𝕋) = 0. But ℝ ∩ (𝕊 + 𝕋) = ℝ since (𝕊 + 𝕋) spans the whole ofV 2.
(169) In modern differential geometry, a curve through a given point of the manifold is defined by writing γ i = γ i (β) for each coordinate i = 0, …, (2D + 1). Then the derivatives d γ i (β)/d β, taken at the given point, are used as the components of what is called a tangent vector. The differentials that we use for vector components are related to these tangent vectors by d γ i = (d γ i (β)/d β)d β. In either case, the geometrical idea is that the d γ is to represent an arbitrary displacement starting from some point of the manifold. Since we have used differentials (in the sense of the word defined in Section D.12) throughout this text, we will continue to use them here. Those more familiar with the differential-geometric notation may mentally replace the d q k, etc., by components of the corresponding tangent vectors.