(p.575) APPENDIX E GEOMETRY OF PHASE SPACE
(p.575) APPENDIX E GEOMETRY OF PHASE SPACE
(p.575) APPENDIX E
GEOMETRY OF PHASE SPACE
In Section 18.9, we asserted without proof that a binary number α can always be found that will make the matrix (∂Y/∂p) defined in eqn (18.64) nonsingular. Proof of that fact requires an excursion into the geometry of phase space.
In the present reference chapter we define general abstract linear vector spaces and state some of their important properties. We then apply the general theory to define a vector space in phase space, and prove some theorems of importance for the theory of canonical transformations.
E.1 Abstract Vector Space
A linear vector space is a set of objects called vectors that obey the following axioms, abstracted from the properties of the three-dimensional Cartesian displacement vectors defined in Appendix A. In the following, we will denote vectors in an abstract vector space by the same bold, serif typeface as has been used for the special case of Cartesian threevectors.
1. Closure under addition and scalar multiplication: If v and w are any vectors, then addition is defined by some rule such that v + w is also a vector. If α is any scalar (assumed here to be real numbers) and v any vector, then scalar multiplication is defined by some rule such that αv and vα are also vectors.
2. Commutativity: v+ w= w+ v and αv= vα.
3. Associativity of addition: (u+ v) + w= u + (v+ w).
4. Associativity of scalar multiplication: (αβ)v = α(βv).
5. Existence of additive identity: There is a vector ⊘ such that every vector v satisfies the equation v + ⊘ = v. Vector ⊘ is also referred to as the null vector.
6. Existence of additive inverse: For every vector v, there is a vector denoted −v that is a solution of the equation v + (−v) = ⊘.
7. Linearity in scalars: (α + β)v = αv+ βv.
8. Linearity in vectors:α(v+ w) = αv+ αw.
9. Multiplication by one: 1v= v.
E.1.2 Derived properties
The axioms130 can be used to prove the following additional properties of linear vector spaces:
1. The additive identity is unique: If v+ u = v for all vectors v,then u = ⊘.
2. The additive inverse is unique: If v+ u = ⊘, then u = −v.
3. Law of cancellation: v+ u = v+ w implies u = w.
4. Unique solution: If u satisfies the equation v+ u = w then u = w+ (−v).
5. Multiplication of a vector by scalar zero: 0v = ⊘.
6. Multiplication of additive identity by a scalar: α ⊘ = ⊘.
7. Multiplication by a negative number: (−1)v =−v.
8. Inverse of the inverse:−(−v) = v.
9. Simplified notations: Since the above axioms and additional properties imply that αv + β(−w) = αv + (−β)w, we need not be so careful about placement of the minus sign. Both of these equal expressions are commonly denoted simply as αv − βw. Thus the solution in derived property 4 is written u = w− v. Also, properties 5 and 6 allow one to drop the distinction between the scalar 0 and the additive identity ⊘. The equation in axiom 5 becomes v+ 0= v. Leading zeros are usually dropped; for example, 0− v is written −v.
E.1.3 Linear independence and dimension
The set of vectors v 1, v 2,…, v r is linearly dependent (LD) if and only if there exist scalars α1,α2,…,αr, not all zero, such that
In the opposite case, the set is linearly independent (LI) if and only if eqn (E.1) implies that α1 = α2 = … = αr = 0.
The vector space has dimension N if and only if it contains an LI set of N vectors, but every set in it with more than N vectors is LD. The dimension of a linear vector space is often indicated by labeling the space as VN.
Let e 1, e 2,…, e N be any LI set of N vectors in a vector space VN of dimension N. Such an LI set is called a basis for the vector space, and is said to span it. Any vector x in VN can be expanded as
The numbers xi are called components of vector x relative to this basis. For a given basis, the components xi of a vector x are uniquely determined.
A vector space may have many different bases. But all of these bases will have the same number of LI vectors in them. And the vector space can be specified uniquely by listing any one of its bases.
Vector equations are equivalent to equations between components (in a given basis). The equation u = αv+ βw is true if and only if ui = αυi+ βw i is true for all i = 1,…, N. This latter relation is often written as an equation of column matrices.
(p.577) If one defines
with a similar definition for other vectors, the equality of components may be written as [u] = α[υ]+ β[w].
Any LI set of vectors x 1, x 2,…, x m for m ≤ N can be extended to a basis in VN. If e 1, e 2,…, e N is some known basis for VN then vectors can be selected from it so that is an LI set of N vectors and so forms a basis in VN.
A subspace of VN is a linear vector space all of whose member vectors are also members of VN. Denoting a subspace as ℝ,spaces that if x and y are members of ℝ, it follows from the closure axiom for vector spaces that if x and y are members of ℝ,then (αx + βy) must also be a member of ℝ. Thus subspaces are different from subsets in set theory. If one constructs a subspace by listing a set of vectors, then the subspace must also contain all linear sums of those vectors and not just the vectors in the original list.
Subspaces may be specified by listing an LI set of vectors that spans the space. For example, if VN has a basis e 1,…, e N then the set of all vectors x = αe 1 + βe 4+ γ e 5, where α, β, γ may take any values, forms a subspace.
Subspaces may also be specified by stating some criterion for the inclusion of vectors in the subspace. For example, the set of threevectors with components(0, y, z) forms a subspace of V 3, but the set with components (1, y, z) does not since the sum of two such vectors does not have a 1 as its first component.
The null vector is a member of every subspace. If ℝ contains only the null vector, then we write ℝ = 0.
E.2.1 Linear Sum and Intersection
If ℝ and S are subspaces of VN, then we define their linear sum (ℝ + S) as the set of all vectors x = r+ s where r ∈ ℝ (which is read as “r is a member of ℝ”) and s ∈ S The linear sum (ℝ+S) of two subspaces is itself a subspace of VN.
If ℝ and S are subspaces of VN, then we define their intersection (ℝ ∩ S) as teh set of all vectors x that are members of both subspaces, x ∈ ℝ and x ∈ S. The intersection (ℝ ∩ S) of two subspaces is itself a subspace of VN.131
Subspaces are vector spaces. Just as was done for VN, the dimension NR of ℝ is defined as the number of vectors in the largest LI set it contains, and this set then (p.578) a basis for ℝ. The notation dim(ℝ) = NR will also be used, meaning that the dimension of ℝ is NR. This notation can be used to write the dimension rule,
If (ℝ ∩ S) = 0 then we say that ℝ and S are disjoint. Two subspaces are disjoint when their only common vector is the null vector.
In the special case in which (ℝ + S)= VN and (ℝ ∩S) = 0, we say that ℝ and are S complementary subspaces or complements relative to VN. This complementarity relation will be denoted S. Complementary subspaces are disjoint and their linear sum is the whole of the space VN. Since the empty subspace (consisting only of the null vector) has zero dimension by definition, the dimension rule gives N = NR+ NS for complements.
If ℝ and S are complements, and if ℝ has a basis r 1,… r NR and S has a basis s 1,…, s NS, then the set is an LI set and forms a basis for VN. It follows that every vector x ∈ VN can then be written as x = r+ s where r and s are unique vectors in ℝ and S, respectively.
Conversely, suppose that VN has some basis e 1, e 2,…, e N and we segment that basis into two disjoint LI sets of vectors, say e 1,…, e n and e n+1,…, e N. Then if we define ℝ to be the subspace spanned by e 1,…, e n and define S to be the subspace spanned by e n+1,…, e N, it follows that ℝ and S are complements relative to VN
Every subspace ℝ has some subspace S such that and are complements. But this S is not unique. So it is incorrect to speak of “the” complement of a given subspace.
The notation ℝ ⊃ S means that every x ∈ S is also a member of ℝ. We say that ℝ contains S. Hence (ℝ + S) ⊃ S, (ℝ + S) ⊃ ℝ, ℝ ⊃ (ℝ ∩ S),S ⊃ (ℝ ∩ S).
E.3 Linear Operators
A linear operator in vector space VN operates on vectors x ∈ VN and produces other vectors y ∈ VN. We write this operation as y = x. The assumed linearity property is
It follows that, for a given basis e 1,…, e N, there is a unique N-rowed square matrix A associated with each operator . Sinceany x ∈ VN may be expanded in the basis, as x = x 1 e 1+ ··· + xN e N it follows from linearity that
But A e k is also a vector in VN and hence may also be expanded in the basis, as
(p.579) where the matrix elements Aik are uniquely determined. Then
together with the uniqueness of the expansion of y in the basis implies that
E.3.1 Nonsingular Operators
Since canonical transformations are nonsingular, we will be interested here in nonsingular operators. By definition, an operator is nonsingular if it satisfies the condition that x = 0 only when x = 0.
An operator is nonsingular if and only if it possesses an inverse. For nonsingular operators, the equation y = x can be solved uniquely for x. This inverse relation is denoted as x = −1 y where the operator −1 is called the inverse of . The inverse −1 is also nonsingular.
In matrix form, the definition of a nonsingular operator translates to the requirement that A[x]= 0 only when[x]= 0. In Corollary B.19.2, such nonsingular matrices were shown to have a nonzero determinant and hence to possess an inverse matrix A −1. If A is the matrix of in some basis, then A −1 is the matrix of the inverse operator −1.
E.3.2 Nonsingular Operators and Subspaces
If ℝ is a subspace of VN and is a nonsingular linear operator, then ℝ is also subspace. It consists of all vectors y = x where x ∈ ℝ. The following properties hold for nonsingular operators acting on subspaces
If are a basis in r 1,. .., r NR are an LI set and form a basis in ℝ. Thus, nonsingular operators also have the property
(p.580) E.4 Vectors in Phase Space
A set of values for the variables q 0,…, qD,p 0,…, pD of extended phase space defines a point in what is called a differentiable manifold. We want to establish a linear vector space such that the differentials of these variables dq 0,…,dqD,dp 0,…,dpD are components of vectors in an abstract linear vector space of dimension 2D+ 2.132 Introducing basis vectors q 0,…, q D,p 0,…, p D a typical vector may be written as
where q 0,…, q D =γ0,…, γD and p 0,…, p D =γ(D+1),…, γ(2D+1).
So far in this chapter, we have made no mention of inner products of vectors. The natural definition for the inner product in phase space is what will be called the symplectic inner product, or symplectic metric. It can be defined by defining the products of the basis vectors to be, for i, j = 0,…,(2D+ 1),
where sij are the matrix elements of the symplectic matrix s defined in Section 17.4. We use the small circle between the vectors rather than the usual dot to emphasize that this is not the standard Cartesian form of the inner product.
Inner products or metrics introduced into linear vector spaces are required to obey certain properties.
1. Linearity: x· (αy+ βz) = αx· y+ βx· z.
2. Non-degeneracy: x· y = 0 for all x∈ VN if and only if y = 0.
3. Symmetry: x· y = y· x
4. Positive definiteness: If x ≠ 0 then x· x > 0.
The inner product defined in eqn (E.15) is assumed to have the first property. It has the second property since the matrix s is nonsingular. But it does not satisfy properties 3 and 4. Equation (E.15) therefore does not define a proper metric but what is referred to in mathematics texts as a structure function. However, we will continue to refer to it as a metric, following the somewhat looser usage in physics. (The inner product of two timelike or lightlike fourvectors in Minkowski space, for example, violates property 4 and yet gij is generally called a metric.)
(p.581) Using the assumed linearity of the inner product, we can write the inner product of any two vectors as
from which it follows that properties 3 and 4 above are replaced by:
3′. Anti-symmetry: dγ ◦ dϕ =−dϕ ◦ dγ
4′. Nullity: dγ ◦ dγ = 0 for every vector dγ.
E.5 Canonical Transformations in Phase Space
In Section 17.3 a canonical transformation q, p→ Q, P was written in symplectic coordinates as γ → Г. Then a (2D+ 2) × (2D+ 2) Jacobian matrix with components Jij = ∂Гi(γ)/∂ γj was defined. It follows from the chain rule that the differentials of the symplectic coordinates transform with this Jacobian matrix J just as the derivatives were shown to do in eqn (17.24)
In discussing canonical transformations in phase space it is simplest to adopt the active definition of canonical transformations discussed in Section 18.14. Thus the transformation of differentials in eqn (E.17) is assumed to transform the vector dγ in eqn (E.14) into a new vector
in the same basis. As is always the case for active transformations, the vector dγ is transformed into a new vector d Г while the basis vectors γi are not transformed.
The symplectic inner product defined in Section E.4 has the following important property.
Theorem E.5.1: Invariance of Symplectic Inner Product
The symplectic inner product is invariant under canonical transformations. If J is the operator of any canonical transformation and dγ and dϕ are any two phase-space vectors, then
Proof: Writing eqn (E.19) in component form, we want to prove that
(p.582) is equal to
The Lagrange bracket condition, eqn (17.54), states that any canonical transformation has J T s J= s, which proves the present theorem.
E.6 Orthogonal Subspaces
If two phase-space vectors have a zero inner product dγ ◦ dϕ = 0 then we will say that they are orthogonal. But we must note that “orthogonality” in a vector space with a symplectic metric will not have the same geometrical meaning as in the space of threevectors. For example, as seen in property 4′ of Section E.4, every vector is orthogonal to itself! We will adopt the notation dγ ⊥ dϕ to indicate that the vectors are orthogonal and obey dγ ◦ dϕ = 0. Thus dγ ⊥ dγ for every vector dγ.
This idea of orthogonality can be extended to subspaces. If we have two subspaces ℝ and Sg then we will write ℝ ⊥ S if d r ◦ d s = 0 for every d r ∈ ℝ and d s ∈ S. Also a subspace can be self-orthogonal, with ℝ ⊥ ℝ, which means that d x◦ d y = 0 for every d x and d y in ℝ.
It follows from Theorem E.5.1 that mutual- and self-orthogonality of subspaces is invariant under canonical transformations. If, as we did in Section E.3, we denote by the set of all vectors d x = d r where d r ∈ ℝ, then it follows that for any canonical transformation and any subspaces ℝ andS
Thus, in the special case in which ℝ = S, we also have that
for any subspace ℝ.
E.7 A Special Canonical Transformation
In Section 18.6 we defined a transformation (linear, with constant coefficients) from the variables Q, P to the new variables X, Y. Taking the differentials of the defining eqn (18.40) it can be written as
If we define symplectic coordinates dГ0,…, dГ(2D+1) =dQ 0,…,dQD,dP 0,…,dPD as was done in Section 17.3 and also define dΛ0,…, dΛ(2D+1) = dX 0,…,dXD, (p.583) dY 0,…,dYD using a similar pattern, then eqn (E.25) may be written as
where the (2D + 2)×(2D + 2) matrix A is composed of diagonal (D + 1) × (D + 1) blocks a and ā. These blocks are defined in terms of the αk binary digits in eqn (18.38) by aij = δijαi andāij = δijᾱj. The matrix A is
As the reader may verify, the matrix A is both canonical (obeying the definition eqn (17.37), for example) and orthogonal. Thus the operator in nonsingular, and defines an active canonical transformation. The inverse matrix is
E.8 Special Self-Orthogonal Subspaces
Before proving the main theorems of this chapter, we require some preliminary definitions and lemmas.
1. Define a subspace ℚ to be all vectors of the form . As can be confirmed using eqn (E.16), ℚ ⊥ ℚ. Since ℚ is spanned by (D +1) basis vectors, dim(ℚ)= D + 1.
2. Define a subspace ℙ to be all vectors of the form . As can be confirmed using eqn (E.16), ℙ ⊥ ℙ. Since ℙ is spanned by (D + 1) basis vectors, dim(ℙ)= D + 1.
3. Given any canonical transformation , define a subspaces M = ℚ and ℕ=ℙ. It follows from the results in Section E.6 that M ⊥ M, ℕ ⊥ ℕ, and dim(M) = D + 1= dim(ℕ).
4. Given any canonical transformation , and the special canonical transformation for any choice of the αk, define subspaces W = M = ℚ and ℤ= ℕ = ℙ. It follows from the results in Section E.6 that W ⊥ W, ℤ⊥ℤ, and dim(W) = D + 1= dim(ℤ).
5. Given the special canonical transformation for any choice of the αk,define the subspaces E= −1ℚ and F = −1. It follows from the results in Section E.6 that E⊥E, F⊥F, and dim(E) = D + 1= dim(F).
6. By construction ℚ∩ℙ = 0 and ℚ+ℙ = V (2D +2) where V (2D +2) denotes the whole of the vector space, of dimension (2D + 2). Thus ℚ and ℙ are complements relative to V (2D +2), denoted ℚ ℙ. It then follows from eqn (E.12) that the following pairs are also complements: M ℕ, W ℤ, E F.
7. Define an operator S whose associated matrix is s, the symplectic matrix. Then, as can be verified by writing out the component equation, ℙ= Sℚ. From the definitions 5 (E= −1ℚ and F= −1ℙ) it follows that F= −1 SE = T SE = SE where we used theorthogonality of A to write A−1 = AT and the Lagrange bracket condition eqn (17.54) to write ATSA= S for the canonical transformation A.
The scheme of subspace relations can be summarized as
where subspaces in the upper row and the corresponding subspaces in the lower row are all complements.
The following lemma will also be required.
Lemma E.8.1: Maximum Dimension of Self-Orthogonal Subspace
If ℝ is any self-orthogonal subspace, with ℝ⊥ℝ ,thenℝ∩ (Sℝ) = 0 and NR = dim(ℝ) ≤ (D + 1).
Proof: The operator S defined in property 7 above is nonsingular and is a canonical transformation. Its nonsingularity is proved in Section 17.4. As is proved in eqn (17.29), S is also orthogonal, with S −1 = ST. Thus SSS T = S which shows that S satisfies the Poisson bracket condition eqn (17.37) and is a canonical transformation.
Let d x be any vector in ℝ∩ (Sℝ),so that d x is in both ℝ and Sℝ. Since d x∈ℝ, it follows that S d x is in Sℝ. But d x is also in Sℝ. Hence both d x and S d x are in Sℝ.
It follows from eqn (E.23) that Sℝ⊥ Sℝ. Hence d x0 S d x = 0 must hold. Writing this equation out in terms of components and using (s)2 =− U gives
which implies dxi = 0 for every i value. Hence d x is the null vector, which implies that ℝ∩ (Sℝ) = 0.
But every subspace in V (2D +2) must have a dimension less than or equal to (2D + 2). Thus 2NR≤ (2D + 2) and so NR≤ (D + 1) as was to be proved.
(p.585) E.9 Arnold’s Theorem
The following theorem allows us to prove that every canonical transformation has a mixed generating function.
Theorem E.9.1: Arnold’s Theorem
If ℕ is any self-orthogonal subspace of dimension (D + 1), there is some choice of the binary digits αk used to define in Section E.7 such that ℕ and the self-orthogonal subspace E defined in property 5 of Section E.8 are disjoint,
Starting with ℕ and the self-orthogonal subspace ℙ defined in property 2 of Section E.8, consider the intersection ℕ∩ℙ. This intersection is a self-orthogonal subspace since (ℕ∩ℙ) ⊂ℙ and ℙ is self-orthogonal. Denote the dimension of ℕ∩ℙ by n. It thus has a basis consisting of an LI set of vectors x 0,…, x (n−1). Since(ℕ∩ℙ) ⊂ℙ, this basis can be extended to a basis for ℙ by adding vectors p kn,…, p kD selected from the full ℙ basis p 0,…, p D as needed. Then
are (D + 1) linearly independent vectors that form a basis for ℙ.
Choose the binary digits αk so that and where k 0,…, k (n−1) are all those indices not selected for the p ki in eqn (E.33). (The k 0,…, kD are therefore an arrangement of the integers 0,1,…, D.) With this definition of the αk, the definitions in Section E.7 may be used to verify that the (D + 1)-dimensional subspace = −1ℚ will be spanned by the basis
Using this definition of E, we now proceed to the proof of eqn (E.32).
Inspection of eqn (E.34) shows that the intersection E∩ℙ is a self-orthogonal subspace of dimension (D + 1− n) spanned by . Also, we have already seen that ℕ∩ℙ is a self-orthogonal subspace of dimension n spanned by x 0,…, x (n−1). Thus, eqn (E.33) shows that ℙ has a basis consisting of a basis of ℕ∩ℙ concatenated with a basis of E∩ℙ. It follows from the discussion in Section E.2.1 that ℕ∩ℙ and E∩ℙ are complements relative to ℙ and that
Since (ℕ∩ℙ) ⊂ℙ and ℙ is self-orthogonal, it follows that (ℕ∩ℙ) ⊥ℙ. Similarly, (E∩ℙ)⊥ℙ. Also, since ℕ∩E and ℕ∩ℙ are both contained in the self-orthogonal
(p.586) subspace ℕ, it follows that (ℕ∩E) ⊥ (ℕ∩ℙ). Similarly, (ℕ∩E) ⊥ℙ (E∩ℙ) since E is self-orthogonal. Therefore
Now ℙ is a self-orthogonal subspace of dimension (D + 1). If ℕ∩E ≠ 0, and if it is not true that (ℕ∩E) ⊂ℙ, then there will be a vector not in that is symplectically orthogonal to all vectors in ℙ. This would constitute a self-orthogonal subspace of dimension(D +2) which is proved impossible by Lemma E.8.1. Thus, whether ℕ∩E = 0 or not, it is true that
Thus, using eqn (E.35),
as was to be proved.
E.10 Existence of a Mixed Generating Function
In Section 18.9 we demonstrated that every canonical transformation can be generated by a mixed generating function F(q, Y). A crucial point in that demonstration was the assertion, without proof, that the binary number α defined in Section 18.6 can always be chosen so that the matrix (∂Y/∂p) defined in eqn (18.64) is nonsingular. Using Arnold’s theorem, we can now provide the proof of this assertion.
Theorem E.10.1: Existence of Mixed Generating Function
Assuming a general canonical transformation q, p→ Q, P, the variables X, Y were defined in eqn (18.63) as
The digits αk of the binary number α = αD··· α1α0 used in this definition can always be chosen so that (D + 1)-dimensional square matrix (∂Y/∂p) defined by its matrix elements
will have a nonzero determinant and hence be nonsingular.
Proof: As in Theorem E.9.1, we continue to use the subspace definitions from Section E.8. From Arnold’s theorem, we know that there is some choice of α such that ℕ∩E = 0. Multiplying by the nonsingular operatorA defined in Section E.7 and using eqn (E.10) gives
By its definition in property 4 of Section E.8, ℤ= AJℙ where ℙ is the subspace defined in property 2, consisting of vectors of the form Using the definition of symplectic coordinates from eqn (E.26), vectors in ℤ are
where the components come from the matrix equation
Now suppose that at least one of the dpi is nonzero, so that[dp] ≠0. If[dY (p)]= 0 in that case, it would follow that a vector ℤ in would be of the form But such a vector is entirely expressed in the q 0,…, q D basis and therefore is a vector in ℚ. This is impossible, since, by eqn (E.41), there are no vectors in both ℤ and ℚ. Therefore it is impossible to have[dp] ≠ 0 and[dY (p)]= 0.
We have proved above that [dY (p)]= 0 implies[dp]= 0. By Corollary B.19.2, this is the necessary and sufficient condition for (∂Y/∂p) in eqn (E.44) to be nonsingular and have a nonzero determinant. Thus|∂Y/∂p|= 0 as was to be proved.
(130) See Chapter 2 of Mirsky (1961) or Chapter 7 of Birkhoff and MacLane (1977). Different texts differ as to which are the axioms and which are the derived properties. However, there is general agreement that linear vector spaces do satisfy the whole list of axioms and derived properties given here.
(131) It might appear that a distributive rule would hold for linear sums and intersections, but it does not. In general ℝ ∩(S +T) ≠ ℝ ∩ S + ℝ ∩ T. For example, consider vectors in the Cartesian x-y plane V 2. Let be the one-dimensional subspace with Cartesian components (r,r),S with (s,0),and T with (0, t). Then ℝ ∩ S = 0 and T ∩ T = 0, so (ℝ ∩ S + ℝ ∩ S)= 0. But ℝ ∩(S+T)= ℝ since (S+T) spans the whole of V 2.
(132) In modern differential geometry, a curve through a given point of the manifold is defined by writing γi = γi(β) for each coordinate i = 0,…,(2D + 1). Then the derivatives dγi(β)/dβ, taken at the given point, are used as the components of what is called a tangent vector. The differentials that we use for vector components are related to these tangent vectors by dγi = (dγi(β)/dβ) dβ. In either case, the geometrical idea is that the dγ is to represent an arbitrary displacement starting from some point of the manifold. Since we have used differentials (in the sense of the word defined in Section D.12) throughout this text, we will continue to use them here. Those more familiar with the differential-geometric notation may mentally replace the dqk, etc., by components of the corresponding tangent vectors.