John Garrison and Raymond Chiao

Print publication date: 2008

Print ISBN-13: 9780198508861

Published to Oxford Scholarship Online: September 2008

DOI: 10.1093/acprof:oso/9780198508861.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2016. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 08 December 2016

(p.645) Appendix A Mathematics

Source:
Quantum Optics
Publisher:
Oxford University Press

(p.645) Appendix A

Mathematics

Mathematics Hilbert spaces

A.1 Vector analysis

Our conventions for elementary vector analysis are as follows. The unit vectors corresponding to the Cartesian coordinates x, y, z are u x, u y, u z. For a general vector v, we denote the unit vector in the direction of v by v∼ = v/ |v|.

The scalar product of two vectors is a · b = axbx + ayby + azbz, or

$Display mathematics$
(A.1)

where (a 1, a 2, a 3) = (ax,ay,az), etc. Since expressions like this occur frequently, we will use the Einstein summation convention: repeated vector indices are to be summed over; that is, the expression aibi is understood to imply the sum in eqn (A.1). The summation convention will only be employed for three‐dimensional vector indices. The cross product is

$Display mathematics$
(A.2)

where the alternating tensor ϵijk is defined by

$Display mathematics$
(A.3)

A.2 General vector spaces

A complex vector space is a set ℏ on which the following two operations are defined.

1. (1) Multiplication by scalars. For every pair (α, ψ), where α is a scalar, i.e. a complex number, and ψ ϵ ℏ, there is a unique element of ℏ that is denoted by αψ.

2. (2) Vector addition. For every pair ψ, φ of vectors in ℏ there is a unique element of ℏ denoted by ψ + φ.

The two operations satisfy (a) α(βψ) = (αβ) ψ, and (b) α (ψ + φ) = αψ + αφ. It is assumed that there is a special null vector, usually denoted by 0, such that α0 = 0 and ψ + 0 = ψ. If the scalars are restricted to real numbers these conditions define a real vector space.

(p.646) Ordinary displacement vectors, r, belong to a real vector space denoted by ℝ3. The set ℂn of n‐tuplets ψ = (ψ 1,…, ψn), where each component ψi is a complex number, defines a complex vector space with component‐wise operations:

$Display mathematics$
$Display mathematics$
(A.4)

Each vector in ℝ3 or ℂn is specified by a finite number of components, so these spaces are said to be finite dimensional.

The set of complex functions, C (ℝ), of a single real variable defines a vector space with point‐wise operations:

$Display mathematics$
(A.5)
$Display mathematics$
(A.6)

where α is a scalar, and ψ (x) and φ (x) are members of C (ℝ). This space is said to be infinite dimensional, since a general function is not determined by any finite set of values.

For any subset u ⊂ ℏ, the set of all linear combinations of vectors in u is called the span of u, written as span (u). A family B ⊂ ℏ is a basis for ℏ if ℏ = span (B), i.e. every vector in ℏ can be expressed as a linear combination of vectors in B. In this situation ℏ is said to be spanned by B.

A linear operator is a rule that assigns a new vector to each vector ψ ϵ ℏ, such that

$Display mathematics$
(A.7)

for any pair of vectors ψ and φ, and any scalars α and β. The action of a linear operator M on ℏ is completely determined by its action on the vectors of a basis B.

A.3 Hilbert spaces

A.3.1 Definition

An inner product on a vector space ℏ is a rule that assigns a complex number, denoted by (φ,ψ), to every pair of elements φ and ψ ϵ ℏ, with the following properties:

$Display mathematics$
(A.8a)
$Display mathematics$
(A.8b)
$Display mathematics$
(A.8c)
$Display mathematics$
(A.8d)

An inner product space is a vector space equipped with an inner product. The inner product satisfies the Cauchy—Schwarz inequality:

$Display mathematics$
(A.9)

Two vectors are orthogonal if (φ, ψ) = 0. If 𝕱 is a subspace of ℏ, then the orthogonal complement of 𝕱 is the subspace 𝕱 of vectors orthogonal to every vector in 𝕱. (p.647) The norm ‖ψ‖ of ψ is defined as,$| | ψ | | = ( ψ , ψ ) ,$ so that ‖ψ‖ = 0 implies ψ = 0. Vectors with ‖ψ‖ = 1 are said to be normalized. A set of vectors is complete if the only vector orthogonal to every vector in the set is the null vector. Each complete set contains a basis for the space. A vector space with a countable basis set, B = {φ(1),(2),…}, is said to be separable. The vector spaces relevant to quantum theory are all separable. A basis for which (φ(n), φ(m)) = δφ nm holds is called orthonormal. Every vector in ℏ can be uniquely expanded in an orthonormal basis, e.g.

$Display mathematics$
(A.10)

where the expansion coefficients are ψn = (φ(n),ψ).

A sequence ψ1, ψ2,…, ψk,… of vectors in ℏ is convergent if

$Display mathematics$
(A.11)

A vector ψ is a limit of the sequence if

$Display mathematics$
(A.12)

A Hilbert space is an inner product space that contains the limits of all convergent sequences.

A.3.2 Examples

The finite‐dimensional spaces ℝ3 and ℂN are both Hilbert spaces. The inner product for ℝ3 is the familiar dot product, and for ℂN it is

$Display mathematics$
(A.13)

If we constrain the complex functions ψ (x) by the normalizability condition

$Display mathematics$
(A.14)

then the Cauchy‐Schwarz inequality for integrals,

$Display mathematics$
(A.15)

is sufficient to guarantee that the inner product defined by

$Display mathematics$
(A.16)

makes the vector space of complex functions into a Hilbert space, which is called L 2(ℝ).

(p.648) A.3.3 Linear operators

Let A be a linear operator acting on ℏ then the domain of A, called D(A), is the subspace of vectors ψ ϵ ℏ such that ‖Aψ‖ < ∞. An operator A is positive definite if (ψ, Aψ) ≥ 0 for all ψ ϵ D(A), and it is bounded if ‖Atp‖ < b‖ψ‖, where b is a constant independent of ψ. The norm of an operator is defined by

$Display mathematics$
(A.17)

so a bounded operator is one with finite norm.

If Aψ = λψ, where ψ is a complex number and ψ is a vector in the Hilbert space, then ψ is an eigenvalue and ψ is an eigenvector of A. In this case λ is said to belong to the point spectrum of A. The eigenvalue λ is nondegenerate if the eigenvector ψ is unique (up to a multiplicative factor). If ψ is not unique, then λ is degenerate. The linearly‐independent solutions of Aψ = Aψ form a subspace called the eigenspace for λ, and the dimension of the eigenspace is the degree of degeneracy for λ. The continuous spectrum of A is the set of complex numbers λ such that: (1) λ is not an eigenvalue, and (2) the operator λ — A does not have an inverse.

The adjoint (hermitian conjugate) A of A is defined by

$Display mathematics$
(A.18)

and A is self‐adjoint (hermitian) if D(A) = D(A) and (φ, Aψ) = (Aφ, ψ). Bounded self‐adjoint operators have real eigenvalues and a complete orthonormal set of eigenvectors. For unbounded self‐adjoint operators, the point and continuous spectra are subsets of the real numbers. Note that (ψ, Aδ Aψ) = (φ, φ), where φ = Aψ, so that

$Display mathematics$
(A.19)

i.e. AA is positive definite.

$Display mathematics$
(A.20)

is called a projection operator; it has only a point spectrum consisting of {0,1}. Consider the set of vectors Pℏ, consisting of all vectors of the form as ψ ranges over ℏ. This is a subspace of ℏ, since

$Display mathematics$
(A.21)

shows that every linear combination of vectors in Pℏ is also in Pℏ. Conversely, let 𝕾 be a subspace of ℏ and {φ(n)} an orthonormal basis for 𝕾. The operator P, defined by

$Display mathematics$
(A.22)

is a projection operator, since

$Display mathematics$
(A.23)

Thus there is a one‐to‐one correspondence between projection operators and subspaces of ℏ. Let P and Q be projection operators and suppose that the vectors in Pℏ are (p.649) orthogonal to the vectors in Qℏ; then PQ = QP = 0 and P and Q are said to be orthogonal projections. In the extreme case that (𝕾 = ℏ), the expansion (A.10) shows that P is the identity operator, Pψ = ψ.

A self‐adjoint operator with pure point spectrum {λ12,…} has the spectral resolution

$Display mathematics$
(A.24)

where Pn is the projection operator onto the subspace of eigenvectors with eigenvalue λn. The spectral resolution for a self‐adjoint operator A with a continuous spectrum is

$Display mathematics$
(A.25)

where dΜ (λ) is an operator‐valued measure defined by the following statement: for each subset δ of the real line,

$Display mathematics$
(A.26)

is the projection operator onto the subspace of vectors ψ such that ‖ (λ − A)−1 ψ ‖ > ∞ for all λ ⊋ δ (Riesz and Sz.‐Nagy, 1955, Chap. VIII, Sec. 120).

A linear operator U is unitary if it preserves inner products, i.e.

$Display mathematics$
(A.27)

for any pair of vectors ψ, φ in the Hilbert space. A necessary and sufficient condition for unitarity is that the operator is norm preserving, i.e.

$Display mathematics$
(A.28)

The spectral resolution for a unitary operator with a pure point spectrum is

$Display mathematics$
(A.29)

and for a continuous spectrum

$Display mathematics$
(A.30)

A linear operator N is said to be a normal operator if

$Display mathematics$
(A.31)

The hermitian and unitary operators are both normal. The hermitian operators N1 = (N+ N) /2 and N2 = (N−N) /2i satisfy N = N1+ iN2 and [N1,N2] = 0. Normal operators therefore have the spectral resolutions

$Display mathematics$
(A.32)

(p.650) for a point spectrum, and

$Display mathematics$
(A.33)

for a continuous spectrum.

A.3.4 Matrices

A linear operator X acting on an N‐dimensional Hilbert space, with basis {f(1), …, f(N)}, is represented by the N × N matrix

$Display mathematics$
(A.34)

The operator and its matrix are both called X. The matrix for the product XY of two operators is the matrix product

$Display mathematics$
(A.35)

The determinant of X is defined as

$Display mathematics$
(A.36)

where the generalized alternating tensor is

$Display mathematics$
(A.37)

The trace of X is

$Display mathematics$
(A.38)

The transpose matrix XT is defined by Xnm T = Xnm. The adjoint matrix X is the complex conjugate of the transpose: Xnm = Xnm *. A matrix X is symmetric if X = XT, self‐adjoint or hermitian if X = X, and unitary if XX = XX = I, where I is the N × N identity matrix. Unitary transformations preserve the inner product. The hermitian and unitary matrices both belong to the larger class of normal matrices defined by XX =XX.

A matrix X is positive definite if all of its eigenvalues are real and non‐negative. This immediately implies that the determinant and trace of the matrix are both non‐ negative. An equivalent definition is that X is positive definite if

$Display mathematics$
(A.39)

for all vectors φ. For a positive‐definite matrix X, there is a matrix Y such that X = Y Y +.

The normal matrices have the following important properties (Mac Lane and Birk‐ hoff, 1967, Sec. XI–10).

(p.651) Theorem A.1 (i)If f is an eigenvector of the normal matrix Z with eigenvalue z, then f is an eigenvector of Z with eigenvalue z*, i.e. Zf = zf ⟹ Zf = z* f.

(ii)Every normal matrix has a complete, orthonormal set of eigenvectors.

Thus hermitian matrices have real eigenvalues and unitary matrices have eigenvalues of modulus 1.

A.4 Fourier transforms

A.4.1 Continuous transforms

In the mathematical literature it is conventional to denote the Fourier (integral) transform of a function f(x) of a single, real variable by

$Display mathematics$
(A.40)

so that the inverse Fourier transform is

$Display mathematics$
(A.41)

The virtue of this notation is that it reminds us that the two functions are, generally, drastically different, e.g. if f(x) = 1, then f∼(k) = 2π‡ (k).

On the other hand, the ∼ is a typographical nuisance in any discussion involving many uses of the Fourier transform. For this reason, we will sacrifice precision for convenience. In our convention, the Fourier transform is indicated by the same letter, and the distinction between the functions is maintained by paying attention to the arguments.

The Fourier transform pair is accordingly written as

$Display mathematics$
(A.42)
$Display mathematics$
(A.43)

This is analogous to the familiar idea that the meaning of a vector V is independent of the coordinate system used, despite the fact that the components (Vx,Vy,Vz) of V are changed by transforming to a new coordinate system. From this point of view, the functions f(x) and f(k) are simply different representations of the same physical quantity. Confusion is readily avoided by paying attention to the physical significance of the arguments, e.g. x denotes a point in position space, while k denotes a point in the reciprocal space or k‐space.

If the position‐space function f(x) is real, then the Fourier transform satisfies

$Display mathematics$
(A.44)

When the position variable x is replaced by the time t, it is customary in physics to use the opposite sign convention: (p.652)

$Display mathematics$
(A.45)
$Display mathematics$
(A.46)

Fourier transforms of functions of several variables, typically f(r), are defined similarly:

$Display mathematics$
(A.47)
$Display mathematics$
(A.48)

where the integrals are over position space and reciprocal space (k‐space) respectively. If f(r) is real then

$Display mathematics$
(A.49)

Combining these conventions for a space‐time function f(r, t) yields the transform pair

$Display mathematics$
(A.50)
$Display mathematics$
(A.51)

The last result is simply the plane‐wave expansion of f(r, t). If f(r, t) is real, then the Fourier transform satisfies

$Display mathematics$
(A.52)

Two related and important results on Fourier transforms—which we quote for the one‐ and three‐dimensional cases—are Parseval's theorem:

$Display mathematics$
(A.53)
$Display mathematics$
(A.54)

and the convolution theorem:

$Display mathematics$
(A.55)
$Display mathematics$
(A.56)
$Display mathematics$
(A.57)
$Display mathematics$
(A.58)

These results are readily derived by using the delta function identities (A.95) and (A.96).

(p.653) A.4.2 Fourier series

It is often useful to simplify the mathematics of the one‐dimensional continuous transform by considering the functions to be defined on a finite interval (−L/2, L/2) and imposing periodic boundary conditions. The basis vectors are still of the form uk (x) = C exp(ikx), but the periodicity condition, uk (−L/2) = uk (L/2), restricts k to the discrete values

$Display mathematics$
(A.59)

Normalization requires$C = 1 / L ,$ SO the transform is

$Display mathematics$
(A.60)

and the inverse transform f(x) is

$Display mathematics$
(A.61)

The continuous transform is recovered in the limit L → ∞ by first using eqn (A.60) to conclude that

$Display mathematics$
(A.62)

and writing the inverse transform as

$Display mathematics$
(A.63)

The difference between neighboring k‐values is ‡k = 2π/L, So this equation can be recast as

$Display mathematics$
(A.64)

In Cartesian coordinates the three‐dimensional discrete transform is defined on a rectangular parallelepiped with dimensions Lx, Ly, Lz. The one‐dimensional results then imply

$Display mathematics$
(A.65)

where the k‐vector is restricted to

$Display mathematics$
(A.66)

and V = LxLyLx. The inverse transform is

$Display mathematics$
(A.67)

and the integral transform is recovered by (p.654)

$Display mathematics$
(A.68)

The sum and integral over k are related by

$Display mathematics$
(A.69)

which in turn implies

$Display mathematics$
(A.70)

A.5 Laplace transforms

Another useful idea—which is closely related to the one‐dimensional Fourier transform—is the Laplace transform defined by

$Display mathematics$
(A.71)

In this case, we will use the standard mathematical notation f∼(ζ), since we do not use Laplace transforms as frequently as Fourier transforms. The inverse transform is

$Display mathematics$
(A.72)

The line (ζ0i∞, ζ0 + i∞) in the complex ζ‐plane must lie to the right of any poles in the transform function f∼(ζ).

The identity

$Display mathematics$
(A.73)

is useful in treating initial value problems for sets of linear, differential equations. Thus to solve the equations

$Display mathematics$
(A.74)

with a constant matrix V, and initial data fn(0), one takes the Laplace transform to get

$Display mathematics$
(A.75)

This set of algebraic equations can be solved to expressf∼n(ζ) in terms of fn(0). Inverting the Laplace transform yields the solution in the time domain.

The convolution theorem for Laplace transforms is

$Display mathematics$
(A.76)

where the integration contour is to the right of any poles of both g∼(ζ) and f∼(ζ).

(p.655) An important point for applications to physics is that poles in the Laplace transform correspond to exponential time dependence. For example, the function f(t) = exp (zt) has the transform

$Display mathematics$
(A.77)

More generally, consider a function f∼(cz) with N simple poles in ζ:

$Display mathematics$
(A.78)

where the complex numbers z1,…, ZN are all distinct. The inverse transform is

$Display mathematics$
(A.79)

where ζ0 > max[Re z 1,…, Re zN]. The contour can be closed by a large semicircle in the left half plane, and for N > 1 the contribution from the semicircle can be neglected. The integral is therefore given by the sum of the residues,

$Display mathematics$
(A.80)

which explicitly exhibits f(t) as a sum of exponentials.

A.6 Functional analysis

A.6.1 Linear functionals

In normal usage, a function, e.g. f(x), is a rule assigning a unique value to each value of its argument. The argument is typically a point in some finite‐dimensional space, e.g. the real numbers ℝ, the complex numbers ℂ, three‐dimensional space ℝ3, etc. The values of the function are also points in a finite‐dimensional space. For example, the classical electric field is represented by a function ε (r) that assigns a vector—a point in ℝ3—to each position r in ℝ3.

A rule, X, assigning a value to each point f in an infinite‐dimensional space 𝔐 (which is usually a space of functions) is called a functional and written as X [f]. The square brackets surrounding the argument are intended to distinguish functionals from functions of a finite number of variables.

If 𝔐 is a vector space, e.g. a Hilbert space, then a functional Y [f] that obeys

$Display mathematics$
(A.81)

for all scalars α, β and all functions f,g ϵ 𝔐, is called a linear functional. The family, 𝔐′, of linear functionals on 𝔐 is called the dual space of 𝔐. The dual space is also a vector space, with linear combinations of its elements defined by

$Display mathematics$
(A.82)

for all f ϵ 𝔐.

(p.656) A.6.2 Generalized functions

In Section 3.1.2 the definition (3.18) and the rule (3.21) are presented with the cavalier disregard for mathematical niceties that is customary in physics. There are however some situations in which more care is required. For these contingencies we briefly outline a more respectable treatment. The chief difficulty is the existence of the integrals defining the operators s(−‡2). This problem can be overcome by restricting the functions ϕ(r) in eqn (3.18) to good functions (Lighthill, 1964, Chap. 2), i.e. infinitely‐differentiable functions that fall off faster than any power of |r|. The Fourier transform of a good function is also a good function, so all of the relevant integrals exist, as long as s(|k|) does not grow exponentially at large |k|. The examples we need are all of the form |k|α, where −1 ≤ α ≤ 1, so eqns (3.18) and (3.21) are justified. For physical applications the really important assumption is that all functions can be approximated by good functions.

A generalized function is a linear functional, say G[ϕ], defined on the good functions, i.e.

$Display mathematics$
(A.83)

for any scalars α, β and any good functions ϕ, ψ. A familiar example is the delta function. The rule

$Display mathematics$
(A.84)

maps the function ϕ (r) into the single number ϕ (R). In this language, the transverse delta function$Δ i j ⊥ ( r − r ′ )$ is also a generalized function. An alternative terminology, often found in the mathematical literature, labels good functions as test functions and generalized functions as distributions.

In quantum field theory, the notion of a generalized function is extended to linear functionals sending good functions to operators, i.e. for each good function ϕ,

$Display mathematics$
(A.85)

Such functionals are called operator‐valued generalized functions. For any density operator ρ describing a physical state, X [ϕ] defines an ordinary (c‐number) generalized function X ρ∖ [ϕ] by

$Display mathematics$
(A.86)

A.7 Improper functions

A.7.1 The Heaviside step function

The step function θ(x) is defined by

$Display mathematics$
(A.87)

and it has the useful representation (p.657)

$Display mathematics$
(A.88)

which is proved using contour integration.

A.7.2 The Dirac delta function

A Standard properties

(1) If the function f(x) has isolated, simple zeros at the points x1, x1,… then

$Display mathematics$
(A.89)

The multidimensional generalization of this rule is

$Display mathematics$
(A.90)

where x = (x1, x2,…, xN), f(x) = (f1 (x), f2 (x),…, fN (x)),

$Display mathematics$
$Display mathematics$
(A.91)

the Jacobian f/dx is the N × N matrix with components ∂fn/dxm, and x i satisfies fn (x i) = 0, for n = 1,…, N.

(2) The derivative of the delta function is defined by

$Display mathematics$
(A.92)

(3) By using contour integration methods one gets

$Display mathematics$
(A.93)

where P is the principal part defined by

$Display mathematics$
(A.94)

(4) The definition of the Fourier transform yields

$Display mathematics$
(A.95)

in one dimension, and

$Display mathematics$
(A.96)

in three dimensions.

(p.658) (5) The step function satisfies

$Display mathematics$
(A.97)

(6) The end‐point rule is

$Display mathematics$
(A.98)

(7) The three‐dimensional delta function δ (r − r′) is defined as

$Display mathematics$
(A.99)

and is expressed in polar coordinates by

$Display mathematics$
(A.100)

B A special representation of the delta function

In many calculations, particularly in perturbation theory, one encounters functions of the form

$Display mathematics$
(A.101)

which have the limit

$Display mathematics$
(A.102)

provided that the integral

$Display mathematics$
(A.103)

exists.

A.7.3 Integral kernels

The definition of a generalized function as a linear rule assigning a complex number to each good function can be extended to a linear rule that maps a good function, e.g. f(t), to another good function g(t). The linear nature of the rule means that it can always be expressed in the form

$Display mathematics$
(A.104)

For a fixed value of t, W (t, t′) defines a generalized function of t′ which is called an integral kernel. This definition is easily extended to functions of several variables, e.g. f(r). The delta function, the Heaviside step function, etc. are examples of integral kernels. An integral kernel is positive definite if

$Display mathematics$
(A.105)

for every good function f(t).

(p.659) A.8 Probability and random variables

A.8.1 Axioms of probability

The abstract definition of probability starts with a set ω of events and a probability function P that assigns a numerical value to every subset of ω. In principle, ω could be any set, but in practice it is usually a subset of ℝN or ℂN, or a subset of the integers. The essential properties of probabilities are contained in the axioms (Gardiner, 1985, Chap. 2):

(1) P(S)≥0 for all S⊂ ω;

(2) P(ω) = l;

(3) if S1, S2,… is a discrete (countable) collection of nonoverlapping sets, i.e.

$Display mathematics$
(A.106)

then

$Display mathematics$
(A.107)

The familiar features 0 ≤ P (S) ≤ 1, P (∅) = 0, and P (S′) = 1 − P (S), where S′ is the complement of S, are immediate consequences of the axioms. If ω is a discrete (countable) set, then one writes P(x) = P({x}), where {x} is the set consisting of the single element x. If ω is a continuous (uncountable) set, then it is customary to introduce a probability density p(x) so that

$Display mathematics$
(A.108)

where dx is the natural volume element on ω.

If ω = ℝn, the probability density is a function of n variables: p(x1,x2,…, xn). The marginal distribution of xj is then defined as

$Display mathematics$
(A.109)

The joint probability for two sets S and T is P (S ∩T); this is the probability that an event in S is also in T. This is more often expressed with the notation

$Display mathematics$
(A.110)

which is used in the text. The conditional probability for S given T is

$Display mathematics$
(A.111)

this is the probability that x ε S, given that x ϵ T.

(p.660) The compound probability rule is just eqn (A.111) rewritten as

$Display mathematics$
(A.112)

This can be generalized to joint probabilities for more than two outcomes by applying it several times, e.g.

$Display mathematics$
$Display mathematics$
(A.113)

Dividing both sides by P (R) yields the useful rule

$Display mathematics$
(A.114)

Two sets of events S and T are said to be independent or statistically independent if the joint probability is the product of the individual probabilities:

$Display mathematics$
(A.115)

A.8.2 Random variables

A random variable X is a function X (x) defined on the event space ω. The function can take on values in ω or in some other set. For example, if ω = ℝ, then X(t) could be a complex number or an integer. The average value of a random variable is

$Display mathematics$
(A.116)

If the function X does take on values in ω, and is one‐one, i.e. X (x1) = X (x2) implies x1 = x2, then the distinction between X (x) and x is often ignored.