Jump to ContentJump to Main Navigation
Quantum Optics$

John Garrison and Raymond Chiao

Print publication date: 2008

Print ISBN-13: 9780198508861

Published to Oxford Scholarship Online: September 2008

DOI: 10.1093/acprof:oso/9780198508861.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2016. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 08 December 2016

(p.645) Appendix A Mathematics

(p.645) Appendix A Mathematics

Source:
Quantum Optics
Publisher:
Oxford University Press

(p.645) Appendix A

Mathematics

Mathematics Hilbert spaces

A.1 Vector analysis

Our conventions for elementary vector analysis are as follows. The unit vectors corresponding to the Cartesian coordinates x, y, z are u x, u y, u z. For a general vector v, we denote the unit vector in the direction of v by v∼ = v/ |v|.

The scalar product of two vectors is a · b = axbx + ayby + azbz, or

a b = Σ i = 1 3 a i b i ,
(A.1)

where (a 1, a 2, a 3) = (ax,ay,az), etc. Since expressions like this occur frequently, we will use the Einstein summation convention: repeated vector indices are to be summed over; that is, the expression aibi is understood to imply the sum in eqn (A.1). The summation convention will only be employed for three‐dimensional vector indices. The cross product is

( a b ) i = i j k a j b k ,
(A.2)

where the alternating tensor ϵijk is defined by

i j k = { 1 i j k i s a n e v e n p e r m u t a t i o n o f 123 , 1 i j k i s a n o d d p e r m u t a t i o n o f 123 , 0 o t h e r w i s e
(A.3)

A.2 General vector spaces

A complex vector space is a set ℏ on which the following two operations are defined.

  1. (1) Multiplication by scalars. For every pair (α, ψ), where α is a scalar, i.e. a complex number, and ψ ϵ ℏ, there is a unique element of ℏ that is denoted by αψ.

  2. (2) Vector addition. For every pair ψ, φ of vectors in ℏ there is a unique element of ℏ denoted by ψ + φ.

The two operations satisfy (a) α(βψ) = (αβ) ψ, and (b) α (ψ + φ) = αψ + αφ. It is assumed that there is a special null vector, usually denoted by 0, such that α0 = 0 and ψ + 0 = ψ. If the scalars are restricted to real numbers these conditions define a real vector space.

(p.646) Ordinary displacement vectors, r, belong to a real vector space denoted by ℝ3. The set ℂn of n‐tuplets ψ = (ψ 1,…, ψn), where each component ψi is a complex number, defines a complex vector space with component‐wise operations:

α ψ = ( α ψ 1 , , α ψ n ) ,
ψ + φ = ( ψ 1 , + φ 1 , , ψ n + φ n ) .
(A.4)

Each vector in ℝ3 or ℂn is specified by a finite number of components, so these spaces are said to be finite dimensional.

The set of complex functions, C (ℝ), of a single real variable defines a vector space with point‐wise operations:

( α ψ ) ( x ) = α ψ ( x ) ,
(A.5)
( ψ + φ ) ( x ) = ψ ( x ) + φ ( x ) ,
(A.6)

where α is a scalar, and ψ (x) and φ (x) are members of C (ℝ). This space is said to be infinite dimensional, since a general function is not determined by any finite set of values.

For any subset u ⊂ ℏ, the set of all linear combinations of vectors in u is called the span of u, written as span (u). A family B ⊂ ℏ is a basis for ℏ if ℏ = span (B), i.e. every vector in ℏ can be expressed as a linear combination of vectors in B. In this situation ℏ is said to be spanned by B.

A linear operator is a rule that assigns a new vector to each vector ψ ϵ ℏ, such that

M ( α ψ + β φ ) = α M ψ + β M φ
(A.7)

for any pair of vectors ψ and φ, and any scalars α and β. The action of a linear operator M on ℏ is completely determined by its action on the vectors of a basis B.

A.3 Hilbert spaces

A.3.1 Definition

An inner product on a vector space ℏ is a rule that assigns a complex number, denoted by (φ,ψ), to every pair of elements φ and ψ ϵ ℏ, with the following properties:

( φ , α ψ + β χ ) = α ( φ , ψ ) + β ( φ , χ ) ,
(A.8a)
( φ , ψ ) = ( ψ , φ ) * ,
(A.8b)
0 ( φ , φ ) < ,
(A.8c)
( φ , φ ) = 0 i f a n d o n l y i f φ = 0.
(A.8d)

An inner product space is a vector space equipped with an inner product. The inner product satisfies the Cauchy—Schwarz inequality:

| ( φ , ψ ) | 2 ( φ , φ ) ( ψ , ψ ) .
(A.9)

Two vectors are orthogonal if (φ, ψ) = 0. If 𝕱 is a subspace of ℏ, then the orthogonal complement of 𝕱 is the subspace 𝕱 of vectors orthogonal to every vector in 𝕱. (p.647) The norm ‖ψ‖ of ψ is defined as, | | ψ | | = ( ψ , ψ ) , so that ‖ψ‖ = 0 implies ψ = 0. Vectors with ‖ψ‖ = 1 are said to be normalized. A set of vectors is complete if the only vector orthogonal to every vector in the set is the null vector. Each complete set contains a basis for the space. A vector space with a countable basis set, B = {φ(1),(2),…}, is said to be separable. The vector spaces relevant to quantum theory are all separable. A basis for which (φ(n), φ(m)) = δφ nm holds is called orthonormal. Every vector in ℏ can be uniquely expanded in an orthonormal basis, e.g.

ψ = Σ n = 1 ψ n φ ( n ) ,
(A.10)

where the expansion coefficients are ψn = (φ(n),ψ).

A sequence ψ1, ψ2,…, ψk,… of vectors in ℏ is convergent if

ψ k ψ j 0 a s k , j .
(A.11)

A vector ψ is a limit of the sequence if

ψ k ψ 0 a s k , j .
(A.12)

A Hilbert space is an inner product space that contains the limits of all convergent sequences.

A.3.2 Examples

The finite‐dimensional spaces ℝ3 and ℂN are both Hilbert spaces. The inner product for ℝ3 is the familiar dot product, and for ℂN it is

( ψ , φ ) = Σ n = 1 N ψ n * φ n .
(A.13)

If we constrain the complex functions ψ (x) by the normalizability condition

d x | ψ ( x ) | 2 < ,
(A.14)

then the Cauchy‐Schwarz inequality for integrals,

| d x ψ * ( x ) φ ( x ) | 2 d x | ψ ( x ) | 2 d x | φ ( x ) | 2 ,
(A.15)

is sufficient to guarantee that the inner product defined by

( ψ , φ ) = d x ψ * ( x ) φ ( x )
(A.16)

makes the vector space of complex functions into a Hilbert space, which is called L 2(ℝ).

(p.648) A.3.3 Linear operators

Let A be a linear operator acting on ℏ then the domain of A, called D(A), is the subspace of vectors ψ ϵ ℏ such that ‖Aψ‖ < ∞. An operator A is positive definite if (ψ, Aψ) ≥ 0 for all ψ ϵ D(A), and it is bounded if ‖Atp‖ < b‖ψ‖, where b is a constant independent of ψ. The norm of an operator is defined by

A = max A ψ ψ f o r ψ 0 ,
(A.17)

so a bounded operator is one with finite norm.

If Aψ = λψ, where ψ is a complex number and ψ is a vector in the Hilbert space, then ψ is an eigenvalue and ψ is an eigenvector of A. In this case λ is said to belong to the point spectrum of A. The eigenvalue λ is nondegenerate if the eigenvector ψ is unique (up to a multiplicative factor). If ψ is not unique, then λ is degenerate. The linearly‐independent solutions of Aψ = Aψ form a subspace called the eigenspace for λ, and the dimension of the eigenspace is the degree of degeneracy for λ. The continuous spectrum of A is the set of complex numbers λ such that: (1) λ is not an eigenvalue, and (2) the operator λ — A does not have an inverse.

The adjoint (hermitian conjugate) A of A is defined by

( ψ , A φ ) = ( φ , A ψ ) * ,
(A.18)

and A is self‐adjoint (hermitian) if D(A) = D(A) and (φ, Aψ) = (Aφ, ψ). Bounded self‐adjoint operators have real eigenvalues and a complete orthonormal set of eigenvectors. For unbounded self‐adjoint operators, the point and continuous spectra are subsets of the real numbers. Note that (ψ, Aδ Aψ) = (φ, φ), where φ = Aψ, so that

( ψ , A A ψ ) 0 ,
(A.19)

i.e. AA is positive definite.

A self‐adjoint operator, P, satisfying

P 2 = P
(A.20)

is called a projection operator; it has only a point spectrum consisting of {0,1}. Consider the set of vectors Pℏ, consisting of all vectors of the form as ψ ranges over ℏ. This is a subspace of ℏ, since

α P φ + β P χ = P ( α φ + β χ )
(A.21)

shows that every linear combination of vectors in Pℏ is also in Pℏ. Conversely, let 𝕾 be a subspace of ℏ and {φ(n)} an orthonormal basis for 𝕾. The operator P, defined by

P ψ = Σ n ( φ ( n ) , ψ ) φ ( n ) ,
(A.22)

is a projection operator, since

P 2 ψ = Σ n ( φ ( n ) , ψ ) P φ ( n ) = Σ n ( φ ( n ) , ψ ) φ ( n ) = P ψ .
(A.23)

Thus there is a one‐to‐one correspondence between projection operators and subspaces of ℏ. Let P and Q be projection operators and suppose that the vectors in Pℏ are (p.649) orthogonal to the vectors in Qℏ; then PQ = QP = 0 and P and Q are said to be orthogonal projections. In the extreme case that (𝕾 = ℏ), the expansion (A.10) shows that P is the identity operator, Pψ = ψ.

A self‐adjoint operator with pure point spectrum {λ12,…} has the spectral resolution

A = Σ n λ n P n ,
(A.24)

where Pn is the projection operator onto the subspace of eigenvectors with eigenvalue λn. The spectral resolution for a self‐adjoint operator A with a continuous spectrum is

A = λ d μ ( λ ) ,
(A.25)

where dΜ (λ) is an operator‐valued measure defined by the following statement: for each subset δ of the real line,

P ( Δ ) = Δ d μ ( λ )
(A.26)

is the projection operator onto the subspace of vectors ψ such that ‖ (λ − A)−1 ψ ‖ > ∞ for all λ ⊋ δ (Riesz and Sz.‐Nagy, 1955, Chap. VIII, Sec. 120).

A linear operator U is unitary if it preserves inner products, i.e.

( U ψ , U φ ) = ( ψ , φ )
(A.27)

for any pair of vectors ψ, φ in the Hilbert space. A necessary and sufficient condition for unitarity is that the operator is norm preserving, i.e.

( U ψ , U ψ ) = ( ψ , ψ ) f o r a l l ψ i f a n d o n l y i f U i s u n i t a r y .
(A.28)

The spectral resolution for a unitary operator with a pure point spectrum is

U = Σ n e i θ n P n , θ n r e a l ,
(A.29)

and for a continuous spectrum

U = e i θ d μ ( θ ) , θ r e a l .
(A.30)

A linear operator N is said to be a normal operator if

[ N , N ] = 0.
(A.31)

The hermitian and unitary operators are both normal. The hermitian operators N1 = (N+ N) /2 and N2 = (N−N) /2i satisfy N = N1+ iN2 and [N1,N2] = 0. Normal operators therefore have the spectral resolutions

N = Σ n ( x n P 1 n + i y n P 2 n ) , [ P 1 n , P 2 m ] = 0
(A.32)

(p.650) for a point spectrum, and

N = x d μ 1 ( x ) + i y d μ 2 ( y ) , [ Δ 1 d μ 1 ( x ) , Δ 1 d μ 2 ( y ) ] = 0
(A.33)

for a continuous spectrum.

A.3.4 Matrices

A linear operator X acting on an N‐dimensional Hilbert space, with basis {f(1), …, f(N)}, is represented by the N × N matrix

X m n = ( f ( m ) , X f ( n ) ) .
(A.34)

The operator and its matrix are both called X. The matrix for the product XY of two operators is the matrix product

( X Y ) m n = Σ N k = 1 X m k Y k n .
(A.35)

The determinant of X is defined as

det ( X ) = Σ n 1 n N n 1 n N X 1 n 1 X N n N ,
(A.36)

where the generalized alternating tensor is

Σ n 1 n N n 1 n N = { 1 n 1 n N i s a n e v e n p e r m u t a t i o n o f 12 N , 1 n 1 n N i s a n o d d p e r m u t a t i o n o f 12 N , 0 o t h e r w i s e
(A.37)

The trace of X is

T r X = Σ n = 1 N X n n .
(A.38)

The transpose matrix XT is defined by Xnm T = Xnm. The adjoint matrix X is the complex conjugate of the transpose: Xnm = Xnm *. A matrix X is symmetric if X = XT, self‐adjoint or hermitian if X = X, and unitary if XX = XX = I, where I is the N × N identity matrix. Unitary transformations preserve the inner product. The hermitian and unitary matrices both belong to the larger class of normal matrices defined by XX =XX.

A matrix X is positive definite if all of its eigenvalues are real and non‐negative. This immediately implies that the determinant and trace of the matrix are both non‐ negative. An equivalent definition is that X is positive definite if

φ X φ 0
(A.39)

for all vectors φ. For a positive‐definite matrix X, there is a matrix Y such that X = Y Y +.

The normal matrices have the following important properties (Mac Lane and Birk‐ hoff, 1967, Sec. XI–10).

(p.651) Theorem A.1 (i)If f is an eigenvector of the normal matrix Z with eigenvalue z, then f is an eigenvector of Z with eigenvalue z*, i.e. Zf = zf ⟹ Zf = z* f.

(ii)Every normal matrix has a complete, orthonormal set of eigenvectors.

Thus hermitian matrices have real eigenvalues and unitary matrices have eigenvalues of modulus 1.

A.4 Fourier transforms

A.4.1 Continuous transforms

In the mathematical literature it is conventional to denote the Fourier (integral) transform of a function f(x) of a single, real variable by

f ( k ) = d x f ( x ) e i k x ,
(A.40)

so that the inverse Fourier transform is

f ( x ) = d k 2 π f ( k ) e i k x .
(A.41)

The virtue of this notation is that it reminds us that the two functions are, generally, drastically different, e.g. if f(x) = 1, then f∼(k) = 2π‡ (k).

On the other hand, the ∼ is a typographical nuisance in any discussion involving many uses of the Fourier transform. For this reason, we will sacrifice precision for convenience. In our convention, the Fourier transform is indicated by the same letter, and the distinction between the functions is maintained by paying attention to the arguments.

The Fourier transform pair is accordingly written as

f ( k ) = d x f ( x ) e i k x ,
(A.42)
f ( x ) = d k 2 π f ( k ) e i k x .
(A.43)

This is analogous to the familiar idea that the meaning of a vector V is independent of the coordinate system used, despite the fact that the components (Vx,Vy,Vz) of V are changed by transforming to a new coordinate system. From this point of view, the functions f(x) and f(k) are simply different representations of the same physical quantity. Confusion is readily avoided by paying attention to the physical significance of the arguments, e.g. x denotes a point in position space, while k denotes a point in the reciprocal space or k‐space.

If the position‐space function f(x) is real, then the Fourier transform satisfies

f * ( k ) = [ f ( k ) ] * = f ( k ) .
(A.44)

When the position variable x is replaced by the time t, it is customary in physics to use the opposite sign convention: (p.652)

f ( ω ) = d x f ( x ) e i ω t ,
(A.45)
f ( t ) = d ω 2 π f ( k ) e i ω t .
(A.46)

Fourier transforms of functions of several variables, typically f(r), are defined similarly:

f ( r ) = d 3 r f ( r ) e i k r ,
(A.47)
f ( r ) = d 3 k ( 2 π ) 3 f ( k ) e i k r ,
(A.48)

where the integrals are over position space and reciprocal space (k‐space) respectively. If f(r) is real then

f * ( k ) = f ( k )
(A.49)

Combining these conventions for a space‐time function f(r, t) yields the transform pair

f ( k , ω ) = d 3 r d t f ( r , t ) e i ( k r ω t ) ,
(A.50)
f ( r , t ) = d 3 k ( 2 π ) 3 d ω 2 π f ( k , ω ) e i ( k r ω t ) ,
(A.51)

The last result is simply the plane‐wave expansion of f(r, t). If f(r, t) is real, then the Fourier transform satisfies

f * ( k , ω ) = f ( k , ω )
(A.52)

Two related and important results on Fourier transforms—which we quote for the one‐ and three‐dimensional cases—are Parseval's theorem:

d t f * ( t ) g ( t ) = d ω 2 π f * ( ω ) g ( ω )
(A.53)
d 3 r f * ( r ) g ( r ) = d 3 k ( 2 π ) 3 f * ( k ) g ( k ) ,
(A.54)

and the convolution theorem:

h ( t ) = d t f ( t t ) g ( t ) i f a n d o n l y h ( ω ) = f ( ω ) g ( ω ) ,
(A.55)
h ( ω ) = d ω 2 π f ( ω ω ) g ( ω ) i f a n d o n l y h ( t ) = f ( t ) g ( t ) ,
(A.56)
h ( r ) = d 3 r f ( r r ) g ( r ) i f a n d o n l y h ( k ) = f ( k ) g ( k ) ,
(A.57)
h ( k ) = d 3 k ( 2 π ) 3 f ( k k ) g ( k ) i f a n d o n l y i f h ( r ) = f ( r ) g ( r ) ,
(A.58)

These results are readily derived by using the delta function identities (A.95) and (A.96).

(p.653) A.4.2 Fourier series

It is often useful to simplify the mathematics of the one‐dimensional continuous transform by considering the functions to be defined on a finite interval (−L/2, L/2) and imposing periodic boundary conditions. The basis vectors are still of the form uk (x) = C exp(ikx), but the periodicity condition, uk (−L/2) = uk (L/2), restricts k to the discrete values

k = 2 π n L ( n = 0 , ± 1 , ± 2 , )
(A.59)

Normalization requires C = 1 / L , SO the transform is

f k = 1 L L / 2 L / 2 d x f ( x ) e i k x
(A.60)

and the inverse transform f(x) is

f ( x ) = 1 L Σ k f k e i k x
(A.61)

The continuous transform is recovered in the limit L → ∞ by first using eqn (A.60) to conclude that

L f k f ( k )  as  L
(A.62)

and writing the inverse transform as

f ( x ) = 1 L Σ k L f k e i k x
(A.63)

The difference between neighboring k‐values is ‡k = 2π/L, So this equation can be recast as

f ( x ) = Σ k Δ k 2 π L f k e i k x d k 2 π f ( k ) e i k x
(A.64)

In Cartesian coordinates the three‐dimensional discrete transform is defined on a rectangular parallelepiped with dimensions Lx, Ly, Lz. The one‐dimensional results then imply

f k = 1 V V d 3 r f ( r ) e i k r
(A.65)

where the k‐vector is restricted to

k = 2 π n x L x u x + 2 π n y L y u y + 2 π n z L z u z
(A.66)

and V = LxLyLx. The inverse transform is

f ( r ) = 1 V Σ k f k e i k r
(A.67)

and the integral transform is recovered by (p.654)

V f k f ( k )  as  V
(A.68)

The sum and integral over k are related by

1 V Σ k d 3 k ( 2 π ) 3
(A.69)

which in turn implies

V δ k , k ( 2 π ) 3 δ ( k k )
(A.70)

A.5 Laplace transforms

Another useful idea—which is closely related to the one‐dimensional Fourier transform—is the Laplace transform defined by

f ( ζ ) = 0 d t e ζ t f ( t )
(A.71)

In this case, we will use the standard mathematical notation f∼(ζ), since we do not use Laplace transforms as frequently as Fourier transforms. The inverse transform is

f ( t ) = ζ 0 i ζ 0 + i d ζ 2 π i e ζ t f ( ζ )
(A.72)

The line (ζ0i∞, ζ0 + i∞) in the complex ζ‐plane must lie to the right of any poles in the transform function f∼(ζ).

The identity

( d f d t ) ( ζ ) = ζ f ( ζ ) f ( 0 )
(A.73)

is useful in treating initial value problems for sets of linear, differential equations. Thus to solve the equations

d f n d t = Σ m V n m f m
(A.74)

with a constant matrix V, and initial data fn(0), one takes the Laplace transform to get

ζ f n ( ζ ) Σ m V n m f m ( ζ ) = f n ( 0 )
(A.75)

This set of algebraic equations can be solved to expressf∼n(ζ) in terms of fn(0). Inverting the Laplace transform yields the solution in the time domain.

The convolution theorem for Laplace transforms is

0 t d t g ( t t ) f ( t ) = ζ 0 i ζ 0 + i d ζ 2 π i g ( ζ ) f ( ζ ) e ζ t
(A.76)

where the integration contour is to the right of any poles of both g∼(ζ) and f∼(ζ).

(p.655) An important point for applications to physics is that poles in the Laplace transform correspond to exponential time dependence. For example, the function f(t) = exp (zt) has the transform

f ( ζ ) = 1 ζ z
(A.77)

More generally, consider a function f∼(cz) with N simple poles in ζ:

f ( ζ ) = 1 ( ζ z 1 ) ( ζ z N )
(A.78)

where the complex numbers z1,…, ZN are all distinct. The inverse transform is

f ( t ) = ζ 0 i ζ 0 + i d ζ 2 π i e ζ t ( ζ z 1 ) ( ζ z N )
(A.79)

where ζ0 > max[Re z 1,…, Re zN]. The contour can be closed by a large semicircle in the left half plane, and for N > 1 the contribution from the semicircle can be neglected. The integral is therefore given by the sum of the residues,

f ( t ) = Σ n 1 N e z n t Π j 1 1 z n z j
(A.80)

which explicitly exhibits f(t) as a sum of exponentials.

A.6 Functional analysis

A.6.1 Linear functionals

In normal usage, a function, e.g. f(x), is a rule assigning a unique value to each value of its argument. The argument is typically a point in some finite‐dimensional space, e.g. the real numbers ℝ, the complex numbers ℂ, three‐dimensional space ℝ3, etc. The values of the function are also points in a finite‐dimensional space. For example, the classical electric field is represented by a function ε (r) that assigns a vector—a point in ℝ3—to each position r in ℝ3.

A rule, X, assigning a value to each point f in an infinite‐dimensional space 𝔐 (which is usually a space of functions) is called a functional and written as X [f]. The square brackets surrounding the argument are intended to distinguish functionals from functions of a finite number of variables.

If 𝔐 is a vector space, e.g. a Hilbert space, then a functional Y [f] that obeys

Y [ α f + β g ] = α Y [ f ] + β Y [ g ]
(A.81)

for all scalars α, β and all functions f,g ϵ 𝔐, is called a linear functional. The family, 𝔐′, of linear functionals on 𝔐 is called the dual space of 𝔐. The dual space is also a vector space, with linear combinations of its elements defined by

( α X + β Y ) [ f ] = α X [ f ] + β Y [ f ]
(A.82)

for all f ϵ 𝔐.

(p.656) A.6.2 Generalized functions

In Section 3.1.2 the definition (3.18) and the rule (3.21) are presented with the cavalier disregard for mathematical niceties that is customary in physics. There are however some situations in which more care is required. For these contingencies we briefly outline a more respectable treatment. The chief difficulty is the existence of the integrals defining the operators s(−‡2). This problem can be overcome by restricting the functions ϕ(r) in eqn (3.18) to good functions (Lighthill, 1964, Chap. 2), i.e. infinitely‐differentiable functions that fall off faster than any power of |r|. The Fourier transform of a good function is also a good function, so all of the relevant integrals exist, as long as s(|k|) does not grow exponentially at large |k|. The examples we need are all of the form |k|α, where −1 ≤ α ≤ 1, so eqns (3.18) and (3.21) are justified. For physical applications the really important assumption is that all functions can be approximated by good functions.

A generalized function is a linear functional, say G[ϕ], defined on the good functions, i.e.

G [ φ ]  isacomplexnumberand  G [ α φ + β ψ ] = α G [ φ ] + β G [ φ ]
(A.83)

for any scalars α, β and any good functions ϕ, ψ. A familiar example is the delta function. The rule

d 3 r δ ( r R ) φ ( r ) = φ ( R )
(A.84)

maps the function ϕ (r) into the single number ϕ (R). In this language, the transverse delta function Δ i j ( r r ) is also a generalized function. An alternative terminology, often found in the mathematical literature, labels good functions as test functions and generalized functions as distributions.

In quantum field theory, the notion of a generalized function is extended to linear functionals sending good functions to operators, i.e. for each good function ϕ,

X [ φ ]  isanoperatorand  X [ α φ + β ψ ] = α X [ φ ] + β X [ φ ]
(A.85)

Such functionals are called operator‐valued generalized functions. For any density operator ρ describing a physical state, X [ϕ] defines an ordinary (c‐number) generalized function X ρ∖ [ϕ] by

X ρ [ φ ] = T r ( ρ X [ φ ] )
(A.86)

A.7 Improper functions

A.7.1 The Heaviside step function

The step function θ(x) is defined by

θ ( x ) = { 1  for  x > 0 , 0  for  x < 0 ,
(A.87)

and it has the useful representation (p.657)

θ ( x ) = lim ε 0 d s 2 π i e i s x s + i ε
(A.88)

which is proved using contour integration.

A.7.2 The Dirac delta function

A Standard properties

(1) If the function f(x) has isolated, simple zeros at the points x1, x1,… then

δ ( f ( x ) ) = Σ i 1 | ( d f d x ) x = x i | δ ( x x i )
(A.89)

The multidimensional generalization of this rule is

δ ( ( ) ) = Σ i 1 | det ( ) = i | δ ( i )
(A.90)

where x = (x1, x2,…, xN), f(x) = (f1 (x), f2 (x),…, fN (x)),

δ ( ( ) ) = δ ( f 1 ( ) ) δ ( f N ( ) ) ,
δ ( i ) = δ ( x 1 x 1 i ) δ ( x N x N i )
(A.91)

the Jacobian f/dx is the N × N matrix with components ∂fn/dxm, and x i satisfies fn (x i) = 0, for n = 1,…, N.

(2) The derivative of the delta function is defined by

d x f ( x ) d d x δ ( x a ) = ( d f d x ) x = a
(A.92)

(3) By using contour integration methods one gets

lim ε 0 1 x + i ε = P 1 x i π δ ( x )
(A.93)

where P is the principal part defined by

P d x f ( x ) x = lim a 0 { a d x f ( x ) x + a d x f ( x ) x }
(A.94)

(4) The definition of the Fourier transform yields

d t e i ( ω ν ) t = 2 π δ ( ω ν )
(A.95)

in one dimension, and

d 3 r e i ( k q ) r = ( 2 π ) 3 δ ( k q )
(A.96)

in three dimensions.

(p.658) (5) The step function satisfies

d d x θ ( x ) = δ ( x )
(A.97)

(6) The end‐point rule is

a d x δ ( x a ) f ( x ) = 1 2 f ( a )
(A.98)

(7) The three‐dimensional delta function δ (r − r′) is defined as

δ ( r r ) = δ ( x x ) δ ( y y ) δ ( z z )
(A.99)

and is expressed in polar coordinates by

δ ( r r ) = 1 r 2 δ ( r r ) δ ( cos θ cos θ ) δ ( φ φ )
(A.100)

B A special representation of the delta function

In many calculations, particularly in perturbation theory, one encounters functions of the form

ξ ( ω , t ) = η ( ω t ) ω
(A.101)

which have the limit

lim t ξ ( ω , t ) = ξ 0 δ ( ω )
(A.102)

provided that the integral

ξ 0 = d u η ( u ) u
(A.103)

exists.

A.7.3 Integral kernels

The definition of a generalized function as a linear rule assigning a complex number to each good function can be extended to a linear rule that maps a good function, e.g. f(t), to another good function g(t). The linear nature of the rule means that it can always be expressed in the form

g ( t ) = d t W ( t , t ) f ( t )
(A.104)

For a fixed value of t, W (t, t′) defines a generalized function of t′ which is called an integral kernel. This definition is easily extended to functions of several variables, e.g. f(r). The delta function, the Heaviside step function, etc. are examples of integral kernels. An integral kernel is positive definite if

d t d t f * ( t ) W ( t , t ) f ( t ) 0
(A.105)

for every good function f(t).

(p.659) A.8 Probability and random variables

A.8.1 Axioms of probability

The abstract definition of probability starts with a set ω of events and a probability function P that assigns a numerical value to every subset of ω. In principle, ω could be any set, but in practice it is usually a subset of ℝN or ℂN, or a subset of the integers. The essential properties of probabilities are contained in the axioms (Gardiner, 1985, Chap. 2):

(1) P(S)≥0 for all S⊂ ω;

(2) P(ω) = l;

(3) if S1, S2,… is a discrete (countable) collection of nonoverlapping sets, i.e.

S i S j = φ  for  i j
(A.106)

then

P ( S 1 222A S 2 ) = Σ j P ( S j )
(A.107)

The familiar features 0 ≤ P (S) ≤ 1, P (∅) = 0, and P (S′) = 1 − P (S), where S′ is the complement of S, are immediate consequences of the axioms. If ω is a discrete (countable) set, then one writes P(x) = P({x}), where {x} is the set consisting of the single element x. If ω is a continuous (uncountable) set, then it is customary to introduce a probability density p(x) so that

P ( S ) = S d x p ( x )
(A.108)

where dx is the natural volume element on ω.

If ω = ℝn, the probability density is a function of n variables: p(x1,x2,…, xn). The marginal distribution of xj is then defined as

p j ( x j ) = d x 1 d x j 1 d x j + 1 d x n p ( x 1 , x 2 , , x n )
(A.109)

The joint probability for two sets S and T is P (S ∩T); this is the probability that an event in S is also in T. This is more often expressed with the notation

P ( S , T ) = P ( S T )
(A.110)

which is used in the text. The conditional probability for S given T is

P ( S | T ) = P ( S , T ) P ( T ) = P ( S T ) P ( T )
(A.111)

this is the probability that x ε S, given that x ϵ T.

(p.660) The compound probability rule is just eqn (A.111) rewritten as

P ( S , T ) = P ( S | T ) P ( T )
(A.112)

This can be generalized to joint probabilities for more than two outcomes by applying it several times, e.g.

P ( S , T , R ) = P ( S | T , R ) P ( T , R )
= P ( S | T , R ) P ( T | R ) P ( R )
(A.113)

Dividing both sides by P (R) yields the useful rule

P ( S , T | R ) = P ( S | T , R ) P ( T | R )
(A.114)

Two sets of events S and T are said to be independent or statistically independent if the joint probability is the product of the individual probabilities:

P ( S , T ) = P ( S ) P ( T )
(A.115)

A.8.2 Random variables

A random variable X is a function X (x) defined on the event space ω. The function can take on values in ω or in some other set. For example, if ω = ℝ, then X(t) could be a complex number or an integer. The average value of a random variable is

< X > = d x p ( x ) X ( x )
(A.116)

If the function X does take on values in ω, and is one‐one, i.e. X (x1) = X (x2) implies x1 = x2, then the distinction between X (x) and x is often ignored.