Jump to ContentJump to Main Navigation
The Equilibrium Theory of Inhomogeneous Polymers$

Glenn Fredrickson

Print publication date: 2005

Print ISBN-13: 9780198567295

Published to Oxford Scholarship Online: September 2007

DOI: 10.1093/acprof:oso/9780198567295.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 25 February 2017

(p.400) APPENDIX D COMPLEX LANGEVIN THEORY

(p.400) APPENDIX D COMPLEX LANGEVIN THEORY

Source:
The Equilibrium Theory of Inhomogeneous Polymers
Publisher:
Oxford University Press

The complex Langevin (CL) simulation method described in Section 6.4 is a versatile tool for bypassing the sign problem that arises in sampling field theories with non-positive definite weights, i.e. theories with a complex Hamiltonian H[w]. In this appendix we discuss the theoretical basis for the method.

The complex Langevin technique was devised independently by Parisi (1983) and Klauder (1983) for evaluating averages such as

(D.1)
< G ( x ) > = d x G ( x ) exp [ H ( x ) ] d x exp [ H ( x ) ]
where the integration path is along the real axis for the variable x, but the Hamiltonian H(x) is a complex (not strictly real) function of x. We shall begin by discussing the case where x is a scalar, but then generalize to the more important situation where x is replaced by an M-vector so that the integrals in eqn (D.1) are M-dimensional integrals taken along the real axis.

A convenient way to rewrite eqn (D.1) is in the form

(D.2)
< G ( x ) > = d x G ( x ) P c ( x )
where P c (x) is a so-called “complex probability weight” defined by
(D.3)
P c ( x ) = exp [ H ( x ) ] d x exp [ H ( x ) ]
and it is understood that the path of integration is the real axis. In spite of its name, P c(x) is not a true probability density because it is not positive semidefinite for H(x) complex. This also implies that eqn (D.2) cannot be directly tackled by Monte Carlo importance sampling of P c(x) (Landau and Binder, 2000).119 The basic idea behind the CL technique is to assume that one can find a real, non-negative probability density P(x, y) so that eqn (D.2) can be reexpressed as
(D.4)
< G ( x ) > = d x d y G ( x + i y ) P ( x , y )
Equation (D.4) amounts to the assumption that the line integral in eqn (D.2) along the real axis can be exactly rewritten as an area integral over the entire (p.401) complex plane of z = x + iy. If such a probability density P(x, y) exists, so that eqns (D.2) and (D.4) are equivalent, then eqn (D.4) can be approximately evaluated with the importance sampling formula
(D.5)
< G ( x ) > 1 N C j = 1 N C G ( z j )
where z j = x j + iy j for j = 1,2,3, …N C are a set of random points in the complex plane selected from the distribution P(x, y). The sign problem discussed in the context of eqn (6.129) in Section 6.3 would thereby be avoided, because no complex phase factor appears in eqn (D.5) multiplying the observable G to be averaged.

For such a strategy to be realized, we require two things:

  • proof that eqn (D.2) can be rewritten in the form of eqn (D.4) and that P(x, y) exists for any physically reasonable H(x)

  • a numerical scheme for importance sampling of the function P(x, y) With regard to the first point, direct comparison of the right-hand sides of the two equations indicates that they are equivalent if a P(x, y) can be found such that

(D.6)
P c ( x ) = d y P ( x i y , y ) = d x d y δ ( x x i y ) P ( x , y )
where we have assumed that G(x), P c(x), and P(x, y) are analytic functions of their x arguments. The second line of this expression will prove especially useful in the following. Necessary and sufficient conditions for the existence of P(x, y) have recently been identified (Salcedo, 1997; Weingarten, 2002). These conditions lead us to expect that most, if not all, physically realistic polymer field theory models will possess a real, non-negative distribution P satisfying eqn (D.6).

The complex Langevin (CL) scheme is a stochastic dynamics that, if convergent to a steady state, provides a method for sampling the distribution P(x, y) and verifying that it exists. The method amounts to writing a Langevin equation analogous to eqn (6.144), but generalizing it to trajectories z(t) = x(t) + iy(t) in the complex plane according to (Parisi, 1983; Klauder, 1983)

(D.7)
d d t x ( t ) = λ Re [ d H d z ( t ) ] + η ( t ) d d t y ( t ) = λ Im [ d H d z ( t ) ]
In these equations, Re and Im denote the operations of taking the real and imaginary parts of a complex function and dH(z)/dz is the complex derivative (p.402) for an analytic Hamiltonian H(z) (Ahlfors, 1979).120 The random force η(t) is a real, Gaussian white noise defined by (van Kampen, 1981; Kloeden and Platen, 1992)
(D.8)
< η ( t ) > = 0 < η ( t ) η ( t ) > = 2 λ δ ( t t )
The “kinetic coefficient” λ appearing in eqns (D.7) and (D.8) must be real and positive, although its value is arbitrary and can be absorbed into the time variable. Here we keep it explicit for reasons that will become apparent.

There are several notable features of the CL eqns (D.7). The first is the asymmetry with respect to the addition of the random force – the force is added only to the equation for the real component x(t) of the complex trajectory z(t). This asymmetry is necessary to preserve the broken symmetry of the original model in the complex x−y plane; namely, the fact that the integral in eqn (D.2) is taken along the real axis. The noise covariance in eqn (D.8) is consistent with the usual fluctuation-dissipation theorem for Brownian dynamics (van Kampen, 1981; McQuarrie, 1976), which states that the noise strength should be twice the dissipative coefficient λ appearing in front of the force terms in a Langevin equation. Another important feature of eqns (D.7) is that with the random force η(t) removed, the equations constitute a relaxational dynamics towards a saddle point z* of the model satisfying

(D.9)
d H ( z ) d z | z = z * = 0
Indeed, without the random force, the CL eqns (D.7) reduce to the relaxation scheme eqn (5.106) presented in Chapter 5 for the numerical computation of saddle points.

The physical content of eqns (D.7) should now be clear. In the absence of the noise, the CL equations evolve deterministically towards a nearby saddle point. However, with the random force present, the second of the two equations drives the stochastic sampling path to a value of y that is approximately consistent with the local constant phase condition121

(D.10)
Im d H d z = x H I ( x , y ) = 0
The second equation in (D.7) thus attempts to maintain the dynamic trajectory z(t) on a locally constant phase path by adjusting the imaginary component y(t). In contrast, the first equation stochastically drives the trajectory along the path through the action of the random force on the real component x(t). As a result, if the Langevin dynamics converge to a stationary distribution P(x, y) (p.403) in the complex plane, we expect P(x, y) to have maximum intensity centered around a constant phase ascent path passing through one or more saddle points of the model. The beauty of the technique is that it is fully adaptive – namely, the dominant saddle point and constant phase path need not be determined in advance of running a CL simulation!

Our next task is to use eqns (D.7) to derive a Fokker–Planck equation (van Kampen, 1981) for the time-dependent probability distribution P(x, y, t) implied by the CL stochastic dynamics. The steady state solution of this equation, if it exists, is the real probability density P(x, y). Integrating both sides of eqns (D.7) from t to t + Δt leads to

(D.11)
Δ x x ( t + Δ t ) x ( t ) = λ t t + Δ t d s F R ( x ( s ) , y ( s ) ) + μ Δ y y ( t + Δ t ) y ( t ) = λ t t + Δ t d s F I ( x ( s ) , y ( s ) )
where F(z) ≡ − dH (z)/dz is the complex force. The quantity μ t t + Δ t d s η ( s ) is a new Gaussian random force acting over the timestep with mean and variance that follow immediately from eqn (D.8):
(D.12)
< μ > = 0 < μ 2 > = t t + Δ t d s t t + Δ t d s < η ( s ) η ( s ) > = 2 λ Δ t
It is important to note that μ is characteristically O((Δt)1/2). Assuming continuity of dH/dz, eqns (D.11) can be approximated by
(D.13)
Δ x = λ Δ t F R + μ Δ y = λ Δ t F I
with errors that are O((Δt)2). Using these equations, it is straightforward to show that the first two moments of the random variables Δx and Δy, averaged over all realizations of the Gaussian force μ, are given to Ot) by
(D.14)
< Δ x > = λ Δ t F R , < Δ y > = λ Δ t F I < ( Δ x ) 2 > = 2 λ Δ t , < ( Δ y ) 2 > = < Δ x Δ y > = 0

These results can now be used to derive a Fokker–Planck equation for the probability density P(x, y, t). The starting point is a Chapman–Kolmogorov (CK) equation strictly analogous to eqn (2.58) for the continuous Gaussian chain. Defining a two-component state vector according to x = (x, y)T, the CK equation can be written

(D.15)
P ( x , t + Δ t ) = d ( Δ x ) Φ ( Δ x ; x Δ x ) P ( x Δ x , t )
where Φ(Δx; x) is the transition probability density for a displacement Δx in the complex plane, starting at the point x, over a time interval of Δt. This function (p.404) is normalized so that ∫ dx) Φ = 1 and its first two moments are summarized by eqn (D.14). Following the procedure outlined in Section 2.4, eqn (D.15) can be converted to a Fokker–Planck equation by expanding the left-hand side in powers of Δt to Ot) and expanding the right-hand side in powers of Δx to O((Δx)2) = Ot). This leads to
(D.16)
Δ t t P ( x , t ) = x [ < Δ x > P ( x , t ) ] + 1 2 ! x x : [ < Δ x Δ x > P ( x , t ) ] + O ( ( Δ t ) 2 )
Finally, substituting eqn (D.14) for the moments of Φ produces the desired Fokker–Planck equation for the CL process
(D.17)
t P ( x , y , t ) = λ t [ F R ( x , y ) P ( x , y , t ) ] λ y [ F I ( x , y ) P ( x , y , t ) ] + λ 2 x 2 P ( x , y , t )
In spite of its linearity, this Fokker–Planck equation apparently has no closed form solution for an arbitrary force F(z), even in the steady state limit where P(x, y, t)P(x, y).

Our final task is to prove that if a steady state solution P(x, y) of the above equation exists, then averages computed with this solution using eqn (D.4) are equivalent to averages computed with eqn (D.2) using the complex weight P c(x) (Schoenmaker, 1987; Lee, 1994). This can be shown by combining eqns (D.6) and (D.17). Specifically, applying the operation ∫ dx′dy′ δ(xx′iy′) to both sides of the Fokker–Planck equation written for P(x′, y′, t) leads to

(D.18)
t P c ( x , t ) = T 1 ( x , t ) + T 2 ( x , t ) + T 3 ( x , t )
where P c (x, t) ≡ ∫ dy P(x − iy, y, t) and T j(x, t) is the function obtained by applying the indicated operation to the j th term on the right-hand side of eqn (D.17). T 1 can be manipulated as follows:
(D.19)
T 1 ( x , t ) = λ d x d y δ ( x x i y ) x [ Re ( d H ( x + i y ) d ( x + i y ) ) P ( x , y , t ) ] = λ d y x [ Re ( d H ( x ) d x ) P ( x i y , y , t ) ] = λ x [ Re ( d H ( x ) d x ) P c ( x , t ) ]
Similarly, the second term can be written (p.405)
(D.20)
T 2 ( x , t ) = λ d x d y δ ( x x i y ) y [ Im ( d H ( x + i y ) d ( x + i y ) ) P ( x , y , t ) ] = i λ d x d y δ ( x x i y ) x [ Im ( d H ( x ) d x ) P ( x , y , t ) ] = i λ x [ Im ( d H ( x ) d x ) P c ( x , t ) ]
Finally, the last term is
(D.21)
T 3 ( x , t ) = λ d x d y δ ( x x i y ) 2 ( x ) 2 P ( x , y , t ) = λ d y 2 x 2 P ( x , i y , y , t ) = λ 2 x 2 P c ( x , t )
Combining these results, we see that the function P c(x, t) satisfies the following complex Fokker–Planck (FP) equation:
(D.22)
t P c ( x , t ) = x λ [ x + d H ( x ) d x ] P c ( x , t )
This equation has a complex steady state solution P c(x)∞ exp[−H(x)] corresponding to eqn (D.3), but also a second “spurious” steady state (Lee, 1994)
(D.23)
P s p u r ( x ) exp [ H ( x ) ] x d y exp [ H ( y ) ]
This spurious solution is usually not relevant, because it leads to expectation values that are incompatible with the most common choices of boundary conditions, e.g. periodic. Thus, if the real FP eqn (D.17) converges to a steady state P(x, y), then the associated function P c(x) = ∫ dy P(x − iy, y) corresponds to the desired complex probability given by eqn (D.3). When this condition is met, a “time average” computed with eqn (D.5) along a complex Langevin trajectory will converge to the ensemble average (D.2) in the limit of N C → ∞.

Unfortunately, an analytical proof that eqn (D.17) has a steady state solution is not in hand. Nevertheless, there is a simple test to ensure that a numerical CL simulation is working properly (Gausterer and Lee, 1993; Lee, 1994). If the expectation values for analytic observables G(x) become time independent over the course of a CL simulation, then it can be proven that these values are correct and in agreement with eqn (D.2). In practice, we have not encountered convergence problems in CL simulations for any of the models considered in Chapter 4.

The complex Langevin equations given in (D.7) constitute the “standard” CL approach. However, there are several extensions of the formalism that are potentially useful in conducting field-theoretic simulations. The first of these is (p.406) a generalization to include a noise source acting on both the real and imaginary parts of the field:122

(D.24)
d d t x ( t ) = λ Re [ d H d z ( t ) ] + η R ( t ) d d t y ( t ) = λ Im [ d H d z ( t ) ] + η I ( t )
The random forces ηR(t) and ηI(t) can be taken to be real, Gaussian white noise processes with vanishing mean values, <ηR(t)> = <ηI(t)> = 0. The covariance of the noise can be further generalized from eqn (D.8) to
(D.25)
< η R ( t ) η R ( t ) > = 2 ( λ + ε ) δ ( t t ) < η I ( t ) η I ( t ) > = 2 ε δ ( t t ) < η R ( t ) η I ( t ) > = 0
where λ > 0 and ε ≥ 0 are real parameters that determine the relative strengths of the two noise components. Equations (D.24) and (D.25) evidently reduce to the “standard” CL eqns (D.7)–(D.8) in the special case of ε = 0. Since the parameter λ can be absorbed into the time scale of the stochastic process, ε is effectively a free parameter that can be adjusted to optimize the performance of a CL simulation.

The role of ε can be established by deriving the Fokker–Planck equation corresponding to these generalized CL equations. By repeating the steps leading to eqn (D.17) it is straightforward to show that eqns (D.24)–(D.25) are consistent with the Fokker–Planck equation

(D.26)
t P ( x , y , t ) = λ x [ F R ( x , y ) P ( x , y , t ) ] λ y [ F I ( x , y ) P ( x , y , t ) ] + ( λ + ε ) 2 x 2 P ( x , y , t ) + ε 2 y 2 P ( x , y , t )
Moreover, it can be shown that if this generalized Fokker–Planck equation has a steady state P(x, y), then the associated function P c(x) = ∫ dy P(x – iy, y) reduces to eqn (D.3). Thus, the generalized CL equations are a suitable alternative to the standard CL approach. For non-zero ε, however, eqn (D.26) shows that the generalized scheme contains an extra dispersive term ε y 2 P that will tend to smooth the steady state distribution in the variable y.

To illustrate the role of finite ε, we return to the “toy” model of eqn (5.7) for the case of p = 1, i.e. H(x) = ix + x 2/2. The steady state solution of eqn (D.26) for this simple quadratic Hamiltonian is

(D.27)
P ( x , y ) exp ( λ x 2 2 ( λ + ε ) λ ( y + 1 ) 2 2 ε )
(p.407) apart from a normalization constant. The distribution function has a Gaussian ridge of maximum probability centered on the line y = − 1 in the complex plane that passes through the saddle point, z* = − i. This line is, of course, the constant-phase ascent path for the model. The width of the ridge normal to this line is O1/2), while the decay in probability density away from the saddle point along the line y = − 1 is much slower for ε ≪ λ. For ε → 0 +, the conventional CL theory is recovered and eqn (D.27) reduces to the singular distribution
(D.28)
P ( x , y ) δ ( y + 1 ) exp ( x 2 / 2 )
At least for this simple model, it is clear that the generalized CL theory with ε > 0 produces a smoother steady state distribution function that is centered on the constant-phase ascent path.

The CL theory can be immediately extended to the multi-dimensional case of model Hamiltonians H(x), where x = (x 1, x 2, …, x M)T is a real M-vector and H is complex valued. In particular, the generalized CL eqns (D.24)–(D.25) can be written in the multi-variate case as

(D.29)
d d t x ( t ) = λ Re [ H z ( t ) ] + η R ( t ) d d t y ( t ) = λ Im [ H z ( t ) ] + η I ( t )
where z = x + iy is a complex M-vector. The Gaussian random forces ηR(t) and ηI (t) are M-vectors with vanishing mean and covariance matrices given by
(D.30)
< η R ( t ) η R ( t ) > = 2 ( λ + ε ) 1 δ ( t t ) < η I ( t ) η I ( t ) > = 2 ε 1 δ ( t t ) < η R ( t ) η I ( t ) > = 0
where 1 is the M × M unit tensor. By setting ε = 0, the above scheme reduces to the standard multi-variate CL theory.

Finally, in the multi-dimensional case, the parameters λ and ε can be replaced by a positive definite, M × M “kinetic coeffficient” matrix ε and a positive semidefinite, M × M “noise” matrix ε. This generalization amounts to the scheme

(D.31)
d d t x ( t ) = λ Re [ H z ( t ) ] + η R ( t ) d d t y ( t ) = λ Im [ H z ( t ) ] + η I ( t )
with
(D.32)
< η R ( t ) η R ( t ) > = 2 ( λ + ε ) δ ( t t ) < η I ( t ) η I ( t ) > = 2 ε δ ( t t ) < η R ( t ) η I ( t ) > = 0
(p.408) Again it can be proven that if the above CL equations converge to a steady state P(x, y), the steady state solution is consistent with the proper complex probability weight P c(x) ∼ exp[−H(x)]. By adjusting the form of the matrices λ and ε it may be possible to achieve better performance in numerical simulations than the simplest choice of λ = λ1 and ε = ε1. A potentially important class of rank-M matrices is those that are translationally invariant on a d-dimensional collocation grid and can be diagonalized by a discrete Fourier transform. By employing such matrices, efficient pseudo-spectral numerical solutions of the CL equations can be achieved.

Notes:

(119) Only stochastic methods of evaluating integrals such as eqn (D.2) are considered, because our main interest is in the discrete field theory case where the integral is M-dimensional with M ≫ 1.

(120) We shall assume throughout this appendix that H(z) is an analytic function of z.

(121) Subscripts R and I denote the real and imaginary parts, respectively, of a complex function.

(122) We are not aware of this generalization having been reported previously in the literature.