Jump to ContentJump to Main Navigation
Macroeconomic Theory$

Jean-Pascal Benassy

Print publication date: 2011

Print ISBN-13: 9780195387711

Published to Oxford Scholarship Online: April 2015

DOI: 10.1093/acprof:osobl/9780195387711.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 26 February 2017

(p.502) A Mathematical Appendix

(p.502) A Mathematical Appendix

Macroeconomic Theory
Oxford University Press

A.1 Matrices

A.1.1 General Properties

A matrix is a collection of numbers arranged in a rectangular table. The typical entry will be denoted aij, where i = 1,…m is the line index and j = 1,…n is the column index. A typical matrix looks as follows:

A = [ a 11 a 1n a ij a m1 a mn ]

Most often we use matrices where aij is a real number, but it can be a complex number as well.

A number of operations can be performed on matrices. They can be multiplied by scalars:

B=λA   ± b ij =λ a ij

(p.503) If two matrices A and B have the same dimension (m, n) they can be added or subtracted:

C=A+B    c ij = a ij + b ij

C=AB    c ij = a ij a ij

The transpose of a matrix is the matrix obtained by exchanging rows and columns:

B= A T     b ij = a ij

Two matrices A and B (in that order) can be multiplied if the number of columns of A is the same as the number of lines of B. Let us denote by ‎ that common number. The product matrix is defined by:

C=AB    c ij =   k=1 a ik b kj

A.1.2 Square Matrices

We will particularly work with square matrices such that m = n. For such matrices we define the trace T as the sum of entries on the main diagonal:

T( A )=   i=1 n a ii

An important special case is the identity matrix, denoted I, which is composed of ones along the main diagonal, and zeros everywhere else.

A.1.3 Determinants

The determinant of a matrix is a central element for all the uses that will follow. The determinant of a matrix A will be denoted | A | or D (A). For a matrix of dimension 1 the definition is particularly simple:

A=[ a 11 ]    D ( A ) = | A | = a 11

To define the determinant of a matrix of dimension n < 1, we work recursively. Call Aij the matrix obtained by deleting from A row i and (p.504) column j:

A ij =[ a 1,1 a 1,j1 a 1,j+1 a 1,n a i1,1 a i+1,1 a i1,j1 a i+1,j1 a i1,j+1 a i+1,j+1 a i1,n a i+1,n a m,1 a m,j1 a m,j+1 a m,n ]

Now the cofactor Cij is deduced from Aij through:

C ij = (1) i+j D( A ij )

Finally the determinant | A | is obtained by “cofactor expansion”. This expansion can be implemented in two ways: expansion through the ith row:

|A|=   j=1 n a ij C ij

or expansion through the jth column:

|A|=   i=1 n a ij C ij

It does not matter which row or which column is used. Take as an example the matrix of dimension 2:

A=[ a 11     a 12 a 21     a 22 ]

Applying the above steps to this matrix we find:

D( [ a 11     a 12 a 21     a 22 ] )= a 11 a 22 a 12 a 21

(p.505) A.1.4 Matrix Inversion

Consider a square matrix A. The matrix inverse of A, which we denote as A−1, is a matrix such that:

A A 1 = A 1 A=I

where I is the identity matrix. If the determinant | A | is different from 0, the inverse matrix is unique and computed as follows:

A 1 = 1 |A| C T

The matrix C, or matrix of cofactors, is simply deduced from matrix A by replacing every entry aij by its cofactor Cij, which was defined in (A.10):

C=[ C 11 C 1n C ij C n1 C nn ]

As an example, applying these formulas to the matrix (A.13) we find the inverse:

A 1 = 1 a 11 a 22 a 12 a 21 [ a 22     a 21 a 12     a 11 ]

A.1.5 Eigenvalues and Eigenvectors

Consider a square matrix A. An eigenvalue λ‎ and the associated eigenvector x are defined by:


In other words, the eigenvector x premultiplied by A is equal to a multiple of itself. This can be rewritten:

( AλI )x=0

This equation has a nontrivial solution in x if | A − λ‎I |= 0 (otherwise x = 0 is the only solution). Taking again the example of matrix (A.13) this (p.506) yields:

D( [ a 11 λ a 12 a 21 a 22 λ ] )=0


Ψ( λ )=( a 11 λ)( a 22 λ) a 12 a 21 =0

The polynomial Ψ‎ (λ‎) =| A − λ‎I | is called the characteristic polynomial. In the two-dimensional case, it can be rewritten:

Ψ(λ)= λ 2 T(A)λ+D(A)

In the general case, the values of the sum and product of the eigenvalues are:

  i=1 n λ i =T(A)

П i=1 n λ i =D(A)

A.2 Functions

A.2.1 Derivatives

We start with functions of a single argument f (x), where x is a scalar. We define the first and second derivatives:

f (x)= lim δ0 f(x+δ)f(x) δ

f (x)= lim δ0 f (x+δ) f (x) δ

We can similarly define the derivatives of higher order:

f (n+1) (x)= lim δ0 f (n) (x+δ) f (n) (x) δ

where the common usage is to denote:

f (x)= f (1) (x)  f (x)= f (2) (x)

(p.507) We now move to functions of more than one argument. We still denote the function as f (x), but now x = (x1 ,…, xi, …, xn) is a vector. The partial derivative with respect to argument xi is defined as:

f x i (x)= lim δ0 f( x 1 ,, x i +δ,, x n )f( x 1 ,, x i ,, x n ) δ

We can also define second order partial derivatives:

2 f x i x j = x i ( f x j )

Note that the order of derivation does not matter:

x i ( f x j )= 2 f x i x j = 2 f x j x i = x j ( f x i )

A.2.2 Taylor Expansions

Consider a function f (x) that is continuously differentiable up to the order N. Then there exists γ ∈‎ [0, δ‎] such that:

f(x+δ)=f(x)+   n=1 N1 δ n n! f (n) (x)+ δ N N! f (N) (x+ γ)

where n! = 1 ×‎ 2 × … × n‎.

A.2.3 Homogeneous and Homothetic Functions

A function F (x) = F (x1 ,…, xn) is homogeneous of degree k if:

F( λ x 1 ,,λ x n )= λ k F( x 1 ,, x n )  λ>0

If F (x1 ,…, xn) is homogeneous of degree k, then its partial derivatives ∂F /∂xi are homogeneous of degree k − 1. Furthermore, the function F and its derivatives satisfy the following identity:

  i=1 n x i F x i =kF( x 1 ,, x n )

A function F (x) is homothetic if:


(p.508) There is a relation between homothetic and homogeneous functions. Consider a homothetic function such that F (λ‎x) is increasing in λ‎. Then there exist a homogeneous function g and an increasing function f such that:

F(x)=f[ g(x) ]

A.2.4 Elasticities and Log-linear Differentiation

The elasticity of a function of a single variable f (x) is equal to:

ε[ f(x) ]= dlogf(x) dlogx = x f(x) df(x) dx

The elasticity of a product of functions is the sum of their elasticities:

ε[ f( x )g( x ) ]=ε[ f( x ) ]+ε[ g( x ) ]

The elasticity of a sum of functions is a weighted average of their elasticities:

ε[ f(x)+g(x) ]= f(x) f(x)+g(x) ε[ f(x) ]+ g(x) f(x)+g(x) ε[ g(x) ]

We consider a function of several variables f (x) = f (x1 ,…, xn). The partial elasticity of f with respect to xi is equal to:

ε i [ f(x) ]= Logf(x) Log x i = x i f(x) f(x) x i

If a function is homogeneous of degree k, then its partial elasticities sum to k:

f(λ x 1 ,,λ x n )= λ k f( x 1 ,, x n )     i=1 n ε i [ f(x) ] =k

A.2.5 Elasticities of Substitution

Consider a function of several variables f (x) = f (x1 , …, xn). The marginal rate of substitution between xi and xj is equal to fi/fj where:

f i = f x i    f j = f x j

(p.509) Now the elasticity of substitution between xi and xj is equal to:

σ ij = Log( x i / x j ) Log( f i / f j )     along the curve f(x)=c

where c is a constant. The constant elasticity of substitution function that we already saw in appendix 1.1 has constant elasticity of substitution:

f( x 1 ,, x n )= (   i=1 n a i x i (σ1)/σ ) σ/ (σ1)

σ ij =σ  i,j

A.2.6 Concavity and Convexity

A subset S of Rn is convex if:

xS  yS  λ[ 0,1 ]   λx+(1λ)yS

A function f (x) defined on a convex subset S of Rn is concave if for every λ ∈‎ [0, 1] and every x ∈‎ S and y ∈‎ S, we have:

f[ λx+(1λ)y ]λf(x)+(1λ)f(Y)

A function f (x) is strictly concave if for 0 < λ <‎ 1 and every x ∈‎ S and y ∈‎ S, we have:

f[ λx+(1λ)y ]>λf(x)+(1λ)f(y)

Similarly the function f (x) is convex if:

f[ λx+(1λ)y ]λf(x)+(1λ)f(y)

and it is strictly convex if:

f[ λx+(1λ)y ]<λf(x)+(1λ)f(y)

If f is twice continuously differentiable, we can characterize concavity through second-order derivatives. In particular, if f is function of a single variable, f is concave if:

f (x)0

(p.510) If f is a function of two variables, the two following conditions are sufficient:

f 11 0

f 11 f 22 ( f 11 ) 2 0

Note that (A.53) and (A.54) imply f22 ‎ 0, so these two conditions are more symmetrical than they look.

A.3 Static Optimization

A.3.1 First- and Second-order Conditions

Consider a function F (x) = F (x1, …, xn) defined over a set SRn. A point x* is a maximum of the function over S if:

F( x * )F(x)  xS

Traditionally the conditions for a maximum are separated between the first-order conditions (which involve the first-order derivatives) and the second-order conditions (which involve the second-order derivatives). The first-order conditions are:

F x i =0    i =1,,n

In the case where the function F has a single argument, the second-order condition is F″ (x*) <‎ 0, so that sufficient conditions for a maximum are:

F ( x * )=0   F ( x * )<0

A.3.2 Maximization Under Equality Constraints

Consider the following problem:

Maximize F(x)=F( x 1 ,, x n ) s.t. g i (x)= g i ( x 1 ,, x n )=0 i=1,,m

(p.511) We define the Lagrangian as:

£( x,λ )=F( x )+   i=1 m λ i g i ( x )

The parameter λ‎i is called the Lagrange multiplier associated to the ith constraint. The first-order necessary conditions for a maximum are:

£ x i =0     i =1,,n     £ λ i =0     i =1,,m

If £‎ (x, λ‎) is concave in x, then the solution to these equations is indeed a maximum.

A.3.3 An Envelope Theorem

We now assume that both the maximand and the constraints are functions of a vector of parameters a, so that the maximization program is:

Maximize F(x,a) s.t. g i (x,a)=0 i=1,,m

The Lagrangian is now defined as:

£(x,λ,a)=F(x,a)+   i=1 m λ i g i (x,a)

Define the value function as:

V(a)= max x { F(x,a)| g i (x,a)=0     i=1,,m }

Then the envelope theorem says:

V(a) a i = £(x,λ,a) a i     i

where the value of x in (A.62) is a solution of the maximization program. As an example of application of this theorem, we give a very simple (p.512) interpretation of the Lagrange multiplier when F (x, a) = F (x) and the constraint gi (x, a) = 0 takes the particular form:

g i (x)= a i

Then direct application of the envelope theorem shows that:

λ i = V(a) a i

The Lagrange multiplier is equal to the marginal contribution of a unit of ai to the overall value function.

A.3.4 Maximization Under Inequality Constraints

Consider the following problem:

Maximize F(x,a) s.t. g i (x,a)0 i=1,,m

where F and the gi’s are concave in x. We define the Lagrangian as:

£(x,λ,a)=F(x,a)+   i=1 m λ i g i (x,a)

where λi‎ is again the Lagrange multiplier associated to the ith constraint. Sufficient conditions for a maximum are:

£( x * ,λ,a) x i =0     i=1,,m

λ i 0     λ i =0    if  g i ( x * ,a)>0     i=1,,m

The value function is now defined as:

V(a)=max { F(x,a)| g i (x,a)0     i=1,,m }

and the envelope theorem now says:

V(a) a i = £(x,λ,a) a i     i

(p.513) A.4 Dynamic Optimization

Consider the following dynamic maximization problem:

Maximize  0 T F( x t , u t ,t)dt     s.t. x . t =g( x t , u t ,t)

where xt is called the state variable, ut the control variable, x.t=dxt/dt, and the initial value x0 is historically given. We define the Hamiltonian Ht as:

H t ( x t , u t , λ t ,t)=F( x t , u t, t)+ λ t g( x t , u t ,t)

where λt‎ is a multiplier similar to the Lagrange multiplier. Necessary conditions for a maximum are:

H t u t =0

x ˙ t = H t λ t

λ . t = H t x t

There can be other conditions depending on the constraints on the endpoints. (a) If xT is free, then λ‎T = 0. (b) If xTx¯T, then λ‎T 0. (c) If xT is given, then λ‎T is free.

A.4.1 Some Sufficient Conditions

If the Hamiltonian function Ht (xt, ut, λ‎t, t) is concave in (xt, ut), then the conditions (A.71) to (A.73) are sufficient for a maximum (Mangasarian, 1966).

If the Hamiltonian is not concave in (xt, ut), there is a weaker sufficient condition (Arrow and Kurz, 1970): let us denote as u* (xt, λ‎t, t) the value of the control variable that maximizes Ht (xt, ut, λ‎t, t) for given (xt, λ‎t, t), and define the maximized Hamiltonian Ht* as

H t * ( x t , λ t ,t)= H t [ x t , u * ( x t , λ t ,t), λ t ,t]

Then a sufficient condition for the above conditions to yield a maximum is that the maximized Hamiltonian be concave in xt.

(p.514) A.4.2 The Current Value Hamiltonian

In many applications the function F contains a discount factor and is written as:

F( x t , u t ,t)= e ρt f( x t , u t ,t)

so that the maximization problem is:

Maximize  0 T e ρt f( x t , u t ,t)dt    s.t. x . t =g( x t , u t ,t)

We now use a new Hamiltonian multiplier variable μt = λteρt‎ and define the current value Hamiltonian Hc as:

H t c =f( x t , u t ,t )+ μ t g( x t , u t ,t )

Then necessary conditions for optimality are:

H t c u t =0

x . t = H t c μ t

μ . t =ρ μ t H t c x t

A.4.3 Transversality Conditions

For infinite horizon optimization problems, such as those we encountered in the Ramsey model in chapter 7, a first set of optimality conditions usually consists of first-order conditions, for example, the Euler equations of the consumer. We encountered a second type of condition, the transversality condition. We give here, by way of example, a simple intuitive description of what this transversality condition means.1

To make the exposition particularly simple, we use the same framework as that of the Ramsey model with fixed incomes and in discrete time that we studied in chapter 4. We consider an infinitely lived agent who in period t (p.515) receives an exogenous income Yt and consumes Ct. She has an intertemporal utility:

  t=0 β t U( C t )

and is subject to the budget constraints:

D t+1 = R t+1 ( D t + Y t C t )

where Dt is the amount of government debt (the only asset) the agent holds at the beginning of period t. She thus solves the following optimization problem:

Maximize    t=0 β t U( C t ) s.t. D t+1 = R t+1 ( D t + Y t C t )

The Lagrangian for this program is:

  t=0 β t [ U( C t )+ λ t ( D t + Y t C t D t+1 R t+1 ) ]

The first-order conditions for Ct and Dt are, respectively:

U ( C t )= λ t

λ t =β R t+1 λ t+1

Combining (A.83) and (A.84) we obtain the traditional Euler equation:

U ( C t )=β R t+1 U ( C t+1 )

Now define the discount factors:

Δ t = 1 R 1 R t     Δ 0 =1

So we have:

Δ t = R t+1 Δ t+1

(p.516) The budget equation (A.81) is rewritten:

Δ t+1 D t+1 = Δ t ( D t + Y t C t )

Sum equations (A.88) from time 0 to time T. We obtain:

Δ T+1 D T+1 +   t=0 T Δ t C t = D 0 +   t=0 T Δ t Y t

Now the transversality condition says that the limit, as time goes to infinity, of the discounted value of asset holdings must go to 0, that is:

Lim t Δ t D t =0

Indeed, if this limit was positive, there would be “purchasing power left at infinity.” The agent could transfer this purchasing power to any early date and consume more at this date. This means that the initial situation would not be optimal. Conversely if this limit was negative, this would mean that the state would permanently finance this agent’s consumptions, which we rule out.

A.5 Dynamic Programming

The technique of dynamic programming (Bellman, 1957), is a specific optimization technique used to solve maximization problems of the following type:2

Maximize    t=1 T β t U t ( x t , u t )    s.t. x t+1 = g t ( x t , u t )

The variable ut is the control variable, which is chosen by the maximizing agent. The variable xt is the state variable, which embeds all information necessary at date t. Both ut and xt can be vectors as well as scalars.

Note that we have described a deterministic, finite horizon problem. This is the first we study, followed by deterministic infinite horizon and a stochastic problem.

(p.517) A.5.1 Deterministic Finite Horizon

The method of dynamic programming is based on a value function, denoted as Vt, which represents the maximal utility that can be expected as of period t included, for a given initial value xt:

V t ( x t )=Max   s=t T β st U s ( x s , u s )    s.t. x s+1 = g s ( x s , u s ) st

This program is solved recursively, starting with the last period:

V T ( x T )= Max u T   U T ( x T , u T )

Now the value function in T − 1 is obtained by:

V T1 ( x T1 )= Max u T1 { U T1 ( x T1 , u T1 )+β V T [ g T ( x T1 , u T1 ) ] }

and similarly for all previous periods:

V t ( x t )= Max ut { U t ( x t , u t )+β V t+1 [ g t+1 ( x t , u t ) ] }

A.5.2 Deterministic infinite Horizon

This time we solve the problem:

Maximize   t=1 β t U t ( x t , u t )    s.t. x t+1 = g t ( x t , u t )

The difference is that because the horizon is infinite, there is no such thing as starting from the last period. One usually studies stationary problems such that the functions U and g are time independent. In that case, the value function will also be time independent and satisfy the functional equation:

V(x)= Max u { U(x,u)+βV[ g( x,u ) ] }

(p.518) A.5.3 A Stochastic Problem

We now assume that the law of transition is also function of a stochastic variable εt‎ (this will be defined precisely in sections A.7 and A.8, which the reader may want to read before this section):

x t+1 = g t ( x t , u t , ε t+1 )

where the probability distribution of the variable εt+1‎ depends on xt and ut. The dynamic equation giving the value function in the finite horizon case becomes:

V t ( x t )= Max u t { U t ( x t , u t )+β E t V t+1 [ g t+1 ( x t , u t , ε t+1 ) ] }

which replaces (A.93), and the functional equation for the infinite horizon case:

V(x)= Max u { U(x,u)+βEV[ g(x,u,ε) ] }

which replaces (A.94).

A.6 Noncooperative Games

We are in a noncooperative game situation when economic agents interact, each choosing independently an action, which will be called his strategy. Assume there are n agents indexed by i = 1,…, n. Agent i has a strategy si. The utility function of agent i is function of all strategies:

U i ( s 1 ,, s n )= U i ( s i , s i )

where s-i is the set of all strategies, except that of agent i:

s i ={ s j |ji }

A.6.1 Nash Equilibrium

A pure strategy Nash equilibrium (Nash, 1953) is a set of strategies si*, i = 1,… n, such that:

U i ( s i * , s * i ) U i ( s i , s i * )     s i

(p.519) We can express this in terms of the best response function ψ‎ (s):

ψ i ( s i )=arg  max si U i ( s i , s i )

A Nash equilibrium is a set of si* such that each strategy is a best response to the other ones:

s i * ψ i ( s i * )

If the utility functions are differentiable, the Nash equilibrium satisfies the following necessary conditions:

U i s i =0     i=1,,n

A.6.2 A Two-period Game

We consider a two-period, two-player game and see that depending on the timing of the play the solutions obtained can be quite different.

The players are denoted X and Y. Their strategies in periods 1 and 2 are, respectively, (x1 , x2) for X and (y1 , y2) for Y, and their utility functions:

U X ( x 1 , y 1 , x 2 , y 2 )

U Y ( x 1 , y 1 , x 2 , y 2 )

A.6.3 Nash Equilibrium with Commitment

We first assume that each player decides in the first period her strategies for both the first and the second period. In other words, she can commit to a second-period strategy. We say there is commitment. The optimality conditions for players X and Y are, respectively:

U X x 1 =0     U X x 2 =0

U Y y 1 =0     U Y y 2 =0

(p.520) A.6.4 Subgame Perfect Nash Equilibrium

In reality games are usually played sequentially. Even if players announce their second-period strategies in the first period, they are not bound by this announcement. In particular x2 and y2 are effectively decided on in the second period, when x1 and y1 have already been played and cannot be changed. As a result we obtain what is called a subgame perfect, or time consistent equilibrium (Selten, 1975; Kydland and Prescott, 1977).

To characterize this equilibrium, we place ourselves in the second period. At this stage x1 and y1 are given. We have a Nash equilibrium in x2 and y2 characterized by:

U X x 2 =0     U Y y 2 =0

In particular conditions (A.108) yield the best response functions:

x 2 = X 2 ( x 1 , y 1 , y 2 )

y 2 = Y 2 ( x 1 , y 1 , x 2 )

Going back to the first period, each player knows that the values that will be played in the second period are given by equations (A.109) and (A.110), so that players X and Y maximize, respectively:

U X [ x 1 , y 1 X 2 ( x 1 , y 1 , y 2 ), Y 2 ( x 1 , y 1 , x 2 ) ]

U Y [ x 1 , y 1 X 2 ( x 1 , y 1 , y 2 ), Y 2 ( x 1 , y 1 , x 2 ) ]

Player X maximizes with respect to x1, player Y with respect to y1. The first-order conditions are, respectively, for players X and Y :

U X x 1 + U X x 2 X 2 x 1 + U X y 2 Y 2 x 1 =0

U Y y 1 + U Y x 1 X 2 y 1 + U Y y 2 Y 2 y 1 =0

We may first note that if ∂X2 /∂y1 = 0 and Y2 /∂x1 = 0, then the solutions of the system (A.106, A.107) and of system (A.108, A.113, A.114) are the same. But as soon as the second-period optimal strategies of a player depend on the other player’s first-period strategy, the two outcomes will be different.

(p.521) A.6.5 Bargaining: The Strategic Approach

We now describe the strategic approach to bargaining.3 Assume that agents X and Y bargain over the partition of a cake of size 1. Denote X′s share of the cake by x, and Y ’s share by y = 1 − x‎. Their respective undiscounted utility functions are for period t:

U X (x)    and  U Y (y)

The traditional Nash cooperative solution (Nash, 1950) says that x is solution of:

x=Argmax { γ X Log[ U X (x)]+ γ Y Log[ U Y (1x)] }

where γ‎X and γ‎Y represent X ’s and Y ’s weights in the negotiation.

Now we show that such a solution can actually be obtained as the outcome of a bargaining process in time where the two agents make alternating offers to each other (Rubinstein, 1982).

More precisely, in even periods X proposes to Y a partition x of the cake, with y = 1 − x‎. If Y accepts, this partition is implemented. If Y refuses, she will make in the next odd period a proposition for a partition, and so on. If an agreement is reached in period t, the two players receive respectively the following discounted utilities:

δ X t U X (x)    and  δ Y t U Y (1x)

where δ‎X and δ‎Y are, respectively, X’s and Y’s discount rates. We will now show that the dynamic game just described yields a solution that is very similar to that in formula (A.116).

Linear Utilities

We begin with the case of linear utility functions (Rubinstein, 1982):

U X (x)=x     U Y (y)=y

We find the solution by determining the maximal and minimal share of the pie that X can obtain. Let us start with the maximum, denoted xM (table A.1).

(p.522) If X can obtain at most xM in period 2, because of discounting she will be ready to accept δ‎X xM in period 1, which gives Y at least 1 − δ‎X xM. Now if Y must get at least 1 − δ‎X xM in period 1, because of discounting she must be given at least δ‎Y (1 − δ‎X xM) in period 0, which leaves X with at most 1 − δ‎Y (1 − δ‎X xM) in period 0.

Table A.1 Bounds on alternating offers


offer made by

X obtains at most

Y obtains at least



1 − δ‎Y (1 − δ‎X xM)



1 − δ‎X xM




We note that player X is in the same situation in period 0 and in period 2, so the outcomes for X must be the same, which yields:

x M =1 δ Y (1 δ X x M )

so that:

x M = 1 δ Y 1 δ X δ Y

This was for the maximum attainable xM. The same reasoning applies to the minimum xm, and we find the same number. So the shares of the two players are uniquely defined and equal to:

x= x M = x m = 1 δ Y 1 δ X δ Y

1x= δ Y (1 δ X ) 1 δ X δ Y

We note that because of the discrete time, the solution is asymmetrical. Playing first confers an advantage to X. Note, indeed, that if δ‎X = δ‎Y = δ‎, we have:

x= 1 1+δ > 1 2

(p.523) Now we can make the problem symmetrical by assuming that the time span between offers tends to 0. For that we take, calling Δ‎ this time span:

δ X = e ρXΔ     δ Y = e ρYΔ

Then we find in the limit, when Δ‎ ‎ 0:

x= ρ Y ρ X + ρ Y     1x= ρ X ρ X + ρ Y

This is now symmetrical and similar to the generalized Nash solution, with:

γ X = 1 ρ X     γ Y = 1 ρ Y

We see that the implicit bargaining power of the two agents is inversely related to their impatience.

Concave Utilities

We now consider the more general case (Hoel, 1986) where the utilities UX (x) and UY (y) are concave and such that Ui (0) = 0. Then consider the two numbers x and y defined by:

U X (y)= δ X U X (x)

U Y (1x)= δ Y U X (1y)

A reasoning similar to what we made for linear utilities shows that the outcome is (x, 1 − x). Indeed let us assume that x is the minimum that X can expect in period 2. Then in period 1, Y must at least propose a value y such that:

U X (y)= δ X U X ( x )

Now in period 0, X must propose to Y a value 1 − x that will give her the same utility as 1 − y in period 1, that is:

U Y ( 1x )= δ Y U Y ( 1y )

Because UX and UY are concave, the values x and y given by (A.127) and (A.128) are unique. To make the problem symmetrical, we take again:

δ X = e ρ X Δ     δ Y = e ρ Y Δ

(p.524) and go to the limit Δ‎ ‎ 0. The system (A.127), (A.128) yields:

lim Δ0 x=Argmax{ Log[ U X ( x ) ] ρ X + Log[ U Y ( 1x ) ] ρ Y }

which has the first-order condition:

1 ρ X U X ( x ) U X ( x ) = 1 ρ Y U Y ( 1x ) U Y ( 1x )

As an example, if:

U X ( x )= x κ X     U Y ( x )= x κ Y


x= κ X ρ Y κ X ρ Y + κ Y ρ X = κ X / ρ X ( κ X / ρ X )+( κ Y / ρ Y )

A.7 Stochastic Variables

A.7.1 Probabilities

We consider an experiment whose outcome is not known in advance and can take values in a set Ω‎. Ω‎, called the sample space, is the set of all possible outcomes. A subset A ⊂ Ω‎ is called an event. If the subset A is a point in Ω‎, it is called an elementary event.

For example, if the experiment consists in tossing a coin, there are two elementary events, heads and tails, and Ω‎ = {heads, tails}.

A probability function is a real valued function P (A) defined for every subset A ⊂ Ω‎ and satisfying the following axioms:

  • 0 ≤ P (A) 1

  • P (Ω‎) = 1

  • If A ∩ B = ∅‎ then P (AB) = P (A) + P (B)

Consider two events, A and B. Events A and B are stochastically independent if:

P( AB )=P( A )P( B )

(p.525) The probability that A occurs, given that B has occurred, that is, the conditional probability P (A | B), is defined by:

P( A|B )= P( AB ) P( B ) if P( B ) >0

A.7.2 One-dimensional Random Variables

We start with a random variable X, which can take values xR, where R is the real line. If the potential values form a discrete set, we denote as f (x) the probability that X takes the value x. The function f (x) has the property that:

  xR f( x )=1

The probability of an event A is:

P( XA )=   xA f( x )

Similarly, if the random variable is continuous, the probability that X is in the interval [x, x + dx] is f (x) dx with:

R f( x )dx=1

The function f (x) is called the probability density function. The probability of an event A is:

P( XA )= A f( x )dx

One defines also the cumulative distribution function F (z):

F( z )=Prob( Xz )

It is easily derived from the probability density function. In the discrete case:

F( z )=   xz f( x )

and in the continuous case:

F( z )= xz f( x )dx

(p.526) A.7.3 Moments of Random Variables

The mean, or expected value, of the random variable X is given by:

μ=E( X )=   x xf( x )

for the discrete case, and:

μ=E( X )= + xf(x)dx

for the continuous case. More generally, the expectation of a function of the random variable h (X) is:

E[ h( X ) ]=   x h( x )f( x )

for the discrete case, and:

E[ h( X ) ]= + h( x )f( x )dx

for the continuous case.

If the function h is convex, we have Jensen’s inequality:

h[ E( X ) ]E[ h( X ) ]

Some of the expectations defined in (A.147) and (A.148) are often used in economics. Notably one defines the central moment of order k as:

μ κ =E ( Xμ ) κ

A particularly used moment is the central moment of order 2, called the variance:

Var( X )=E ( Xμ ) 2

One also defines the standard deviation as the square root of the variance:

σ( X )= [ Var ( X ) ] 1/2

(p.527) A.7.4 Two-dimensional Random Variables

We consider two random variables X and Y. They have a joint density function f (x, y). The expectation of a function h is defined in the discrete case as:

E[ h( X,Y ) ]=   x,y h( x,y )f( x,y )

and in the continuous case as:

E[ h( X,Y ) ]= + + h( x,y )f( x,y )dxdy

An important second-order moment is the covariance between the two variables:

Cov  ( X,Y )=E{ [ XE ][ YE( Y ) ] }

If Cov (X, Y) = 0, then X and Y are said to be uncorrelated. One defines the coefficient of correlation between X and Y :

Corr  ( X,Y )= Cov( X,Y ) σ( X )σ( Y )

One easily checks that:

1Corr ( X,Y )1

We have the following useful relations between the second order moments:

Var ( X+Y ) = Var  ( X ) + Var   ( Y )+2 Cov (X,Y)

Var ( XY )=Var ( X )+Var ( Y )2Cov (X,Y)

A.7.5 Some Classic Distributions

We briefly describe a few classic distributions for which we give the density function, mean, variance, and, when possible, the moments of order k.

(p.528) The Normal Distribution

The normal distribution is defined for x ∈ [−∞ +] and has the density function:

f( x )= 1 σ 2π exp[ ( 1μ ) 2 2 σ 2 ]

The mean and variance are:

E( X )=μ

Var ( X )= σ 2

The Log-normal Distribution

The log-normal distribution is defined for x ∈ [0, +∞] and the variable Logx is normal, so x has the density function:

f( x )= 1 σ 2π exp[ ( Logxμ ) 2 2 σ 2 ]

The mean, variance, and moment of order k are:

E( X )=exp( μ+ σ 2 2 )

Var ( X )=exp( 2μ )[ exp( 2 σ 2 )exp( σ 2 ) ]

E( X κ )=exp( κμ+ κ 2 σ 2 2 )

The Uniform Distribution

The uniform distribution is defined for x ∈ [A, B] and has the density function:

f( x )={ 1/ ( BA ) AxB 0 otherwise

The mean and variance are:

E( X )= A+B 2    Var( X ) = ( BA ) 2 12

(p.529) The Exponential Distribution

The exponential distribution has the density function:

f( x )={ λexp( λx ) x0 0 x<0

and its mean and variance are:

E( X )= 1 λ Var ( X )= 1 λ 2

A.7.6 Multivariate Normal Laws and Conditional Probabilities

Assume that X and Y are jointly normal with means μ‎X and μ‎Y, variances σ2X and σ‎2Y and covariance σ‎XY. Then the stochastic variable Y conditional on X is a normal variable with the following mean and variance:

E( Y|X )= μ Y + σ XY σ Y 2 ( X μ X )

σ 2 ( Y|X )= σ Y 2 ( 1 σ XY 2 σ X 2 σ Y 2 )

A.8 Time Series and Stochastic Processes

Call Z the set of integers and consider a set of random variables Xt indexed by tZ. The infinite vector X = {Xt, tZ} is called a discrete time series.

A.8.1 Stationary Processes

A process is second-order stationary (we henceforth call it stationary for brevity) if:

E X t 2 <     tZ

E X t =E X s     t,     s z

cov( X t , X t+h )=γ( h )     tZ,     hZ

For a stationary process we define the autocorrelation function as:

ρ( h )= γ( h ) γ( 0 )

(p.530) A.8.2 White Noises

A white noise of dimension 1 is a process (ε‎t, tZ) where the εt‎’s are centered and uncorrelated:

E( ε t )=0     E( ε t ε s )=0     ts

We can also define a white noise of dimension n, where this time the εt‎’s are vectors of the same dimension, and with the same matrix of variance- covariance Ω‎. They are also centered and uncorrelated in time:

E( ε t )=0     E( ε t ε S T )=0     ts

A.8.3 Expectations

Quite often in this book we use the expectation at some time t of a variable at a later date, say t + j. To make that precise, we first define the information set at date tZ as the set of all informations available at date t, which we denote as It. As an example of what It contains, if the agent observes a particular variable Xt, a usual assumption is that It contains the current and past values of Xt:

{ X t , X t1 , } I t

Now the expectation at t of Xt+j, which we denote Et (Xt+j) is simply defined as the mathematical expectation of Xt+j, conditional on the information contained in It.

To give a simple example of such an expectation, we consider the following process:

X t = X t1 + ε t

where εt‎ is a white noise. Such a process is called a random walk. We can write Xt+j as:

X t+j = X t + ε t+1 ++ ε t+j

Because the expected values, as of period t, of all the white noises εt+1,…, ε‎t+j are equal to 0, we find:

E t X t+j = X t

(p.531) Now a useful property of these expectations is the law of iterated expectations :4

E t ( E t+1 X t+j )= E t ( X t+j )     0ij

A.8.4 Lag Operators

We call lag operator L the operator such that:

L( X t )= X t1

L j ( X t )= X tj

This operator is linear and invertible. Its inverse L−1 is defined by:

L j ( X t )= E t X t+j

Note that there are two “times” involved: the time t at which the expectation is taken, and the time t + j at which the variable occurs. When shifting time forward, we shift only the time at which the variable occurs, not the time at which the expectation is taken.

A very useful property of these lag operators is that we can make computations with them as if they were algebraic numbers, provided all expectations are taken as of the same period. We see an application of this shortly when computing solutions to a rational expectations dynamic equation.

A.8.5 Lag Polynomials

One can define polynomial series in the operator L. Consider a time series X = (Xt, tZ) and a sequence (ai, iZ) such that:

  i= + | a i | <+

Then the process defined by:

Y t =   i= + a i X ti

(p.532) is stationary. We note:

A ( L ) t =   i= + a i L i

The process Yt can be rewritten as:

Y t =A( L ) X t

A.8.6 Simple Inversion Formulas

In several occasions in this book we have to find the inverses of simple operators like 1 − λ‎L or 1 − μ‎L−1, where the absolute values of λ‎ and μ‎ are smaller than 1. As we indicated, we can treat the lag operator like an algebraic number, so the inverse of 1 − λ‎L is:

  i=0 + λ i L i

and that of 1 − μ‎L−1:

  i=0 + μ i L i

A.8.7 ARMA Processes

We say that the process Xt is autoregressive of order p, or AR (p) if it can be written as:

A( L ) X t = ε t

where εt‎ is a white noise and A (L) is a polynomial of order p:

A( L ) X t = a o + a 1 L++ a p L p

We say that the process Xt is a moving average of order q, or MA (q) if it can be written as:

X t =B( L ) ε t

(p.533) where εt‎ is a white noise and B (L) is a polynomial of order q:

B( L )= b 0 + b 1 L++ b q L q

Finally, the process Xt is an autoregressive moving average of order (p, q), or ARMA (p, q) if it can be written as:

A( L ) X t =B( L ) ε t

where the polynomials A (L) and B (L) have been defined in equations (A.194) and (A.196).

A.8.8 Martingales

A process M = (Mt, tZ) is a martingale if:

E( M t | I t )= M t t

E( M t | I t1 )= M t1 t

For example, the random walk defined in equation (A.180) is a martingale.

A.9 Solutions to a Rational Expectations Dynamic Equation

We already studied (in chapter 3) the solutions to a simple dynamic equation involving the current and expected values of the endogenous variable. We now study a dynamic equation involving the lagged value as well. We want to solve the dynamic equation:

A x t B x t1 C E t x t+1 =D z t

where xt is the endogenous variable, zt an exogenous random variable, and:

A>0     B>0     C>0     A>B+C

We present the two most used techniques for solving such equations, that of lag operators, and that of undetermined coefficients.

(p.534) A.9.1 Lag Operators

Using the lag operator defined in section A.8.4, we rewrite equation (A.210) as:

( ABLC L 1 ) x t =D z t

We want to factorize the parentheses in (A.202) as:

( ABLC L 1 )=χ( 1λL )( 1μ L 1 )

Identifying the terms one by one we find:

λχ=B     μχ=C

χ( 1+λμ )=A

We thus have from (A.204):

μ= λC B     χ= B λ

and, inserting (A.206) into (A.205), we find that the autoregressive root λ‎ is solution of the characteristic polynomial:

Ψ( λ )=C λ 2 Aλ+B=0

We have:

Ψ( 0 )=B>0

Ψ( 1 )=B+CA<0

There is thus one root λ‎ between 0 and 1. We must also verify that the other root μ‎ in (A.203) is smaller than 1, which boils down to λ‎ < B/C. This is indeed the case, because:

Ψ( B C )= B C ( B+CA )<0

The solution is therefore, from (A.202), (A.203), and (A.204):

x t = D z t χ( 1λL )( 1μ L 1 ) = λD z t B( 1λL )( 1μ L 1 )

(p.535) which can be rewritten:

x t =λ x t1 + λD B   j=0 μ j E t ( z t+j )

with μ‎ defined in (A.206).

A.9.2 The Method of Undetermined coefficients

The method of undetermined coefficients consists in conjecturing a solution with unknown coefficients and determining the value of these coefficients using the constraints imposed by rational expectations. Here we conjecture a solution of the form:

x t =λ x t1 +   j=0 κ j E t z t+j

From that we deduce:

E t x t+1 =λ x t +   j=0 κ j E t z t+1+j =λ x t +   j=0 κ j1 E t z t+j = λ 2 x t1 +λ   j=0 κ j E t z t+j +   j=0 κ j1 E t z t+j

Inserting (A.213) and (A.214) into the initial formula (A.200), we obtain:

A( λ x t1 +   j=0 κ j E t z t+j )B x t1 C( λ 2 x t1 +λ   j=0 κ j E t z t+j +   j=1 κ j1 E t z t+j )=D z t

Identifying to 0 the term in xt−1 we find the characteristic equation giving λ‎:

Ψ( λ )=C λ 2 AλB=0

Note that this is the same as (A.207). So there is one root λ‎ such that:


(p.536) Now identifying to 0 the term in zt in (A.215) yields:

κ 0 = D AλC

and using (A.216):

κ 0 = λD B

Finally identifying to 0 the term in Etzt+j gives:

κ j = C AλC κ j1 =η κ j1

We may note, again using (A.216), that η = μ‎, where μ <‎ 1 was defined in (A.206). Finally we find:

x t =λ x t1 + λD B   j=0 μ j E t ( z t+j )

This is the same expression as (A.212).

A.10 Dynamic Systems

A.10.1 Generalities

Consider a n-dimensional vector xt. A discrete time dynamic system is characterized by a function F that depicts how this vector evolves over time:

x t+1 =F( x t )

A steady state x*, or equilibrium, of this dynamic system is a fixed point of the above mapping:

x * =F( x * )

Consider the eigenvalues of the linearization of F around x*. The equilibrium is called:

  1. 1. A sink if all eigenvalues are of modulus smaller than 1.

  2. 2. A source if all eigenvalues are of modulus greater than 1. (p.537)

    A Mathematical Appendix

    Figure A1 A sink

  3. 3. A saddle in the other cases.

The three cases are represented in figures A.1, A.2, and A.3 for a two-dimensional system.

A.10.2 A Two-dimensional Linear System

Consider the following linear discrete time dynamic system:

[ y t+1 z t+1 ]=A [ y t z t ]

where the matrix A is defined as:

A=[ a   b c   d ]

and where the variables yt and zt have been redefined as deviations from their stationary values y* and z*, so that there is no constant term.

We are looking for λ‎’s and associated values of yt and zt such that:

[ y t+1 z t+1 ]=A [ y t z t ]=λ [ y t z t ]
A Mathematical Appendix

Figure A2 A source

A Mathematical Appendix

Figure A3 A saddle

(p.539) where λ‎ is a scalar. We know from section A.1 that λ‎’s must be such that the determinant of the matrix A − λ‎I is equal to 0, that is:

| aλ b c dλ |=0

so λ‎ is given by the following equation:

Ψ( λ )= λ 2 ( a+d )λ+adbc=0

Ψ‎ (λ‎) is the characteristic polynomial, and its roots are called the eigenvalues. We recognize that ad − bc is the determinant D of the matrix A. The term a + d is called the trace and noted T. The characteristic polynomial is rewritten:

Ψ( λ )= λ 2 λT+D=0

This characteristic polynomial is represented in figure A.4. We now indicate for which combinations of T and D one obtains a sink, a saddle, or a source.

Let us start with the case of a sink. For that case to obtain, the two roots λ‎1 and λ‎2 must be smaller than 1 in absolute value. Looking at figure figure A.4, we see that there are three conditions:

(a)     Ψ( 1 )>0

(b)     Ψ( 1 )>0

(c)     λ 1 λ 2 <1
A Mathematical Appendix

Figure A4 The characteristic polynomial


In terms of T and D this yields, respectively:

(a) 1T+D>0

(a) 1+T+D>0

(a) D<1

Using similar graphs, one sees that for a saddle there are two possible combinations:

1T+D>0    and  1+T+D<0


1T+D<0    and  1+T+D>0

Finally, for a source there are also two possible combinations:

1T+D>0    and  1+T+D>0    and  D>1


1T+D<0    and  1+T+D<0    and  D<1

All the foregoing combinations are represented in figure A.5.

Finally we must separate the cases where the roots are real from those where they are complex. The roots are real if the discriminant of equation (A.229) is positive, that is, if:

4D< T 2

which is also represented in figure A.5. Above the parabola D = T2/4, the roots are real; below the parabola, the roots are complex.

A.11 Determinacy

An important problem in macrodynamics is that of determinacy, that is, the uniqueness or multiplicity of admissible dynamic tra jectories.

At this stage there is no really satisfactory theory to say what happens when a system is indeterminate. There seems to be an agreement, however, that such a situation may be associated with some instability in economic (p.541)

A Mathematical Appendix

Figure A5 Roots and dynamics

variables (because one may possibly jump from one tra jectory to the other), and one usually looks for policies that will make the system determinate. We want to indicate a few simple conditions for a dynamic system to have a determinate solution.

A.11.1 Global and Local Determinacy

There are actually, implicitly or explicitly, two different determinacy criteria in the literature: global and local determinacy.

There is global determinacy if there is only one admissible tra jectory. That criterion is easy to enunciate but sometimes difficult to prove. So many people use a “local” criterion. One says that a dynamic system displays local determinacy around an equilibrium if there is only one admissible tra jectory in a neighborhood of that equilibrium. That criterion will be easier to handle because, as we shall see, it can be assessed by looking only at the eigenvalues of the dynamic system.

A.11.2 Predetermined and Nonpredetermined Variables

Another fundamental distinction for the determinacy issue is that between predetermined and nonpredetermined variables. Predetermined variables are, as their name says, fully determined at time t, for example, because they are functions of past variables. As an example, a typical predetermined (p.542) variable is fixed capital, because the traditional equation of evolution of capital is:

K t =( 1δ ) K t1 + I t1

A nonpredetermined variable, on the contrary, can take any value at time t. In particular it can “jump” to any value. For example, prices are usually thought of as nonpredetermined variables, at least in a Walrasian environment.

A.11.3 The Dynamic System

Let us call Yt the n-dimensional vector of predetermined variables, Zt the m-dimensional vector of nonpredetermined variables. The dynamic system is written:

[ Y t+1 Z t+1 ]= A[ Y t Z t ]+ Ω t

where A is a square matrix of dimension (m + n) and Ω‎t is a vector of exogenous variables. The “initial conditions” consist in the value of the predetermined variables at time 0, Y0.

We first study two particular cases, and then give a more general condition, all for local determinacy.

A.11.4 Determinacy: Predetermined Variables

We start with the case where there is only one predetermined variable.

Y t+1 =a Y t +b

We see on figure A.6 that if a < 1 there is one admissible tra jectory5 starting from some given Y0, whereas if a> 1 tra jectories are divergent and there is no admissible tra jectory. (p.543)

A Mathematical Appendix

Figure A6 Determinacy: a predetermined variable

A.11.5 Determinacy: Nonpredetermined Variables

We now consider the case of a single nonpredetermined variable:

Z t+1 =a Z t +b

Although the tra jectories (figure A.7) look fairly similar to those in figure A.6, the interpretation is almost opposite, because in the case of non predetermined variables there are no such things as initial conditions.

In the case where a < 1, there is an infinity of admissible tra jectories, which all converge toward the long-run equilibrium Z*. There is thus indeterminacy. In the contrary if a > 1, all tra jectories such that Z0Z * are divergent. Only the stationary tra jectory such that Zt = Z * for all t is admissible. The dynamic system is thus determinate.

A.11.6 A General Condition for Local Determinacy

Extending the foregoing intuitions to a system with n predetermined variables and m nonpredetermined variables, we have the following result (Blanchard and Kahn, 1980).

Local determinacy: If the number of eigenvalues of A outside the unit circle is equal to the number of nonpredetermined variables, there exists a unique locally determinate solution.

In this book we have most often encountered two particular cases: (a) there is only one nonpredetermined variable, in which case the (only) (p.544)

A Mathematical Appendix

Figure A7 Determinacy: a nonpredetermined variable

eigenvalue must be greater than 1 for determinacy; and (b) there is one predetermined variable and one nonpredetermined variable, in which case one eigenvalue must be greater than 1 (in absolute value) the other one smaller than 1.

A.12 Some Useful Calculations

We derive a few formulas that have been used without proof in chapters 12 and 13.

A.12.1 A Computation for Chapter 12

We want to compute:

E( ΘLogΘ )

where θ‎ = LogΘ‎ is a normal variable with mean μθ‎ and variance σθ‎2 such that:

E( Θ )=1

Condition (A.246) implies:

μ θ + σ θ 2 2 =0

(p.545) Take θ‎ as the working variable. The expression in (A.245) is equal to:

+ θ e θ exp[ ( θ μ θ ) 2 2 σ θ 2 ]  dθ

Let us denote:

z=θ μ θ

The expression in (A.248) becomes:

exp μ θ + zexp( z z 2 2 σ θ 2 )  dz+ μ θ exp μ θ + exp( z z 2 2 σ θ 2 )  dz


exp( z z 2 2 σ θ 2 )=exp( σ θ 2 2 )exp[ ( z σ θ 2 ) 2 2 σ θ 2 ]

Insert (A.251) into (A.250). The expression (A.250) becomes:

exp ( μ θ + σ θ 2 2 ) + z exp [ ( z σ θ 2 ) 2 σ θ 2 ]  dz + μ θ  exp ( μ θ + σ θ 2 2 ) + exp [ ( z σ θ 2 ) 2 σ θ 2 ]  dz

Formula (A.252) simplifies to:

( μ θ + σ θ 2 ) exp ( μ θ + σ θ 2 2 )

so that, in view of (A.247):

E( ΘLogΘ )= σ θ 2 2

(p.546) A.12.2 A Computation for Chapter 13

We want to establish the formulas:

  j=0 j λ j = λ ( 1λ ) 2

  j=0 j λ j = t λ t ( 1λ ) + λ t+1 ( 1λ ) 2

We first compute:

ϒ t =   j=0 t j λ j

We make the change of variable j = 1 + i:

ϒ t =   i = 0 t 1 ( 1 + i ) λ 1 + i = λ [   i = 0 t 1 ( 1 + i ) λ i ] = λ (   i = 0 t 1 λ i ) + λ (   i = 0 t 1 i λ i ) = λ 1 λ t 1 λ + λ ( ϒ t t λ t )


ϒ t = λ( 1 λ t ) ( 1λ ) 2 t λ t+1 1λ

Taking the limit t = ‎ in (A.259) we find:

ϒ=   j=0 j λ j = λ ( 1λ ) 2

which is formula (A.255). Finally we combine (A.257) and (A.260) to obtain:

  j=t j λ j =ϒ ϒ t1 = t λ t 1λ + λ t+1 ( 1λ ) 2

which is formula (A.256).

(p.547) A.13 References

Although the above mathematical appendix should give sufficient mathe- matical material to read the book, some readers may want to pursue some particular directions further. The following is a (nonexhaustive) list of more advanced sources: Bellman (1957); Broze, Gourieroux, and Szafarz (1985, 1990); Chiang (1984, 1992); Chiang and Wainwright (2005); De la Fuente (2000); Dixit (1990); Gourieroux and Monfort (1995, 1996); Hirsch and Smale (1974); Intriligator (1971); Léonard and van Long (1992); Seierstad and Sydsaeter (1987); Stachurski (2009); and Sydsaeter, Strom, and Berck (2000).


(1.) More advanced treatments are found, among others, in Chiang (1992), Dixit (1990), Intriligator (1971), and Michel (1982, 1990).

(2.) Although discounting is not necessary for the finite horizon case, we introduce it from the start for the homogeneity of the exposition.

(4.) This is actually valid if ItIt+iIt+j, that is, if the information set expands over time.

(5.) In this literature the usual definition of an admissible trajectory is a nonexplosive or nondivergent tra jectory.