Jump to ContentJump to Main Navigation
Likelihood-Based Inference in Cointegrated Vector Autoregressive Models$

Søren Johansen

Print publication date: 1995

Print ISBN-13: 9780198774501

Published to Oxford Scholarship Online: November 2003

DOI: 10.1093/0198774508.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 25 February 2017

The Statistical Analysis of I (1) Models

The Statistical Analysis of I (1) Models

Chapter:
(p.89) 6 The Statistical Analysis of I (1) Models
Source:
Likelihood-Based Inference in Cointegrated Vector Autoregressive Models
Author(s):

Søren Johansen (Contributor Webpage)

Publisher:
Oxford University Press
DOI:10.1093/0198774508.003.0006

Abstract and Keywords

Contains the likelihood analysis of the I(1) models. The main result is the derivation of the method of reduced rank regression because of Anderson. This solves the estimation problem for the unrestricted cointegration vectors, and hence the problem of deriving a test for cointegrating rank, the so‐called trace test. The reduced rank algorithm is applied to a number of different models defined by restrictions on the deterministic terms.

Keywords:   cointegrating rank, cointegrating vectors, deterministic terms I(1) model, reduced rank regression, trace test

THIS chapter contains an analysis of the likelihood function of the I(1) models discussed in Chapter 5. The main result in section 6.1 is the derivation of the method of reduced rank regression which solves the estimation problem for the unrestricted cointegration vectors, and which solves the problem of deriving a test statistic for the hypothesis of cointegrating rank. The asymptotic distribution of this test statistic is discussed in Chapter 12, and the way it should be applied is discussed in Chapter 13.

It turns out that the method of reduced rank regression solves a number of different models defined by various restrictions on the parameters. We give here the estimator of the unrestricted cointegrating vectors, and show in section 6.2 how it should be modified if restrictions are imposed on the deterministic terms. In Chapter 7 we discuss the modification needed for the estimation of cointegrating relations, when they are restricted by linear restrictions, and in Chapter 8 how it should be modified when α is restricted.

6.1 Likelihood Analysis of H(r)

We define the reduced from error correction model as given by

Δ X t = α β X t 1 + i = 1 k 1 Γ i Δ X t i + Φ D t + ε t , t = 1 , . . . , T ,
(6.1)
where εt are independent N p(0, Ω) and (α, β, Γ1, . . . , Γk−1, Φ, Ω) are freely varying parameters.

The advantage of this parametrization is in the interpretation of the coefficients, where the effect of the levels is isolated in the matrix αβ′ and where Γ1, . . . , Γk−1 describe the short‐term dynamics of the process. Sometimes the form

Δ X t = i = 1 k 1 Γ ~ i Δ X t i + α β X t k + Φ D t + ε t ,
is given, where Γ ~ i = Π t + + Π t I . This reparametrization leads to the same statistical analysis.

(p.90) In (6.1) we introduce the notation Z 0t = Δ X t, Z 1t = X t−1 and let Z 2t be the stacked variables Δ X t−1, . . . , Δ X tk+1, and D t. We let Ψ be the matrix of parameters corresponding to Z 2t, that is, the matrix consisting of Γ1, . . . , Γk−1, and Φ. Thus Z 2t is a vector of dimension p(k − 1) + m and Ψ is a matrix of dimension p × (p(k − 1) + m).

The model expressed in these variables becomes

Z 0 t = α β Z 1 t + Ψ Z 2 t + ε t , t = 1 , . . . , T .
(6.2)
This is clearly a non‐linear regression model where the parameters Ψ are unrestricted and the coefficient matrix to the levels Z 1t is of reduced rank. The analysis of the likelihood function leads to the technique developed by Anderson (1951) of reduced rank regression. We give the details here since the notation is needed for the asymptotic analysis. The log likelihood function is given apart from a constant by
log L ( Ψ , α , β , Ω ) = 1 2 T log | Ω | 1 2 t = 1 T ( Z 0 t α β Z 1 t Ψ Z 2 t ) Ω 1 ( Z 0 t α β Z 1 t Ψ Z 2 t ) .
The first order conditions for estimating Ψ are given by
t = 1 T ( Z 0 t α β Z 1 t Ψ ^ Z 2 t ) Z 2 t = 0 .
(6.3)
We introduce the notation for the product moment matrices
M ij = T 1 t = 1 T Z it Z jt , i , j = 0 , 1 , 2 ,
(6.4)
and note that
M ij = M ji , i , j = 0 , 1 , 2 .
We write (6.3) as
M 02 = α β M 12 + Ψ ^ M 22 ,
such that
Ψ ^ ( α , β ) = M 02 M 22 1 α β M 12 M 22 1 .
(6.5)

This leads to the definition of the residuals

R 0 t = Z 0 t M 02 M 22 1 Z 2 t ,
(6.6)
R 1 t = Z 1 t M 12 M 22 1 Z 2 t ,
(6.7)
(p.91) i.e. the residuals we would obtain by regressing Δ X t and X t − 1 on the lagged differences Δ X t − 1, . . . , Δ X tk+1, and D t or Z 0t and Z 1t on Z 2t. The concentrated likelihood function is
log L ( α , β , Ω ) = 1 2 T log | Ω | 1 2 t = 1 T ( R 0 t α β R 1 t ) Ω 1 ( R 0 t α β R 1 t ) ,
(6.8)
see also (A.29). Another way of writing this is as a regression equation in the residuals
R 0 t = α β R 1 t + ε ^ t ,
(6.9)
which would give the same likelihood as (6.8). Thus the parameters Ψ can be eliminated by regression and what remains in (6.9) is a reduced rank regression as investigated by Anderson (1951).

As a final piece of notation consider

S ij = T 1 t = 1 T R it R jt = M ij M i 2 M 22 1 M 2 j , ; i , j = 0 , 1 .
(6.10)

For fixed β it is easy to estimate α and Ω by regressing R 0t on β′ R 1t and obtain

α ^ ( β ) = S 01 β ( β S 11 β ) 1 ,
(6.11)
Ω ^ ( β ) = S 00 S 01 β ( β S 11 β ) 1 β S 10 = S 00 α ^ ( β ) ( β S 11 β ) α ^ ( β ) ,
(6.12)
and apart from the constant (2π e)p, which disappears when forming ratios, we find
L max 2 / T ( α ^ ( β ) , β , Ω ^ ( β ) ) = L max 2 / T ( β ) = | Ω ^ ( β ) | = | S 00 S 01 β ( β S 11 β ) 1 β S 10 | .
We next rewrite this expression, using the identity
| Σ 11 Σ 12 Σ 21 Σ 22 | = | Σ 11 | | Σ 22 Σ 21 Σ 11 1 Σ 12 | = | Σ 22 | | Σ 11 Σ 12 Σ 22 1 Σ 21 | ,
which is discussed in (A.26). Applying the identity to the matrix
| S 00 S 01 β β S 10 β S 11 β | = | S 00 | | β ( S 11 S 10 S 00 1 S 01 ) β | = | β S 11 β | | S 00 S 01 β ( β S 11 β ) 1 β S 10 | ,
we find that (p.92)
| S 00 S 01 β ( β S 11 β ) 1 β S 10 | = | S 00 | | β S 11 β β S 10 S 00 1 S 01 β | / | β S 11 β | = | S 00 | | β ( S 11 S 10 S 00 1 S 01 ) β | / | β S 11 β | .
(6.13)
Thus the maximization of the likelihood function is equivalent to the minimization of the last factor of (6.13). This factor is minimized among all p × r matrices β, by applying Lemma A.8, that is, by solving the eigenvalue problem
| ρ S 11 ( S 11 S 10 S 00 1 S 01 ) | = 0 ,
or, for λ = 1 − ρ, the eigenvalue problem
| λ S 11 S 10 S 00 1 S 01 | = 0 ,
for eigenvalues λi and eigenvectors υi, such that
λ i S 11 υ i = S 10 S 00 1 S 01 υ i ,
and υjS 11υi = 1 if i = j and 0 otherwise, see Lemma A.8. Note that the eigenvectors diagonalize the matrix S 10 S 00 1 S 01 since υ j S 10 S 00 1 S 01 υ i = λ i if i = j and zero otherwise. Thus by simultaneously diagonalizing the matrices S 11 and S 10 S 00 1 S 01 we can estimate the r‐dimensional cointegrating space as the space spanned by the eigenvectors corresponding to the r largest eigenvalues. With this choice of β ^ we find from Lemma A.8, that
L max 2 / T = | S 00 | | β ^ ( S 11 S 10 S 00 1 S 01 ) β ^ | | β ^ S 11 β ^ | = | S 00 | Π i = 1 r ( 1 λ ^ i ) ,
(6.14)
since by the choice of β ^ we have β ^ S 11 β ^ = I , as well as β ^ S 10 S 00 1 S 01 β ^ = diag ( λ ^ 1 , . . . , λ ^ r ) .

For r = 0 we choose sp ( β ^ ) = { 0 } , and find Π = Π ^ = 0 , and for r = p we can take sp ( β ^ ) = R p , and the estimate of Π is Π ^ = S 01 S 11 1 . Note that we have solved all the models H(r), r = 0, . . . , p by the same eigenvalue calculation. The maximized likelihood is given for each r by (6.14) and by dividing the maximized likelihood function for r with the corresponding expression for r = p we get the likelihood ratio test

Q ( H ( r ) | H ( p ) ) 2 T = | S 00 | Π i = 1 r ( 1 λ ^ i ) | S 00 | Π i = 1 p ( 1 λ ^ i ) .
The factor |S 00| cancels and we find the so‐called trace statistic
2 log Q ( H ( r ) | H ( p ) ) = T i = r + 1 p log ( 1 λ ^ i ) .
(p.93) Under hypothesis H(r) the estimates of β and α are related to the canonical variates between R 0t and R 1t, and the eigenvalues are the squared canonical correlations, see Appendix A or Anderson (1984). The estimate of β is given as the eigenvectors of (6.15), see below, corresponding to the r largest eigenvalues, that is, the choice of β ^ is the choice of the r linear combinations of X t−1 which have the largest squared partial correlations with the stationary process Δ X t after correcting for lags and deterministic terms. We call such an analysis a reduced rank regression of Δ X t on X t−1 corrected for (Δ X t−1, . . . , Δ X tk+1, D t). The results are formulated in Theorem 6.1, where we also give the asymptotic distributions of the test statistics even though they will be derived in Chapter 12. Note that the estimate of β given here is the unrestricted estimator, which is relevant if we do not want to impose any restrictions.

THEOREM 6.1 Under hypothesis

H ( r ) : Π = α β ,
the maximum likelihood estimator of β is found by the following procedure: first solve the equation
| λ S 11 S 10 S 00 1 S 01 | = 0 ,
(6.15)
for the eigenvalues 1 λ ^ 1 λ ^ p 0 and eigenvectors V = ( υ 1 , , υ p ) which we normalize by V S 11 V = I . The cointegrating relations are estimated by
β ^ = ( v ^ 1 , . . . , v ^ r ) ,
(6.16)
and the maximized likelihood function is found from
L max 2 / T ( H ( r ) ) = | S 00 | Π i = 1 r ( 1 λ ^ i ) .
(6.17)
The estimates of the other parameters are found by inserting β ^ into the above equations, i.e. by ordinary least squares for β = β ^ .

The likelihood ratio test statistic Q(H(r)|H(p)) for H(r) in H(p), is found by comparing two expressions like (6.17). This gives the result

2 log Q ( H ( r ) | H ( p ) ) = T i = r + 1 p log ( 1 λ ^ i ) .
(6.18)
The likelihood ratio test statistic for testing H(r) in H(r+1) is given by
2 log Q ( H ( r ) | H ( r + 1 ) ) = T log ( 1 λ ^ r + 1 ) .
(6.19)
(p.94) The asymptotic distribution of ( 6.18 ) depends on the deterministic terms present in the model, and is derived in Chapter 11 . We assume here that rank (Π) = r.

If μ = 0 we find

2 log Q ( H ( r ) | H ( p ) ) w tr { 0 1 ( dB ) F [ 0 1 FF du ] 1 0 1 F ( dB ) } ,
(6.20)
where F = B is a pr dimensional Brownian motion. The distribution is tabulated in Chapter 15 , Table 15.1.

If μt = μ0 and α′ ⊥ μ0 ≠ 0, the asymptotic distribution is given by ( 6.20 ) with F defined by

F i ( u ) = B i ( u ) 0 1 B i ( u ) du , i = 1 , . . . , p r 1 , F p r ( u ) = u 1 2 .
(6.21)
The distribution is tabulated in Chapter 15 , Table 15.3.

If μt = μ0 + μ1 t, and α′ ⊥ μ1 ≠ 0 then F is given by

F i ( u ) = B i ( u ) a i b i t , i = 1 , . . . , p r 1 , F p r ( u ) = u 2 a bu ,
(6.22)
where the random coefficients a, b, a i, and b i are found by regressing u, respectively B i, on a constant and a linear term. The distribution is tabulated in Chapter 15 , Table 15.5.

Another way of formulating this basic estimation result is that we have performed a singular value decomposition of the unrestricted regression estimator Π ^ = S 01 S 11 1 with respect to its ‘covariance matrix’ S 00 . 1 S 11 1 , that is, of the matrix Π ~ = S 00 . 1 1 2 Π ^ S 11 1 2 since

| ρ I Π ~ Π ~ | = | ρ I S 00 . 1 1 2 S 01 S 11 1 S 10 S 00 . 1 1 2 | ,
which is zero if
| ρ S 00 . 1 S 01 S 11 1 S 10 | = | ρ S 00 ( 1 + ρ ) S 01 S 11 1 S 10 | = 0 ,
which gives λ = ρ/(1 + ρ), where we have used the notation
S 00 . 1 = S 00 S 01 S 11 1 S 10 .
The normalization β ^ S 11 β ^ = I is convenient from a mathematical point of view but may not be economically meaningful. It has the advantage that such normalizations can be made without assuming anything about which variables cointegrate, that is, without normalizing β.

(p.95) Note that since α ^ = S 01 β ^ we have

α ^ S 00 1 β ^ = β ^ S 10 S 00 1 S 01 β ^ = diag ( λ ^ 1 , . . . , λ ^ r ) .
Thus the eigenvalues measure the size of the coefficients to the cointegrating relations, and the test statistics can be interpreted as measuring the ‘length’ of the coefficients measured by S 00 1 of the supposedly nonstationary components of X t.

The calculation of the eigenvalues of equation (6.15) is performed as follows: first the matrix S 11 is diagonalized by solving the eigenvalue problem

| ρ I S 11 | = 0 ,
for eigenvalues ρ1, . . . , ρp and eigenvectors W = (w 1, . . . , w p), that is, we have the decomposition
S 11 = W diag ( ρ 1 , . . . , ρ p ) W
Then we define S 11 1 2 = W diag ( ρ 1 1 2 , . . . , ρ p 1 2 ) W and solve the eigenvalue problem
| λ I S 11 1 2 S 10 S 00 1 S 01 S 11 1 2 | = 0 ,
for eigenvalues λ ^ 1 , . . . , λ ^ p and eigenvectors U = (u 1, . . . , u p). Finally we define the eigenvectors V = S 11 1 2 U . Thus we diagonalize the matrices S 11 and S 10 S 00 1 S 01 simultaneously by the transformation V. The matrix S 11 is reduced to the identity and S 10 S 00 1 S 01 is reduced to diag ( λ ^ 1 , . . . , λ ^ p ) . One could also just say that we find the eigenvalues of S 11 1 S 10 S 00 1 S 01 , but this matrix is not symmetric, hence a different numerical algorithm should be used.

It is sometimes necessary to estimate the orthogonal complements of α and β. This can easily be done by the above results since

β ^ = S 11 ( υ r + 1 , . . . , υ p ) , α ^ = S 00 1 S 01 ( υ r + 1 , . . . , υ p ) .
satisfy the relation α ^ α ^ = β ^ β ^ = 0 . These relations follow from the fact that the eigenvectors υ1, . . . , υp diagonalize both S 11 and S 10 S 00 1 S 01 .

Note that the asymptotic distribution of the test statistic λmax in (6.19) is not given here but left as an exercise 11.5 in Chapter 11. The properties of the test are posed as a problem in Chapter 12.

6.2 Models for the Deterministic Terms

In this section we analyse the hypotheses given by (5.13), . . . , (5.17). The analysis of (5.13), (5.15), and (5.17) is given in section 6.1 where the analysis (p.96) is for a general form of the deterministic term D t. If we take Φ D t = μ0 + μ1 t we get the analysis of H(r), for Φ D t = μ0 we get the analysis of H 1(r) and finally H 2(r) is analysed with Φ D t = 0. What remains is to discuss the models with a restriction on the deterministic terms: H*(r) where α′μ1 = 0 and H 1 * ( r ) where μ1 = 0 and α′μ0 = 0. The analysis is very similar to the one given in section 6.1 and the new models will not be treated in so much detail.

Consider first H*(r) given by (5.14), that is, Π = αβ′, Φ D t = μ0 + μ1 t, and α′μ1 = 0. We note the following relation

α β X t 1 + μ 1 t = α β X t 1 + α ρ 1 t = α ( β , ρ 1 ) ( X t 1 , t ) = α β * Z 1 t * ,
(6.23)
where we define β* = (β′, ρ1)′ and Z 0 t * = Z 0 t = Δ X t and let Z 2 t * be the stacked variables Δ X t − 1, . . . , Δ X tk+1, 1 whereas Z 1 t * = X t 1 * = ( X t 1 , t ) . Further we define Ψ* as the matrix {Γ1, . . . , Γk−1, μ0}. The regression model then becomes
Z 0 t * = α β * Z 1 t * + Ψ * Z 2 t * + ε t .
(6.24)
It is seen that by this reformulation we can we estimate model H*(r) by reduced rank regression of Z 0 t * on Z 1 t * corrected for Z 2 t * . This defines residuals R 0 t * , R 1 t * , and product moment matrices S ij * . Note in particular that S 11 * is p1 × p1, whereas S 10 * is p1 × p and S 00 * is p × p as before, with p1 = p + 1.

Thus we solve the eigenvalue problem

| λ * S 11 * S 10 * S 00 * 1 S 01 * | = 0 ,
for eigenvalues λ 1 * , . . . , λ p 1 * . Note that λ p 1 * = 0 since the matrix S 10 * S 00 * 1 S 01 * is of dimension p1 × p1, but has rank p, such that | S 10 * S 00 * 1 S 01 * | = 0 . The likelihood ratio test of H*(r) in H(r) is calculated from
Q ( H * ( r ) | H ( r ) ) 2 T = | S 00 * | Π i = 1 r ( 1 λ i * ) | S 00 | Π i = 1 r ( 1 λ ^ i ) .
The hypotheses H*(p) and H(p) are so close that the likelihood function attains the same maximal value, see p.161.

Hence we find for r = p that

| S 00 * | Π i = 1 p ( 1 λ i * ) = | S 00 | Π i = 1 p ( 1 λ ^ i ) ,
which shows that
2 log Q ( H * ( r ) | H ( r ) ) = T log Π i = r + 1 p ( 1 λ ^ i ) Π i = r + 1 p ( 1 λ i * ) .
The results are formulated in

(p.97) THEOREM 6.2 Under the restrictions Π = αβ′, Φ D t = μ0 + μ1 t, and α′μ1 = 0 the cointegrating vectors are estimated by reduced rank regression of Δ X t on (X t − 1, t) corrected for lagged differences and the constant. The likelihood ratio test for the rank of Π, is given by

2 log Q ( H * ( r ) | H * ( p ) ) = T i = r + 1 p log ( 1 λ i * ) ,
(6.25)
where λ i * solves the eigenvalue problem
| λ * S 11 * S 10 * S 00 * 1 S 01 * | = 0 ,
(6.26)
for eigenvalues 1 λ 1 * λ p * λ p 1 * = 0 , and eigenvectors v 1 * , . . . , v p 1 * . The estimator for β* is given by β ^ * = ( v 1 * , . . . , v r * ) . The likelihood ratio test of the restriction α′ μ1 = 0 when there are r cointegrating vectors, that is, of H*(r) in H(r), is given by
2 log Q ( H * ( r ) | H ( r ) ) = T i = r + 1 p log { ( 1 λ ^ i ) / ( 1 λ i * ) } ,
(6.27)
where λ ^ i solves ( 6.15).

The asymptotic distribution of the likelihood ratio test statistic ( 6.25 ) is derived in Theorem 11.1, and is given by ( 6.20 ) with F defined by

F i ( u ) = B i ( u ) 0 1 B i ( u ) du , i = 1 , . . . , p r , F p r + 1 ( u ) = u 1 2 .
(6.28)
The distribution is tabulated by simulation in Chapter 15 , Table 15.4. The asymptotic distribution of the likelihood ratio test statistic ( 6.27 ) is shown in Corollary 11.2 to be χ2(pr).

In a completely analogous way we can estimate in the model H 1 * ( r ) where Φ D t = μ0 and α′ μ0 = 0. In this case we note that

α β X t 1 + α ρ 0 = α ( β , ρ 0 ) ( X t 1 , 1 ) = α β * Z 1 t * .
(6.29)
For Z 2 t * = ( Δ X t 1 , . . . , Δ X t k + 1 ) we find the reduced rank regression (6.24) again giving rise to new residuals and product moment matrices S ij * . We formulate the results in

THEOREM 6.3 Under the restrictions Π = α β′ and Φ D t = μ0 and α′ μ0 = 0 the cointegration vectors are estimated by reduced rank regression of Δ X t on (X t − 1, 1) corrected for lagged differences. The likelihood ratio test for the rank of Π, when α′ μ0 = 0 is given by (p.98)

2 log Q ( H 1 * ( r ) | H 1 * ( p ) ) = T i = r + 1 p log ( 1 λ i * ) ,
(6.30)
where λ i * solves the eigenvalue problem ( 6.26 ). The likelihood ratio test of the restriction α′ μ0 = 0 when there are r cointegration vectors, that is, of H 1(r)* in H 1(r) is given by
2 log Q ( H 1 * ( r ) | H 1 ( r ) ) = T i = r + 1 p log { 1 λ i * ) / ( 1 λ ^ i ) } ,
(6.31)
where λ ^ i solves ( 6.15).

The asymptotic distribution of the likelihood ratio test ( 6.30 ) is derived from Theorem 11.1 and is given by ( 6.20 ) with F defined by

F i ( u ) = B i ( u ) , i = 1 , . . . , p r , F p r + 1 ( u ) = 1 .
(6.32)
The distribution is tabulated by simulation in Chapter 15 , Table 15.2.

The asymptotic distribution of the likelihood ratio test statistic ( 6.31 ) is shown in Corollary 11.2 to be χ2(pr).

6.3 Determination of Cointegrating Rank

The problem of determining the cointegrating rank will be discussed in detail in Chapter 12, but we give here some rules for the application of the results in Theorem 6.1, 6.2, and 6.3.

Consider for simplicity first the situation where Φ D t = 0, that is, there are no deterministic terms in the model. In this case the test statistic is given by (6.18) where the preliminary regression does not involve correction for any deterministic terms, since they are not present in the model. The limit distribution of the likelihood ratio test statistic is given by (6.20) with F = B, and is tabulated in Chapter 15, Table 15.1. The tables are then used as follows.

If r represents a priori knowledge we simply calculate the test statistic Q r = −2log Q(H(r)|H(p)) and compare it with the relevant quantile in Table 15.1. Note that the tables give the asymptotic distribution only, and that the actual distribution depends not only on the finite value of T but also on all the short‐term parameters as well as on α and β. If one wants to be absolutely sure that the quantiles reported are at all reasonable, one would have to supplement the comparison with the asymptotic tables with a simulation investigation. This will not be attempted here.

A small sample correction has been suggested by Reinsel and Ahn (1992). It consists of using the factor (Tkp) instead of the sample size T (p.99) in the calculation of the test statistic for cointegrating rank. This idea has been investigated by Reimers (1992) and it seems that the approximation to the limit distribution is better with the corrected sample size. The theoretical justification for this result presents a very difficult mathematical problem, which it would be extremely useful to solve.

A common situation is that one has no or very little prior information about r, and in this case it seems more reasonable to estimate r from the data. This is done as follows. First compare Q 0 with its quantile c 0, say, from Table 15.1. If Q 0c 0, then we let r = 0 , if Q 0c 0 we calculate Q 1 and compare it with c 1. If now Q 1c 1 we define r = 1 , and if not we compare Q 2 with its quantile c 2, etc. This defines an estimator r which takes on the values 0, 1, . . . , p and which converges in probability to the true value in a sense discussed in Chapter 12.

Next consider the case Φ D t = μ0, where μ0 is allowed to vary freely. We see from Theorem 6.1 that the limit distribution depends on the assumption that α′μ0 ≠ 0.

Sometimes inspection of the graphs shows that the trend is present and we proceed as above and calculate Q 0, . . . , Q p−1, and compare them with the relevant quantiles from Table 15.3, since now the limit distribution is given by (6.21). We start comparing Q 0 with its quantile and proceed to Q 1, etc. This gives the possibility of estimating the value of r.

If it is clear that there is no deterministic trend it seems more reasonable to analyse the model H 1 * ( r ) , and calculate the relevant test statistic 2 log Q ( H 1 * ( r ) | H 1 * ( p ) ) . That is, we take the consequence of the assumption that α′μ0 = 0 and analyse the model thus specified instead of applying another limit distribution to the previous statistic. That is, we change the test statistic to reflect the hypothesis we are interested in, rather than changing the limit distribution of the previous statistic.

If we are in the situation that we do not know whether there is a trend or not, we have to determine the presence of the trend as well as the cointegrating rank at the same time, since the tests are not similar, not even asymptotically, that is, the distribution and the limit distribution depends on which parameter point is considered under the null hypothesis. We then have a non‐nested set of hypotheses, see Table 5.1

H 1 ( 0 ) H 1 ( r ) H 1 ( p ) H 1 * ( 0 ) H 1 * ( r ) H 1 * ( p )
It holds that H 1(p) is almost the same hypothesis as H 1 * ( p ) , in the sense that 2 log Q ( H 1 * ( r ) | H 1 ( p ) ) = 2 log Q ( H 1 * ( r ) | H 1 * ( p ) ) .

Thus we test all hypotheses against H 1(p). The simultaneous determination of trend and cointegrating rank is now performed as follows:

We calculate Q 0 , . . . , Q p 1 , Q 0 * , . . . , Q p 1 * . We accept rank r and the presence of a trend if H 1(0), . . . , H 1(r − 1) are rejected and if also the (p.100) models H 1 * ( 0 ) , . . . , H 1 * ( r 1 ) as well as H 1 * ( r ) are rejected but H 1(r) is accepted.

We accept cointegrating rank r and the absence of a trend if H 1 * ( r ) is accepted and H 1(0), . . . , H 1(r − 1) as well as H 1 * ( 0 ) , . . . , H 1 * ( r 1 ) are rejected. This solution represents a choice and reflects a priority in the ordering of the hypotheses.

If instead we assume no quadratic trend in the process but allow a linear trend in all directions, we can analyse model H*(r). These models are nested and the rank is determined as above by calculating the −2log Q(H(r)*|H(p)*) for r = 0, . . . , p − 1, and compare them with their quantiles from Table 15.4, starting with r = 0.

6.4 Exercises

6.1

Consider the model

Δ X t = α β X t 1 + μ 0 + μ 1 t + ε t .
We define the parameters
μ 0 = α ρ 0 + α γ 0 , μ 1 = α ρ 1 + α γ 1 .
  1. 1. Show by Granger's representation theorem that X t in general has a quadratic trend and show how this model can be estimated by reduced rank regression.

  2. 2. Show that if α′μ1 = 0 then the quadratic trend disappears, but the process still has a linear trend given by

    τ 1 = β ( α β ) 1 α α ( γ 0 + ( α α ) 1 α β ( β β ) 1 ρ 1 ) β ( β β ) 1 ρ 1 ,
    see Chapter 5, formula (5.20).

  3. 3. Show how one can estimate the parameters of the model by reduced rank regression under the constraint α′μ1 = 0.

  4. 4. What happens under the constraint α′μ0 = 0, and μ1 unrestricted?

  5. 5. Under the restriction α′μ1 = 0, the hypothesis of trend stationarity of X 1t, say, can be formulated as the hypothesis that the unit vector (1, 0, . . . , 0) is one of the cointegrating vectors. Discuss how the parameters can be estimated by reduced rank regression in this case.

(p.101) 6.2

A normalization or identification of the cointegrating vectors. Consider the model

Δ X t = α β X t 1 + ε t .
The equation
| λ Σ β β Σ β 0 Σ 00 1 Σ 0 β | = 0
has solutions λ1 〉 ⋯ 〉 λr and V = (v 1, . . . , v r) such that V′Σββ V = I. Now define β ~ = β V , that is, β ~ i = β υ i , and define α ~ = α V 1 , such that α ~ β ~ = α β .
  1. 1. Show that

    Var ( β ~ X t 1 ) = I , α ~ = Σ 0 β ~ = Cov ( Δ X t , β ~ X t 1 ) , α ~ Σ 00 1 α ~ = diag ( λ 1 , . . . , λ r ) .

  2. 2. Show that similar relations hold for the estimated values of α and β.

6.3

An example of a model based on rational expectations can be formulated as

E t ( c 0 X t + c 1 X t + 1 ) + c = 0 ,
(6.33)
for a process X t which we assume is generated by the equation
Δ X t = Π X t 1 + μ + ε t , t = 1 , . . . , T .
(6.34)
Here X t is p‐dimensional and the matrix c is (q × 1) and c 0 and c 1 are (p × q) known matrices. As an example of this situation consider the variables ( i t au , i t us , exch t , p t us , p t au ) . The hypothesis of uncovered interest parity is formulated as
i t us i t au = E t Δ exch t + 1 ,
(6.35)
and the hypothesis of equal expected real interest rates is formulated as
i t us E t Δ p t + 1 us = i t au E t Δ p t + 1 au .
(6.36)
  1. 1. Show that (6.35) and (6.36) taken together are a special case of (6.33) and find the matrices c, c 0, and c 1.

  2. (p.102)
  3. 2. Show that in order for (6.33) to be consistent with (6.34) it must hold that

    c 1 Π = ( c 0 + c 1 ) , and c 1 μ + c = 0 .
    (6.37)
    In the presence of cointegration these restrictions give more restrictions on the matrix Π and it is these restrictions that we want to test in the following. We define
    a = c 1 , b = ( c 0 + c 1 ) .
    and assume that the matrix b has full rank.

  4. 3. Show that under the assumption that Π has reduced rank r (0 〈 rp) and the restrictions (6.37) hold, we have that a has full rank and that

    ( a , a ) Π ( b , b ) = [ b b 0 Θ ξ η ] ,
    for matrices ξ and η of dimension (pq) × (rq) and rank (rq) and a matrix Θ of dimension (pq) × q. Find Π as a function of Θ, ξ, and η and show that this expression implies that Π has reduced rank and satisfies the restrictions (6.37).

  5. 4. Find the dimension of the parameter space spanned by Π and μ under the restrictions (6.37), and find an expression for the cointegrating vectors β expressed in terms of η, and an expression for α.

  6. 5. Show by multiplying the equations for X t by a and a respectively that the estimators for η, ξ, Θ, and μ can be determined by reduced rank regression.

  7. 6. Find an expression for the likelihood ratio test for the hypothesis (6.37). The asymptotic distribution is χ2. Find the degrees of freedom.

6.4

Discuss the estimation problem for the statistical models discussed in the exercises given in Chapter 5.

6.5

Consider the model

Δ X t = α β X t 1 + Γ 1 Δ X t 1 + ε t .
(6.38)
Let H be a p × s matrix.

  1. 1. Show that the hypothesis

    Γ 1 H = 0
    can be formulated as Γ1 = ξ H′ for some ξ (p × s) and derive the likelihood ratio test for this hypothesis. The asymptotic distribution of the test statistic is χ2. Determine the degrees of freedom.

  2. (p.103)
  3. 2. Show that if β = Hϕ and Γ1 = ξ H′ then Y t = HX t is an autoregressive process. Find the parameters and give the condition for the process Y t to be an I(1) process.

  4. 3. Consider now the following special case

    Δ Y t = γ Δ Z t 1 + ε 1 t ,
    (6.39)
    Δ Z t = α 2 ( Y t 1 + β 2 Z t 1 ) + ε 2 t .
    (6.40)
    Find the characteristic polynomial and its roots and show that if
    β 2 + γ 0 1 γ α 2 γ α 2 + β 2 α 2 0 α 2 ( γ β 2 ) 2
    then the process X t = (Y t, Z t)′ is an I(1) process and Y t + β2 Z t is stationary.

  5. 4. For H = (1, 0)′ the hypothesis β = Hϕ reduces to β2 = 0. Find the autoregressive representation of Y t given by (6.39) and (6.40) under the assumption that β = Hϕ and Γ1 = ξ H′ and determine the properties of the process depending on the parameters.

    The problem is inspired by the following situation. In an investigation of money m t, income y t price p t and interest rates i 1t and i 2t, both money and income are in nominal values. The variables m t, y t, and p t are in logarithms. An analysis of the data shows that a model with r = 3 and k = 2 describes the data. We now want to investigate if we could have analysed the variables in real terms, that is, the variables (m tp t, y tp t, i 1t, i 2t).

  6. 5. Determine in this case the matrix H and determine explicitly the condition that Γ1 has to satisfy in order that the real variables are described by an AR(2) model.