The Statistical Analysis of I (1) Models
The Statistical Analysis of I (1) Models
Abstract and Keywords
Contains the likelihood analysis of the I(1) models. The main result is the derivation of the method of reduced rank regression because of Anderson. This solves the estimation problem for the unrestricted cointegration vectors, and hence the problem of deriving a test for cointegrating rank, the so‐called trace test. The reduced rank algorithm is applied to a number of different models defined by restrictions on the deterministic terms.
Keywords: cointegrating rank, cointegrating vectors, deterministic terms I(1) model, reduced rank regression, trace test
THIS chapter contains an analysis of the likelihood function of the I(1) models discussed in Chapter 5. The main result in section 6.1 is the derivation of the method of reduced rank regression which solves the estimation problem for the unrestricted cointegration vectors, and which solves the problem of deriving a test statistic for the hypothesis of cointegrating rank. The asymptotic distribution of this test statistic is discussed in Chapter 12, and the way it should be applied is discussed in Chapter 13.
It turns out that the method of reduced rank regression solves a number of different models defined by various restrictions on the parameters. We give here the estimator of the unrestricted cointegrating vectors, and show in section 6.2 how it should be modified if restrictions are imposed on the deterministic terms. In Chapter 7 we discuss the modification needed for the estimation of cointegrating relations, when they are restricted by linear restrictions, and in Chapter 8 how it should be modified when α is restricted.
6.1 Likelihood Analysis of H(r)
We define the reduced from error correction model as given by
The advantage of this parametrization is in the interpretation of the coefficients, where the effect of the levels is isolated in the matrix αβ′ and where Γ_{1}, . . . , Γ_{k−1} describe the short‐term dynamics of the process. Sometimes the form
(p.90) In (6.1) we introduce the notation Z _{0t} = Δ X _{t}, Z _{1t} = X _{t−1} and let Z _{2t} be the stacked variables Δ X _{t−1}, . . . , Δ X _{t−k+1}, and D _{t}. We let Ψ be the matrix of parameters corresponding to Z _{2t}, that is, the matrix consisting of Γ_{1}, . . . , Γ_{k−1}, and Φ. Thus Z _{2t} is a vector of dimension p(k − 1) + m and Ψ is a matrix of dimension p × (p(k − 1) + m).
The model expressed in these variables becomes
This leads to the definition of the residuals
As a final piece of notation consider
For fixed β it is easy to estimate α and Ω by regressing R _{0t} on β′ R _{1t} and obtain
For r = 0 we choose $\mathrm{sp}(\hat{\beta})\text{}=\text{}\{0\}$, and find $\Pi \text{}=\text{}\hat{\Pi}\text{}=\text{}0$, and for r = p we can take $\mathrm{sp}(\hat{\beta})\text{}=\text{}{R}^{p}$, and the estimate of Π is $\hat{\Pi}\text{}=\text{}{S}_{01}{S}_{11}^{1}$. Note that we have solved all the models H(r), r = 0, . . . , p by the same eigenvalue calculation. The maximized likelihood is given for each r by (6.14) and by dividing the maximized likelihood function for r with the corresponding expression for r = p we get the likelihood ratio test
THEOREM 6.1 Under hypothesis
The likelihood ratio test statistic Q(H(r)H(p)) for H(r) in H(p), is found by comparing two expressions like (6.17). This gives the result
If μ = 0 we find
If μ_{t} = μ_{0} and α′ ⊥ μ_{0} ≠ 0, the asymptotic distribution is given by ( 6.20 ) with F defined by
If μ_{t} = μ_{0} + μ_{1} t, and α′ ⊥ μ_{1} ≠ 0 then F is given by
Another way of formulating this basic estimation result is that we have performed a singular value decomposition of the unrestricted regression estimator $\hat{\Pi}\text{}=\text{}{S}_{01}{S}_{11}^{1}$ with respect to its ‘covariance matrix’ ${S}_{00\text{.}1}\text{}\otimes \text{}{S}_{11}^{1}$, that is, of the matrix $\stackrel{~}{\Pi}={S}_{00\text{.}1}^{\frac{1}{2}}\hat{\Pi}{S}_{11}^{\frac{1}{2}}$ since
(p.95) Note that since $\hat{\alpha}\text{}=\text{}{S}_{01}\hat{\beta}$ we have
The calculation of the eigenvalues of equation (6.15) is performed as follows: first the matrix S _{11} is diagonalized by solving the eigenvalue problem
It is sometimes necessary to estimate the orthogonal complements of α and β. This can easily be done by the above results since
Note that the asymptotic distribution of the test statistic λ_{max} in (6.19) is not given here but left as an exercise 11.5 in Chapter 11. The properties of the test are posed as a problem in Chapter 12.
6.2 Models for the Deterministic Terms
In this section we analyse the hypotheses given by (5.13), . . . , (5.17). The analysis of (5.13), (5.15), and (5.17) is given in section 6.1 where the analysis (p.96) is for a general form of the deterministic term D _{t}. If we take Φ D _{t} = μ_{0} + μ_{1} t we get the analysis of H(r), for Φ D _{t} = μ_{0} we get the analysis of H _{1}(r) and finally H _{2}(r) is analysed with Φ D _{t} = 0. What remains is to discuss the models with a restriction on the deterministic terms: H*(r) where α_{⊥}′μ_{1} = 0 and ${H}_{1}^{*}(r)$ where μ_{1} = 0 and α_{⊥}′μ_{0} = 0. The analysis is very similar to the one given in section 6.1 and the new models will not be treated in so much detail.
Consider first H*(r) given by (5.14), that is, Π = αβ′, Φ D _{t} = μ_{0} + μ_{1} t, and α_{⊥}′μ_{1} = 0. We note the following relation
Thus we solve the eigenvalue problem
Hence we find for r = p that
(p.97) THEOREM 6.2 Under the restrictions Π = αβ′, Φ D _{t} = μ_{0} + μ_{1} t, and α_{⊥}′μ_{1} = 0 the cointegrating vectors are estimated by reduced rank regression of Δ X _{t} on (X _{t − 1}, t) corrected for lagged differences and the constant. The likelihood ratio test for the rank of Π, is given by
The asymptotic distribution of the likelihood ratio test statistic ( 6.25 ) is derived in Theorem 11.1, and is given by ( 6.20 ) with F defined by
In a completely analogous way we can estimate in the model ${H}_{1}^{*}(r)$ where Φ D _{t} = μ_{0} and α_{⊥}′ μ_{0} = 0. In this case we note that
THEOREM 6.3 Under the restrictions Π = α β′ and Φ D _{t} = μ_{0} and α_{⊥}′ μ_{0} = 0 the cointegration vectors are estimated by reduced rank regression of Δ X _{t} on (X _{t − 1}, 1) corrected for lagged differences. The likelihood ratio test for the rank of Π, when α_{⊥}′ μ_{0} = 0 is given by (p.98)
The asymptotic distribution of the likelihood ratio test ( 6.30 ) is derived from Theorem 11.1 and is given by ( 6.20 ) with F defined by
The asymptotic distribution of the likelihood ratio test statistic ( 6.31 ) is shown in Corollary 11.2 to be χ^{2}(p − r).
6.3 Determination of Cointegrating Rank
The problem of determining the cointegrating rank will be discussed in detail in Chapter 12, but we give here some rules for the application of the results in Theorem 6.1, 6.2, and 6.3.
Consider for simplicity first the situation where Φ D _{t} = 0, that is, there are no deterministic terms in the model. In this case the test statistic is given by (6.18) where the preliminary regression does not involve correction for any deterministic terms, since they are not present in the model. The limit distribution of the likelihood ratio test statistic is given by (6.20) with F = B, and is tabulated in Chapter 15, Table 15.1. The tables are then used as follows.
If r represents a priori knowledge we simply calculate the test statistic Q _{r} = −2log Q(H(r)H(p)) and compare it with the relevant quantile in Table 15.1. Note that the tables give the asymptotic distribution only, and that the actual distribution depends not only on the finite value of T but also on all the short‐term parameters as well as on α and β. If one wants to be absolutely sure that the quantiles reported are at all reasonable, one would have to supplement the comparison with the asymptotic tables with a simulation investigation. This will not be attempted here.
A small sample correction has been suggested by Reinsel and Ahn (1992). It consists of using the factor (T − kp) instead of the sample size T (p.99) in the calculation of the test statistic for cointegrating rank. This idea has been investigated by Reimers (1992) and it seems that the approximation to the limit distribution is better with the corrected sample size. The theoretical justification for this result presents a very difficult mathematical problem, which it would be extremely useful to solve.
A common situation is that one has no or very little prior information about r, and in this case it seems more reasonable to estimate r from the data. This is done as follows. First compare Q _{0} with its quantile c _{0}, say, from Table 15.1. If Q _{0} 〈 c _{0}, then we let $\stackrel{\wedge}{r}=0$, if Q _{0} ≥ c _{0} we calculate Q _{1} and compare it with c _{1}. If now Q _{1} 〈 c _{1} we define $\stackrel{\wedge}{r}=1$, and if not we compare Q _{2} with its quantile c _{2}, etc. This defines an estimator $\stackrel{\wedge}{r}$ which takes on the values 0, 1, . . . , p and which converges in probability to the true value in a sense discussed in Chapter 12.
Next consider the case Φ D _{t} = μ_{0}, where μ_{0} is allowed to vary freely. We see from Theorem 6.1 that the limit distribution depends on the assumption that α_{⊥}′μ_{0} ≠ 0.
Sometimes inspection of the graphs shows that the trend is present and we proceed as above and calculate Q _{0}, . . . , Q _{p−1}, and compare them with the relevant quantiles from Table 15.3, since now the limit distribution is given by (6.21). We start comparing Q _{0} with its quantile and proceed to Q _{1}, etc. This gives the possibility of estimating the value of r.
If it is clear that there is no deterministic trend it seems more reasonable to analyse the model ${H}_{1}^{*}(r)$, and calculate the relevant test statistic $2\mathrm{log}\text{}Q({H}_{1}^{*}(r){H}_{1}^{*}(p))$. That is, we take the consequence of the assumption that α_{⊥}′μ_{0} = 0 and analyse the model thus specified instead of applying another limit distribution to the previous statistic. That is, we change the test statistic to reflect the hypothesis we are interested in, rather than changing the limit distribution of the previous statistic.
If we are in the situation that we do not know whether there is a trend or not, we have to determine the presence of the trend as well as the cointegrating rank at the same time, since the tests are not similar, not even asymptotically, that is, the distribution and the limit distribution depends on which parameter point is considered under the null hypothesis. We then have a non‐nested set of hypotheses, see Table 5.1
Thus we test all hypotheses against H _{1}(p). The simultaneous determination of trend and cointegrating rank is now performed as follows:
We calculate ${Q}_{0}\text{,}\text{}...\text{,}\text{}{Q}_{p1}\text{,}\text{}{Q}_{0}^{*}\text{,}\text{}...\text{,}\text{}{Q}_{p1}^{*}$. We accept rank r and the presence of a trend if H _{1}(0), . . . , H _{1}(r − 1) are rejected and if also the (p.100) models ${H}_{1}^{*}(0)\text{,}\text{}...\text{,}\text{}{H}_{1}^{*}(r\text{}\text{}1)$ as well as ${H}_{1}^{*}(r)$ are rejected but H _{1}(r) is accepted.
We accept cointegrating rank r and the absence of a trend if ${H}_{1}^{*}(r)$ is accepted and H _{1}(0), . . . , H _{1}(r − 1) as well as ${H}_{1}^{*}(0)\text{,}\text{}...\text{,}\text{}{H}_{1}^{*}(r\text{}\text{}1)$ are rejected. This solution represents a choice and reflects a priority in the ordering of the hypotheses.
If instead we assume no quadratic trend in the process but allow a linear trend in all directions, we can analyse model H*(r). These models are nested and the rank is determined as above by calculating the −2log Q(H(r)*H(p)*) for r = 0, . . . , p − 1, and compare them with their quantiles from Table 15.4, starting with r = 0.
6.4 Exercises
6.1
Consider the model

1. Show by Granger's representation theorem that X _{t} in general has a quadratic trend and show how this model can be estimated by reduced rank regression.

2. Show that if α_{⊥}′μ_{1} = 0 then the quadratic trend disappears, but the process still has a linear trend given by
$$\tau}_{1}={\beta}_{\perp}({\alpha}_{\perp}\prime {\beta}_{\perp}){}^{1}{\alpha}_{\perp}\prime {\alpha}_{\perp}({\gamma}_{0}+({\alpha}_{\perp}\prime {\alpha}_{\perp}){}^{1}{\alpha}_{\perp}\prime \beta (\beta \prime \beta ){}^{1}{\rho}_{1})\beta (\beta \prime \beta ){}^{1}{\rho}_{1}\text{,$$ 
3. Show how one can estimate the parameters of the model by reduced rank regression under the constraint α_{⊥}′μ_{1} = 0.

4. What happens under the constraint α_{⊥}′μ_{0} = 0, and μ_{1} unrestricted?

5. Under the restriction α_{⊥}′μ_{1} = 0, the hypothesis of trend stationarity of X _{1t}, say, can be formulated as the hypothesis that the unit vector (1, 0, . . . , 0) is one of the cointegrating vectors. Discuss how the parameters can be estimated by reduced rank regression in this case.
(p.101) 6.2
A normalization or identification of the cointegrating vectors. Consider the model

1. Show that
$$\begin{array}{cc}\text{Var}(\stackrel{~}{\beta}\prime {X}_{t1})\hfill & =I\text{,}\hfill \\ \multicolumn{1}{c}{\stackrel{~}{\alpha}={\Sigma}_{0\stackrel{~}{\beta}}}& =\text{Cov}(\Delta {X}_{t}\text{,}\stackrel{~}{\beta}\prime {X}_{t1})\text{,}\hfill \\ \multicolumn{1}{c}{\stackrel{~}{\alpha}\prime {\Sigma}_{00}^{1}\stackrel{~}{\alpha}}& =\text{diag}({\lambda}_{1}\text{,}...\text{,}{\lambda}_{r})\text{.}\hfill \end{array}$$ 
2. Show that similar relations hold for the estimated values of α and β.
6.3
An example of a model based on rational expectations can be formulated as

1. Show that (6.35) and (6.36) taken together are a special case of (6.33) and find the matrices c, c _{0}, and c _{1}.
(p.102)

2. Show that in order for (6.33) to be consistent with (6.34) it must hold that
$$c}_{1}\prime \Pi =({c}_{0}+{c}_{1})\prime \text{,}\phantom{\rule{1}{0ex}}\text{and}\phantom{\rule{1}{0ex}}{c}_{1}\prime \mu +c=0\text{.$$(6.37)$$a={c}_{1}\text{,}b=({c}_{0}+{c}_{1})\text{.}$$ 
3. Show that under the assumption that Π has reduced rank r (0 〈 r 〈 p) and the restrictions (6.37) hold, we have that a has full rank and that
$$\begin{array}{c}(a\text{,}{a}_{\perp})\prime \Pi (b\text{,}{b}_{\perp})=\left[\begin{array}{cc}\hfill b\prime b\hfill & \hfill 0\hfill \\ \multicolumn{1}{c}{\Theta}& \hfill \xi \eta \prime \hfill \end{array}\right]\text{,}\hfill \end{array}$$ 
4. Find the dimension of the parameter space spanned by Π and μ under the restrictions (6.37), and find an expression for the cointegrating vectors β expressed in terms of η, and an expression for α.

5. Show by multiplying the equations for X _{t} by a and a _{⊥} respectively that the estimators for η, ξ, Θ, and μ can be determined by reduced rank regression.

6. Find an expression for the likelihood ratio test for the hypothesis (6.37). The asymptotic distribution is χ^{2}. Find the degrees of freedom.
6.4
Discuss the estimation problem for the statistical models discussed in the exercises given in Chapter 5.
6.5
Consider the model

1. Show that the hypothesis
$${\Gamma}_{1}{H}_{\perp}=0$$
(p.103)

2. Show that if β = Hϕ and Γ_{1} = ξ H′ then Y _{t} = H′X _{t} is an autoregressive process. Find the parameters and give the condition for the process Y _{t} to be an I(1) process.

3. Consider now the following special case
$$\Delta {Y}_{t}=\gamma \Delta {Z}_{t1}+{\epsilon}_{1t}\text{,}$$(6.39)$$\Delta {Z}_{t}={\alpha}_{2}({Y}_{t1}+{\beta}_{2}{Z}_{t1})+{\epsilon}_{2t}\text{.}$$(6.40)$$\begin{array}{cc}\phantom{\rule{3}{0ex}}{\beta}_{2}+\gamma \hfill & \ne 0\hfill \\ \multicolumn{1}{c}{1}& \u3008\gamma {\alpha}_{2}\hfill \\ \multicolumn{1}{c}{\gamma {\alpha}_{2}+{\beta}_{2}{\alpha}_{2}}& \u30080\hfill \\ \multicolumn{1}{c}{{\alpha}_{2}(\gamma {\beta}_{2})}& \u30082\hfill \end{array}$$ 
4. For H = (1, 0)′ the hypothesis β = Hϕ reduces to β_{2} = 0. Find the autoregressive representation of Y _{t} given by (6.39) and (6.40) under the assumption that β = Hϕ and Γ_{1} = ξ H′ and determine the properties of the process depending on the parameters.
The problem is inspired by the following situation. In an investigation of money m _{t}, income y _{t} price p _{t} and interest rates i _{1t} and i _{2t}, both money and income are in nominal values. The variables m _{t}, y _{t}, and p _{t} are in logarithms. An analysis of the data shows that a model with r = 3 and k = 2 describes the data. We now want to investigate if we could have analysed the variables in real terms, that is, the variables (m _{t} − p _{t}, y _{t} − p _{t}, i _{1t}, i _{2t}).

5. Determine in this case the matrix H and determine explicitly the condition that Γ_{1} has to satisfy in order that the real variables are described by an AR(2) model.