The Power of Cointegration Tests Versus Data Frequency and Time Spans.

AuthorZhou, Su

Su Zhou [*]

Using Monte Carlo methods, this study illustrates the potential benefits of using high frequency data series to conduct cointegration analysis. The study also provides an account of why the results are different from those reported by Hakkio and Rush (1991). The simulation results show that when the studies are restricted by relatively short time spans of 30 to 50 years, increasing data frequency may yield considerable power gain and less size distortion, especially when the cointegrating residual is not nearly nonstationary, and/or when the models with nonzero lag orders are required for testing cointegration. The study may help clarify some misconceptions and misinterpretations surrounding the role of data frequency and sample size in cointegration analysis.

  1. Introduction

    In the empirical literature of cointegration analysis, researchers often face the limitation of using relatively short time spans of data. In many cases, this is simply due to the absence of longer spans of data. In other cases, some equilibrium relationships have to be studied for certain time periods. For instance, when the models require a flexible exchange rate or a variable price of gold, studies have to be undertaken for the flexible exchange rate period, starting from the early 1970s, or for the period since 1968 when the price of gold was allowed to fluctuate. With the limits of relatively short time spans, many researchers chose to use relatively high frequency data to conduct the studies. Such attempts have been criticized in the literature. Hakkio and Rush (1991) argue that "the frequency of observation plays a very minor role" (p. 572) in exploring a cointegration relationship, because "cointegration is a long-run property and thus we often need long spans of data to properly test it" (p. 579). H akkio and Rush's point is similar to the one made by Shiller and Perron (1985), that the length of the time series is far more important than the frequency of observation when testing for unit roots.

    While those who criticize the collection of high frequency data to deal with the short time span problem advocate the use of long spans of data to test properly for cointegration, their suggestion is sometimes misinterpreted as a support for using a small number of annual data. [1] For instance, Bahmani-Oskooee (1996, p. 481) borrows Hakkio and Rush's (1991, p. 572) "testing a long-run property of the data with 120 monthly observations is no different than testing it with ten annual observations" to defend his use of annual data by saying that, using annual data of over 30 years "is as good as using quarterly or monthly data over the same period." Taylor (1995, P. 112) claims that the deficiency of using less than 50 annual observations "should be compensated by the fact that the data set spans nearly half a century."

    I would like to point out that Hakkio and Rush's study has several limitations, and therefore it may not be appropriate to cite the conclusions of the study for the cases beyond its limitations. Their study only allows the cointegrating residual to be a pure first-order autoregressive (AR[1]) process and is limited to the single-equation method of cointegration tests. They show the results only for some extreme cases where the cointegrating residual for the monthly data is either very highly serially correlated (nearly nonstationary and thus all the cointegration tests would have very low test power regardless of the frequency of the data) or with a quite low coefficient of serial correlation (thus all the cointegration tests can easily reject the null of no cointegration regardless of the frequency of the data).

    The present paper is motivated by seeking the answers to the following questions: (i) Does the frequency of observation play a very minor role in exploring a cointegration relationship in the cases where the cointegrating residual is not nearly nonstationary? (ii) Can the validity of the conclusions of Hakkio and Rush (1991) based on a single-equation method be extended to other popular cointegration tests and to more realistic cases, where the models with higher lag orders are required when the cointegrating residual is generated with more noise than a pure first-order autoregressive process? (iii) While testing cointegration with 120 monthly observations could be no different than testing it with 10 annual observations as both cases are subject to very low test power, does this warrant that using annual data of 30 to 40 years is as good as using quarterly or monthly data over the same period? (iv) How serious would the problem of size distortion be for the use of a small number of annual observations?

    This paper examines the power of cointegration tests versus frequency of observation and time spans, as well as the small-sample size distortions of the tests, through the Monte Carlo experiments. [2] The above questions are addressed by doing the following: (i) Instead of focusing on extreme cases corresponding to the cointegrating residuals with either very high or rather low serial correlation coefficients, this study also pays attention to moderately serially correlated cointegrating residuals. (ii) Both the cointegration tests in single equations, such as the Engle-Granger (1987) tests, and those in systems of equations, such as the Johansen (1988) tests and the Horvath-Watson (1995) tests, are examined. (iii) After generating the data at the monthly frequency, two sets of quarterly and annual data are produced. One set is obtained by taking the last observation in the period. As demonstrated by Hakkio and Rush (1991), the cointegrating residual of these end-of-quarter and end-of-year data remains an AR (1) process as long as the cointegrating residual at monthly interval is generated as an AR(1) process. Another set is computed by averaging the 3 or 12 monthly observations that correspond to each quarter or each year. It can be shown that these average quarterly and annual data contain a first-order moving average (MA[1]) component. Therefore, the models with higher orders of lag length are required for testing cointegration. This allows us to examine the effects of time span and data frequency on the power of cointegration tests with different lag orders as well as the impact of under- or overparameterization on the power and empirical sizes of the tests. (iv) The simulations are first conducted with a fixed time span of 30 annual observations, 120 quarterly observations, and 360 monthly observations to illustrate the influence of different sampling frequency on the power of cointegration tests for the cointegrating residuals with different degrees of serial correlation and for the models with different la g orders. The test power and corresponding size distortions are then further analyzed for different combinations of time spans and data frequencies.

    The paper is organized as follows. The next section introduces the data-generating processes. Section 3 briefly describes the cointegration tests under examination. The design of the Monte Carlo experiments applied in the study and the simulation results are reported in section 4. The last section concludes.

  2. Data-Generating Processes

    Following Hakkio and Rush (1991), the study starts with generating the monthly data [[X.sup.M].sub.t] by a random walk without a drift:

    [[X.sup.m].sub.t] = [[X.sup.M].sub.t-1] + [[[eta].sup.M].sub.t], [[[eta].sup.M].sub.t] [sim] N(0,1).

    Monthly [[Y.sup.M].sub.t] is defined as

    [[Y.sup.M].sub.t] = [[X.sup.M].sub.t] + [[[epsilon].sup.M].sub.t]

    where [[[epsilon].sup.M].sub.t] is an AR(1) process,

    [[[epsilon].sup.M].sub.t] = [rho][[[epsilon].sup.M].sub.t-1] + [[e.sup.M].sub.t], [[e.sup.M].sub.t] [sim] N(0, [[[sigma].sup.2].sub.e]).

    [[X.sup.M].sub.t] and [[Y.sup.M].sub.t] are cointegrated if [rho] [less than] 1, and are not cointegrated if [rho] = 1.

    The end-of-period quarterly and annual data are

    [[X.sup.end].sub.t] = [[X.sup.M].sub.t,s], [[Y.sup.end].sub.t] = [[Y.sup.M].sub.t,s], (1)

    where s = 3 for quarterly data and s = 12 for annual data, hence

    [[X.sup.end].sub.t] = [[X.sup.end].sub.t-1] + [[[eta].sup.end].sub.t], [[[eta].sup.end].sub.t] = [[[eta].sup.M].sub.t,1] + [[[eta].sup.M].sub.t,2] + ... [[[eta].sup.M].sub.t,s], E([[[eta].sup.end].sub.t], [[[eta].sup.end].sub.t-j] = 0 for j [neq] 0,

    [[Y.sup.end].sub.t] = [[X.sup.end].sub.t] + [[[epsilon].sup.end].sub.t], [[[epsilon].sup.end].sub.t] = [[rho].sup.s][[[epsilon].sup.end].sub.t-1] + [[e.sup.end].sub.t], [[e.sup.end].sub.t] = [[e.sup.M].sub.t,s] + [rho][[e.sup.M].sub.t,s-1] + ... + [[rho].sup.s-1][[e.sup.M].sub.t,1].

    Because E([[e.sup.end].sub.t], [[e.sup.end].sub.t-j]) = 0 for j [neq] 0, [[[epsilon].sup.end].sub.t] remains and AR(1) process.

    The average quarterly and annual data are

    [[X.sup.av].sub.t] = [[[sigma].sup.s].sub.i=1] ([[X.sup.M].sub.t,i])/s, [[Y.sup.av].sub.t] = [[[sigma].sup.s].sub.i=1] ([[Y.sup.M].sub.t,i])/s, and [[X.sup.av].sub.t] = [[X.sup.av].sub.t-1] + [[[eta].sup.av].sub.t],

    [[[eta].sup.av].sub.t] = (1/s) [[[sigma].sup.s].sub.i=1] ([[X.sup.M].sub.t,i] - [[X.sup.M].sub.t-1,i]) = (1/s){[[[[sigma].sup.s].sub.i=1] (s + 1 - i) [[[eta].sup.M].sub.t,i]] + [[[[sigma].sup.s].sub.i=2] (i - 1) [[[eta].sup.M].sub.t-1,i]]},

    [[Y.sup.av].sub.t] = [[[X.sup.av].sub.t] + [[[epsilon].sup.av].sub.t] , [[[epsilon].sup.av].sub.t] = [[[sigma].sup.s].sub.i=1] ([[[epsilon].sup.M].sub.t,i])/s, (2)

    which give

    [[[epsilon].sup.av].sub.t] = [[rho].sup.s][[[epsilon].sup.av].sub.t-1] + [[e.sup.av].sub.t], [[e.sup.av].sub.t] = (1/s){[[[sigma].sup.s].sub.i=1][[[e.sup.M].sub.t,i]([[[sigma].sup.s-i ].sub.j=0] [[rho].sup.j])] + [[[sigma].sup.s].sub.i =2][ [[e.sup.M].sub.t-1,i]([[[sigma].sup.s-1].sub.j=s-i+1] [[rho].sup.j])]}

    It can be easily shown that E([[[eta].sup.av].sub.t], [[[eta].sup.av].sub.t-1]) [neq] = 0 and E([[e.sup.av].sub.i], [[e.sup.av].sub.t-1]) [neq] 0, yet E([[[eta].sup.av].sub.t], [[[eta].sup.av].sub.t-j]) = 0 and E([[e.sup.av].sub.t]...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT