Methods for backcasting, nowcasting and forecasting using factor‐MIDAS: With an application to Korean GDP

Published date01 April 2018
AuthorNorman R. Swanson,Hyun Hak Kim
Date01 April 2018
DOIhttp://doi.org/10.1002/for.2499
Received: 3 January 2017 Revised: 21 September 2017 Accepted: 24 September 2017
DOI: 10.1002/for.2499
RESEARCH ARTICLE
Methods for backcasting, nowcasting and forecasting using
factor-MIDAS: With an application to Korean GDP
Hyun Hak Kim1Norman R. Swanson2
1Department of Economics, Kookmin
University, Seoul, Korea
2Department of Economics, Rutgers
University, NewBrunswick, NJ, USA
Correspondence
Hyun Hak Kim, Department of
Economics, Kookmin University,77
Jeongneung-Ro, Seongbuk-Gu, Seoul
02707, Korea.
Email: hyunhak.kim@kookmin.ac.kr
Abstract
We utilize mixed-frequency factor-MIDAS models for the purpose of carrying
out backcasting, nowcasting, and forecasting experiments using real-time data.
We also introduce a new real-time Korean GDP dataset, which is the focus of
our experiments. The methodology that we utilize involves first estimating com-
mon latent factors (i.e., diffusion indices) from 190 monthly macroeconomic
and financial series using various estimation strategies. These factors are then
included, along with standard variables measured at multiple different frequen-
cies, in various factor-MIDAS prediction models. Our key empirical findings as
follows. (i) When using real-time data, factor-MIDAS prediction models outper-
form various linear benchmark models. Interestingly, the “MSFE-best” MIDAS
models contain no autoregressive (AR) lag terms when backcasting and now-
casting. AR terms only begin to play a role in “true” forecasting contexts. (ii)
Models that utilize only one or two factors are “MSFE-best” at all forecasting
horizons, but not at any backcasting and nowcasting horizons. In these latter
contexts, much more heavily parametrized models with many factors are pre-
ferred. (iii) Real-time data are crucial for forecasting Korean gross domestic
product, and the use of “first available” versus “most recent” data “strongly”
affects model selection and performance. (iv) Recursively estimated models are
almost always “MSFE-best,” and models estimated using autoregressiveinterpo-
lation dominate those estimated using other interpolation methods. (v) Factors
estimated using recursive principal component estimation methods have more
predictive content than those estimated using a variety of other (more sophis-
ticated) approaches. This result is particularly prevalent for our “MSFE-best”
factor-MIDASmodels, across virtually all forecast horizons, estimation schemes,
and data vintages that are analyzed.
KEYWORDS
backcasting, forecasting, factor model, MIDAS, nowcasting
1INTRODUCTION
In this paper, we utilize a combination of real-time data
and mixed-frequency modeling methods together with
a variety of principal component analyses, in order to
provide new evidence on the usefulness of these tech-
niques for forecasting. More specifically, we introduce
a new real-time Korean gross domestic product (GDP)
Journal of Forecasting. 2018;37:281–302. wileyonlinelibrary.com/journal/for Copyright © 2017 John Wiley & Sons, Ltd. 281
282 KIM AND SWANSON
dataset, which is used together with a large monthly
dataset including 190 variables,1to backcast, nowcast,
and forecast Korean GDP. Our prediction models combine
the mixed data sampling (MIDAS) framework of Ghysels,
Santa-Clara, and Valkanov (2004), which allows for the
incorporation of variables of differing frequencies,with the
diffusion index framework of Stock and Watson (2002).
The difference between backcasting, nowcasting, and
forecasting can be explained as follows. Suppose that the
objective is to predict GDP for 2016:Q2, using a simple
autoregressive model of order one, say. In a conventional
setting where real-time data are not available,it is assumed
that information up to 2016:Q1 is available at the time the
prediction is made, so that
GDP2016:Q2 =+
GDP2016:Q1,
where and
are parameters estimated using maximum
likelihood based on recursive or rolling data windows. In
a real-time context, however, this prediction is not feasi-
ble. Namely, if the prediction is to be made in April or
even May of 2016, then GDP2016:Q1 is not yet available,even
in preliminary release. This issue leads to the convention
of defining three different types of predictions, includ-
ing backcasts (predicting past observations, which are not
yet available in real time), nowcasts (predicting concur-
rent observations), and forecasts (see Giannone, Reichlin,
and Small (2008) for a comprehensive discussion of back-
casting and nowcasting). One advantage of carefully ana-
lyzing the data structure used in the formulation of
prediction models is that we are able to simulate real-time
decision-making processes. In addition to Giannone et al.
(2008), the reader is referred to Girardi et al. (2017) for an
overview of this literature, within the context of nowcast-
ing euro area GDP in pseudo real time using dimension
reduction techniques.
There are several approaches to forecasting lower fre-
quency variables using higher frequency variables. The
first approach involves use of the so-called “bridge” model,
which aggregates higher frequency variables with lower
frequency variables, such as GDP. This aggregation is
called a “bridge,” and this method is commonly used
by central banks, since implementation and interpreta-
tion are straightforward (see, e.g., Golinelli & Parigi, 2005;
Rünstler & Sédillot, 2003; Zheng & Rossiter,2006). Indeed,
this approach offers a very convenient solution for fil-
tering, or aggregating, variables characterized by differ-
ent frequencies. However, aggregation may lead to the
loss of useful information. This issue has led to the
recent development of alternative mixed frequency mod-
eling approaches. One important approach, which is men-
tioned above, is called MIDAS. This approach involves
1This large monthly Korean macroeconomic dataset, which resembles
the well-knownUS Stock and Watson (2002) dataset, is introduced in Kim
(2017).
the use of a regression framework that directly includes
variables sampled at different frequencies. Broadly speak-
ing, MIDAS regression offers a parsimonious means by
which lags of explanatory variables of differing frequen-
cies can be utilized; and its use for macroeconomic fore-
casting is succinctly elucidated by Clements and Galvao
(2008). Additional recent papers in this area of forecasting
include Kuzin, Marcellino, and Schumacher (2011), who
predict euro area GDP, Ferrara and Marsilli (2013), who
predict French GDP, and Pettenuzzo, Timmermann, and
Valkanov (2014), who discuss Bayesianimplementation of
MIDAS. One interesting feature of MIDASis that the tech-
nique readily allows for the inclusion of diffusion indices.
For discussion of the combination of factor and MIDAS
approaches, see Marcellino and Schumacher (2010) and
Section 5 of this paper.For an interesting application to the
prediction of German GDP, see Schumacher (2007). The
reader is additionally referred to Giannone et al. (2008),
Ba ´
nbura, Giannone, and Reichlin (2012), and Ba ´
nbura and
Modugno (2014) for interesting discussions on the use of
mixed-frequency modeling for nowcasting, and to Kuzin,
Marcellino, and Schumacher (2013), Aastveit, Gerdrup,
Jore, and Thorsrud (2014), and Mazzi, Mitchell, and
Montana (2014) for a discussion on forecast combination
in the current context.
Our empirical findings can be summarized as fol-
lows. First and foremost, real-time data make a differ-
ence. The utilization of real-time data in a recursive
estimation framework, coupled with MIDAS, leads to
the “MSFE-best” predictions in our experiments. Sec-
ond, models that utilize only one or two factors are
“MSFE-best” at all forecasting horizons, but not at any
backcasting and nowcasting horizons. In these latter con-
texts, much more heavily parametrized models with many
factors are preferred. In particular,while one or two factors
are selected around half of the time in the cases, five or six
factors are also selected around half of the time. Third, the
variable being predicted makes a difference. For Korean
GDP, the use of “first available” versus “most recent”
data “strongly” affects model selection and performance.
One reason for this is that “first available” data are never
revised, and can thus in many cases be viewed as “noisy”
versions of later releases of observations for the same cal-
endar date. This is particularly true if rationality holds (see,
e.g., Swanson & van Dijk, 2006). Fourth, recursively esti-
mated models are almost always “MSFE-best,” and mod-
els estimated using autoregressive interpolation dominate
those estimated using other interpolation methods. Fifth,
factors constructed using recursive principal component
estimation methods have more predictive content than
those estimated using a variety of other (more sophisti-
cated) approaches. This result is particularly prevalent for
our “MSFE-best” factor-MIDAS models, across virtually

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT