Forecasting under model uncertainty: Non‐homogeneous hidden Markov models with Pòlya‐Gamma data augmentation

Date01 July 2020
AuthorIoannis Vrontos,Loukia Meligkotsidou,Constandina Koki
Published date01 July 2020
DOIhttp://doi.org/10.1002/for.2645
Received: 4 June 2019 Revised: 10 December 2019 Accepted: 13 December 2019
DOI: 10.1002/for.2645
RESEARCH ARTICLE
Forecasting under model uncertainty: Non-homogeneous
hidden Markov models with Pòlya-Gamma data
augmentation
Constandina Koki1Loukia Meligkotsidou2Ioannis Vrontos1
1Department of Statistics, Athens
University of Economics and Business,
Athens, Greece
2Faculty of Mathematics, National and
Kapodistrian University of Athens,
Panepistimioupolis, Athens, Greece
Correspondence
Constandina Koki, Department of
Statistics, Athens University of Economics
and Business, 76 Patission Street, Athens.
Email: kokiconst@aueb.gr
Abstract
We consider finite state-space non-homogeneous hidden Markov models for
forecasting univariate time series. Given a set of predictors, the time series
are modeled via predictive regressions with state-dependent coefficients and
time-varying transition probabilities that depend on the predictors via a logis-
tic/multinomial function. In a hidden Markov setting, inference for logistic
regression coefficients becomes complicated and in some cases impossible due
to convergence issues. In this paper, we aim to address this problem uti-
lizing the recently proposed Pólya-Gamma latent variable scheme. Also, we
allow for model uncertainty regarding the predictors that affect the series both
linearly — in the mean — and non-linearly — in the transition matrix. Predic-
tor selection and inference on the model parameters are based on an automatic
Markov chain Monte Carlo scheme with reversible jump steps. Hence the pro-
posed methodology can be used as a black box for predicting time series. Using
simulation experiments, we illustrate the performance of our algorithm in vari-
ous setups, in terms of mixing properties, model selection and predictive ability.
An empirical study on realized volatility data shows that our methodology gives
improved forecasts compared to benchmark models.
KEYWORDS
non-homogeneous hidden Markov models, model selection, forecasting, Pólya-Gamma data aug-
mentation, realized volatility
1INTRODUCTION
Discrete-time finite state-space homogeneous hidden
Markov models (HHMMs) have been extensively studied
and used to model stochastic processes that consist of
an observed process and a latent (hidden) sequence of
states which is assumed to affect the observation sequence
(see, e.g., Billio et al., 1999; Cappé et al., 2005). Bayesian
inference, using Markov chain Monte Carlo (MCMC)
techniques, has enhanced the applicability of HHMMs
and has led to the construction of more complex model
specifications, including nonhomogeneous hidden
Markov models (NHHMMs). Initially, Diebold et al.
(1994) studied 2-state Gaussian NHHMMs, in which
the time-varying transition probabilities were modeled
via logistic functions. Their approach was based on
the expectation-maximization (EM) algorithm. Filardo
and Gordon (1998) adopted a Bayesian perspective to
overcome technical and calculation issues of classical
approaches. Since then, various Bayesian methods have
been proposed in the literature. For example, Spezia
(2006) modeled time-varying transition probabilities via
© 2019 John Wiley & Sons, Ltd. Journal of Forecasting. 2020;39:580–598.
wileyonlinelibrary.com/journal/for
580
a logistic function depending on exogenous variables and
performed model selection based on the Bayes factor.
In the same spirit, Meligkotsidou and Dellaportas (2011)
considered an m-state (m2) NHHMM and assumed
that the elements of the transition matrix are linked
through exogenous variables with a multinomial logistic
link, whereas the observed process conditional on the
unobserved process follows an autoregressive model
of order p. They accommodated and exploited model
uncertainty within their Bayesian model — by allowing
covariate selection only on the transition matrix — to
improve the predictive ability of NHHMMs on economic
data series.
Based on experimental evidence, the algorithm of
Meligkotsidou and Dellaportas (2011; M&D) faces
convergence issues when there exists model uncertainty,
due to the data augmentation scheme of Holmes and Held
(2006). Polson et al. (2013) confirm the efficiency issues in
the Holmes and Held scheme and propose a Pólya-Gamma
data augmentation strategy that significantly improves
over various benchmarks (e.g., Frühwirth-Schnatter &
Frühwirth, 2010; Fussl et al., 2013; O'Brien & Dunson,
2004). Furthermore, the recent work of Holsclaw et al.
(2017) confirms that using Pólya-Gamma data aug-
mentation to parametrize the transition probabilities of
NHHMMs results in an algorithm that mixes well and
provides adequate estimates of the model parameters.
Motivated by this, we revisit the work of Meligkotsidou
and Dellaportas (2011) by employing the recent method-
ological advances on the Pólya-Gamma data augmentation
scheme of Polson et al. (2013). We consider NHHMMs in
which the time series are modeled via different predictive
regression models for each state, whereas the transition
probabilities are modeled via logistic regressions. Given
an available set of predictors, we allow for model uncer-
tainty regarding the predictors that affect the series both
linearly — directly in the mean regressions — and
non-linearly — in the transition probability matrix.
The resulting model is a nonhomogeneous
Pólya-Gamma hidden Markov model, which we will
denote by NHPG. Bayesian inference is performed via an
MCMC scheme that overcomes difficulties and conver-
gence issues inherent in existing MCMC algorithms. To
this end, we exploit the missing data representation of hid-
den Markov models and construct an MCMC algorithm
based on data augmentation, consisting of several steps.
First, we sample the latent sequence of states via the
scaled forward–backward algorithm of Scott(2002), which
is a modification of the forward–backward algorithm of
Baum et al. (1970), who used it to implement the classical
EM algorithm. We then use a logistic regression repre-
sentation of the transition probabilities and simulate the
parameters of the mean predictive regression model for
each state, via Gibbs sampling steps. Finally, we incorpo-
rate variable selection within our MCMC scheme by using
the reversible jump (RJ) algorithm of Green (1995) and
Hastie and Green (2011).
Different approaches have been used in the literature
to cope with the model selection problem. The use of
information criteria, such as Akaike's information crite-
rion (AIC; Akaike et al., 1973), the Bayesian information
criterion (BIC) of Schwarz (1978), the deviance informa-
tion criterion (DIC; Spiegelhalter et al., 2002) or the widely
applicable Bayesian information criterion (WBIC; Watan-
abe, 2013), is another approach to variable selection. A
study for comparing variable selection methods is well pre-
sented in O'Hara and Sillanpää (2009), while Dellaportas
et al. (2002) study the variable selection methods in the
context of model choice. Holsclaw et al. (2017) consider an
NHHMM similar to ours for modeling multivariate mete-
orological time series data. In that paper, the transition
probabilities are modeled via multinomial logistic regres-
sions affected by a specific set of exogenous variables. The
authors use the BIC for choosing the best model among a
prespecified class of models. We extend this work by con-
sidering the problems of statistical inference and variable
selection jointly,in a purely Bayesian setting. The proposed
model is flexible, since we do not decide a priori which
covariates affect the observed or the unobserved process.
Instead, we have a common pool of covariates {X}and
within the MCMC algorithm, we gauge which covariates
are included in subset X(1)affecting the mean predic-
tive equation of the observed process, and which covariates
are included in subset X(2)affecting the time-varying
transition probabilities.
Our probabilistic approach is based on the calculation
of the posterior distribution of different NHPGs. Posterior
probabilities can be used either for selecting the most prob-
able model (i.e., making inference using the model with
the highest posterior probability), or for Bayesian model
averaging (i.e., producing inferences averaged over differ-
ent NHPGs). Barbieri and Berger (2004) argue that the
optimal predictive model is not necessarily the model with
the highest posterior probability. Specifically, they show
that the optimal predictive model for linear regression
models is the median probability model, i.e. the model that
consisted of those covariates which have overall posterior
inclusion probabilities greater or equal to 0.5. We calcu-
late both the posterior probabilities of the models and the
probabilities of inclusion.
Weuse our model for predicting realized volatility. Accu-
rate forecasting of future volatility is important for ass
et allocation, portfolio construction, and risk manage-
ment (see Gospodinov et al., 2006). A review of the real-
ized volatility literature can be found in McAleer and
Medeiros (2008). The relationship between the volatility
KOKI ET AL.581

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT