On the directional predictability of equity premium using machine learning techniques

Date01 April 2020
DOIhttp://doi.org/10.1002/for.2632
Published date01 April 2020
Received: 28 November 2018 Revised: 22 October 2019 Accepted: 28 October 2019
DOI: 10.1002/for.2632
RESEARCH ARTICLE
On the directional predictability of equity premium using
machine learning techniques
Jonathan Iworiso Spyridon Vrontos
Department of Mathematical Sciences,
University of Essex, Colchester, UK
Correspondence
Spyridon Vrontos, Department of
Mathematical Sciences, University of
Essex, Wivenhoe Park, Colchester CO4
3SQ, UK.
Email: svrontos@essex.ac.uk
Abstract
This paper applies a plethora of machine learning techniques to forecast the
direction of the US equity premium. Our techniques include benchmark binary
probit models, classification and regression trees, along with penalized binary
probit models. Our empirical analysis reveals that the sophisticated machine
learning techniques significantly outperformed the benchmark binary probit
forecasting models, both statistically and economically. Overall, the discrimi-
nant analysis classifiers are rankedfirst among all the models tested. Specifically,
the high-dimensional discriminant analysis classifier ranks first in terms of sta-
tistical performance, while the quadratic discriminant analysis classifier ranks
first in economic performance. The penalized likelihood binary probit models
(least absolute shrinkage and selection operator, ridge, elastic net) also outper-
formed the benchmark binary probit models, providing significant alternatives
to portfolio managers.
KEYWORDS
binary probit, CART, directional predictability,forecasting, penalized binary probit, recursive win-
dow
1INTRODUCTION
Stock market participants aim at maximizing returns on
portfolio investments at minimal risk. Consequently,fore-
casting stock market returns has received considerable
attention in recent years. The majority of papers have
focused on the forecast accuracy of competing models
and examined whether there is evidence of predictabil-
ity, which can lead to economic gains. However, devising
successful trading strategies is contingent on the direc-
tional accuracy of the underlying models. The literature
on directional predictability is sparse, and the empirical
findings offer limited support. For example, the findings of
Chevapatrakul (2013), Christoffersen and Diebold (2006),
and Nyberg and Pönkä (2016) provide weak evidence of
directional stock market predictability. Although the pre-
dictive power of the models employed so far are shown to
be weak in statistical terms, they seem to provide economic
value. Thus the emphatic challenge lies in the develop-
ment of a suitable directional predictive model involving
the relevant financial and economic variables.
The application of some benchmark econometric mod-
els used in previous findings are shown to be weak
in terms of predictive performance. The introduction of
out-of-sample estimation and forecasting techniques used
by Nyberg (2011) and Pönkä (2016) provide statistically
significant evidence of the directional predictability of
stock market returns, but the predictive power of the mod-
els are shown to be relatively weak, and hence there is
a need to introduce sophisticated machine learning tech-
niques, as proposed in this paper,to improve the predictive
task of the models.
This paper focuses on the application of sophisti-
cated machine learning techniques on binary probit and
classification models to forecast the direction of US
excess stock market returns. The machine learning tech-
Journal of Forecasting. 2020;39:449–469. wileyonlinelibrary.com/journal/for © 2019 John Wiley & Sons, Ltd. 449
IWORISO AND VRONTOS
niques employed include classification and regression
trees (CART), such as bagging, boosting and discriminant
analysis classifiers, Bayesian classifiers, neural networks
(NNET), and regularization techniques, such as ridge,
least absolute shrinkage and selection operator (LASSO),
and elastic net. To compare our findings with the previ-
ous literature, we also include four variants of the bench-
mark binary probit models, namely the static, stepwise
static, dynamic, and stepwise dynamic models. The appli-
cation of CART forecasting models aims to explore all
covariates as ensembles to learn the data, train the clas-
sification model, recognize patterns, classify instances,
and to forecast future binary outcomes. With respect to
penalized binary probit models, we should note that the
presence of shrinkage penalty vector norms could result
in a bias in coefficient estimates, reduction in the fore-
cast errors, and improvement in predictive performance
via the so-called bias–variance tradeoff. Thus the proposal
of CART and penalized predictive models in this paper
aims at yielding superior statistical predictive performance
and economic significance compared to the benchmark
econometric models typically employed in the literature
to date.
The remaining structure of the paper is laid out as fol-
lows: Section 2 discusses the relevant literature; Section 3
describes the research methodology; Section 4 presents the
data and the empirical findings; and Section 5 concludes
the paper.
2LITERATURE REVIEW
A notable quest in modern financial econometric litera-
ture is the application of suitable techniques to predict the
sign of stock market returns. A reviewof relevant empirical
literature has revealed that the use of econometric mod-
els for the directional predictability of excess stock returns
is known to produce weak predictive power, poor statisti-
cal goodness of fit, and low predictive accuracies, among
others (see Chevapatrakul, 2013; Leitch & Tanner, 1991;
Leung, Daouk, & Chen, 2000; Nyberg, 2011; Pesaran &
Timmermann, 1995; Pönkä, 2016), even though the empir-
ical results seems to provide economic significance.
The previous findings on directional predictability by
Anatolyev and Gospodinov (2010) and Hong and Chung
(2003) employed a logistic regression model to predict
the sign of US stock market returns using relevant finan-
cial variables as the key predictors, and their results
provide evidence of predictability, but the overall predic-
tive power is relatively weak as compared to a rule of
thumb. In an attempt to determine market timing and ass
et allocation decisions between stocks and risk-free assets,
some researchers considered the role of conditional mean
and volatility while predicting the sign of asset returns.
Christoffersen and Diebold (2006) have opined that the
direction of asset returns is predictable, as volatility depen-
dence produces sign dependence, so long as expected
returns are nonzero. Thi s notion seems to be true, as other
existing papers have also provided significant statistical
evidence of the sign predictability of the US stock market
returns and economic recession status by application of
static, dynamic, autodynamic, and error correction mod-
els, both in-sample and out-of-sample (Kauppi & Saikko-
nen, 2008; Nyberg, 2011, 2013; Nyberg & Pönkä, 2016).
The static and dynamic probit models proposed
by Nyberg (2011) to predict the direction of monthly
US excess stock returns recursively appears to have
outperformed the autoregressive moving average
with exogenous inputs models (ARMAX), vector
autoregressive-generalized autoregressive conditional
heteroskedasticity models (VAR-GARCH), etc., used by
previous researchers. The idea was based on the approach
used by Kauppi and Saikkonen (2008) and Estrella and
Mishkin (1998) to obtain US economic recession forecasts
using variables such as the US term spread and lagged
stock returns, among others.
However, according to the Nyberg (2011) paper, Estrel-
la's statistical goodness-of-fit values in the various probit
models are very low in all cases. The positive values of
the Sharpe ratios signified that investors are likely to have
positive returns on portfolio investments. The percentage
of correct matches as a statistical performance evaluation
measure in the existing papers are relatively low—hence
the need to employ more advanced sophisticated models
that can yield a better degree of accuracy with the smallest
prediction error.
The underlying challenges associated with the use
of financial and economic variables to predict stock
market returns have prompted researchers to introduce
sophisticated statistical or machine learning algorithms to
improve the predictive task and the overall performance
evaluation of the resulting models under consideration.
It is noticeable from the empirical literature that statis-
tical learning techniques, which include random forest,
linear discriminant analysis (LDA), k-nearest neighbor,
tree-based classification, recursive partitioning, bagging
and boosting, logistic regression, support vector machine
(SVM), ridge regression, LASSO,least angle regression and
elastic nets, are useful for the analysis of financial econo-
metric time series (Chen, 2016; Hajek, Olej, & Myskova,
2014; Hsu, Hung, & Chang, 2008; Inoue & Kilian, 2008;
Kim & Swanson, 2014; Li & Chen, 2014; Lin & McClean,
2001; Pahwa, Khalfay, Soni, & Vora, 2017; Park & Sakaori,
2013; Roy, Mittal, Basu, & Abraham, 2015; Sermpinis,
Tsoukas, & Zhang, 2018; Shen, Wang, & Ma, 2014; Stock
& Watson, 2012; Swanson & White, 1997; Zhou, Lu, &
Fujita, 2015). Khaidem, Saha, and Dey (2016) used the
450

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT