Does a lot help a lot? Forecasting stock returns with pooling strategies in a data‐rich environment

AuthorFabian Baetje
DOIhttp://doi.org/10.1002/for.2475
Date01 January 2018
Published date01 January 2018
RESEARCH ARTICLE
Does a lot help a lot? Forecasting stock returns with pooling
strategies in a datarich environment
Fabian Baetje
Department of Economics, Leibniz
Universität Hannover, Hannover, Germany
Correspondence
Fabian Baetje, Department of Economics,
Leibniz Universität Hannover,
Königsworther Platz 1, D30167 Hannover,
Germany
Email: baetje@gif.unihannover.de
Abstract
A variety of recent studies provide a skeptical view on the predictability of stock
returns. Empirical evidence shows that most prediction models suffer from a loss
of information, model uncertainty, and structural instability by relying on low
dimensional information sets. In this study, we evaluate the predictive ability of var-
ious lately refined forecasting strategies, which handle these issues by incorporating
information from many potential predictor variables simultaneously. We investigate
whether forecasting strategies that (i) combine information and (ii) combine individ-
ual forecasts are useful to predict US stock returns, that is, the market excess return,
size, value, and the momentum premium. Our results show that methods combining
information have remarkable insample predictive ability. However, the outofsam-
ple performance suffers from highly volatile forecast errors. Forecast combinations
face a better biasefficiency tradeoff, yielding a consistently superior forecast per-
formance for the market excess return and the size premium even after the 1970s.
KEYWORDS
factor models, forecast combination, model uncertainty, principal components, return predictability
1|INTRODUCTION
The classical capital asset pricing model is often expanded by
additionally considering returns to a size portfolio (SMB), to
a value portfolio (HML), and to a momentum portfolio
(MOM) as risk factors (Carhart, 1997; Fama & French,
1993). The resulting four risk factors can usually be seen as
risk premia compensating investors for holding risky assets.
Due to the important role of these risk factors and their
related premia, the question arises: Are these risk premia
predictable by macroeconomic determinants?
We focus on the macroeconomic aspect because one may
argue that in the last instance the riskiness of firms is to a great
extent determined by the firms' exposure to macroeconomic
risks. This line of research has quite some history for equity
premium prediction and has developed established indicators,
such as the shortterm interest rate, the credit and the term
spread (see, e.g., Ang & Bekaert, 2007; Fama & French,
1989; Keim & Stambaugh, 1986), the inflation rate (e.g.,
Campbell & Vuolteenaho, 2004; Fama & Schwert, 1977;
Nelson, 1976), stock market volatility (investigated by Guo,
2006), or the consumptionwealth ratio provided by Lettau
and Ludvigson (2001), to name just a few. Most of the empir-
ical studies (e.g., Campbell & Shiller, 1988; Cochrane, 2008;
Lewellen, 2004) use valuation ratios such as the dividend
yield, the priceearnings ratio, or the booktomarket ratio,
which should serve as proxies for expected business condi-
tions, as mentioned by Campbell and Diebold (2009).
Although stock return predictability was accepted for a
long time, Goyal and Welch (2008), Timmermann (2008),
Griffin, Ji, and Martin (2003), and Griffin and Lemmon
(2002), among others, show that commonly used predictors
perform poorly in an outofsample (OOS) setting, especially
during more recent decades, which cast doubt on the empiri-
cal linkage between macroeconomic fundamentals and stock
returns. More recently, Rapach and Zhou (2013) provide a
survey of further approaches attempting to improve the fore-
cast performance. Campbell and Thompson (2008), Ferreira
Received: 5 October 2015 Revised: 30 November 2016 Accepted: 25 March 2017
DOI: 10.1002/for.2475
Journal of Forecasting. 2018;37:3763. Copyright © 2017 John Wiley & Sons, Ltd.wileyonlinelibrary.com/journal/for 37
and SantaClara (2011), and Pettenuzzo, Timmermann, and
Valkanov (2014) show that the application of economic con-
straints generally increases the overall forecast performance
of commonly used macroeconomic variables. However,
improvements are predominantly located in the distant past
(see Li & Tsiakas, 2016; Pettenuzzo et al., 2014). Further
forecasting advances incorporate the usage of pooling strate-
gies like principal component analysis (proposed by Stock &
Watson, 2002b) or forecast combination methods (proposed
by Rapach, Strauss, & Zhou, 2010). Although Rapach and
Zhou (2013) report evidence in favor of principal component
predictive regressions, the same set of predictors performs
considerably worse than the historical average benchmark
by considering a more recent data sample (see, e.g., Neely,
Rapach, Tu, & Zhou, 2014). This behavior also seems to be
evident for alternative pooling strategies (Baetje &
Menkhoff, 2016). Overall, reported findings confirm results
highlighted by Goyal and Welch (2008) that commonly used
predictors perform unstably over time.
This finding might be a direct consequence of model
uncertainty and/or structural instability surrounding a con-
stantly evolving datagenerating process for stock returns
(Pesaran & Timmermann, 1995). As previously mentioned,
many approaches concerning return predictability only con-
sider a small number of a priori selected indicators as poten-
tial predictor variables, which naturally limits the amount of
available information. Nowadays, numerous variables are
available for model specification, leaving the question unan-
swered of which variables are the most relevant ones for
stock return predictability.
Despite the problem of searching for the most informative
variables, structural instability might lead to changes in the
best model specification (Rapach & Zhou, 2013). Following
Timmermann (2008), Paye and Timmermann (2006), and
Rapach and Wohar (2006), among others, forecasting models
exhibit parameter instability with the consequence that the
outperformance of forecasting models is extremely restricted
to shortlived periods (see also Goyal & Welch, 2008). Thus,
if the forecasting ability of individual variables varies over
time, it is unclear which variable should be considered in
predictive regression models (i.e., model uncertainty).
This study contributes to the existing literature on stock
return predictability in three ways. First and foremost, we
refer to studies predicting stock market risk premia with
macroeconomic information by exploiting of a large number
of potential predictor variables. Empirical studies commonly
make use of a strong prior on predictability by examining the
forecast performance of individual predictors (e.g., Goyal &
Welch, 2008) or forecasting strategies based on a small set
of variables (e.g., Rapach & Zhou, 2013; Rapach et al.,
2010). However, reported evidence shows that the relation
between the macroeconomic situation and the stock market
is difficult to grasp by relying on lowdimensional
information sets (Timmermann, 2008). By expanding the
set of predictors, we are able to determine whether the weak
evidence of predictability can be linked to a loss of informa-
tion by using a small set of information to capture timevary-
ing dynamics of stock market risk premia. In general, we
follow a variety of recent studies (Stock & Watson, 2002b;
Ludvigson & Ng, 2007, 2009; among others) showing that
the consideration of a large number of predictors might
improve the forecast performance. In this context, our dataset
of potentially meaningful predictor variables consists of 124
US macroeconomic and financial time series and is related
to other studies, such as Ludvigson and Ng (2009).
Second, our research contributes to concerns regarding
model uncertainty and instability issues surrounding the
datagenerating process for stock returns. We address the
complexity of the relation between stock market risk premia
and macroeconomic fundamentals by applying various
forecasting strategies. These are able to consider many poten-
tially important predictor variables simultaneously and,
therefore, are able to limit the impact of model uncertainty
(see Rapach & Zhou, 2013). Additionally, if the degree of
potential model instability is not too dependent across series,
pooling strategies might further enhance the overall forecast
performance (Rapach et al., 2010; Stock & Watson, 2002a).
Explicitly, this study evaluates the predictive performance
of forecasting strategies combining information and strate-
gies that pool individual forecasts (Huang & Lee, 2010;
Rapach & Zhou, 2013). For this purpose, we first apply prin-
cipal component predictive regressions considering a small
number of latent common components as potential predictor
variables (Cakmakli & van Dijk, 2016; Ludvigson & Ng,
2007; Neely et al., 2014). Furthermore, we follow Kelly
and Pruitt (2013, 2015) and make use of a threepass regres-
sion filter estimating target relevant factors by particularly
taking into account the relationship between the estimated
factors and the target relevant variables (i.e., stock returns).
Regarding the second set of forecasting strategies, we follow
Rapach et al. (2010) by using forecast combination strategies.
Third, we provide a comprehensive overview of the pre-
dictability of various risk premia. Therefore, we extend the
set of stock market risk premia and provide further insights
into whether forecasting strategies are even beneficial to pre-
dict the size, value, and momentum risk premium. To the best
of our knowledge, other studies either focus on a single risk
premium (in particular, on the market excess return) or on a
single approach. We conduct forecast regressions concerning
shortrun (13 months) and longrun (1224 months) fore-
cast horizons. In addition to an insample analysis, we evalu-
ate the forecast performance in an OOS setting, starting in
February 1975. Thus forecast accuracy is directly linked to
the end of the recession in the 1970s, which is mentioned
as the climax of the previously reported OOS predictive abil-
ity of commonly used variables (see Goyal & Welch, 2008).
38 BAETJE

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT