Evaluation of current research on stock return predictability

AuthorErhard Reschenhofer,Manveer Kaur Mangat,Christian Zwatz,Sándor Guzmics
Published date01 March 2020
DOIhttp://doi.org/10.1002/for.2629
Date01 March 2020
RESEARCH ARTICLE
Evaluation of current research on stock return
predictability
Erhard Reschenhofer | Manveer Kaur Mangat | Christian Zwatz | Sándor Guzmics
Department of Statistics and Operations
Research, University of Vienna, Vienna,
Austria
Correspondence
Erhard Reschenhofer, Department of
Statistics and Operations Research,
University of Vienna, OskarMorgenstern
Platz 1, 1090. Vienna, Austria.
Email: erhard.reschenhofer@univie.ac.at
Abstract
The results of recent replication studies suggest that false positive findings are a
big problem in empirical finance. We contribute to this debate by reviewing a
sample of articles dealing with the shortterm directional forecasting of the
prices of stocks, commodities, and currencies. Screening all relevant articles
published in 2016 by one of the 96 journals covered by the Social Sciences Cita-
tion Index in the category Business, Finance,we select only those studies
that use easily accessible data of daily or higher frequency. We examine each
study in detail, from the selection of the dataset to the interpretation of the
results. We also include empirical analyses to illustrate the shortcomings of
certain approaches. There are three main findings from our review. First, the
number of selected papers is very low, which is surprising even when the strict
selection criteria are taken into account. Second, there are hardly any relevant
studies that use highfrequency datadespite the hype about financial big data
and machine learning. Third, the economic significance of the findingsfor
example, their usefulness for trading purposesis questionable. In general,
apparently good forecasting performance does not translate into profitability
once realistic transaction costs and the effect of data snooping are taken into
account. Other typical problems include unsuitable benchmarks, short evalua-
tion periods, and nonoperational trading strategies.
KEYWORDS
replication studies, trading strategies,fractal dynamics, periodicities, technical analysis, machine
learning
1|INTRODUCTION
In a muchnoticed study (Open Science Collaboration,
2015), 100 replications of studies published in three
renowned psychology journals were conducted in order
to evaluate reproducibility. Overall, only 36% of the
replications yielded significant results (at the 5% level),
compared to 97% of the original studies that reported sig-
nificant results. Possible explanations for this discrepancy
range from the tendency of journal editors and reviewers
to favor articles with statistically significant findings
(publication bias) right down to data snooping and
questionable research practices. Of course, the problem
of false positive findings is not confined to fields such as
------------------------------------------------------- -- --- -- -- --- -- --- -- -- --- -- --- -- -- --- -- --- -- -- --- -- --- -- --- -- -- --- -- --- -- -
This is an open access article under the terms of the Creative Commons AttributionNonCommercialNoDerivs License, which permits use and distribution in any
medium, provided the original work is properly cited, the use is noncommercial and no modifications or adaptations are made.
© 2019 The Authors. Journal of Forecasting published by John Wiley & Sons Ltd
Received: 20 November 2018 Revised: 6 August 2019 Accepted: 28 October 2019
DOI: 10.1002/for.2629
Journal of Forecasting. 2020;39:334–351.
wileyonlinelibrary.com/journal/for
334
psychology and medicine. It may even be more serious in
scientific fields where it is not possible to replicate studies.
For example, in empirical economics and empirical
finance, the data are obtained by observing economic
and financial variables over time. Clearly, there is only
one set of historical data available. We cannot simply gen-
erate a completely new history to determine the reliability
of earlier results. There are only two options. First, we can
reanalyze the original dataset by scrutinizing the selection
of the dataset, the methods used for analyzing the data, the
assumptions on which these methods are based, the mea-
sures used for the evaluation of the results, and the inter-
pretation of the results. Second, we can use a related
dataset that differs from the original dataset in the specifi-
cation of the variables, the regions, or the observation
period. In the latter case, it must be determined whether
any discrepancies between the original investigation and
the subsequent investigation point to something more
serious than unimportant differences in the specifications.
In an extensive investigation of stockreturn predictabil-
ity, McLean and Pontiff (2016) reanalyzed 79 relevant stud-
ies. They compared the returns of strategies based on 97
different predictors from these 79 studies for three different
time periods: (i) the original study's sample period; (ii) the
period after the original sample but before publication; and
(iii) the postpublication period. Not surprisingly, it turned
out that returns were highest in the first period and lowest
in the third period. The finding that the returns in period
(i) were higher than those in period (ii) was interpreted
as an indication of possible data snooping, and the finding
that the returns in period (ii) were higher than those in
period (iii) was interpreted as an indication that academic
research may destroy return predictability. However, these
findings must be put in perspective. First, the interpreta-
tion of the results is not straightforward. Can we really
accredit an article published in 2006, which analyzed the
performance of a certain strategy between 1963 and 2001,
with any discrepancies between the periods 19632001,
20022005, and 20072015? It is well known that the char-
acteristics of financial data change over time. For example,
the firstorder autocorrelation of stock returns was weakly
negative during the Great Depression, strongly positive
from the 1940s to the 1970s, close to zero in the 1980s
and 1990s, and turned negative again in the new millen-
nium. These changes are visualized in Figure 1a for the
Dow Jones Industrial Average (DJIA) using a method
that is based on ratios of successive returns and is robust
both against extreme values and clusters of high volatility
(Reschenhofer, 2017a, 2017b, 2019). Clearly, any trading
strategy might be affected by nonstationarities of this type
and not just by the appearance of some academic article.
From this point of view, the periods (i), (ii), and (iii) should
be as short as possible.
On the other hand, many predictors are only available
monthly, quarterly, annually, or even sporadicallyfor
example, book values, earnings, dividends, executives'
stock option exercises, analyst recommendations, credit
ratings, as well as consumer sentiment, inflation, and
other macroeconomic variables. Obviously, any results
obtained over a 3year period would be highly unreliable
in the case of annual data. Moreover, even in the case of
daily data, a period of a few years might not be sufficient
to evaluate the success of a trading strategy. The classic
strategy of buying/selling when the current price rises
above/falls below the 200day moving average (MA) is
only successful at times when stocks face major correc-
tions but persistently loses money at other times (see
Figure 1b). Overall, the profitability of this strategy
depends on whether major corrections occur every 5,
10, or 15 years. A fair evaluation based only on a few
years will therefore not be possible. Figure 1b shows the
performance over almost 90 years. Since the early 1980s,
this strategy was clearly outperformed by the buyand
hold strategy. Indeed, the difference between the cumula-
tive returns of the MA strategy and the buyandhold
strategy reached its postwar maximum in 1983 and
declined steadily thereafter, which implies that the
returns of the buyandhold strategy were generally
higher in the last decades than those of the MA strategy.
Also, the better performance of the MA strategy earlier
FIGURE 1 (a) Changes in the firstorder autocorrelation of daily
(log) returns on the DJIA. (b) Comparison of the cumulative returns
of the S&P 500 (black) and the cumulative returns of a simple
trading strategy based on 200day moving averages (green, no
transaction costs; red, 10 basis point) [Colour figure can be viewed
at wileyonlinelibrary.com]
RESCHENHOFER ET AL.335

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT