Estimating the Out‐of‐Sample Predictive Ability of Trading Rules: A Robust Bootstrap Approach

Date01 July 2016
Published date01 July 2016
DOIhttp://doi.org/10.1002/for.2380
Journal of Forecasting,J. Forecast. 35, 347–372 (2016)
Published online 25 November 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/for.2380
Estimating the Out-of-Sample Predictive Ability of Trading
Rules: A Robust Bootstrap Approach
JULIEN HAMBUCKERS1,2AND CÉDRIC HEUCHENNE1,3
1
University of Liège, UER Operations, Liège, Belgium
2
Belgian National Fund for Scientific Research (F.R.S. - F.N.R.S), Belgium
3
Université Catholique de Louvain (Louvain-la-Neuve), Institute of Statistics, Belgium
ABSTRACT
In this paper, we provide a novel way to estimate the out-of-sample predictive ability of a trading rule. Usually, this
ability is estimated using a sample-splitting scheme, true out-of-sample data being rarely available. Weargue that this
method makes poor use of the available data and creates data-mining possibilities. Instead, we introduce an alternative
.632 bootstrap approach. This method enables building in-sample and out-of-sample bootstrap datasets that do not
overlap but exhibit the same time dependencies. We show in a simulation study that this technique drastically reduces
the mean squared error of the estimated predictive ability. We illustrate our methodology on IBM, MSFT and DJIA
stock prices, where we compare 11 trading rules specifications. For the considered datasets, two different filter rule
specifications have the highest out-of-sample mean excess returns. However, all tested rules cannot beat a simple
buy-and-hold strategy when trading at a daily frequency. Copyright © 2015 John Wiley & Sons, Ltd.
KEY WORDS trading rules; bootstrap; bootstrap .632; out-of-sample; predictive ability, parameter
uncertainty
INTRODUCTION
Whether technical trading rules can consistently generate profits is a question that has been investigated by
researchers, for a variety of reasons. In particular, it can be used to study whether a market is efficient or not. Indeed,
profitable technical trading rules would be in opposition to the market efficiency hypothesis, which states that all avail-
able information must be contained in the price of a security (Bajgrowicz and Scaillet, 2012). Besides, the profitability
of trading rules may also be used to detect a time-varying risk premium (Kho, 1996). Moreover, as a high propor-
tion of practitioners rely on technical trading rules to trade, it may be interesting to know whether these investment
strategies have any economic value (see Park and Irwin, 2007, for a review).
However, answering this question poses several econometric challenges. Until now, the literature has focused
mainly on multiple testing (also called data-mining) issues; see Lo and MacKinlay (1990), White (2000), Sullivan
et al. (1999), Hansen (2005), Romano and Wolf (2005) and Hsu et al. (2010) for more discussion of this question),
whereas it neglected the issue of computing an adequate estimator of the out-of-sample predictive ability of a trading
rule. By out-of-sample predictive ability, we mean the ability of a rule, with its parameters determined ex ante on
in-sample data, to generate buy-and-sell signals that correctly predict the future ups and downs of an asset price.
Recently, Fang et al. (2014) stressed the need for fresh out-of-sample data, as it is considered to offer the strongest
safeguard against possible statistical bias. Kuang et al. (2014) and Sullivan et al. (1999) point out that an out-of-
sample analysis is often an effective way of detecting a data-snooping bias. In practice, however, fresh data are not
always available and a true out-of-sample analysis is rarely possible. Therefore, researchers use various sample-
splitting approaches. Kuang et al. (2014) and Bajgrowicz and Scaillet (2012) study the out-of-sample profits obtained
with strategies based on ex ante selected trading rule parametrizations. Kuang et al. (2014) select all rules apparently
profitable on a subsample and then compute the out-of-sample performance on a second subsample. In Bajgrowicz
and Scaillet (2012) the authors build monthly portfolios of rules using data of a single month and compute the out-
of-sample performance over the following month. Bajgrowicz and Scaillet (2012) call this task persistence analysis.
Both studies conclude that despite a superior in-sample performance of some rules (without taking the transaction
costs into account) these results cannot be reproduced on out-of-sample data. According to Park and Irwin (2007), this
is also the methodology followed by Lukac et al. (1988). Allen and Karjalainen (1999) compute the out-of-sample
performance of trading rules obtained via genetic algorithms in a similar way (4 years of data to build the rules, 2
years to select the best ones and the remaining data to assess the performance).
There exist several drawbacks to this splitting technique. First, the results are very dependent on the cut-off point
and are very volatile, because the estimation is based on a single realization of a stochastic process. Also, it is a
suboptimal use of the available data: by setting aside some data for the purpose of validation, we do not use all the
Correspondence to: Julien Hambuckers, University of Liège, UER Operations, Liège, Belgium. E-mail: jhambuckers@ulg.ac.be
Copyright © 2015 John Wiley & Sons, Ltd
348 J. Hambuckers and C. Heuchenne
information at hand to select ex ante the best parametrizations of the rules (as an investor would likely do in the real
life), and therefore we usually decrease the quality of this selection. Note also that some authors (Bajgrowicz and
Scaillet, 2012; Kuang et al., 2014) do not assume parameter uncertainty and consider all parameters as a priori fixed:
they assume that each parametrization is a rule in itself, and their selection of the best rules (or of the profitable ones)
is made across all rule specifications. Oddly, to the best of our knowledge and despite the apparent weaknesses of
this technique, no study in the field of technical trading rules tries to focus on these issues and to solve them. This
is surprising, as one could expect evolved investors to search for trading rules that correctly forecast the future, not
the past.
In this work, we introduce a methodology that avoids these weaknesses. We propose a way to improve the basic
sample splitting technique by adapting advanced cross-validation and bootstrap techniques to the time series context.
Our goal here is to obtain a better measure of the out-of-sample predictive ability of a trading rule. This idea is
briefly suggested in White (2000), when the author tells us that ‘cross-validation represents a more sophisticated
use of hold-out data. It is plausible that our methods may support testing that the best cross-validated model is no
better than a benchmark’. Cross-validation techniques are quite common in the field of neural networks modeling and
classification problems but did not attract a lot of attention from researchers in the area of technical trading. Moreover,
few adaptations to the time series context currently exist.
Among the best-known techniques in the regression context, one method is to split the sample not into two parts
but into kparts (Efron and Tibshirani, 1993). Hence we estimate the parameters on k1parts and compute the
out-of-sample prediction error (or the value of the score function) on the last part. This operation is performed for
the kdifferent parts, before averaging to obtain our final estimator of the prediction error. This technique is called
k-fold cross-validation. The roll-over month-by-month approach followed by Bajgrowicz and Scaillet (2012) and
Taylor (2014) can be linked to this idea. Now, if kis equal to the size of the sample, we perform a leave-one-out
cross-validation. A second method relies on the bootstrap (Efron and Tibshirani, 1993). Witha resampling procedure,
we build a statistical world replicating the properties of the true world, where we are able to estimate Bsets of
parameters. Then, the initial sample is used as a validation (out-of-sample) set. It is as if we had at hand multiple
realizations of the same stochastic process for estimation purposes and the opportunity to validate these results on
the whole population. The disadvantage of this method is the large overlap between training sets and validation sets,
which causes a bias. To solve this issue, the bootstrap .632 and +.632 techniques (Efron, 1983; Efron and Tibshirani,
1993, 1997) can be used to compute estimators based on validation sets that do not overlap with the training sets.
Overall, these techniques must be seen as generalizations of the sample splitting technique.
In the next section we present an adaptation to the time series context of the .632 bootstrap technique. Our proce-
dure is based on the idea that, to correctly assess the predictive ability of a trading rule, we need a large number of
the possible outcomes of the underlying stochastic process. Using the .632 resampling technique, we are able to build
a large number of both training sets (in-sample data) and validation sets (out-of-sample data). Also, we can simul-
taneously control for non-overlapping datasets and keep intact the intrinsic time dependencies of the original data.
In the bootstrap world, we fit the rules on the training sets and then use the bootstrap validation sets to compute an
estimator of the out-of-sample predictive ability of the rules. The bootstrap training samples are generated as in a reg-
ular bootstrap procedure for time series data: by block, or using a nonparametric resampling of the residuals, which
are used to build recursively new data with the estimated model. The bootstrap validation samples, for their part, are
drawn using the residuals (or blocks of data) not used in the training samples. Indeed, an elementary calculation tells
us that, if we draw a sample of size nwith replacement from an initial set of nobservations, a single observation
has roughly a probability ..n 1/=n/nof not being selected in the resample. This means that, on average, around
one third of the data would stay unused in each resample. We use these unselected observations to create bootstrap
validation samples that do not overlap with the training samples. These validation samples can be used to assess the
out-of-sample performance, avoiding an overfitting bias. When a good stationary time series model can be found,
the residual-based bootstrap is preferred to the block bootstrap of Politis and Romano (1994). Indeed, the residuals-
based bootstrap has been shown to produce very good results when the hypotheses of the underlying model are met.
Also, it is interesting to note that our approach could easily be extended to all data frequencies (especially very high
frequencies). However, our approach differs slightly from the traditional one. Here, we estimate the out-of-sample
predictive ability of a trading rule with its best parametrization determined ex ante on in-sample data. In Brock et al.
(1992), Allen and Karjalainen (1999), Bajgrowicz and Scaillet (2012) and Kuang et al. (2014), the authors are inter-
ested in computing the out-of-sample performance of the combination parameters—trading rules that perform best
in-sample. In other words, our perspective takes into account the parameter uncertainty around a trading rule, whereas
the traditional approach assumes no parameter uncertainty. As an example, imagine that we consider two rules (let
us say cross-over moving average and support and resistance rules), each with 10 possible parametrizations. We have
20 combination parameters–trading rules. Whereas Brock et al. (1992), Sullivan et al. (1999) and Bajgrowicz and
Scaillet (2012) try to find the combination(s) that generates the highest mean excess return (or those that generate
a profit), we aim at finding, among the two rules, the one that can generate the highest out-of-sample mean excess
return (see below the section ‘Resampling procedure’ for more comments regarding this perspective).
Copyright © 2015 John Wiley & Sons, Ltd J. Forecast. 35, 347–372 (2016)

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT