Trade signing in fast markets

Date01 August 2020
AuthorMadhuparna Kolay,Allen Carrion
DOIhttp://doi.org/10.1111/fire.12218
Published date01 August 2020
DOI: 10.1111/fire.12218
ORIGINAL ARTICLE
Trade signing in fast markets
Allen Carrion1Madhuparna Kolay2
1Department of Finance, Insurance,and Real
Estate, University of Memphis, Memphis,
Tennessee
2PamplinSchool of Business, University of
Portland,Portland, Oregon
Correspondence
AllenCarrion, Fogelman College Administration
Building,Room 402, University of Memphis,
Memphis,TN 38152.
Email:a.carrion@memphis.edu
Abstract
This study assesses the accuracy of trade signing algorithms in fast
trading environments using NASDAQ and NYSE trade and quote
data. Using data that contain true trade signs, we show that the Lee
and Ready algorithm outperforms the tick rule and classifies trades
at least as well as in earlier studies from slower trading environ-
ments, even in subsamples where the market is particularly fast.
We conclude that trade signing remains viable in fast markets, and
that the use of quote data continues to increase trade classification
accuracy.
KEYWORDS
high-frequency trading, Lee and Ready algorithm, market
microstructure, trade classification
JEL CLASSIFICATIONS
C18, G10
1INTRODUCTION
In many research applications, it is important to distinguish whether stock trades are initiated by the buyer or the
seller.Recent studies use this determination to examine the trading strategies of high-frequency traders (HFTs)(Benos,
Brugler, Hjalmarsson, & Zikes,2017), order flows around the 2010 Flash Crash (McInish, Upson, & Wood, 2014), the
impact of make–takefees on market quality and trading behavior (Battalio, Corwin, & Jennings, 2016), anomaly trading
costs (Novy-Marx & Velikov, 2016), the asset pricing implications of the noninformational components of order flow or
inventory effects (Chung & Huh, 2016; Kang & Lee, 2014), and resiliency,defined as the time required for temporary
price impacts to decay (Bessembinder,Carrion, Tuttle, & Venkataraman, 2016). Microstructure data do not often indi-
cate which side initiates trades, so this must be inferred by the researcher with a trade signing algorithm. When both
trade and quote data are available, the predominant approach in the literature has been the Leeand Ready algorithm
(Lee & Ready,1991), which we describe in more detail below.1
1Asof June 1, 2019, Google Scholar lists 3,156 papers citing Lee and Ready (1991).
Financial Review.2020;55:385–404. wileyonlinelibrary.com/journal/fire c
2019 The Eastern Finance Association 385
386 CARRION ANDKO LAY
It is well-documented that speeds have increased dramatically in modern equity markets.2This has led some to
question the continued reliability of the Lee and Ready algorithm (LR algorithm hereafter). According to Easley,López
de Prado, and O’Hara(2012, p. 1466; ELO [2012] hereafter):
In a high-frequency setting, trade classification is much more difficult order splitting is the norm, cancellations
of quotes and orders arerampant, and the sheer volume of trades is overwhelming Because the BBO changes
many times between trades, many contracts exchangedat the same price in fact occurred against the bid and
the offer.In this high-frequency world, applying standard algorithms over individual transactions is problematic.
In short, ELO (2012) question both the continued use of quotes to sign trades and the practice of individual trade
signing altogether.They propose a new trade classification technique (bulk volume classification [BVC]) that operates
on aggregated “bars” instead of individual trades.3While ELO (2012) specifically study futures markets, the concerns
they raiseapply to the equity markets as well, and have been met with varied responses from researchers. Many recent
studies simply ignore these criticisms and proceed to sign tradesusing conventional techniques. Some studies continue
to sign trades using the LR algorithm but exclude data from fast marketsin robustness tests (Chung & Huh, 2016) or
choose their sample periods to avoid fast market issues entirely (Abad & Pascual, 2015). Other papers test, use, or
extendtechniques suggested by ELO (2012) and the companion papers, at least implicitly accepting the possibility that
the LR algorithm and other individual tradesigning algorithms should be replaced (Chakrabarty et al., 2015; Panayides,
Shohfi, & Smith, 2019; Poppe, Moos, & Schiereck, 2016).
In this study,we assess the accuracy of the LR algorithm and the tick test in a recent sample of NASDAQ trades and
quotes that identifies true trade signs. This data set is particularly well-suited for this analysis because it also identi-
fies HFT participation, which is a potential source of trade classification inaccuracy,and also contains millisecond (ms)
timestamps. We investigate the following research questions: Does the LR algorithm still accurately sign individual
trades? Does it outperform the simple tick rule, which can be implemented without quotation data? Is millisecond data
necessary for the algorithm to perform well? In cases where only second level resolution data are available, does the
interpolation method proposed by Holden and Jacobsen (2014) offer an improvement over simply matching trades
with the last quote in the prior second?
Our main results are as follows. In the NASDAQHFT database, the LR algorithm outperforms the tick rule when 1-s
timestamps are used, correctly classifying 86.88% of the trades in our sample compared to 78.62% for the tick rule.
When millisecond data are employed, the accuracy improves to 93.57%, which is considerably better than the accu-
racy rate reported in earlier studies conducted in slower markets. Consistent with the conjectures in ELO(2012), the
1-s version tends to be less accuratein fast market conditions. But, even for these trades its performance remains com-
parable with that reported in prior studies and superior to that of the tick test. We find that the Holden and Jacobsen
(2014)interpolation procedure underperforms the basic LR algorithm in 1-s data and does not consistently outperform
thetickrule.
We also conduct robustness tests using the NYSE trade and quote (TAQ) data set, which is widely used by
researchers. The NYSE TAQ data set may be sequenced less accurately than the NASDAQ HFT data set because it
combines trades and quotes from multiple trading venues. Although the TAQdata do not contain true trade signs, we
employ a matching procedure to utilize the true trade signs provided by NASDAQ to conduct this analysis. We find
that the advantage of millisecond timestamps disappears in the TAQdata. LR classification accuracy is actually slightly
higher using TAQ data with 1-s timestamps than with millisecond timestamps. Both procedures classify trades with
a high level of accuracy (85.65% and 84.84%, respectively), but neither approaches the extremely high accuracyrate
2Hasbrouck and Saar (2013) report that HFTscan cancel or submit orders in reaction to market events within 2–3 ms, and Carrion (2013) reports that HFTs
participatein 68.3% of the dollar trading volume. Also see Angel, Harris, and Spatt (2011), O’Hara (2015), and Hasbrouck (2018), among many others.
3BVCis more fully developed and validated in Easley, López de Prado, and O’Hara (2016). BVC is also tested in Chakrabarty,Pascual, and Shkilko (2015) and
Andersenand Bondarenko (2014, 2015).

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT