An ensemble of LSTM neural networks for high‐frequency stock market classification

Document

Cited in

Author	Ioannis Tsiamas,Svetlana Borovkova
DOI	http://doi.org/10.1002/for.2585
Date	01 September 2019
Published date	01 September 2019

Received: 21 September 2018 Accepted: 12 February 2019

DOI: 10.1002/for.2585

RESEARCH ARTICLE

An ensemble of LSTM neural networks for high-frequency

stock market classification

Svetlana Borovkova Ioannis Tsiamas

School of Business and Economics, Vrije

Universiteit Amsterdam, Amsterdam, The

Netherlands

Correspondence

Svetlana Borovkova, School of Business

and Economics, Vrije Universiteit

Amsterdam, De Boelelaan 1105, 1081 HV

Amsterdam, The Netherlands.

Email: s.a.borovkova@vu.nl

Abstract

We propose an ensemble of long–short-term memory (LSTM) neural networks

for intraday stock predictions, using a large variety of technical analysis indi-

cators as network inputs. The proposed ensemble operates in an online way,

weighting the individual models proportionally to their recent performance,

which allows us to deal with possible nonstationarities in an innovative way.

The performance of the models is measured by area under the curve of the

receiver operating characteristic. Weevaluate the predictive power of our model

on several US large-cap stocks and benchmark it against lasso and ridge logistic

classifiers. The proposed model is found to perform better than the benchmark

models or equally weighted ensembles.

KEYWORDS

deep learning, ensemble models, high-frequency trading, LSTM neural networks

1INTRODUCTION

The long-lasting debate on predictability of financial mar-

kets has led to volumes of research on this subject, but

no consensus has been reached. With the emergence and

development of efficient machine learning algorithms and

powerful computers, this debate has been reinvigorated in

the last few years.

The original, and probably most important theory

related to that debate is the efficient market hypothesis

(EMH; Fama, 1970), whose core idea is that all available

information is already incorporated into market prices and

thus the prices reflect assets' true values. In other words,

no individual can profit by making predictions of future

prices, since future information is not yet available. In that

sense, asset prices are not predictable.

Criticisms of the EMH are plentiful. One of the key

assumptions of EMH—the rationality of agents operating

in the markets—is often challenged. Behavioral finance

scholars argue that there are times when even the collec-

tive actions of people (and certainly individual decisions)

are irrational.

In practice, a well-established field of technical

analysis—studying price patterns and inferring future

price developments from these patterns—is a direct

challenge to the EMH. With the increased availability

of high-frequency trade data and the development of

machine learning algorithms that can handle such large

amounts of data, technical analysis is currently undergo-

ing a revival: Daily patterns are replaced by intraday ones,

and algorithms, not humans, now learn price patterns and

make forecasts on the basis of them.

This is also the focus of our paper. We compile a large set

of “features”—that is, technical analysis indicators (on the

basis of intraday trading data)—and feed them into recur-

rent neural networks. We use the so-called deep learning,

where not only contemporary but also previous patterns

and prices are fed into the networks; this is achieved by

using so-called long–short-term memory networks (LSTM

networks). Contrary to most research on this subject, we

...............................................................................................................................................................

This is an open access article under the terms of the Creative Commons AttributionLicense, which permits use, distribution and reproduction in any medium, provided the

original work is properly cited.

600 wileyonlinelibrary.com/journal/for Journalof Forecasting. 2019;38:600–619.

BOROVKOVA AND TSIAMAS 601

train not one but an ensemble of neural networks, and for

forecasting they are weighted in real time, according to the

recent forecasting performance (so the network that pro-

duced best forecasts recently is weighted most heavily).

In this way, we can flexiblydeal with possible nonstation-

arities in the data.

We apply our methodology to 22 large cap US stocks. We

use 1 year of rawtrade data, which were cleaned and aggre-

gated into 5-minute intervals,amounting to roughly 19,000

observations per stock. Furthermore, for every stock we

also include information about its primary competitor,

thus establishing a universe of 44 stocks (for example, if

we want to forecast the direction of GM stock, we also use

features of Ford stock). We forecast price direction for 22

stocks, but use price features for all 44. This is done to

maximally utilize the available information and to obtain

robust forecasts.

We are interested in price directionforecasts, so at every

moment each stock is labeled as “Buy” or “Sell,” accord-

ing to the price direction. By cross-sectional aggregation,

we additionally created eight sector data sets. Our feature

engineering is done by constructing a large number of

technical indicators, on different time frames and on stock

as well as sector level.

The first month of our data set was solely used for fea-

ture engineering. For the other 11 months, we operated

in a rolling window way, where 1 month of data is used

for training the networks, 1 week for validating their per-

formance and the predictions are done for the following

week. This amounts to 21 training–validation–testing peri-

ods per stock.

For every period and for each stock, we trained 12

stacked LSTM networks. The predictions of each model

were evaluated by the area under the curve (AUC) score

of the receiver operating characteristic (ROC; Kent, 1989),

which we explain below. The ensemble predictions for

each testing period were obtained by the weighted com-

bination of the 12 trained models. The weights assigned

to each model were proportional to their AUC score on

the past week of predictions. Finally, the overall perfor-

mance of our predictive framework is measured by the

average AUC score of the ensembles for all 21 testing

periods.

All data processing is done in Python 3, using the pack-

ages NumPy and pandas. Training LSTM networks is done

in TensorFlow, while the lasso and ridge logistic regres-

sion models were trained using the scikit-learn package.

No special hardware was employed: we used a PC with a

two-core 2.3 GHz CPU and 8 GB RAM.

The rest of the paper is structured as follows. The

next section briefly describes related works on machine

learning methods for predictive modeling in financial

markets. Section 3 describes data collection and fea-

ture engineering. Section 4 describes the methods used

and Section 5 presents and discusses the results. The

last section is dedicated to conclusions and future

research.

2LITERATURE

Most of the research on machine learning and deep learn-

ing applications for financial time series predictions is

quite recent. Some early works include that of Baes-

taens, Van Den Bergh, and Vaudrey (1995) and Refenes,

Zapranis, and Francis (1994), who used simple artifi-

cial neural network (ANN) architectures and compared

their performance to logistic regression (LR) models.

Huanga, Nakamoria, and Wang (2005) found that sup-

port vector machines (SVMs) can achieve better results

than traditional statistical methods, while Pai and Lin

(2005) proposed a hybrid autoregressiveintegrated moving

average–SVM method for price forecasting.

Later contributions include Hegazy, Soliman, and

Salam (2013), who used particle swarm optimization1

to fine-tune the hyperparameters of an SVM regres-

sor, achieving significantly smaller mean squared

errors (MSEs) on several US stocks. Nelson, Pereira,

and de Oliveira (2017) trained LSTM networks on

15-minute-interval observations, for several BOVESPA

(Sao Paolo Stock Exchange) stocks, and reported accuracy

metrics of 53–55% for the next direction price fore-

casts. Fischer and Krauss (2017) conducted a large-scale

research project using daily S&P 500 data from 1992 to

2015. They used LSTMs, random forests, deep networks,

and logistic regression, and found that a trading strategy

based on the predictions of LSTM was the most profitable.

Qu and Zhang (2016), assuming that high-frequency

returns periodically trigger momentum and reversal,

designed a new SVM kernel method2for forecast-

ing high-frequency market directions and applied their

method to the Chinese CSI 300 index. Their results were

significantly better compared to the radial basis function

(RBF)3kernel and the sigmoid kernel.

One of the most remarkable contributions to deep learn-

ing for stock price prediction is that by Bao, Yue, and Rao

(2017). They proposed a predictive framework composed

of three parts. First, they apply a wavelet transformation

(WT)4to the financial data set (prices, technical indicators,

1A nature-inspired algorithm that uses a collection of candidate solutions

as well as their respective positions and velocities to find the optimal

solutions (Kennedy & Eberhart, 1995).

2Kernel functions enable higher dimensional operations without operat-

ing in the actual higher dimensional space.

3RBF maps inputs to higher dimensional space using the Euclidean

distance.

4A signal processing technique which transforms a series from time to

frequency domain.

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

An ensemble of LSTM neural networks for high‐frequency stock market classification

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users