Performance assessment of ensemble learning systems in financial data classification

Published date01 January 2020
AuthorSalim Lahmiri,Stelios Bekiros,Anastasia Giakoumelou,Frank Bezzina
DOIhttp://doi.org/10.1002/isaf.1460
Date01 January 2020
RESEARCH ARTICLE
Performance assessment of ensemble learning systems in
financial data classification
Salim Lahmiri
1
| Stelios Bekiros
2,3
| Anastasia Giakoumelou
4
| Frank Bezzina
5
1
Department of Supply Chain & Business
Technology Management, John Molson School
of Business, Concordia University, Montreal,
Canada
2
Department of Economics, European
University Institute, Florence, Italy
3
Rimini Centre for Economic Analysis, Wilfrid
Laurier University, Waterloo, Canada
4
Department of Law and Management, School
of Economics, University of Rome Tor Vergata,
Rome, Italy
5
Faculty of Economics, Management, and
Accountancy, University of Malta, Msida,
Malta
Correspondence
Salim Lahmiri, Department of Supply Chain &
Business Technology Management, John
Molson School of Business, Concordia
University. Montreal, Canada.
Email: salim.lahmiri@concordia.ca
Summary
Financial data classification plays an important role in investment and banking indus-
try with the purpose to control default risk, improve cash and select the best cus-
tomers. Ensemble learning and classification systems are becoming gradually more
applied to classify financial data where outputs from different classification systems
are combined. The objective of this research is to assess the relative performance of
existing state-of-the-art ensemble learning and classification systems with applica-
tions to corporate bankruptcy prediction and credit scoring. The considered ensem-
ble systems include AdaBoost, LogitBoost, RUSBoost, subspace, and bagging
ensemble system. The experimental results from three datasets: one is composed of
quantitative attributes, one encompasses qualitative data, and another one combines
both quantitative and qualitative attributes. By using ten-fold cross-validation
method, the experimental results show that AdaBoost is effective in terms of low
classification error, limited complexity, and short time processing of the data. In addi-
tion, the experimental results show that ensemble classification systems outperform
existing models that were recently validated on the same databases. Therefore,
ensemble classification system can be employed to increase the reliability and consis-
tency of financial data classification task.
KEYWORDS
bankruptcy prediction, credit scoring, ensemble classifiers, ensemble learning, financial data
classification
1|INTRODUCTION
An ensemble learning and classification system is composed of multi-
ple classifiers or sub-systems. The goal is to improve accuracy (reduce
prediction error) by combining responses produced by these multiple
classifiers into a single response or output. In this framework, since
the final output is computed by combining outputs from different
classifiers, the final ensemble system decision generally is wrong only
when most of the sub-systems make the same error.
In general, an ensemble learning and classification system is
homogeneous when it is composed of classifiers trained with the
same learning algorithm, whereas it is heterogeneous when it is
composed of classifiers trained with different learning algorithms.
For instance, bagging and boosting algorithms are commonly used
to build homogeneous ensembles with various samples or parame-
ters to build base classifiers. In contrary, the stacking algorithm
generally is employed to generate heterogeneous ensemble sys-
tems. Because of their good performance, ensemble learning and
classification systems have been successfully employed in various
applications; including customer churn prediction (Xiao, Jiang, He, &
Teng, 2016), person authentication (Pisani, Lorena & de Carvalho,
2018), visual categorization (Zhu, Shao, & Fang, 2016), microblog
summarization (Dutta et al., 2018), sentiment classification (Xia,
Zong, Hu & Cambria, 2013), energy price time series prediction
(Lahmiri, 2017a), and stock market forecasting (Lahmiri, 2018;
Lahmiri & Boukadoum, 2015).
Received: 10 May 2019 Revised: 18 October 2019 Accepted: 18 October 2019
DOI: 10.1002/isaf.1460
Intell Sys Acc Fin Mgmt. 2020;27:39. wileyonlinelibrary.com/journal/isaf © 2020 John Wiley & Sons, Ltd. 3

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT