Features selection, data mining and finacial risk classification: a comparative study
Author | Salim Lahmiri |
Published date | 01 October 2016 |
Date | 01 October 2016 |
DOI | http://doi.org/10.1002/isaf.1395 |
RESEARCH ARTICLE
Features selection, data mining and finacial risk classification: a
comparative study
Salim Lahmiri
ESCA School of Management, Casablanca,
Morocco
Correspondence
Salim Lahmiri, ESCA School of Management,
Casablanca, Morocco.
Email: slahmiri@esca.ma
Summary
The aim of this paper is to compare several predictive models that combine features selection
techniques with data mining classifiers in the context of credit risk assessment in terms of accu-
racy, sensitivity and specificity statistics. The t‐statistic, Battacharrayia statistic, the area
between the receiver operating characteristic, Wilcoxon statistic, relative entropy, and genetic
algorithms were used for the features selection task. The selected features are used to train
the support vector machine (SVM) classifier, backpropagation neural network, radial basis func-
tion neural network, linear discriminant analysis and naive Bayes classifier. Results from three
datasets using a 10‐fold cross‐validation technique showed that the SVM provides the best accu-
racy under all features selections techniques adopted in the study for all three datasets. There-
fore, the SVM is an attractive classifier to be used in real applications for bankruptcy
prediction in corporate finance and financial risk management in financial institutions. In addition,
we found that our best results are superior to earlier studies on the same datasets.
KEYWORDS
classification, credit risk, data mining, features selection
1|INTRODUCTION
In corporate finance, financial risk prediction is important for financial
decision‐making and profitability of financial institutions. Therefore,
several predictive models based on data mining techniques have been
developed to accurately classify bankrupted and non‐bankrupted com-
panies. The problem of financial risk prediction is important for finan-
cial institutions and investors for better risk control (Cochran et al.,
2006; Sanz and Ayca; 2006; Pindado et al., 2008; Abdou and Poiton,
2011; Platt and Platt, 2012; Savona and Vezzoli, 2012; Çelik, 2013;
Bastos and Pindado, 2013; Evans and Borders, 2014; Mendes et al.,
2014; Ciampi, 2015). Therefore, the topic has received much attention
in corporate finance and risk management literature. Indeed, several
data mining techniques have been used for the prediction of financial
risk, including the AdaBoost algorithm (Alfaro et al., 2008), fuzzy
support vector machine (SVM) (Chaudhuria and De, 2011), SVM (Song
et al., 2010; Horta and Camanho, 2013), backpropagation neural net-
work (BPNN) (Trinkle and Baldwin, 2007; Peat and Jones, 2012; Lee
and Choi, 2013), radial basis function neural networks (RBFNNs)
(Divsalar et al., 2011), decision tree algorithms (Delen et al., 2013),
ensemble of SVMs (Sun and Li, 2012), linear programming (Divsalar
et al., 2011), if–then rules (Davalos et al., 2014), genetic algorithms
(GAs) (Gordini, 2014), ensemble systems (Sun, 2012; Figini et al.,
2016), case‐based reasoning system (Li et al., 2013), probabilistic neu-
ral network (Pendharkar, 2011) and fuzzy neural approach (Quek et al.,
2009),
Recently, several studies have adopted a particular feature selec-
tion technique to improve the classifier accuracy in predicting financial
risk. Indeed, the goal of feature selection is to identify the most infor-
mative features used as patterns in the classification task and remove
redundant ones. In this regard, selected features are expected to
improve classification/prediction results and help in reducing the pro-
cessing time of the classifier. For instance, these studies used t‐statistic
(Ravisankar et al., 2011), principal component analysis (Chen, 2012;
Lin, 2012), partial least square (Yang et al., 2011; Serrano‐Cinca and
Gutiérrez‐Nieto, 2013), stepwise discriminant analysis (Li and Sun,
2011), information gain (Wang et al., 2014), GAs (Oreski and Oreski,
2014), decision trees (Cho et al., 2010), classification tree method
(Brezigar‐Masten and Masten, 2012) and logistic regression (Lahmiri
and Gagnon, 2015) to perform the features selection task.
However, it is difficult to select an adequate combination of fea-
tures selection technique and classifier for a given dataset, which is
an important task in bankruptcy prediction. Indeed, the performance
of the classifier depends on the features selection techniques and
DOI 10.1002/isaf.1395
Intell Sys Acc Fin Mgmt 2016; 23: 265–275Copyright © 2016 John Wiley & Sons, Ltd.wileyonlinelibrary.com/journal/isaf265
Get this document and AI-powered insights with a free trial of vLex and Vincent AI
Get Started for FreeStart Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

Start Your 3-day Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
