Incorporating textual and management factors into financial distress prediction: A comparative study of machine learning methods

AuthorXiaobo Tang,Wenxuan Shi,Shixuan Li,Mingliang Tan
DOIhttp://doi.org/10.1002/for.2661
Date01 August 2020
Published date01 August 2020
RESEARCH ARTICLE
Incorporating textual and management factors into
financial distress prediction: A comparative study of
machine learning methods
Xiaobo Tang | Shixuan Li | Mingliang Tan | Wenxuan Shi
School of Information Management,
Wuhan University, Wuhan, China
Correspondence
Shixuan Li, School of Information
Management, Wuhan University,
No. 299 Bayi Road. Wuhan, Hubei, China.
Email: shixuan.li@hotmail.com
Funding information
National Natural Science Foundation of
China, Grant/Award Number: 71673209
Abstract
Financial distress prediction (FDP) has been widely considered as a promising
approach to reducing financial losses. While financial information comprises
the traditional factors involved in FDP, nonfinancial factors have also been
examined in recent studies. In light of this, the purpose of this study is to
explore the integrated factors and multiple models that can improve the pre-
dictive performance of FDP models. This study proposes an FDP framework to
reveal the financial distress features of listed Chinese companies, incorporating
financial, management, and textual factors, and evaluating the prediction per-
formance of multiple models in different time spans. To develop this frame-
work, this study employs the wrapper-based feature selection method to
extract valuable features, and then constructs multiple single classifiers,
ensemble classifiers, and deep learning models in order to predict financial dis-
tress. The experiment results indicate that management and textual factors can
supplement traditional financial factors in FDP, especially textual ones. This
study also discovers that integrated factors collected 4 years prior to the
predicted benchmark year enable a more accurate prediction, and the ensem-
ble classifiers and deep learning models developed can achieve satisfactory
FDP performance. This study makes a novel contribution as it expands the pre-
dictive factors of financial distress and provides new findings that can have
important implications for providing early warning signals of financial risk.
KEYWORDS
deep learning, financial distress prediction, machine learning, management factors, textual
factors
1|INTRODUCTION
Financial distress prediction (FDP) has attracted a high
degree of attention in both academic and industrial fields
over the last few decades as it is an effective tool of risk
management (Geng, Bose, & Chen, 2015; Wanke, Bar-
ros, & Faria, 2015). Such predictions can provide early
warning signals of financial risks for relevant agents in
the economy, which is important for stakeholders, listed
companies, and even the development of the economy
itself (Farooq & Qamar, 2019). Based on such signals, and
prior to a crisis, the relevant investors can realize invest-
ment strategy adjustments, and firms and governments
are enabled to develop remedial measures, thereby
avoiding financial losses to a certain extent (Wang,
Chen, & Chu, 2018). For these reasons, numerous studies
Received: 18 September 2019 Accepted: 12 January 2020
DOI: 10.1002/for.2661
Journal of Forecasting. 2020;39:769787. wileyonlinelibrary.com/journal/for © 2020 John Wiley & Sons, Ltd. 769
have focused on the prediction of financial distress; spe-
cifically, exploring predictive factors and prediction
methods has been a key undertaking in this body of
research.
The factors that are involved in FDP include financial
and nonfinancial factors. Early studies in this field
mainly examined financial factors, including profitability,
solvency, operational capabilities, and finance structure
(Hua, Wang, Xu, Zhang, & Liang, 2007; Sun & Li, 2008).
Although financial factors do contribute significantly to
FDP, they only cover quantitative information, which
cannot comprehensively represent the status of compa-
nies. In light of this, some of the latest studies have
highlighted the limited predictive capacity of financial
variables, and have attempted to utilize certain non-
financial factors in predicting financial distress
(Cecchini, Aytug, Koehler, & Pathak, 2010; Hajek, Olej, &
Myskova, 2014). Specifically, using textual analysis
methods such as word frequency statistics and sentiment
analysis, qualitative nonfinancial predictive factors can
be extracted from unstructured texts, such as audit
reports and annual reports; these textual factors can then
supplement the traditional financial factors used in FDP
(du Jardin, 2016; Wang et al., 2018).
The prediction method of financial distress can, typi-
cally, be divided into two categories: statistical methods
and machine learning methods. Early research focused
on statistical methods, which mainly include linear dis-
criminant analysis (LDA), multiple discriminant analysis
(MDA), factor analysis (FA), and probabilistic modeling
(Chen, Ribeiro, & Chen, 2016; Kumar & Ravi, 2007;
Suntraruk, 2010). While these statistical methods play a
significant role in studies on FDP, when the restrictive
assumptions that these models depend on are not satis-
fied, their validity and applicability become limited
(Chen et al., 2016; Wang et al., 2018). Therefore, studies
in recent years have focused on using machine learning
methods for the purposes of FDP, such as the support
vector machine (SVM), decision tree (DT), and artificial
neural networks (ANN; Zhou et al., 2014; Geng et al.,
2015; Olson, Delen, & Meng, 2012). With the exception of
these single classifiers, studies have also constructed FDP
models based on ensemble classifiers (Liang, Tsai, Dai, &
Eberle, 2018; Wang, Hao, Ma, & Jiang, 2011). Moreover,
deep learning models have also received attention in the
field of FDP (Alexandropoulos, Aridas, Kotsiantis, &
Vrahatis, 2019).
While previous studies have clearly made contribu-
tions to the area of FDP, optimizing the predictive factors
and prediction models is still arguably in need of
improvement. In order to achieve this with regard to pre-
dictive factors, this study combines traditional financial
factors with management and textual factors; for
prediction models, this study constructs single classifiers,
ensemble classifiers, and deep learning models to predict
financial distress. To the best of our knowledge, this is
the first study to utilize three types of factors and both
traditional machine learning and deep learning models
to predict financial distress in listed Chinese companies.
In so doing, this study attempts to answer the following
research questions:
RQ1. Which are the key influencing factors in FDP of
listed Chinese companies?
RQ2. Which models perform better than others in FDP of
listed Chinese companies?
RQ3. How early can signs of financial distress be
predicted in listed Chinese companies?
In order to engage with these research objectives, this
study utilizes data from 3, 4, and 5 years ahead of the
predicted benchmark year, revealing changes in the key
predictive factors of different years, and constructing
multiple machine learning models to predict financial
distress. It is hoped that the results of this research can
provide a template for early warning mechanisms for rel-
evant economic agents, so that they can make the
corresponding efforts to avoid financial losses.
The main contributions of this study are as follows.
First, it provides a novel FDP method that integrates
multiple predictive factors, time spans, and classifier
models. Second, it highlights the discovery that man-
agement and textual features can play a significant role
in FDP for listed Chinese companies; specifically, the
study finds that these features can achieve optimum
performance 4 years prior to the predicted benchmark
year, and reveals the superiority of ensemble classifiers
and deep learning models. Finally, the methods described
in this paper may be seen as potential approaches that
can concretely be applied in companies or at the level of
economic policy towards avoiding the financial losses.
The remainder of this paper is organized as follows.
Section 2 outlines the relevant studies on the prediction
of financial distress. Section 3 describes the study frame-
work and explains the analysis approaches in detail,
followed by a presentation of the main research results in
Section 4. Following this, Section 5 provides a discussion
of the empirical results and answers to the research ques-
tions. The final section concludes this study and provides
directions for future research.
2|RELATED STUDIES
This section describes the different factors and models
used in studies of FDP, as summarized in Table 1,
770 TANG ET AL.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT