Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research

AuthorMargaret R. Garnsey,Mark E. Hughes,Ingrid E. Fisher
Published date01 July 2016
Date01 July 2016
DOIhttp://doi.org/10.1002/isaf.1386
NATURAL LANGUAGE PROCESSING IN ACCOUNTING,
AUDITING AND FINANCE: A SYNTHESIS OF THE LITERATURE
WITH A ROADMAP FOR FUTURE RESEARCH
INGRID E. FISHER
a
*, MARGARET R. GARNSEY
b
AND MARK E. HUGHES
a
a
The University at Albany, State University of New York, Albany, NY, USA
b
Siena College, Loudonville, NY, USA
SUMMARY
Natural language processing (NLP) is a part of the articial intelligence domain focused on communication be-
tween humans and computers. NLP attempts to address the inherent problem that while human communications
are often ambiguous and imprecise, computers require unambiguous and precise messages to enable understand-
ing. The accounting, auditing and nance domains frequently put forth textual documents intended to communi-
cate a wide variety of messages, including, but not limited to, corporate nancial performance, managements
assessment of current and future rm performance, analystsassessments of rm performance, domain standards
and regulations as well as evidence of compliance with relevant standards and regulations. NLP applications have
been used to mine these documents to obtain insights, make inferences and to create additional methodologies and
artefacts to advance knowledge in accounting, auditing and nance. This paper synthesizes the extant literature in
NLP in accounting, auditing and nance to establish the state of current knowledge and to identify paths for future
research. Copyright © 2016 John Wiley & Sons, Ltd.
Keywords: computational linguistics; natural language processing; text mining
1. INTRODUCTION
1.1. Background and Motivation
Natural language processing (NLP) focuses on the study and facilitation of communication between
people and computers. One of the central problems in articial intelligence (AI) is that of communica-
tion. AI is typically dened as intelligence that is demonstrated by software or machinery that imitates
the workings of the human mind (Tung,Quek, & Cheng 2004). It is an inherently interdisciplinary eld
of study that draws from domains as diverse as psychology, computer science, linguistics and neurosci-
ence. NLP, more specically, encompasses a range of computational techniques for analysing and
representing texts at one or more levels of linguistic analysis to enable human-like language processing
for a range of particular tasks or applications.
Although NLP has been described in many ways, the working denition underlying this discussion
follows the notion that NLP is a theoretically motivated range of computational techniques for analyz-
ing and representing naturally occurring texts at one or more levels of linguistic analysis for the purpose
* Correspondence to: Ingrid E. Fisher, The University at Albany, State University of New York, Albany, NY, USA. E-mail:
isher@albany.edu
Copyright © 2016 John Wiley & Sons, Ltd.
INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE AND MANAGEMENT
Intell. Sys. Acc. Fin. Mgmt. 23, 157214 (2016)
Published online 1 March 2016 in Wiley Online Library (wileyonlinelibrary.com)DOI: 10.1002/isaf.1386
of achieving human-like language processing for a range of tasks or applications(Liddy, 2001). This
implies, as demonstrated throughout this paper, that there are multiple methods or techniques from
which to choose to accomplish a particular type of language analysis(Liddy, 2001). We reference a
variety of analytical methods that incorporate NLP data, at various levels of linguistic analysis(Liddy,
2001). However, since NLP is considered a discipline within Articial Intelligence (AI)we focus
most heavily on the AI-related applications of NLP (Liddy, 2001). Despite the plethora of analytical
methods that are associated with and use data produced by NLP, it comprises a unique set of compu-
tational techniquesthat should not be confused or conated with the many analytical tools referenced
in the text that follows.
Fifty years ago Goldberg (1965) stated: it is scarcely an exaggeration to say that the problem of
communication is the axial problem in accounting. Accounting practice is replete with written docu-
ments intended to communicate such messages as, but not limited to: current accounting standards, past
and expected future corporate performance (along multiple dimensions), policies and practices embod-
ied by nancial statements and the results of nancial statement audits. The growth in digital and social
media usage by businesses has further increased the volume of unstr uctured text documents. Since the
dissemination of these documents and messages is now largely automated by and through computers,
we believe that NLP research and applications have considerable potential to enhance communication
in the areas of accounting, auditing and nance. Our paper is motivated by the desire to determine the
state of the extant literature in NLP in accounting, auditing and nance so as to inform and guide future
research efforts.
1.2. Objectives
This paper examines the NLP literature in accounting, auditing and nance. Our objectives are: (1) to
synthesize the literature and present the lessons to be learned from prior work, (2) to identify
unanswered questions that present ripe opportunity for future research and (3) to identify the constraints
that are likely to present challenges to future research. We focus our literature reviewon combinations
of NLP with AI. Machine learning (ML) is often considered to be a subeld of AI. However, the large
number of studies employing ML, as opposed to other AI methodologies, drove us to acknowledge the
ML literature separately, where appropriate.
While we identify the various methods that accounting, audit and nance-related researchers have
used in association with NLP, we do not seek to create an articial classication of NLP-related
studies. Rather, we acknowledge the existence of groups of studies that have applied similar NLP
analyses, or that sought to accomplish similar tasks by employing NLP, within the elds of
accounting, audit and nance. We also note the diverse analyti cal methods that have been
employed in addition to, or in combination with, NLP. Most importantly, we seek to highlight
the variety of applications of NLP that have developed and the ever-increasing uses of NLP-
generated data.
The remainder of this paper is organized as follows. Section 2 describes the methodology used to
identify the relevant literature and provides an overview of the evolution of NLP literature. A synthesis
of the NLP literature follows. We discuss NLP in accounting, audit and nance, focusing on two major
uses of NLP: classication (Section 3) and prediction (Section 4). Frequently-observed applications
and data sources are highlighted, followed by a review of readability studies in accounting, audit and
nance. In Section 5 we identify future research opportunities as well as likely obstacles and provide
concluding observations in Section 6.
158 I. E. FISHER ET AL.
Copyright © 2016 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt., 23, 157214 (2016)
DOI: 10.1002/isaf
2. METHODOLOGY
2.1. Literature Selection
We began our search for relevant literature byscanning the bibliographies of four literature reviews that
surveyed NLP developments in the accounting, audit, and nance domains (Fisher et al., 2010; Li,
2010c; Kearney & Liu, 2014; Nassirtoussi et al., 2014). We expanded our exploration by performing
keyword searches of databases, including the Association for Computing Machinery (ACM) Digital
Library, ProQuest, and the linked databases accessible via EBSCO Host. The primary search terms
used included: text analysis, text mining, natural language processing (NLP), ML, articial neural
networks (ANNs), support vector machines (SVMs), expert systems (ESs), articial intelligence
(AI), latent semantic analysis (LSA), content analysis and computerized content analysis. Finally, the
bibliographies of retrieved papers were manually scanned, identifying additional sources.
2.2. Literature Assessment
An analysis of the diverse bibliographic sources included in this study demonstrates the ongoing inter-
est in NLP. We assessed a total of 266 monographs, including 192 journal articles, 38 conference
papers, 25 working papers, 1 book chapter and 10 doctoral dissertations, as shown in Table I. Since
our primary focus is NLP combined with ML and/or AI, we excluded 86 studies that addressed manual
text analysis, a precursor to the computer-facilitated NLP often employed today, and 81 involving basic
text mining. Excluding the four literature reviews noted above, 95 studies featured NLP combined with
and/or augmented by ML and/or AI. The studies featuring co-occurrences of NLP, ML and/or AI come
from three primary domains: accounting, audit and nance, represented by 20, 15 and 60 items
respectively, as shown in Table II. Since NLP literature is replete with protracted terms, in the interest
of brevity we use complete designations, followed by acronyms, only for initial references. Thereafter,
acronyms are used. To aid the reader, a glossary of all acronyms and corresponding terms is provided in
Appendix A.
2.3. Brief Overview of the Evolution of Natural Language Processing Research
Assessing the literature from a chronological perspective, it is clear that NLP research has evolved, over
time, as illustrated in Figure 1. Researchersoriginal focus was on manual text analysis, a non-
computerized approach used to tap the informational value of linguistic patterns in text-based data.
The earliest research we found employed manual content analysis in order to assess the readability of
corporate reports. It found that the general level of reading was difcult, the human interest value dull,
Table I. Literature categorized by bibliographic source
Retrieved literature Accounting Audit Finance Total
Journal article 72 26 94 192
Conference papers 11 3 24 38
Working papers 2 2 21 25
Book chapters 1 —— 1
Doctoral dissertations 3 1 6 10
Total articles retrieved 89 32 145 266
NLP IN ACCOUNTING, AUDITING AND FINANCE 159
Copyright © 2016 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt., 23, 157214 (2016)
DOI: 10.1002/isaf

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT