Do Sentiments Matter in Fraud Detection? Estimating Semantic Orientation of Annual Reports

Published date01 July 2016
DOIhttp://doi.org/10.1002/isaf.1392
Date01 July 2016
AuthorSunita Goel,Ozlem Uzuner
DO SENTIMENTS MATTER IN FRAUD DETECTION?
ESTIMATING SEMANTIC ORIENTATION OF ANNUAL REPORTS
SUNITA GOEL
a
*AND OZLEM UZUNER
b
a
Siena College, Accounting & Law, Loundonville, NY USA
b
State University of New York at Albany, Department of Information Studies, Albany, NY USA
SUMMARY
We present a novel approach for analysing the qualitative content of annual reports. Using natural language
processing techniques we determine if sentiment expressed in the text matters in fraud detection. We focus on
the Management Discussion and Analysis (MD&A) section of annual reports because of the nonfactual content
present in this section, unlike other components of the annual reports. We measure the sentiment expressed in
the text on the dimensions of polarity, subjectivity, and intensity and investigate in depth whether truthful and
fraudulent MD&As differ in terms of sentiment polarity, sentiment subjectivityand sentiment intensity. Our results
show that fraudulent MD&As on averagecontain three times more positive sentiment and four times more negative
sentiment compared with truthful MD&As. This suggests that use of both positive and negative sentiment is more
pronounced in fraudulent MD&As. We further nd that, compared with truthful MD&As, fraudulent MD&As
contain a greater proportion of subjective content than objective content. This suggests that the use of subjectivity
clues such as presence of too many adjectives and adverbs could be an indicator of fraud. Clear cases of fraud show
a higher intensity of sentiment exhibited by more use of adverbs in the adverb modifying adjectivepattern. Based
on the results of this study, frequent use of intensiers, particularly in this pattern, could be another indicator of
fraud. Moreover, the dimensions of subjectivity and intensity help in accurately classifying borderline examples
of MD&As (that are equal in sentiment polarity) into fraudulent and truthful categories. When taken together,
these ndings suggest that fraudulent MD&As in contrast to truthful MD&As contain higher sentiment content.
Copyright © 2016 John Wiley & Sons, Ltd.
Keywords: fraud detection; fraud sentiment classication model; textual analysis; natural language processing;
Management Discussion and Analysis
1. INTRODUCTION
Academic research on fraud detection suggests that detecting fraud is a complex problem and no one
set of predictors will be always successful in fraud detection. This may be partly due to the fact that
once the fraud indicators are publicly known, companies can nd ways to outsmart them and nd other
creative ways to conceal fraud. Further research can focus on nding novel fraud indicators and inno-
vative techniques of detecting fraud. Recently, there seems to be an increasing interest in analysing
company disclosures using textual analysis, and some of these studies have produced compelling
insights into how linguistic structure of text can be exploited to nd indicators of fraud (Goel et al.,
2010; Humpherys et al., 2011; Goel & Gangolly, 2012; Skillicorn & Purda, 2012; Purda & Skillicorn,
2014). However, this area is still nascent, and a number of areas remain underexplored, particularly the
role of sentiment- and emotion-related constructs in fraud detection. With the current study, we aim to
* Correspondence to: Sunita Goel, Accounting & Law, Siena College, 515 Loudon Road,Loudonville, NY 12211, USA. E-mail:
sgoel@siena.edu
Copyright © 2016 John Wiley & Sons, Ltd.
INTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE AND MANAGEMENT
Intell. Sys. Acc. Fin. Mgmt. 23, 215239 (2016)
Published online 24 May 2016 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/isaf.1392
ll this void and provide evidence to support a new explanation that further advances our understanding
of fraud and extends linguistic approaches to fraud detection by examining the role ofsentiment -related
linguistic nuances such as polarity, subjectivity and intensity.
Mining text for sentiment (also known as affect or emotion) and understanding how it can affect
readersthinking in some way is important because sentiment can convey incorrect information to
others when necessary (Ekman & Friesen, 1982). In the cur rent study, by analysing sentiment-laden
words in company disclosures, we explore what role sentiment constructs play in fraud detection and
how the distribution of sentiment features varies between fraudulent and truthful disclosures. We be-
lieve that annual reports and other forms of corporate disclosures are particularly interesting to inves-
tigate for fraud indicators for several reasons. First, it is easier to disguise true facts through this
mode of communication as it gives a great sense of visual anonymity to the writer, providing an oppor-
tunity to misrepresent facts and mask the cues that might reveal fraud. Second, owing to lack of a direct
relationship with the users of corporate disclosures, the writer can possibly disassociate from the feel-
ings of guilt and remorse when concealing facts through this mode of communication. Third, with the
rise of digitized information and data processing power, it has become possible to investigate for fraud
patterns and cues from large text sets, which was not possible earlier.
Specically, we examine the Management Discussion and Analysis(MD&A) section of annual reports
as this section predominantly expresses opinions and attitudes of management in a text. Even though
annual reports contain both factual and nonfactual information, these reports are often perceived as con-
veying factualinformation to external and internalusers. We focused on MD&Abecause of the nonfactual
content present in this section, unlike the other components of annual reports. Although both factual and
nonfactual information presented in annual reports is used for decision-making, users of these reports
often perceive nonfactual information with the same or greater interest. When examining MD&As, we
seek to recognize sentiments that are conveyed by text and how these sentiments differ between truthful
and fraudulentMD&As. In spite of burgeoning growthin sentiment analysis, very littleempirical evidence
exists on the role of sentiment in fraud detection. With the current study, we aim toll this void and con-
tribute to the emerging research in textual analysis. Recognition of sentiments conveyed by text can also
provide usefulinsight into how management personates itself when it is committing fraud.As the degree
of complexity in accounting standards and disclosure requirements continue to grow for nancial
reporting, webelieve our ndings maybe useful in developing richer toolsfor (1) predicting the likelihood
of fraud in corporatereports and (2) identifying companiesthat are at high risk of committing fraud. Con-
sequently, our research should be of interest to auditors, fraud examiners, standard setters, regulators,
policy-makers and other users including nancial analysts, investors and lending institutions.
The remainder of this paper is structured as follows. Section 2 provides a review of relevant litera-
ture, concentrating on the studies using textual analysis for fraud detection. Section 3 discusses our
sample and the methodology employed in this study. Section 4 describes sentiment feature sets and
reports on the experimental results. Finally, Section 5 presents the concluding observations, including
the limitations of this study and possible future work directions.
2. LITERATURE REVIEW
The majority of the work on text analysis using natural language processing (NLP) techniques in
accounting has focused on extracting factual information. Recently, a few studies in accounting have
looked at extracting nonfactual information, such as opinions, sentiments and emotions (Li, 2006; Das
& Chen, 2007; Tetlock, 2007; Hájek & Olej, 2013; Hájek et al., 2013). Approaches to the extraction of
216 S. GOEL AND O. UZUNER
Copyright © 2016 John Wiley & Sons, Ltd. Intell. Sys. Acc. Fin. Mgmt., 23, 215239 (2016)
DOI: 10.1002/isaf

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT