What Are You Saying? Using topic to Detect Financial Misreporting

Document

Cited in

Author	RICHARD M. CROWLEY,W. BROOKE ELLIOTT,NERISSA C. BROWN
Date	01 March 2020
DOI	http://doi.org/10.1111/1475-679X.12294
Published date	01 March 2020

DOI: 10.1111/1475-679X.12294

Journal of Accounting Research

Vol. 58 No. 1 March 2020

Printed in U.S.A.

What Are You Saying? Using topic to

Detect Financial Misreporting

NERISSA C. BROWN,

∗RICHARD M. CROWLEY,

†

AND W. BROOKE ELLIOTT

∗

Received 12 September 2016; accepted 25 October 2019

ABSTRACT

We use a machine learning technique to assess whether the thematic con-

tent of ﬁnancial statement disclosures (labeled topic) is incrementally infor-

mative in predicting intentional misreporting. Using a Bayesian topic mod-

eling algorithm, we determine and empirically quantify the topic content of

a large collection of 10-K narratives spanning 1994 to 2012. We ﬁnd that the

algorithm produces a valid set of semantically meaningful topics that pre-

dict ﬁnancial misreporting, based on samples of Securities and Exchange

∗University of Illinois at Urbana-Champaign; †Singapore Management University.

Accepted by Phil Berger. We thank an anonymous reviewer, Andrew Bauer, Matt Cobabe,

Amanda Convery, Robert Davidson, Paul Demer´

e, Lucile Faurel, Shawn Gordon, Jing He,

Shiva Rajgopal, Kristina Rennekamp, Kecia Williams Smith, Gang Wang, and workshop partic-

ipants at Baruch College—City University of New York, Carnegie Mellon University,Columbia

University, Hong Kong University of Science and Technology, Nagoya University, University of

Illinois, U.S. Securities and Exchange Commission (Division of Economic and Risk Analysis),

Virginia Tech,the 2015 AAA FARS Mid-year Meeting, the 2015 AAA Annual Meeting, the 2015

Conference on Convergence of Financial and Managerial Accounting Research, the 2016 Con-

ference on Investor Protection, Corporate Governance, and Fraud Prevention, and the 2016

Conference on Financial Economics and Accounting for helpful comments. We also thank

Xiao Yu for insightful comments on methodology and coding, Brian Gale for helpful assistance

with Amazon Mechanical Turk, and Stephanie Grant, Chunlei Liu, Jill Santore, and Jingpeng

Zhu for excellent research assistance. We thank Derryck Coleman and Olga Usvyatsky (for-

merly) of Audit Analytics for assistance with the restatement data and text search scripts used

in this study.Brown gratefully acknowledges ﬁnancial support from the PricewaterhouseCoop-

ers LLP Faculty Fellowship. Elliott gratefully acknowledges ﬁnancial support from the Ernst &

Young Distinguished Professorship. An online appendix to this paper can be downloaded at

http://research.chicagobooth.edu/arc/journal-of-accounting-research/online-supplements.

237

CUniversity of Chicago on behalf of the Accounting Research Center, 2019

238 N.C.BROWN,R.M.CROWLEY,AND W.B.ELLIOTT

Commission (SEC) enforcement actions (Accounting and Auditing Enforce-

ment Releases [AAERs]) and irregularities identiﬁed from ﬁnancial restate-

ments and 10-K ﬁling amendments. Our out-of-sample tests indicate that topic

signiﬁcantly improves the detection of ﬁnancial misreporting by as much as

59% when added to models based on commonly used ﬁnancial and textual

style variables. Furthermore, models that incorporate topic signiﬁcantly out-

perform traditional models when detecting serious revenue recognition and

core expense errors. Taken together, our results suggest that the topics dis-

cussed in annual report ﬁlings and the attention devoted to each topic are

useful signals in detecting ﬁnancial misreporting.

JEL codes: C80; K22; K42; M40; M41; M48

Keywords: topic modeling; disclosure; latent Dirichlet allocation; ﬁnancial

misreporting

1. Introduction

This study investigates whether a novel text-based measure of the thematic

content of ﬁnancial statement disclosures (labeled as topic)isusefulfor

detecting ﬁnancial misreporting.1Detection models have long focused on

quantitative ﬁnancial statement and stock market variables as predictive fac-

tors (Beneish [1997], Brazel, Jones, and Zimbelman [2009], Dechow et al.

[2011], Bao et al. [2020]). One drawback of this approach is that ﬁnancial

misreporting can go undetected for multiple periods, because misreport-

ing ﬁrms often manipulate performance metrics and accounting transac-

tions to blend in better with their peers or the ﬁrm’s own past performance

(Lewis [2013]). To address this weakness, recent studies analyze the textual

and linguistic features of management disclosures, ﬁnding that summary

measures of these features serve as useful warnings of misreporting (see,

e.g., Hobson, Mayew, and Venkatachalam[2012], Larcker and Zakolyukina

[2012], Purda and Skillicorn [2015]).

Despite the usefulness of communication style in revealing misreport-

ing, the literature debates whether textual and linguistic features ade-

quately capture managers’ deliberate attempts to obfuscate or manipu-

late ﬁnancial information (Bloomﬁeld [2008], Bushee, Gow, and Taylor

[2018]). Further, as Loughran and McDonald [2016] highlight, commonly

used textual measures do not reﬂect the context or meaning of manage-

ment disclosures, thereby limiting the inferences that can be drawn. We

tackle these issues by introducing a machine learning tool that simultane-

ously detects and quantiﬁes the thematic content (topic) of annual report

1We use the terms misreporting and misrepresentation interchangeably to refer to deliberate

violations of ﬁnancial accounting standards and noncompliance with regulatory ﬁnancial re-

porting rules. We refrain from using the term fraud because, in a legal sense, violations of

or noncompliance with ﬁnancial reporting standards and rules are considered fraudulent

only if market participants rely on the misreported or misrepresented information to their

detriment.

USING topic TO DETECT FINANCIAL MISREPORTING 239

narratives. This approach departs from prior text-based research by focus-

ing on what is being disclosed by management rather than how. Using this

unique measure, we evaluate the disclosure topics associated with misre-

porting and how these topics evolve. More importantly, we investigate the

incremental predictive power of topic in detecting misreporting out of sam-

ple, relative to a collection of ﬁnancial and textual style measures.

Our focus on the thematic content of ﬁnancial statement ﬁlings draws on

the management disclosure and communications literatures. These bod-

ies of research suggest that the ﬂexible nature of disclosure content allows

for a broader set of dimensions along which annual report narratives can

be used to identify ﬁnancial misreporting, compared to quantitative ﬁnan-

cial metrics and summary measures of textual features (Hoberg and Lewis

[2017]). These literatures also argue that textual features, such as tone and

word usage, are difﬁcult to classify as deceptive, because disclosure narra-

tives can be inﬂuenced by individuals’ expectations and motivations, even

when the intent is to communicate objectively and truthfully (Douglas and

Sutton [2003]). In that sense, the content of the disclosure and the atten-

tion devoted to each topic may better predict misreporting than how the

narrative is fashioned. We therefore examine whether the topic content of

ﬁnancial statement disclosures is incrementally informative in assessing the

likelihood of misreporting, beyond textual style features. We also analyze

the ability of topic to detect misreporting, relative to quantitative ﬁnancial

variables, given that these measures are typically backward-looking and less

efﬁcient in predicting misreporting, compared to language-based measures

(Cecchini et al. [2010a], Goel and Gangolly [2012], Larcker and Zakolyuk-

ina [2012], Purda and Skillicorn [2015]).

We generate our topic measure by employing a Bayesian topic model-

ing algorithm developed by Blei, Ng, and Jordan [2003], termed Latent

Dirichlet Allocation (LDA). Similar to factor or cluster analysis, LDA is an

unsupervised and unstructured probabilistic model that “learns” or discov-

ers the latent thematic structure of words within a corpus of documents.2

The algorithm (and other variants) is widely used in practice by Internet

search engines to guide keyword selection and improve correlations be-

tween search terms and web content (Fishkin [2014]). A unique advantage

of LDA is that it does not require predetermined word dictionaries or topic

categories and instead relies on the fact that words frequently appearing

together tend to be semantically related. This process reduces researcher

bias, as foreknowledge of document content does not affect the topic clas-

siﬁcations.3Furthermore, the algorithm can classify the content of large

2LDA is a “bag of words” algorithm that uses the distribution of words across documents to

classify and quantify themes without the need for predeﬁned or researcher-determined word

lists or topic categories.

3Although LDA is unsupervised and does not rely on human input to identify topics, hu-

man judgment is necessary to interpret and label the topics inferred from the algorithm.

This is because the LDA output for a given topic consists only of word clusters and word

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

What Are You Saying? Using topic to Detect Financial Misreporting

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users