Textual Analysis in Accounting and Finance: A Survey

Document

Cited in

Published date	01 September 2016
DOI	http://doi.org/10.1111/1475-679X.12123
Author	TIM LOUGHRAN,BILL MCDONALD
Date	01 September 2016

DOI: 10.1111/1475-679X.12123

Journal of Accounting Research

Vol. 54 No. 4 September 2016

Printed in U.S.A.

Textual Analysis in Accounting and

Finance: A Survey

TIM LOUGHRAN

∗AND BILL MCDONALD

∗

Received 20 January 2015; accepted 15 March 2016

ABSTRACT

Relative to quantitative methods traditionally used in accounting and ﬁnance,

textual analysis is substantially less precise. Thus, understanding the art is of

equal importance to understanding the science. In this survey, we describe

the nuances of the method and, as users of textual analysis, some of the trip-

wires in implementation. We also review the contemporary textual analysis

literature and highlight areas of future research.

JEL codes: D82; D83; G14; G18; G30; M40; M41

Keywords: textual analysis; sentiment analysis; bag of words; readability;

word lists; Zipf’s law; cosine similarity; Na¨

ıve Bayes

1. Introduction

Textual analysis, in some form, resides across many disciplines under vari-

ous aliases, including computational linguistics, natural (or statistical) lan-

guage processing, information retrieval, content analysis, or stylometrics.

The notion of parsing text for patterns has a long history. In the 1300s, fri-

ars of the Dominican Order produced concordances of the Latin Vulgate

(Biblical translations) to provide indexes of common phrases (Catholic

Encyclopedia [1908]). In 1901, T.C. Mendenhall used textual analysis to

∗Mendoza College of Business, University of Notre Dame.

Accepted by Christian Leuz. We thank Brad Badertscher, Peter Easton, Diego Garcia, two

anonymous referees, and seminar participants at Columbia Business School’s News and Fi-

nance Conference for helpful comments.

1187

1188 T.LOUGHRAN AND B.MCDONALD

examine whether some works attributed to Shakespeare might have been

written by Bacon (see Williams [1975]). During the world wars, the method

was increasingly adapted to political speech, where carefully scripted

rhetorical choices were interpreted as signals of diplomatic trends (e.g.,

Burke [1939]). In the sixties, the systematic analysis of text increased in

popularity with Mosteller and Wallace’s [1964] purported resolution of au-

thorship for the Federalist Papers. In the past few decades, the release of a

large annotated corpus from the Wall Street Journal (WSJ) led to signiﬁcant

increases in the accuracy of statistical parsing (see Marcus, Santorini, and

Marcinkiewicz [1993]).

More recently, with the exponential increase in computing power over

the past half century and the increased focus on textual methods driven by

the requirements of Internet search engines, the application of this tech-

nique has permeated most disciplines in one way or another. In accounting

and ﬁnance, the online availability of news articles, earnings conference

calls, Securities and Exchange Commission (SEC) ﬁlings, and text from so-

cial media provide ample fodder for applying the technology.

Can we tease out sentiment from mandated company disclosures and

contextualize quantitative data in ways that might predict future valuation

components? Can we computationally read news articles and trade before

humans can read and assimilate the information? If Twitter’s tweets pro-

vide the pulse of information, can we monitor these messages in real time

to gain an informational edge? Do textual artifacts provide an additional at-

tribute that predicts bankruptcies? Are there subtle cues in managements’

earnings conference calls that computers can discern better than analysts?

More broadly, can we examine textual artifacts to measure the quantity and

quality of information in a collection of text, including both the intended

message and, importantly, any unintended revelations? These are all inter-

esting questions potentially answered by the technology of textual analysis.

Textual analysis is an emerging area in accounting and ﬁnance and, as

a result, the corresponding taxonomies are still somewhat imprecise. Tex-

tual analysis can be considered as a subset of what is sometimes labeled

qualitative analysis, with textual analysis most frequently falling into the

categories of either targeted phrases, sentiment analysis, topic modeling,

or measures of document similarity. Readability is another aspect of textual

analysis, which is differentiated from some of the prior methods in that it

attempts to measure the ability of the reader to decipher the intended mes-

sage, whereas the other methods typically focus on computationally extract-

ing meaning from a collection of text. Other examples of the more general

topic of qualitative analysis would include Coval and Shumway [2001], who

consider the information conveyed by noise levels in the Treasury Bond

Futures trading pit at the Chicago Board of Trade, or Mayew and Venkat-

achalam [2012], who examine the audio from earnings conference calls to

determine managerial affective states.

Following the pioneering papers by Frazier, Ingram, and Tennyson

[1984], Antweiler and Frank [2004], Das and Chen [2007], Tetlock [2007],

TEXTUAL ANALYSIS IN ACCOUNTING AND FINANCE 1189

and Li [2008], accounting and ﬁnance researchers have actively examined

the impact of qualitative information on equity valuations. The words se-

lected by managers to describe their operations and the language used by

media to report on ﬁrms and markets have been shown to be correlated

with future stock returns, earnings, and even future fraudulent activities

of management. Clearly, stock market investors incorporate more than just

quantitative data in their valuations, but as the accounting and ﬁnance dis-

ciplines embrace this new technology, we must proceed carefully to assure

that what we purport to measure is in fact so.

The burgeoning literature in textual analysis is already summarized well

in other papers, although the increasing popularity of the method quickly

dates any attempt to distill research on the topic. Li [2010a], in a survey of

the literature, provides details on earlier manual-based examples of textual

analysis, discusses the modern literature by topical area (e.g., information

content, earnings quality, market efﬁciency), and itemizes a prescient list

of potential research topics. His conclusions echo a theme of this paper;

that is, the literature needs to be less centered on ﬁnding ways to apply

off-the-shelf textual methods borrowed from highly evolved technologies

in computational linguistics and instead be more motivated by hypotheses

“closely tied to economic theories” (Li [2010a, p. 158]).

Kearney and Liu [2014] provide a more recent survey of methods and

literature with a focus on textual sentiment. Their table 3 provides a useful

annotated bibliography of most sentiment-related papers published prior

to 2013. Das’s [2014] monograph, in addition to reviewing the academic

literature, provides an excellent user’s guide for someone just approaching

the subject, including code snippets for some of the basic tools used in

textual analysis.

In what follows, we will fold a more selective and focused survey of the

accounting, ﬁnance, and economics literature on textual analysis into a

description of some of its methods. We add value beyond simply offering

an updated literature review by also underscoring the methodological trip-

wires for those approaching this relatively new technique. Qualitative data

require the additional step of translating text into quantitative measures,

which are then used as inputs into either traditional or text-based methods.

We emphasize the importance of exposition and transparency in this trans-

formation process because this is where much of the imprecision of tex-

tual analysis is introduced. More generally, we emphasize the importance

of replicability in the less-structured methods used in textual analysis. Re-

garding the topic of readability, we underscore the importance of carefully

specifying what is meant by the concept in the context of business docu-

ments, where the traditional hallmarks of readability (polysyllabic words

and long sentences) are rarely distinguishing characteristics in the inter-

pretation of ﬁnancial text.

The remainder of our survey is organized as follows. In section 2, before

examining those methods intended to extract meaning from text collec-

tions, we consider the broader topic of information content and document

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Textual Analysis in Accounting and Finance: A Survey

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users