Predictive Coding: Taking the Devil Out of the Details

AuthorL. Casey Auttonberry
PositionJ.D./D.C.L., 2014, Paul M. Hebert Law Center, Louisiana State University
Pages613-648
Predictive Coding: Taking the Devil Out of the Details
INTRODUCTION
Discovery has changed, and electronically stored information
(ESI) was the catalyst.1 Though “[e]-discover y matters are no longer
the novel issues that they once were,2 technology is constantly
changing.3 It was estimated that in 2009 there were 988 exabytes of
data in existence, an amount that would stretch from the Sun to
Pluto and back in paper form.4 Massive amounts of ESI have
become a huge problem in litigation.5 Organizations are retaining
more information than ever,6 and lawsuits among these
organizations sometimes require lawyers to review more than 100
million documents.7 Trying to find relevant information in so much
Copyright 2014, by L. CASEY AUTTONBERRY.
1. See George L. Paul & Jason R. Baron, Information Inflation: Can the
Legal System Adapt?, 13 RICH. J.L. & TECH. 10 (2007) (noting that information
has changed and that change has affected litigation).
2. Johnson v. Big Lots Stores, Inc., 253 F.R.D. 381, 395 (E.D. La. 2008).
Electronic disco very (e-disco very) is “[t]he pr ocess of identi fying, collecti ng,
processing, analyzing, and reviewing ESI for legal proceedings . . . .”
RECOMMIND, INC., PREDICTIVE CODING FOR DUMMIES 3–4 (Recommind Spec.
Ed., 2013), available at http://media.wiley.com/assets/7072/74/9781118522301
_custom.pdf.
3. Technology can also change the way litigatio n is handled. For i nstance,
the “miracle of photographic reproduction” markedly reduced the burden of
transporting discovery materials. The Sedona Conference, The Sedona Conference
Commentary on Proportionality in Electronic Discovery, 11 SEDONA CONF. J.
289, 292 (2010). The Sedona Conference is a non-profit research and educational
organization t hat seeks “the r easoned and ju st advancement of law” by creat ing a
“think-tank” setting for leaders of the legal community to discuss current issues in
the legal practice. THE SEDONA CONF., https://t hesedonaconference.org/aboutus
(last visited Oct. 27, 2013).
4 . Bennett B. Borden et al., Four Years Later: How the 2006 Amendments to
the Federal Rules Have Reshaped the E-discovery Landscape and Are Revitalizing
the Civil Justice System, 17 RICH. J.L. & TECH. 10, 14 (2011) (citing Jason R.
Baron & Ralph C. Losey, e-Discovery: Did You Know?, YOUTUBE (Feb. 11,
2010), http://www.youtube.com/watch?v=bWbJWcsPp1M&feature=player_em
bedded). To put the term “exabyte” in perspective, one exabyte is equal to “a
billion billion bytes.” DIC TION ARY.COM, http://dictionary.reference. com/browse
/exabyte?s=t (last visited Oct. 27, 2013).
5. Paul & Baron, supra note 1, at 1–2.
6. Id. at 1 n.2 (citing GEORGE L. PAUL & BRUCE H. NEARON, THE
DISCOVERY REVOLUTION: E-DISCOVERY AMENDMENTS TO THE FEDERAL RULES
OF CIVIL PROCEDURE 4–5 (2d ed. 2006) (stating that companies are retaining
thousands of times more informatio n now than a few decades ago)).
7. The Sedona Conference, The Case for Cooperation, 10 SEDONA CONF. J.
339, 356 (2009) [hereinafter Case for Cooperation].
614 LOUISIANA LAW REVIEW [Vol. 74
ESI during e-discovery can be grueling for lawyers and expensive
for clients.8
In the past, lawyers could conduct effective discovery using only
manual review.9 Now, with the increased amount of information
retained by parties to litigation, using only manual review in e-
discovery is not a realistic option.10 One way lawyers have dealt
with ESI is by using keyword searches, which have become the
norm in e-discovery because they allow lawyers to more easily
navigate through electronic information.11 However, even with
keyword searches, the amount of ESI can sti ll sometimes be
unmanageable.12
Fortunately, there are some strategies that litigants can use to
make searching through ESI more manageable. One strategy
involves new technological tools in e-discovery.13 One of these
tools, called “predictive coding,”14 could “fundamentally change”
8. See Paul & Baron, supra note 1, at 1–2 (describing how the massive
amount of discoverable information has “stressed the legal system” and made
litigation “prohibitively expensive”).
9. See William W. Belt et al., Technology-Assisted Document Review: Is It
Defensible?, 18 RICH. J.L. & TECH. 10, 2 (2012) (discussing how discovery materials
are now sent on hard drives instead of in boxes). Manual review requires humans to
read through documents one at a time and classify them as relevant or irrelevant to
the document request. Maura R. Grossman & Gordon V. Cormack, Technology-
Assisted Review in E-Discovery Can Be More Effective and More Efficient Than
Exhaustive Manual Review, 17 RICH. J.L. & TECH. 11, 2 (2011).
10. See Andrew Peck, Search, Forward: Will Manual Document Review and
Keyword Searches be Replaced by Computer-Assisted Coding?, L. TECH. NEWS
(Oct. 2011), http://www.recommind.com/sites/default/files/LTN_Search_Forward
_Peck_Recommind.pdf (“[T]he volume of electronically stored information . . . has
largely eliminated manual review as the sole method of document review . . . .”); see
also Paul & Baron, supra note 1, at 3 (noting that “[l]itigators can no longer depend
on manual review alone”).
11. The Sedona Conference, The Sedona Conference Best Practices
Commentary on the Use of Search and Information Retrieval Methods in E-
Discovery, 8 SEDONA CONF. J. 189, 200 (2007) [hereinafter Sedona Conference
Best Practices]. In a keyword search, the human searcher inputs “words into a
computer which then retrieves documents within the collection containing the
same words.” This method is also known as “Boolean searching.” MATTHEW D.
NELSON, ESQ., PREDICTIVE CODING FOR DUMMIES 9 (Symantic Spec. Ed., 2012),
available at http://media.wiley.com/assets/7056/00/9781118482377_custom.pdf.
Keyword searches allow more advanced searches using multiple word
combinations and root word derivatives. Id.
12. Se e Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts
on ‘Information Inflation’ and Current Issues in E-Discovery Search, 17 RICH.
J.L. & TECH. 9, 10 (2011).
13. Se e Paul & Baron, supra note 1, at 26.
14 . Melissa Whittingham et al., Predictive Coding: E-Discovery Game
Changer?, EDDE J. 11 (2011), available at http://www.cov.com/files/Publicat ion
/9f38beae-2753-481d-b638-55f86c46931f/Presentation/PublicationAttachment/
2014] COMMENT 615
discovery in litigation involving large amounts of ESI.15 Predictive
coding is a “machine-learning technology” that, with a relatively
small amount of human input, teaches a computer to “predict”
document classification.16 The coding tool uses a man-made
“definition” to make “rules” for classifying documents17 and then
organizes the documents within a larger document collection based on
how well they match the man-made definition and rules.18 The end
result is that lawyers manually review a much smaller set of
documents.19 Predictive codin g therefore effectivel y “alleviat[es] the
need to review whole masses of records in order to find the relevant
few.”20 Most importantly, predictive coding is estimated to reduce e-
discovery costs as much as 45% t o 71% while maintaining search
quality.21 Studies suggest that technol ogy-assisted review is no less
accurate than human review.22
6e933c53-08a3-4f05-8962-587348107592/Predictive%20Coding%20-%20E-Dis
covery%20Game%20Changer.pdf. Predictive coding is also known as “automated
document review, automated document classification, automatic categorization,
predictive categorization, and predictive ranking.” Id.
15. Scott Vernick, Predictive Coding: Three Things You Need to Know About
This Year’s Biggest Legal Tech Tren d, HUFF POST TECH. BLOG (Aug. 15, 2012,
6:36 PM), http://www.huffingtonpost.com/scott-vernick/three-things-you-need-to-
_b_1773959.html.
16. NELSON, supra note 11, at 7. Predictive coding can also be described as
“technology-assisted review,” which is a search process in which humans use
technology to find responsive docume nts in a large data collection. Grossman &
Cormack, supra note 9, at 2.
17. Chuck Rothman, What is this Predictive Coding Thing Anyway?,
EDISCOVERYJOURNAL.COM (Mar. 14, 2012, 8:00 AM), http://ediscoveryjournal
.com/2012/03/what-is-this-predictive-coding-thing-anyway/. These “definitions” are
called “classifiers.” Id. Humans review a small set of documents and determine their
relevance to the case’s facts to formulate the definition for the p redictive coding tool.
Ari Kaplan & Joe Looby, Advice from Counsel: Can Predictive Coding Deliver on
Its Promise?, FTI CONSULTING TECHN. 1 (2012), available at http://www
.ftitechnology.com/doc/White-Papers/whitepaper-2012-Predictive-Coding-Survey
.pdf [hereinafter Advice from Counsel]. The person actually conducting the coding
process may va ry depending on the sit uation. See infra Part II.
18. Rothman, supra note 17. Several other steps are necessary for the
predictive coding tool to find documents effectively. For example, the searcher
uses an “iterative approach” in the process, which incorporates “document
sampling and quality assurance” checks. Advice from Counsel, supra note 17, at 1.
These steps are discussed further infra Parts II–IV.
19. Se e Grossman & Cormack, supra note 9, at 2.
20. Rothman, supra note 17.
21.
EDISCOVERY INSTITUTE SURVEY ON PREDICTIVE CODING 3 (2010),
available at http://www.discovia.com/wp-content/uploads/2012/07/2010_EDI
_PredictiveCodingSurvey.pdf [hereinafter EDI Survey].
22. Se e, e.g., Herbert L. Roitblat et al., Document Categorization in Legal
Electronic Discovery: Computer Classification vs. Manual Review, J. AM. SOCY

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT