Look before you leap into predictive coding: an argument for a cautious approach to utilizing predictive coding.

Author:Vaccaro, Charles
  1. INTRODUCTION II. BACKGROUND A. Discovery in General B. Electronic Discovery C. Predictive Coding and Its Methodology D. Noteworthy Cases Grappling with Predictive Coding 1. Da Silva Moore v. Publicis Groupe SA 2. EORHB, Inc. v. HO A Holdings LLC 3. Global Aerospace v. Landow Aviation III. THE IMPORTANCE OF PREDICTIVE CODING AND E-DISCOVERY IV. AN ARGUMENT FOR A CAUTIOUS APPROACH TO THE IMPLEMENTATION OF PREDICTIVE CODING IN JUDICIAL PROCEEDINGS A. Hindrance to Cost Saving Capabilities B. Attorneys' Justified Hesitation C. Case Law Provides Little Guidance D. De Facto Cooperation Requirement and Associated Problems E. Limitations to Predictive Coding V. FACTORS THAT WOULD ENABLE ONE TO MAXIMIZE THE VALUE OF PREDICTIVE CODING TECHNOLOGY VI. CONCLUSION I. INTRODUCTION

    Document review is a dreaded process that causes attorneys to cringe. With the rise in the creation and storage of electronic documents, document review during discovery has become particularly onerous and expensive. (1) Some assert that predictive coding, an electronic discovery device, has arrived as a knight in shining armor to rescue the legal community from this increasingly burdensome and costly process. (2) Predictive coding is a device whereby computer software is "trained" to determine which documents are relevant and non-privileged in a large document population. (3) The software creates complex algorithms based on its training and utilizes this to automatically code documents. (4) The software replaces, though not entirely, much of the human review traditionally needed in the document review process. If used properly, the purported benefit of predictive coding is that it enables attorneys to manually review significantly fewer documents without sacrificing accuracy, consequently reducing discovery costs. (5)

    The utilization of predictive coding is currently one of the hottest issues in e-discovery, and some assert that it could fundamentally change the way e-discovery is conducted. (6) There is a stark contrast between those who advocate for the widespread adoption of this e-discovery device and those who are very skeptical of its potential positive impact on the e-discovery landscape. The predictive coding debate is particularly relevant today because, due to advancements in technology, there has been an explosion of electronically-stored information by individuals and businesses alike. This means, in a litigation context, that thousands, even millions, of electronic documents can be produced at the outset of a lawsuit. This high volume of documents significantly raises discovery costs since attorneys now have more information to review than ever before. (7) As noted, many assert that predictive coding could be the answer to combat these outrageous e-discovery expenses. However, before an attorney makes the leap onto the predictive coding bandwagon, he should consider that predictive coding is still a young technology and there are still many overlooked pitfalls associated with its use that could offset any potential economic benefit.

    This Note will argue that attorneys should exercise extreme caution prior to implementing predictive coding as their e-discovery tool by shedding light on the often neglected drawbacks associated with the technology's utilization. An exhaustive cost-benefit-risk analysis should be conducted before one invests significant time and money into predictive coding. This is important because it is vital for parties to properly utilize e-discovery methods and decide which is best suited for their current cause of action. Notwithstanding the drawbacks, there are proper circumstances where predictive coding can be used to maximize its potential benefits. This Note will also set forth factors that parties should look for to determine whether to use predictive coding.

    In summation, Part I of this Note will: A) provide background information on discovery in general, B) describe what e-discovery is and the most frequently implemented methods, C) introduce what predictive coding is and its methodology, and D) outline noteworthy case law grappling with the utilization of predictive coding. Part II will emphasize the importance of this e-discovery issue. Part III presents the argument for a cautious approach to the implementation of predictive coding in judicial proceedings. Finally, Part IV proposes factors that should exist for an attorney to maximize the value of predictive coding if he must inevitably use the software.


    1. Discovery in General

      To fully appreciate and understand predictive coding, it is first important to have a basic understanding of discovery in general. In the life cycle of a civil lawsuit, discovery occurs during the pretrial stage, which is the intermediate stage between the pleadings (8) and trial. (9) Discovery is conducted by parties to uncover information that will help them determine the strength of their case. (10) At the conclusion of discovery, parties should know whether their best course of action is to settle, file for summary judgment, or proceed to trial. (11) For this reason, the importance of discovery cannot be understated.

      To obtain information from a party during discovery, the information must be relevant and non-privileged. (12) Information is relevant if that information "tends to prove or disprove something the governing substantive law says matters." (13) Privileged information is information the law generally does not allow a party to gain access to because of the source of the information, such as the attorney-client privilege (although that privilege can be waived). (14)

      Accordingly, relevance and non-privileged information are referred to as the two main gateways to receiving discoverable materials. (15) The main discovery devices that parties implement are depositions, interrogatories, document requests, medical examinations, and requests for admissions. (16) Predictive coding and other electronic discovery methods deal with the parties' requests for production of electronic documents. (17)

    2. Electronic Discovery

      Electronic discovery, also known as e-discovery, refers to discovery that deals with the process of collecting, reviewing, and exchanging information in the electronic format (frequently referenced as electronically-stored information or "ESI"). (18) ESI includes (but is not limited to) "emails, documents, presentations, databases, voicemail, audio and video files, social media, and web sites." (19) Advancements in technology have led to an eruption of ESI created and hoarded by corporations and individuals, and there is a real challenge to litigators and in-house counsel within legal departments to curb the growing cost of discovery. (20) "The world now holds twice as many bytes of data as there are liters of water in all its oceans[.]" (21) In 2009, there were an estimated 247 billion email messages sent, and this number is expected to more than double in 2013. (22) It is estimated that corporate workers, on average, send and receive in excess of 110 e-mail messages per day. (23) Strikingly, ninety percent of the data currently in the world has been produced in the last two years alone. (24) Hence, thousands (even millions) of electronic documents can be produced during discovery, and such production would require counsel to bill hundreds upon hundreds of hours just to sift through those documents. (25)

      Currently, there are many strategies that attorneys and legal departments use to reduce e-discovery expenses. (26) These strategies include: 1) retaining contract attorneys to manually review and code each document, 2) "limiting the number of custodians and data sources processed[,]" and 3) using technological methods like keyword searches and concept searches to reduce the amount of potentially responsive documents to a manageable number. (27) It is important to have an understanding of both keyword searching and concept searching, as both are popular e-discovery methods and are alternatives to using predictive coding. (28)

      Keyword searching (also known as phrased based searching) is the most frequently utilized textual search method in e-discovery. (29) Under this method, keywords and connectors are inputted into a computer program (30) that searches through an information database or document population for occurrences of those inputted words in the search query. (31) This method operates like commonly used internet search engines. Keyword searches are implemented by litigators to review vast databases of documents, often emails. The typical purpose for using keyword searches is simply to reduce the document population to a practicable number so that manual review of the documents can be conducted in a reasonable amount of time and at a reasonable expense. (32)

      Moreover, although keyword searches can be useful, they still have many disadvantages. First, keyword searches are only as good as the search terms and connectors used. (33) The software will only look for the specific words that were inputted, so if the documents contain rhetoric an attorney failed to surmise, then the search will not bear any fruit. Moreover, keyword-based searches are often over-inclusive (finding a large portion of irrelevant documents), and they can produce a large proportion of false positives ("documents that are identified as potentially relevant that really are not"). (34) Last, for the reasons stated above, "keyword searches are also not very effective or accurate in finding relevant ESI." (35) Extensive manual review is often still necessary, and this increases the time and money involved in finding relevant documents. (36)

      Concept searching, another popular c-discovery device, is an automated search method that goes further than keyword searching. (37) As opposed to simply focusing on the exact word used, like a keyword search does, concept searching focuses on the underlying meaning, subject matter, and rhetoric behind the word or words...

To continue reading