Thinking outside the box: use predictive coding as a rim tool.

AuthorSmith, Doug
PositionRIM FUNDAMENTALS - Records and information management

Predictive coding is causing quite a stir, receiving increased attention in the past year due in large part to two cases being litigated in New York and Illinois. (See the sidebar "Predictive Coding in the News.") But, what is predictive coding --and is it really a new technology?

Sharon D. Nelson, Esq., on her "Ride the Lightning" blog, offers a comprehensive definition of predictive coding that helps make it clear that it is not new. She says predictive coding is a "combination of technologies and processes in which decisions pertaining to the responsiveness of records gathered or preserved for potential production purposes ... are made by having reviewers examine a subset of the collection and having the decisions on those documents propagated to the rest of the collection without reviewers examining each record."

E-discovery systems have been using processes like this for many years. Predictive coding was developed because the proliferation of electronically stored information (ESI) being created and stored made it extremely difficult and expensive to locate relevant information that needed to be preserved and produced for litigation and investigations. So, while there have been new developments in the technology, predictive coding is not a new process.

Explaining Predictive Coding

Most predictive coding processes operate in one of two fundamental ways--either through sampling or observation.

Sampling is done by computer software, which randomly selects a subset of electronic records from all those available, presents it to a human coder for review, monitors the coder's decisions, notes the characteristics of the records that are coded (e.g., date, recipients, author, subject, and keywords), and then uses these recorded decisions to predict the value of the remaining documents in the collection.

In the observation process, the coding system monitors human coders' actual decisions as they review records, begins to predict how a record will be coded before presenting it for coding, and then compares the predicted coding to the actual coding. Eventually, the system's predictive coding will reach the level of accuracy that was predetermined to be sufficient. At this point, the system can be used to predict the coding decisions automatically.

Neither sampling nor the observation process relies on the computer to know anything. Each uses human decisions as a calibrating mechanism to learn about the coding details, and each could be used by an...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT