With the ubiquitous integration of video streaming and recording technology into nearly all facets of public and private interaction, it is not unreasonable to believe that soon most human activity will be captured as video data. In addition to the many public and private video cameras which constantly monitor traffic intersections, storefronts, and neighborhoods, many law enforcement personnel now implement body-worn cameras (BWC) to generate video data during criminal encounters which can provide objective evidence that once resided only in the memory of witnesses, perpetrators, and officers. The tremendous number of man-hours required to evaluate video data can disincentivize the use of BWCs, while retarding the prosecutor's ability to bring charges and efficiently litigate. (1) Given the tendency towards "paralysis by analysis" (2) regarding large amounts of data, a modern solution is the implementation of third-party software to classify and sort BWC video files into manageable and coherent "clips", which serves to: 1) improve recall, precision, and speed of evidentiary review compared to traditional manual review; 2) facilitate the strict allocation of manual review hours for high value "clips" with a high probability of containing material information; and 3) reduce the prosecutorial risk of committing Brady or ethical violations.
METHODOLOGIES FOR DATA ASSESSMENT USING ALGORITHMS
Any algorithm that would replace the majority of manual video review would need a superior methodology for identifying and presenting "clips" with a high probability of containing: 1) a "smoking gun" to implicate the alleged perpetrator(s); and 2) material evidence "favorable to the defense, either because it is exculpatory or... impeaching". (3) The efficacy of extracting relevant information from large data sets is typically measured by "Recall (percentage of relevant documents retrieved) and Precision (percentage of retrieved documents)". (4) It is the second category which presents a unique dilemma for prosecutors, who have a constitutional duty to avoid a Brady violation by "[making] available... all existing material or information that tends to mitigate or negate the defendant's guilt," with an additional ethical duty to "make timely disclosure" of such information. (5,6)
To highlight the inherent bias of lawyers to overestimate the efficacy of their own database queries, a study in 1985 demonstrated that lawyers utilizing paralegals to query a database of full-text electronic documents were only able to retrieve 20% of the relevant documents (Recall), when the lawyer's goal and belief was that his paralegals had retrieved 75% of the documents material to the search criterion. (7) Furthermore, the search Precision was 79%, meaning that 21 documents out of every 100 retrieved by the search were ultimately deemed irrelevant. The ability of humans to extract relevant information from a large data set more effectively than a machine is a fallacy that persists to this day. (8) The lawyer in this study had a good faith, albeit grossly mistaken, belief that his team had extracted 75% of all documents relevant to the query, the minimum...