The big data jury.

Author:Ferguson, Andrew Guthrie
Position:Abstract through II. Big Data and Bright Data C. Bright Data and Jury Selection 1. Court Systems and Bright Data c. Bright Data and Jury Yield, p. 935-971


Big data technologies now exist to create algorithmically perfect jury pools matching the demographic realities of a community. Big data technologies also exist to provide litigants a wealth of personal information about potential jurors. The question remains whether these technological innovations benefit the jury system. This Article addresses the disruptive impact of big data on jury selection and the dilemma it presents to courts, lawyers, and citizens.


Jury selection requires personal information about citizens. Courts need to know whom to summon. Litigants need to know whom to select. Personal identifying data is central to providing a representative and fair jury. Yet, courts and litigants know very little about individuals called to serve on juries. This institutional ignorance is purposeful, puzzling, and soon to be challenged by ever-expanding "big data" technologies which are currently collecting billions of bits of personal data on American citizens. (1)

This Article addresses the dilemma that big data poses to jury selection. The Article examines the law, practice, and theoretical questions that arise when courts and litigants apply new technologies of data collection to jury' selection. Evolving big data information systems have the potential to create perfectly representative jury venires and even generate personalized dossiers on individual jurors. Yet, such informational precision presents real challenges to the existing jury system, offering promises of efficiency and accuracy at the expense of privacy and legitimacy.

The basic problem is one of information. Today, court systems interested in summoning a "fair-cross-section" (2) of citizens know only basic identifying data about citizens--gender, race, age, employment, and limited geographic characteristics. (3) Jurors, as citizens, are considered equal, so long as they meet the statutory requirements of service. (4) In an effort to avoid historically rooted discriminatory practices, courts have limited the data collected about jurors and randomized the selection process. (5) This egalitarian and purposely myopic selection process, while an improvement over past exclusionary practices, has not solved the problem of unrepresentative juries. (6) Constitutional fair cross-section litigation regularly exposes unrepresentative jury pools, and courts have responded with a confused and contradictory body of case law about what constitutes an impartial jury venire. (7)

Litigants equally know very little about individual jurors, although in contrast to the court, wish to know as much as possible to predict who might be favorable to their claims. (8) In high profile cases, lawyers hire jury consultants to divine the inclinations and attitudes of potential jurors. (9) In most cases, however, lawyers rely on hunches, stereotypes, and sometimes impermissible judgments based on race, ethnicity, or gender to find perceived partisans for their cause. (10) New methods of online investigation, involving both "Googling" potential jurors and studying social media connections, have generated additional ways to reveal information about potential jurors. (11) Such information derives from sources external to the court system, requiring additional financial expenditures, and is accessible only to those who can afford it. (12)

The result: a jury selection system that is both limiting and unequal. Courts, purposely limited to basic identifying data, randomly select jury pools that do not reflect demographic realities in society. (13) Litigants, practically limited to basic identifying data, choose jurors informed by the rough proxies of race, ethnicity, or gender, resulting in the discriminatory use of peremptory challenges. (14) Only those litigants with the financial means to investigate individual jurors can go beyond rough stereotypes to find out detailed personal information about potential jurors for their case. (15) Limited data collection thus impacts both selection of the jury venire and the actual jury panel in negative ways.

The rise of "big data" has the potential to upend the current informational limitations of jury selection. (16) Big data companies collect, and have collected, public and quasi-public information about most Americans' consumer, criminal, financial, health, political, and reading interests. (17) Google knows you have the flu before the doctor does. (18) Target knows you are pregnant before your friends do. (19) Amazon will soon send you items that you want before you actually order them. (20) According to a Federal Trade Commission Report, commercial data brokers possess as many as 3,000 data points on every American consumer, segmented into household categories such as "Affluent Baby Boomer," "Bible Lifestyle," "Leans Left," or "African American Professional." (21) If tasked to do so by court administrators, these companies could produce a fair cross-section of individuals that not only represent the racial and gender demographics of a jurisdiction, but also a fair cross-section, as measured by class, age, geography, political affiliation, and even consumer interests or hobbies. (22)

This Article examines the dilemma that "big data" presents to those responsible for summoning and selecting jurors. Commercial providers possess, and government databases contain, better, more targeted, but very personal data in easily accessible formats. (23) I call this information, focused on individuals or groups, "bright data"--"bright" because it is smart (precise and targeted) and because it is illuminating (revealing preferences and patterns). Courts could use this data to select the larger jury venire, and litigants could use this data to select the particular jury panel. For court administrators, the availability of additional information provides the potential for increased jury diversity, beyond the rough categories of race, gender, and geography. For litigants, the available information could provide a wealth of insights once only available from expensive jury consultants. Big data could democratize access to information about jurors, leading to more diverse juries and jury venires, and potentially less discriminatory jury selection practices.

At the same time, court usage of big data technology carries real risks. Traditional jury roles and values, including the continued legitimacy of the jury system itself, are at stake. (24) Big data threatens to disrupt, improve, and re-imagine jury selection, just as it will affect other areas of our lives. Increased big data collection of personal information involves an invasion of privacy that, if embraced by the court system, could result in significant backlash against jury service. Issues of juror privacy continue to increase as new technologies allow lawyers and the public to learn about the nameless, faceless citizens called to service. (25) Potential jurors might find the idea of the court having access to this personal data (even if otherwise publicly accessible) so unpalatable as to undermine cooperation with the existing

jury system. (26) In addition, the idea of the juror constituting a representative "data point," rather than serving as a representative of the larger community, runs counter to the traditional ideal of the juror as the community conscience. (27) This affirmative targeting of classes of jurors also presents thorny constitutional issues, as considerations of race, gender, or ethnicity could run into equal protection problems. (28) Equalizing the availability of big data information about jurors, and making it a part of the jury selection system, raises practical, theoretical, and constitutional dilemmas which must be addressed.

Part I of this Article looks at the longstanding problems of diversity and discrimination in the jury selection process, with a focus on how the existing system limits the available information about jurors. (29) Currently, the jury selection system relies on very basic, almost uninformed data--here termed "dim data." Courts use this purposely shadowed and opaque dim data in an effort to avoid past discriminatory practices and equal protection scrutiny. (30) This approach necessarily limits the information available to courts and the parties. With the exception of Sixth Amendment fair cross-section challenges and Fourteenth Amendment equal protection challenges, courts choose to operate largely in the dark when it comes to who serves on juries. (31) Further, even within fair cross-section challenges, courts have disagreed on the appropriate analytical approach to remedy the defect of unrepresentative jury venires. (32)

Compounding the issue, there exists the problem of discrimination in the peremptory challenge stage. (33) While formal racial discrimination ended with Batson v. Kentucky, (34) and formal gender discrimination ended with J.E.B. v. Alabama, (35) scholars and practitioners acknowledge that race and gender considerations still influence the use of peremptory strikes. (36) These discriminatory practices result, in part, from the lack of information provided to lawyers about jurors. (37) Simply put, one reason why lawyers strike jurors based on the proxies of racial or gender stereotypes is because no better information is available. Consequently, despite great advances in creating larger, more diverse jury pools, concerns about racial or gender diversity within actual jury panels remain. (38)

Part II looks at the potential of "bright data" to respond to the problems of diversity and discrimination by providing more detailed information about potential jurors. This personal information already exists, is easily accessible, and could improve the diversity of jury venires and jury panels. Big data knows the demographic makeup of a community better than the court system does. (39) Big data companies regularly target zip codes, neighborhoods, households and even individuals based on numerous data points. (40)...

To continue reading