EQUAL PROTECTION UNDER ALGORITHMS: A NEW STATISTICAL AND LEGAL FRAMEWORK.
Author | Yang, Crystal S. |
In this Article, we provide a new statistical and legal framework to understand the legality and fairness of predictive algorithms under the Equal Protection Clause. We begin by reviewing the main legal concerns regarding the use of protected characteristics such as race and the correlates of protected characteristics such as criminal history. The use of race and nonrace correlates in predictive algorithms generates direct and proxy effects of race, respectively, that can lead to racial disparities that many view as unwarranted and discriminatory. These effects have led to the mainstream legal consensus that the use of race and nonrace correlates in predictive algorithms is both problematic and potentially unconstitutional under the Equal Protection Clause. This mainstream position is also reflected in practice, with all commonly used predictive algorithms excluding race and many excluding nonrace correlates such as employment and education.
Next, we challenge the mainstream legal position that the use of a protected characteristic always violates the Equal Protection Clause. We develop a statistical framework that formalizes exactly how the direct and proxy effects of race can lead to algorithmic predictions that disadvantage minorities relative to nonminorities. While an overly formalistic solution requires exclusion of race and all potential nonrace correlates, we show that this type of algorithm is unlikely to work in practice because nearly all algorithmic inputs are correlated with race. We then show that there are two simple statistical solutions that can eliminate the direct and proxy effects of race, and which are implementable even when all inputs are correlated with race. We argue that our proposed algorithms uphold the principles of the equal protection doctrine because they ensure that individuals are not treated differently on the basis of membership in a protected class, in stark contrast to commonly used algorithms that unfairly disadvantage minorities despite the exclusion of race.
We conclude by empirically testing our proposed algorithms in the context of the New York City pretrial system. We show that nearly all commonly used algorithms violate certain principles underlying the Equal Protection Clause by including variables that are correlated with race, generating substantial proxy effects that unfairly disadvantage Black individuals relative to white individuals. Both of our proposed algorithms substantially reduce the number of Black defendants detained compared to commonly used algorithms by eliminating these proxy effects. These findings suggest a fundamental rethinking of the equal protection doctrine as it applies to predictive algorithms and the folly of relying on commonly used algorithms.
TABLE OF CONTENTS INTRODUCTION I. PREDICTIVE ALGORITHMS AND THE EQUAL PROTECTION CLAUSE A. Direct Effects of Protected Characteristics B. Proxy Effects of Protected Characteristics C. Trade-Off Between Fairness and Accuracy II. PREDICTIVE ALGORITHMS IN THE CRIMINAL JUSTICE SYSTEM A. Survey of Predictive Algorithms in the Criminal Justice System B. Summary of Predictive Algorithms in the Criminal Justice System III. A STATISTICAL FRAMEWORK FOR PREDICTIVE ALGORITHMS A. Categorizing Algorithmic Inputs B. Benchmark Statistical Model C. The Direct and Proxy Effects of Algorithmic Inputs IV. FORMALISTIC AND STATISTICAL SOLUTIONS TO ENSURING RACE NEUTRALITY A. Formalistic Solution: The Excluding-Inputs Algorithm B. Our First Solution: The Colorblinding-Inputs Algorithm C. Our Second Solution: The Minorities-as-Whites Algorithm D. Legality of Our Two Statistical Solutions E. Racial Disparities Under Our Two Statistical Solutions V. EMPIRICAL TESTS OF OUR PROPOSED STATISTICAL SOLUTIONS A. The New York City Pretrial System B. Data Description C. Proxy Effects in Commonly Used Algorithms D. Comparison of Different Predictive Algorithms VI. EXTENSIONS A. Additional Protected Characteristics B. More Complicated Algorithms C. Other Contexts CONCLUSION INTRODUCTION
There has been a dramatic increase in the use of predictive algorithms in recent years. Predictive algorithms typically use individual characteristics to predict future outcomes, guiding important decisions in nearly every facet of life. In the credit market, for example, these algorithms use characteristics such as an individual's credit and payment history to predict the risk of default, often summarized as a single "credit score." (1) These credit scores are used in almost all consumer-lending decisions, including both approval and pricing decisions for credit cards, private student loans, auto loans, and home mortgages. (2) Credit scores are also widely used in nonlending decisions, such as rental decisions for apartments. (3) In the labor market, predictive algorithms use characteristics such as an individual's past work experience and education to predict productivity or tenure, with employers using these predictions to make hiring, retention, and promotion decisions. (4) In the criminal justice system--the focus of our Article--predictive algorithms use characteristics such as an individual's criminal history and age to predict the risk of future criminal behavior, with these "risk assessments" used to inform pretrial-release conditions, sentencing decisions, and the dispatch of police patrols. (5)
The increasing use of these algorithms has contributed to an active debate on whether commonly used predictive algorithms intentionally or unintentionally discriminate against certain groups, in particular racial minorities and other protected classes. In theory, predictive algorithms have the potential to reduce discrimination by relying on statistically "fair" associations between algorithmic inputs and the outcome of interest. (6) Yet, critics argue that the algorithmic inputs are themselves biased, resulting in violations of the equal protection doctrine and antidiscrimination law. (7) For example, many scholars have raised questions about the growing use of predictive algorithms in making hiring and retention decisions, often arguing that Title VII of the Civil Rights Act of 1964, the primary law prohibiting employment discrimination on the basis of protected characteristics such as race, sex, religion, and national origin, proscribes the use of any such characteristics. (8) In addition, scholars have argued that using even seemingly neutral traits in these algorithms can end up "indirectly determin[ing] individuals' membership in protected classes" and subsequently harm class members if these traits are correlated with protected characteristics. (9) Reflecting these concerns, recent policy proposals regarding algorithms have sought to prohibit the use of protected characteristics, either directly or through proxies. For example, in 2019, the Department of Housing and Urban Development issued a proposal that allows landlords to use a predictive algorithm to screen tenants but prohibits the use of inputs that are deemed to be "substitutes or close proxies" for protected characteristics. (10)
The debate about whether commonly used predictive algorithms discriminate against minorities has been particularly heated in the criminal justice system, where risk-assessment tools are increasingly utilized. (11) Critics of algorithmic risk assessments have argued that use of demographic characteristics such as race or gender in predictive algorithms "amounts to overt discrimination based on demographics and socioeconomic status" and note that use of these characteristics "can be expected to contribute to the concentration of the criminal justice system's punitive impact among those who already disproportionately bear its brunt, including people of color." (12) There are also concerns that seemingly neutral algorithmic inputs such as employment and education may nonetheless result in unwarranted racial disparities because they may serve as proxies for race. (13)
These concerns are echoed in statements made by prominent public officials, including former Attorney General Eric Holder, who argue that "[b]y basing sentencing decisions on static factors and immutable characteristics--like the defendant's education level, socioeconomic background, or neighborhood--they may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society." (14) Even commonly used algorithmic inputs such as current charge and prior criminal history, which many argue are both relevant and legally permissible, (15) may generate unwarranted disparities. For example, an individual's prior criminal history can be driven, at least in part, by racial biases in policing, not just past criminal behavior. In this scenario, using prior arrests as an algorithmic input can result in past discrimination being "baked in" to the algorithm. (16)
In this Article, we provide a new statistical and legal framework to understand the legality and fairness of using protected characteristics in predictive algorithms under the Equal Protection Clause. The framework we develop sheds new light on the main legal and policy debates regarding which individual characteristics should be included in predictive algorithms, particularly those characteristics related to race. The framework is general in nature and applies to any legal setting involving the use of predictive algorithms, but we focus our theoretical and empirical examples on a context where algorithms are increasingly ubiquitous and consequential: the decision of whether defendants awaiting trial should be detained or released back into the community prior to case disposition.
The Article proceeds in six parts. In Part I, we provide an overview of the legal and policy concerns surrounding the use of protected characteristics to make predictions about individuals in the criminal justice system. Protected characteristics are defined as those that can trigger heightened...
To continue reading
Request your trialCOPYRIGHT GALE, Cengage Learning. All rights reserved.