MUCH OF THE controversy over algorithmic decision making is concerned with fairness. Generally speaking, most of us regard decisions as fair when they're free from favoritism, self-interest, bias, or deception and when they conform to established standards or rules. However, it turns out that defining algorithmic fairness is not always simple to do.
That challenge garnered national headlines in 2016 when ProPublica published a study claiming racial bias in COMPAS, a recidivism risk assessment system used by some courts to evaluate the likelihood that a criminal defendant will reoffend. The journalism nonprofit reported that COMPAS was twice as likely to mistakenly flag black defendants as being at a high risk of committing future crimes (false positives) and twice as likely to incorrectly label white defendants as being at a low risk of the same (false negatives).
Because the system is sometimes used to determine whether or not an inmate is paroled, lots of black defendants who would not have been re-arrested remain in jail while many white defendants who will be re-arrested are let go. This is the very definition of disparate impact, or discrimination in which a facially neutral practice has an unjustified adverse impact on members of a protected class. Under that standard, ProPublica declared the outcome unfair.
The COMPAS software's developers countered with data showing that black and white defendants with the same COMPAS scores had almost the exact same recidivism propensities. For example, their algorithm correctly predicted that 60 percent of white defendants and 60 percent of black defendants with COMPAS scores of seven or higher on a 10-point scale would reoffend during the next two years (predictive parity). The developers argued that the COMPAS results were therefore fair because the scores mean the same thing regardless of whether a defendant is black or white. Consequently, because there is a difference in the recidivism base rate between blacks and whites, a system with predictive parity will necessarily produce racially disparate rates of false positives and negatives.
The controversy over COMPAS highlights the tension between notions of individual fairness and group fairness, which can be impossible to reconcile. In fact, the Princeton computer scientist Arvind Narayanan has identified more than 21 different algorithmically incompatible definitions of fairness.
In 2015, Eric Loomis, a man who had been convicted of eluding...