Reuniting 'is' and 'ought' in empirical legal scholarship.

AuthorFischman, Joshua B.
PositionIV. Measuring the Rule of Law: Studies of Interjudge Disparity through Conclusion, with footnotes, p. 146-168
  1. MEASURING THE RULE OF LAW: STUDIES OF INTERJUDGE DISPARITY

    A central feature of the rule of law is that the application of legal force is governed by publicized rules rather than "the predilections of the individual decisionmaker." (150) A large body of empirical research has sought to measure the degree to which systems of adjudication deviate from this ideal. Such studies have typically documented statistical disparities among judges-differences in their rates of reaching various types of dispositions--and concluded that the rule of law is violated. Such claims are typically followed by calls for legal or institutional reform.

    The earliest example of such a study may be an annual report published by the criminal magistrates of New York City in 1914, (151) which provided detailed figures depicting the magistrates' conviction rates for various offenses. The magistrates reported large interjudge disparities in cases involving public intoxication, vagrancy, disorderly conduct, and peddling without a license, but more modest disparities in cases involving cruelty to animals and violations of the motor vehicle laws. (152)

    The magistrates were not trying to advance any grand theories about law or adjudication; they merely hoped that publication of the statistics would help the magistrates "recognize [their] own personal peculiarities" and "correct any that cannot be justified in light of the records of [their] associates." (153) But their reports provided inspiration to legal realists such as Jerome Frank (154) and to political scientists such as Charles Grove Haines, (155) who viewed the results as confirmation that adjudication was inevitably idiosyncratic.

    Since the publication of the magistrates' report in 1914, numerous studies have documented significant interjudge disparities in cases involving criminal law, (156) social security disability claims, (157) and asylum adjudication. (158) The original magistrates' report had modest normative goals, but many of these later studies advocated bold reforms. Disparity studies provided much of the impetus for the enactment of the U.S. Sentencing Guidelines (159) and the disability grid for social security disability claims. (160) More recently, scholars have been debating proposed reforms to address disparities in asylum adjudication. (161) Yet despite the large number of disparity studies that have been conducted and the prominence of the policy claims that have been advanced, there has been surprisingly little discussion about how observable disparities relate to normatively significant concepts.

    1. The Normative Implications of Disparity

      In the century since the magistrates released their annual report, the methodology of disparity studies has barely changed. The studies count judges' decisions and report rates at which they reach various types of outcomes. Whenever disparities are found, the authors conclude that some reform is needed. Yet only a few of these studies have acknowledged that these statistical disparities by themselves do not have intrinsic normative significance. As one study of social security disability adjudication observed:

      Two judges with different [rates of reversing disability determinations] are probably behaving differently. But the reverse is not necessarily true: there is no reason to exclude the possibility that two judges with 50 percent [rates] are also behaving differently. Indeed, the likelihood is great that the existing statistics mask an indeterminate additional amount of nonuniformity in the judge-to-judge handling of [social security] claims. (162) Thus, although large disparities among judges are problematic, small disparities do not necessarily indicate that a system of adjudication is functioning well. If two social security judges were deciding cases using coin flips, there would be no disparity, since both would reverse agency determinations 50% of the time. This means that any existing disparities in grant rates could be eliminated by ordering all judges to flip coins. The absurdity of such a proposal demonstrates that eliminating statistical disparity is not itself a worthy goal. Statistical disparity is only of interest insofar as it can shed light on other values.

      To understand the normative implications of these studies, it is necessary to articulate the values at stake and to explain how they relate to the measureable statistics. Some of the prior scholarship has made efforts to identify the relevant values, such as consistency, correctness, determinacy, fairness, predictability, non-arbitrariness, and the rule of law. (163) But there has been little effort to explain how these values can be measured using available data. In fact, the relationships between these values and measureable statistics can be quite complex.

    2. Consistency, Predictability, and Comparative Justice

      Statistical disparities are of interest in part because they provide evidence of interjudge inconsistency--meaning that some cases would have been decided differently if they had been assigned to different judges. Indeed, many discussions of interjudge disparity focus on inconsistency as a normative concept. (164) Inconsistency has normative significance for two distinct reasons. The first is that it diminishes the predictability of adjudication. The rule of law requires that people have notice regarding how the law will be applied so that they can conform to its requirements and plan their affairs accordingly. (165) Notice will necessarily be inadequate to the extent that the application of the law depends upon which judge is deciding each case. (166)

      Inconsistency among judges also implicates comparative justice. (167) Some legal rights may be comparative, in the sense that "a person's due is determinable only by reference to his relations to other persons." (168) In the sentencing context, for example, moral or legal principles may determine that two offenders are equally culpable and should therefore receive the same sentence, even if those principles do not uniquely determine what that sentence should be. If two such offenders receive different sentences only because they were sentenced by different judges, such a result would constitute a violation of comparative justice.

      Interjudge inconsistency, however, only captures one aspect of comparative justice. If two judges fail to treat like cases alike in precisely the same way--perhaps by exhibiting the same degree of racial bias--then they could be perfectly consistent with each other yet still violate comparative justice. Nevertheless, inconsistency provides some evidence of comparative injustice when the cases under examination present common legal or factual patterns.

      Interjudge inconsistency appears to have an intuitive relationship with observable data. If two social security judges are granting benefits to claimants at very different rates, then they probably are treating the claimants inconsistently. Yet the relationship between measurable statistical disparity and inconsistency is far more complex than the disparity studies have acknowledged. (169) The difference between the judges' grant rates only determines lower and upper bounds for inconsistency, but cannot identify the precise level. Suppose, for example, that Judge A grants benefits to 30% of claimants and Judge B grants benefits to 40% of claimants. If these two judges saw a comparable mix of cases, then it follows that they would have reached different results in at least 10% of the cases. There is no reason, however, to presume that they would have disagreed exactly 10% of the time. In fact, they could have disagreed as much as 70% of the time if they would have granted benefits to entirely different sets of claimants.

      Without more information, it is impossible to know whether the rate of inconsistency between Judges A and B is closer to 10% or 70%. Measuring inconsistency requires not only the judges' grant rates, but also the degree to which their decisions are correlated. There are no data that can provide estimates of correlation, however, because the two judges are never observed deciding the same case. Conceivably, one could administer surveys to Judge A and Judge B and compare their reactions to identical fact patterns, which could then be used to compute the correlation between their decisions. A few studies did administer such surveys in the 1970s and early 1980s, (170) but to my knowledge, no recent disparity study has sought to measure the correlation of judges' decisions using surveys.

      This simple example involved only two judges and assumed that the judges' grant rates were known exactly. When there are more than two judges, the relationship between grant rates and inconsistency becomes far more complex. (171) Complex statistical problems arise when judges' grant rates are not known precisely, but must be inferred from judges' decisions in actual cases. (172)

    3. Determinacy and Correctness

      The concept of interjudge consistency was defined without reference to the content of law or any substantive conception of justice. This makes inconsistency easier to conceptualize but also limits its utility as a normative metric. Inconsistency may be most important in settings where predictability is paramount and correctness is a secondary concern. As Justice Brandeis wrote, it is sometimes "more important that the applicable rule of law be settled than that it be settled right." (173) But in many settings, assessing whether a system of adjudication satisfies the requirements of law and justice may be more important than whether it provides consistent results. (174)

      Any attempt to measure whether decisions are correct or just will typically require addressing concepts that are not objectively measureable, at least whenever the meaning of law or the requirements of justice are disputed. Nevertheless, it is possible to make limited objective claims about correctness on the basis of empirical data. Consider, for...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT