Mapping the Iceberg: The Impact of Data Sources on the Study of District Courts

Date01 September 2020
DOIhttp://doi.org/10.1111/jels.12264
AuthorMargo Schlanger,Pauline T. Kim,Christina L. Boyd
Published date01 September 2020
Journal of Empirical Legal Studies
Volume 17, Issue 3, 466–492, September 2020
Mapping the Iceberg: The Impact of Data
Sources on the Study of District Courts
Christina L. Boyd, Pauline T. Kim*, and Margo Schlanger
Three decades ago, Siegelman and Donohue aptly characterized research about courts and liti-
gation that relied only on published opinions as “studying the iceberg from its tip.” They
implored researchers to view published district court opinions “with greater sensitivity to the
ways in which such cases are unrepresentative of all cases”. The dynamic, multistage nature of
trial court litigation makes a focus solely on published opinions particularly ill-suited to the study
of federal district courts. Expanded electronic access to court documents now allows more pre-
cise analysis of the ways in which published cases are unrepresentative and what differences that
makes for conclusions about the work of district courts. Heeding Siegelman and Donohue’s
admonition, this study seeks to map the iceberg, exploring the extent to which the visible part
misrepresents what lies below the surface. Using a supplemented version of the Kim, Schlanger,
and Martin EEOC Litigation Project data, this article examines the varying extent to which cases
and judicial activity are visible in the several data sources commonly used by district court
researchers. More specifically, we analyze how the work of federal district courts looks different
depending on whether research relies on published opinions, on opinions available on Westlaw
or Lexis (both “published” and “unpublished”), or on more comprehensive data available on
PACER (Public Access to Court Electronic Documents). Our results reveal vast variation in visi-
bility of cases and motions, depending on the data source used. We also demonstrate that these
differences in case and motion visibility can affect the results of empirical analyses relating to,
for example, the success rates of litigants and whether the party of the appointing president
affects judicial behavior. Our findings mean that utilizing docket sheets, now available electroni-
cally, to gather data will often be required to draw accurate conclusions about the nature of dis-
trict court litigation and the behavior of district court judges.
I. Introduction
Federal district courts are incredibly important actors in the implementation of federal
law. Today, these courts receive well over 350,000 new civil and criminal cases per year,
*Address correspondence to Pauline Kim, Daniel Noyes Kirby Professor of Law, Washington University School of
Law, Campus Box 1120, One Brookings Drive, St. Louis, MO 63130; email: kim@wustl.edu. Boyd is Associate Pro-
fessor, Department of Political Science, University of Georgia; Schlanger is the Wade H. and Dores M. McCree Col-
legiate Professor of Law, University of Michigan. We are grateful for the generous support of the William W. Cook
Endowment of the University of Michigan, the National Science Foundation (SES-0718831), and to the following
research assistants for their work on this project: Adam Rutkowski, Emma Brunner, Vander Copeland, Estefania
Edens, Katherine Feagin, Young Jeon, Savannah Lawson, Caitlyn Kinard and Jordan McGill.
466
compared to just 50,000 matters in the federal courts of appeals and fewer than 80 merits
cases at the U.S. Supreme Court annually (Administrative Office of the U.S. Courts 2018;
Spaeth et al. 2019). Because of this unequal distribution of cases across the judicial hier-
archy, federal district judges, who constitute over three-quarters of authorized federal
Article III judgeships (Judicial Business 2018), have been referred to as “the workhorses
of the federal judiciary” (Abraham 1998). For most federal cases, the pursuit of justice
not only begins in the district court but ends there, too (Carp & Wheeler 1972:361).
But while the importance of district courts can hardly be disputed, empirical
scholars of district courts face high hurdles studying them in an appropriate and compre-
hensive way. As Hoffman et al. argue, “empirical work about [federal] trial courts is more
expensive, more time-consuming, and more uncertain than one might imagine”
(2007:727). And as Levin notes, “studying the district courts in a systematic way is
difficult—more difficult than studying federal appellate courts and far more difficult than
studying the Supreme Court” (2008:981). What makes good empirical district court
research so difficult? Much of the difficulty stems from the dynamic, multistaged nature
of litigation at the trial court level. Kim et al. explain:
[A] district judge may rule in a single case on multiple occasions and on different types of ques-
tions, only a few of which could be dispositive but all of which affect the case’s progress and
ultimate outcome. Moreover, because many of the judge’s actions are taken in response to
motions by the parties, there is no determinate sequence in which pretrial litigation events
occur. Rather, how a case proceeds depends on the choices made by the parties—what motions
are filed by whom and how discovery unfolds (2009:85).
The bulk of the prior literature has identified district court cases for study through
the opinions published, in print, in official sources like the Federal Supplement or Federal
Rules Decisions (e.g., Rowland et al. 1984; Rowland & Carp 1996; Schultz & Petterson 1992;
Segal 2000; Walker & Barrow 1985; Winkler 2006). More recent district court empirical
scholarship often also includes “unpublished” opinions that are available through legal
databases like Westlaw and Lexis (e.g., Banks & Tauber 2014; Sen 2015)
1
.
Does studying published and unpublished opinions available in commercial legal
databases accurately capture the work of the federal district courts? We have multiple rea-
sons to suspect that the answer is no. One study (Schlanger & Lieberman 2006:163–64)
found 28,000 district court opinions in 2004 that were accessible on Westlaw, even
though over 10 times that many cases were terminated in the district courts during the
1
For years, some district court scholars have used the Administrative Office of the U.S. Courts and Federal Judicial
Center’s Integrated Data Base (see Eisenberg 2004; Eisenberg & Schlanger 2005). These data provide case-level
information on every federal district court case dating back to 1970 that includes “names of the parties, the subject
matter category and the jurisdictional basis of the case, the case’s origin in the district as original or removed or
transferred, the amount demanded, the dates of filing and termination in the district court or the court of appeals,
the procedural stage of the case at termination, the procedural method of disposition, and, if the court entered
judgment or reached decision, who prevailed” (Clermont et al. 2003). The FJC has recently made these data far
more accessible by posting them in close to real time and with caption and docket number information at
https://www.fjc.gov/research/idb.
Mapping the Iceberg 467

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT