Statistical Procedures for Forecasting Criminal Behavior

AuthorRichard A. Berk,Justin Bleich
Date01 August 2013
DOIhttp://doi.org/10.1111/1745-9133.12047
Published date01 August 2013
RESEARCH ARTICLE
FORECASTING CRIMINAL BEHAVIOR
Statistical Procedures for Forecasting
Criminal Behavior
A Comparative Assessment
Richard A. Berk
Justin Bleich
University of Pennsylvania
Forecasts of recidivism have been widely used in the United States to inform parole
decisions since the 1920s (Borden, 1928; Burgess, 1928). Of late, such forecasts
are being proposed for a much wider range of criminal justice decisions. One im-
portant example is recent calls for predictions of “future dangerousness” to help shape
sentencing (Casey, Warren, and Elek, 2011; Pew Center of the States, 2011). The recom-
mendations build on related risk-assessment tools already operational in many jurisdictions,
some mandated by legislation (Hyatt, Chanenson, and Bergstrom, 2011; Kleiman, Ostrom,
and Cheeman, 2007; Oregon Youth Authority, 2011; Skeem and Monahan, 2011; Turner,
Hess, and Jannetta, 2009). In Pennsylvania, for instance, a key section of a recent statute
reads as follows:
42 Pa.C.S.A.§2154.7. Adoption of risk assessment instrument.
(a) General rule. – The commission shall adopt a sentence risk assessment instrument for
the sentencing court to use to help determine the appropriate sentence within the limits
established by law for defendants who plead guilty or nolo contendere to, or who were
found guilty of, felonies and misdemeanors. The risk assessment instrument may be
used as an aide in evaluating the relative risk that an offender will reoffend and be a
threat to public safety.
(b) Sentencing guidelines. – The risk assessment instrument may be incorporated into
the sentencing guidelines under section 2154 (relating to adoption of guidelines for
sentencing).
Thanks go to Bill Rhodes and three anonymous reviewers for many helpful comments on this article. Direct
correspondence to Richard A. Berk, Statistics Department, The Wharton School, University of Pennsylvania,
Philadelphia, PA 19104 (e-mail: berkr@wharton.upenn.edu).
Figures 1–4 are available in color in the online version of the article.
DOI:10.1111/1745-9133.12047 C2013 American Society of Criminology 513
Criminology & Public Policy rVolume 12 rIssue 3
Research Article Forecasting Criminal Behavior
(c) Pre-sentencing investigation report. – Subject to the provisions of the Pennsylvania
Rules of Criminal Procedure, the sentencing court may use the risk assessment instru-
ment to determine whether a more thorough assessment is necessary and to order a
pre-sentence investigation report.
(d) Alternative sentencing. – Subject to the eligibility requirements of each program, the
risk assessment instrument may be an aide to help determine appropriate candidates
for alternative sentencing, including the recidivism risk reduction incentive, State and
county intermediate punishment programs and State motivational boot camps.
(e) Definition. – As used in this section, the term risk assessment instrument means an
empirically based worksheet which uses factors that are relevantin predicting recidivism.
With such widespread enthusiasm and very high stakes, one might assume forecasting
accuracy has been properly evaluated and determined to be good. In fact, competent eval-
uations can be difficult to find for a wide variety of criminal justice decisions. Some of the
problems have a long history (Ohlin and Duncan, 1949; Ohlin and Lawrence, 1952; Reiss,
1951). For example, it is relatively rare for evaluations to be based on “test data” that were
not used to construct the forecasting procedures. The danger is grossly overoptimistic as-
sessments. More recent commentaries have documented several other problems, sometimes
including no evaluation at all (Berk, 2012; Farrington and Tarling, 2003; Gottfredson and
Moriarty, 2006).
The need for thorough and thoughtful evaluations has become even more impor-
tant over the past decade because in addition to calls for a more routine use of crime
forecasts, new forecasting tools from computer science and statistics have been developed.
Often supported by formal proofs, simulations, and comparative applications across many
different data sets, these tools promise improved accuracy in principle (Breiman, 1996,
2001a; Breiman, Friedman, Olshen, and Stone, 1984; Chipman, George, and McCulloch,
2010; Friedman, 2002; Vapnick,1998).1For example, Breiman (2001a) provided a formal
treatment of random forests and its comparative performance across 20 different data sets.
Several instructive criminal justice applications are in print as well (Berk, 2012).
Yet, several recent articles have claimed that for criminal justice applications, the new
tools perform no better than the old tools (Liu, Yang,Ramsay, Li, and Coid, 2011; Tollenaar
and van der Heijden, 2013; Yang,Liu, and Coid, 2010). Logistic regression (Berkson, 1951)
is a favorite conventional approach. The conclusion seems to be “why bother?” Forcriminal
justice forecasting applications, the new procedures are mostly hype:
The conclusion is that using selected modern statistical, data mining and
machine learning models provides no real advantage overlogistic regression and
1. Very accessible treatments can be found in several textbooks (Berk, 2008; Bishop, 2006; Hastie,
Tibshirani, and Friedman, 2009).
514 Criminology & Public Policy
Berk and Bleich
LDA. If variables are suitably transformed and included in the model, there
seems to be no additional predictive performance by searching for intricate
interactions and/or non-linear relationships. (Tollenaar and van der Heijden,
2013: 582)2
How can the proofs, simulations, and many applications provided by statisticians and
computer scientists be so wrong? How can it be that statistical procedures being rapidly
adopted by private firms such as Google and Microsoft and by government agencies such
as the Department of Homeland security and the Federal Bureau of Investigation are no
better than regression methods that have been readily available for more than 50 years?
Why would the kinds of new analysis procedures being developed for analyzing a variety of
data sets with hundreds of thousands of cases (Dumbill, 2013; National Research Council,
2013: Ch. 7) not be especially effective for a criminal justice data set of similar size?
A careful reading of the technical literature and recent criminal justice applications
suggests that there can be a substantial disconnect between that technical literature and
the applications favored by many criminal justice researchers. Statisticians and computer
scientists sometimes do not distinguish between forecasting performance in principle and
forecasting performance in practice. Criminal justice researchers too often proceed as if
the new procedures are just minor revisions of the generalized linear model. In fact, the
conceptual framework and actual procedures can be very different and require a substantial
change in data analysis craft lore. Without a proper appreciation of how the new methods
differ from the old, there can be serious operational and interpretative mistakes.
In this article, we try to improve the scientific discourse by providing an accessible
discussion of some especially visible, modern forecasting tools that can usefully inform
criminal justice decision making. Machine learning is used as the primary illustration. The
discussion is an introduction to material addressed far more deeply in Criminal Justice
Forecasts of Risk: A Machine Learning Approach (Berk, 2012). We also try to provide honest,
“apples-to-apples” performance comparisons between the newer forecasting methods and
more traditional approaches.
For some readers, it may be useful to make clear what this article is not about. As one
would expect, there have been jurisprudential concerns about “actuarial methods” dating
from at least the time when sentencing guidelines first became popular (Feeley and Simon,
1994; Messinger and Berk, 1987), and more recent discussions about the role of race have
introduced an important overlay (Berk, 2009; Harcourt, 2007). The issues are difficult
and real, but they are not addressed in this article. Our concerns are more immediate.
Forecasts of future dangerousness are being developed and used. Real decisions are being
made affecting real people. At the very least, those decisions should be informed by the
2. “LDA” stands for linear discriminant analysis.
Volume 12 rIssue 3 515

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT