Algorithmic approach to forecasting rare violent events

DOIhttp://doi.org/10.1111/1745-9133.12476
Published date01 February 2020
AuthorRichard A. Berk,Susan B. Sorenson
Date01 February 2020
DOI: 10.1111/1745-9133.12476
SPECIAL ISSUE ARTICLE
COUNTERING MASS VIOLENCE IN THE UNITED STATES
Algorithmic approach to forecasting rare violent
events
An illustration based in intimate partner violence perpetration
Richard A. Berk Susan B. Sorenson
University of Pennsylvania
Correspondence
RichardA. Berk, Department of Cr iminology,
McNeilHall, University of Pennsylvania,
Philadelphia,PA 19104.
Email:berkr@sas.upenn.edu
Thoughtfulcomments and suggestions were
providedby colleagues Aaron Chalfin, John
MacDonald,Greg Ridgeway, MichaelKearns,
AaronRoth, and two anonymous reviewers.
Research Summary: Mass violence, almost no matter
how defined, is (thankfully) rare. Rare events are difficult
to study in a systematic manner. Standard statistical
procedures can fail badly, and usefully accurate forecasts
of rare events often are little more than an aspiration. We
offer an unconventional approachfor the statistical analysis
of rare events illustrated by an extensive case study. We
report research aimed at learning about the attributes of
very-high-risk intimate partner violence (IPV) perpetrators
and the circumstances associated with their IPV incidents
reported to the police. “Very high risk” is defined as
having a high probability of committing a repeat IPV
assault in which the victim is injured. Such individuals
represent a very small fraction of all IPV perpetrators;
these acts of violence reported to the police are rare. To
learn about them nevertheless, we sequentially apply in
a novel fashion three algorithms to data collected from a
large metropolitan police department: stochastic gradient
boosting, a genetic algorithm inspired by natural selection,
and agglomerative clustering. We try to characterize not
just perpetrators who on balance are predicted to reoffend
but also who are very likely to reoffend in a manner that
leads to victim injuries. Important lessons for forecasts of
mass violence are presented.
Policy Implications: If one intends to forecast mass vio-
lence, it is probably important to consider approaches less
Criminology & Public Policy. 2020;19:213–233. wileyonlinelibrary.com/journal/capp © 2019 American Society of Criminology 213
214 BERK AND SORENSON
dependent on statistical procedures common in criminol-
ogy.Given that one needs to “fatten” the right tail of the rare
events distribution, a combination of supervised machine
learning and genetic algorithms may be a useful approach.
One can then study a synthetic population of rare events
almost as if they were an empirical population of rare
events. Variants on this strategy are increasingly common
in machine learning and causal inference. Our overall goal
is to unearth predictors that forecast well. In the absence of
sufficiently accurate forecasts, scarce resources to help pre-
vent mass violence cannot be allocated where they are most
needed.
KEYWORDS
forecasting, genetic algorithms, intimate partner violence, machine learn-
ing, mass violence, synthetic populations
Forecasts of risk are routinely made in a wide variety of situations. What is the probability that a
hurricane will strike the Gulf Coast in a particular hurricane season? What is t he probability that a
given high school student will be accepted by his or her college of choice? What is the probability
that a particular business firm will declare bankruptcy? Coupled with each probability is the expected
cost should the event of concern occur. For the bankruptcy example, repayment of debt at 10 cents on
the dollar means a loss of 90 cents for every dollar invested. Risk formally is defined as the costs of a
particular event multiplied by the probability that the event will occur.
Forecasts of risk can be useful if they lead to actions that are better informed. For undesirable
outcomes, one hopes that prevention strategies can be implemented or that plans for remedial action
after the fact can be made. This has long been well understood by criminal justice decision makers
in the United States. Indeed, risk assessments have been used to inform criminal justice decisions
since the 1920s (Burgess, 1928). One might wonder, therefore, whether forecasts of risk might be
instructive for contemporary incidents of mass violence. Without good forecasts, scarce prevention
and remedial resources easily can be misallocated. One cannot, for example, place armed guards at
every church, mosque, or synagogue. Likewise, one cannot have grief counsellors at every business
establishment. Currently, those resources are distributed in a haphazard manner. Might legitimate risk
assessments help?
For almost any reasonable definition of mass violence, constructing sufficiently accurate forecasts
is a daunting undertaking. This holds whether one is trying to forecast the likely perpetrator, location,
or timing of an event. One obstacle is that mass violence is heterogeneous. It can include school
shootings, homicides committed by disgruntled employees, brutal hate crimes, systematic execution
of witnesses at a crime scene, fatal assaults by perpetrators of intimate partner violence (IPV), and
other mass violence in which the motives are obscure (e.g., the October 2017 Las Vegas music
festival mass shooting in which 58 people were killed and 851 wereinjured). Alt hough understanding
mass violence in general is an admirable aspiration, in the medium term, different forms of mass
violence might be productively examined separately. Useful forecasts probably will require different

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT