Replication and Research Integrity in Criminology: Introduction to the Special Issue

Published date01 August 2018
AuthorJukka Savolainen,Matthew VanEseltine
Date01 August 2018
DOI10.1177/1043986218777288
Subject MatterIntroduction
https://doi.org/10.1177/1043986218777288
Journal of Contemporary Criminal Justice
2018, Vol. 34(3) 236 –244
© The Author(s) 2018
Reprints and permissions:
sagepub.com/journalsPermissions.nav
DOI: 10.1177/1043986218777288
journals.sagepub.com/home/ccj
Introduction
Replication and Research
Integrity in Criminology:
Introduction to the Special
Issue
Jukka Savolainen1 and Matthew VanEseltine1
Imagine growing up in a family where every statement you make is called into ques-
tion by the other family members. Perhaps such families exist but they would likely be
considered “dysfunctional.” Consider a mother who loves all the children in the neigh-
borhood equally, taking no special interest in her own child. We would consider this a
remarkable deviation from normal parenting behavior. Mothers and fathers are
expected to favor their own children. Unlike the family, science is a social institution
where this kind of behavior is not only tolerated but encouraged. The sociologist
Robert K. Merton is well known in criminology as the foundational theorists of the
anomie/strain theoretical tradition, but he was probably even more influential in the
field of science studies (Cole, 2004). Merton’s (1942) four norms of science (Table 1)
are still considered as the canonical characterization of the scientific ethos, a value
system meant to ensure the integrity of scientific research.
How well are these norms followed in the everyday practice of social science in
general and criminology in particular? Organized skepticism is the norm most relevant
to replication research, the theme of this special issue.1 Many aspects of scientific
research are purposefully organized in ways that make it difficult for a claim to be
granted the status of a scientific fact. Norms regarding p values in quantitative research
set a high bar for acceptable knowledge claims. If a home owner is 93% confident that
her roof will survive another winter without leaks, she is likely to delay its replace-
ment. However, in social science, this level of confidence does not meet the conven-
tional standard (p
population parameter. The anonymous peer review is another example of organized
skepticism. Editors send manuscripts out for review to selected participants in the
1University of Michigan, Ann Arbor, MI, USA
Corresponding Author:
Jukka Savolainen, Institute for Social Research, University of Michigan, 330 Packard St. (Perry Building),
1106L, Ann Arbor, MI 48104, USA.
Email: jsavolai@umich.edu
777288CCJXXX10.1177/1043986218777288Journal of Contemporary Criminal JusticeSavolainen and VanEseltine
research-article2018
Savolainen and VanEseltine 237
field, whose task is to submit every aspect of the study to intense scrutiny. Wouldn’t it
be nice to have something akin to the peer review process when fielding competing
estimates from roofing contractors?2
Credible pursuit of organized skepticism depends on adherence to the other three
norms of science, especially communism, which calls for transparency in the research
process, including open access to the data and tools used in the production of knowledge.
In a peer-review situation, the referee is presented with a manuscript that describes how
the study was carried about and what the results were. In the course of the review, the
referee evaluates the appropriateness of the methodological decisions, such as the assump-
tions of the statistical model or the way in which the measures were constructed. However,
the space allotted to methodology cannot typically present a comprehensive discussion of
every decision made over the course of the study. Without direct access to the data and the
code used to run the analysis, it is difficult for reviewers to verify the extent to which a
manuscript is an accurate representation of the underlying research effort.
Science is a human activity; it would be naive to assume a complete absence of bad
actors. Unfortunately, the history of science has plenty of examples of individuals
willing to misrepresent their findings to advance their careers (e.g., Bechtel & Pearson,
1985; Stapel, 2014). A prominent recent case involved an experimental study of politi-
cal persuasion which purported to show that voters, who were initially opposed to gay
marriage, were more likely to change their minds after personal contact with openly
gay canvassers. Failing to replicate the study with their own data, a team of scholars
turned to the original dataset archived in the OpenICPSR repository (LaCour, 2016)
and found a number of irregularities that raised suspicions about the authenticity of the
reported findings (LaCour & Green, 2014). After the junior author (LaCour) respon-
sible for the data collection failed to present the raw data to the senior author (Green),
the latter issued a retraction, which was accepted by the journal.
Although these types of incidents of outright data fabrication are likely rare, it is
difficult to estimate the degree of dishonesty prevailing in scientific research. Given
the pressure to publish results that are “statistically significant,” there are empirically
grounded reasons to believe that some scholars engage in what is known as p-hacking,
that is, an effort to “dredge” the data until the coefficient of interest passes the 5%
Table 1. Merton’s Four Norms of Science.
Four norms of science Description
1. Communism Common ownership of scientific goods, open sharing of data
and knowledge; transparency as opposed to secrecy.
2. Universalism Same standards of evidence and validity apply to all participants
in the scientific discourse, regardless of social status or
personal attributes.
3. Disinterestedness Participants should not favor one outcome over another in the
process of conducting scientific research; impartiality.
4. Organized skepticism Collective critical scrutiny of every aspect of the research
process, both before and after the publication of results.
238 Journal of Contemporary Criminal Justice 34(3)
threshold of statistical significance. As evidence of questionable research practices,
scholars have noted the remarkable clustering of p values in published articles just
slightly below the conventional .05 standard (Masicampo & Lalande, 2012;
Nieuwenhuis, 2016). This observation is taken as indication of systemic p-hacking in
social and behavioral science (but see Lakens, 2015, for an opposing perspective).
Even if science was a society of saints, there are still good reasons to promote
increased transparency and openness. First, humans are fallible. Mistakes happen, as
evidenced by the highly influential paper by two Harvard economists widely cited as
evidence in support of fiscal austerity as the right policy to promote economic growth
during recession (Reinhart & Rogoff, 2010). Once a team of researchers at the
University of Massachusetts gained access to the raw data, one of the issues they dis-
covered was a simple but consequential coding error in the Excel spreadsheet
(Herndon, Ash, & Pollin, 2014). Second, due to “researcher degrees of freedom”
(Simmons, Nelson, & Simonsohn, 2011) any given dataset can be analyzed multiple
different ways, starting with the handling of missing data and ending with sensitivity
checks. To perform an empirical study, researchers have to make a series of decisions
to go one way instead of the other in a “garden of forking paths” (Gelman & Loken,
2013). As a matter of principle, organized skepticism means leaving no stone unturned,
but, in the practical context of a single study, this is not a realistic possibility. Articles
have page limitations; projects have time limitations. For this reason alone, it is impor-
tant to support efforts to replicate and verify published studies.
In sum, replication is important because the standard peer review process alone
does not guarantee the integrity of the results reported in the literature. Although rep-
lication research has always been an accepted feature of social science, an emerging
consensus suggests that this line of critical inquiry has been neglected to the detriment
of the credibility of the published output (Freese & Peterson, 2017; McNeeley &
Warner, 2015). A recent review of replication research in criminology found that only
0.45% of articles in the Web of Science database were replication studies (Pridemore,
Makel, & Plucker, 2018). A watershed moment in the discourse on replication research
occurred in 2015 when the journal Science published results from a study that attempted
to reproduce results from 100 studies published in leading journals of psychology
(Open Science Collaboration, 2015). Across each definition of replication success, a
clear majority, 60% or more, of the studies failed the test. For example, virtually all
(97%) the original studies had reported statistically significant findings (p
only 36% of the replications met this bar. Although some have questioned the conclu-
sions derived from this report (Gilbert, King, Pettigrew, & Wilson, 2016), the Open
Science Collaboration has helped energize an interdisciplinary movement to increase
research integrity in the behavioral and social sciences.3
The open science movement promotes not only replication research but also
improvements in all aspects of the research process to increase the reproducibility of
results. One such emerging practice is research preregistration, which requires schol-
ars to describe in advance what their research questions are and how they are going to
pursue them. Ideally, a detailed description of the undertaking is recorded and pub-
lished before the first empirical step of the study begins. This description includes a
Savolainen and VanEseltine 239
precise statement of the hypotheses, the data collection protocol, all the measurement
decisions, and the computer code used to run the analyses. This approach is meant to
ensure that no questionable research practices will be introduced in the research pro-
cess in the event that the empirical world fails to cooperate with theoretical expecta-
tions. Research preregistration is an excellent safeguard for observing the third norm
of science, disinterestedness, as under this practice researchers are required to publicly
commit to their hypotheses and methodology prior to knowing the outcome of the
study. Research preregistration also supports organized skepticism, as it will be far
easier to replicate a study that is associated with clear and specific instructions.
A number of behavioral and social science journals have begun to print open sci-
ence “badges” to indicate published articles’ adherence to preregistered design, data
sharing, and/or sharing of other research materials (such as code). The American
Economic Review, the highest ranked journal in economics, endorses a policy that
requires published articles to make their data “readily available to any researcher for
purposes of replication.” The American Journal of Political Science (AJPS), one of the
top two journals in the field, subscribes to an exceptionally rigorous replication and
verification standard. Any article accepted for publication in AJPS must first go
through a verification process during which an independent research team contracted
by the journal will repeat the analyses as reported in the manuscript. If this process
cannot reproduce the results, the paper will be rejected. A successful verification typi-
cally takes several iterations involving additional communication with the authors
(Jacoby, 2017).
Criminology and sociology—the general social science discipline most closely tied
to criminology—lag behind psychology, economics, and political science in participa-
tion in the open science movement (Freese & Peterson, 2017; Pridemore et al., 2018).
To our knowledge, there are no journals in criminology that require data sharing as a
condition for publication, and we are not aware of any preregistered criminological
publications thus far. There has been no systematic attempt in criminology or criminal
justice to evaluate the replicability of studies published in the leading journals. The
relative neglect of replication research in criminology is somewhat surprising given
that, as described below, opportunities for organizing something akin to a reproduc-
ibility project are nothing short of excellent in our field.
Since 1978, the National Archive of Criminal Justice Data (NACJD) has dissemi-
nated carefully curated datasets for secondary analysis. National Institute of Justice
(NIJ), the primary federal sponsor of research on crime and justice in the United States,
has been a pioneer in data sharing. Since the late 1970s, the NIJ has required its grant-
ees to deposit their data at NACJD to be documented and made discoverable for the
research community (Garner, 1981). As of April 1, 2018, the NACJD archive features
1,102 datasets from NIJ investigator-initiated studies. The NACJD bibliography of
data-related literature has captured 16,397 publications based on those datasets.4 For
example, a quick search of the NACJD database shows that an article by Augustyn and
McGloin (2018), which appears in the current (February 2018) issue of the journal
Criminology, used data from Research on Pathways to Desistance study (ICPSR
32282). Any scholar interested in investigating the reproducibility of these results has
240 Journal of Contemporary Criminal Justice 34(3)
the opportunity to gain access to the data by following a simple application process.
The same can be said for thousands of other publications.
The impulse for this special issue of the Journal of Contemporary Criminal Justice
came from the realization not only that it is important to investigate the reproducibility
and replicability of criminological research but also that such an undertaking would be
relatively easy to accomplish given the resources available at NACJD. In November
2016, we announced a call for papers soliciting submissions for a special issue dedi-
cated to replication research in criminology and criminal justice. The announcement
was accompanied with instructions on how to use the NACJD bibliography to identify
studies to replicate and gain access to the relevant data. Although using datasets dis-
seminated by NACJD was not a requirement for this special issue, we were pleased to
see that many submissions did.
Freese and Peterson (2017) classify replication research into four categories based
on whether the data and the methods used in the replication are similar versus different
from the original study. As shown in Table 2, the articles included in this issue cover
each type of replication, and some of them exemplify multiple types. The most thor-
ough contribution in this regard is the article by Theodore Lentz (2018), which revisits
a recent Criminology article by Jeffrey Brantingham (2016) titled “Crime Diversity.”
Lentz first uses the same publicly available data from Los Angeles as the original
study. He examines these data using the same procedures as Brantingham but also
considers alternative methodological decisions. Finally, Lentz repeats both sets of pro-
cedures using data from a different ecological context, the city of St. Louis. Keeping
methodology constant, he finds consistent support, in both datasets, for the patterns
reported in the original article. However, the findings were sensitive to decisions about
sampling and the area unit of analysis.
Three of the five articles appearing in this issue used data archived at NACJD.
Each of these is replications of what can be described as classic studies. Maxwell,
Garner, and Skogan (2018) tackled a study that is one of the most influential contribu-
tions to contemporary criminological canon, the seminal study of collective efficacy
by Sampson, Raudenbush, and Earls (1997). Published in the journal Science, this
Table 2. Forms of Replication in Social Science.
Data source Same procedure Different procedure
Same Verifiability
Lentz (2018)
Maxwell, Garner, and Skogan
(2018)
Robustness
Lentz (2018)
Myers, Lloyd, Turanovic,
and Pratt (2018)
Stamatel and Romans (2018)
Different Repeatability
Christ, Schwartz, Stoltenberg,
Brauer, and Savolainen (2018)
Lentz (2018)
Generalization
Lentz (2018)
Stamatel and Romans (2018)
Source. Adapted from Freese and Peterson (2017).
Savolainen and VanEseltine 241
study was cited 9,795 times according to a Google Scholar search performed in April
2, 2018. This citation count is astronomical in criminology. By comparison, Cohen
and Felson’s (1979) game-changing article on routine activities theory has been cited
“only” 7,785 times using the same metric, despite being almost two decades older.
Given the status of the original article, we were relieved to see that Maxwell et al.’s
sophisticated reanalysis managed to reproduce the original results with remarkable
accuracy. We suspect that with improved documentation of all the methodological
decisions related to the original study, the level of accuracy would have been even
stronger. As noted, based on the experience of the AJPS verification policy, it is rare
for an independent team to reproduce the results on first attempt, even with access to
documentation far more detailed than what was available to Maxwell et al. Given the
circumstances, this level of success in the reproduction adds a great deal of confi-
dence in the integrity of Sampson et al.’s (1997) influential article.
“The Cycle of Violence” by Cathy Spatz Widom (1989) was the focus of replica-
tion in the article by Myers, Lloyd, Turanovic, and Pratt (2018). This is another highly
cited criminological classic published in the journal Science. Unlike Maxwell et al.
(2018), these authors go beyond verifiability, asking whether Widom’s results are
robust across various alternative model specifications and other methodological deci-
sions. The third article using data from the NACJD archive is Stamatel and Romans’s
(2018) effort to replicate Archer and Gartner’s (1976) cross-national study of postwar
homicide using updated methodology and data that extend to more recent contexts. An
interesting theme that emerges from this article is the potential for replication research
to stimulate theoretical progress.
In light of this small and unrepresentative selection of replication studies, the integ-
rity of criminological research looks respectable. All the studies that used the same
data as the original study were able to reproduce the basic patterns with reasonable
accuracy. This general conclusion extends to all the submissions we received, not just
to those published here. In situations where this did not happen, the main challenge
was insufficient clarity about the procedures followed in the original study. It is well
understood that most Methods sections of journal articles do not contain enough infor-
mation for replication purposes. To improve the situation, we recommend that the
journals in our field take heed of the open science movement and embrace such prac-
tices and research preregistration and the sharing of research materials (data and code).
The only article in this issue that clearly failed to replicate the original results was
based on data different than the one used in the original study. Christ, Schwartz,
Stoltenberg, Brauer, and Savolainen (2018) examined the repeatability of the first molec-
ular genetics study published in the journal Criminology (Wells et al., 2017). As the
authors note, this “failure” does not necessarily suggest issues with integrity, but instead
raises questions about the repeatability and generalizability of the findings. This problem
is widely recognized in the largely underpowered research on candidate gene-by-environ-
ment (cG×E) effects. Based on the experience from a single replication effort, Christ et al.
(2018) offer useful general suggestions for biosocial criminology moving forward.
This special issue is the first of its kind in criminology. Featuring only five arti-
cles, it represents a small step toward a more comprehensive assessment of the
242 Journal of Contemporary Criminal Justice 34(3)
replicability of criminological findings. We hope that this issue, along with two
recent articles on the topic (McNeeley & Warner, 2015; Pridemore et al., 2018),
helps advance replication research in our field and strengthens the commitment of
the criminological research community to the norms of science. Research integrity
does not happen by default. To make your hypotheses genuinely testable, you must
be willing to commit to the possibility that empirical evidence contradicts your
expectations (Popper, 1945). Adherence to this principle calls for transparency and
specificity about the entire research workflow, starting from the statement of the
hypothesis and ending with the reporting of the empirical results. It is usually not
that difficult to shield your beliefs from falsification, but such a disposition deviates
from the norms of science. Criminologists should study deviance but avoid engaging
in it in research.
Acknowledgment
We thank Karoliina Suonpää, Torbjørn Skardhamar, and Chris Maxwell for their helpful
comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship,
and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of
this article.
Notes
1. For clarity, following Freese and Peterson (2017), we use the term replication to describe
any research focused on revisiting or reexamining a prior study. This definition includes
efforts to “reproduce” a prior study using the same data source and procedure, as well as
replications that use a different data source or methodology. Later we make a distinction
between four kinds of replication research.
2. Consumer Reports (2018) is a nonprofit organization that publishes ratings on products,
ranging from car seats to home insurance. They send anonymous “secret shoppers” to retail
stores to purchase the products under review. This approach to testing consumer products
shares parallels with the peer review process.
3. We encourage readers with limited familiarity with the open science movement to visit
the website of the Center for Open Science (https://cos.io/) to learn more about efforts to
increase openness, integrity, and reproducibility of research.
4. In addition to NIJ, NACJD is sponsored the Bureau of Justice Statistics and the Office of
Juvenile Justice and Delinquency Prevention (OJJDP). Each of those agencies archives
their data assets at NACJD. Recently OJJDP adopted the same policy as NIJ requiring all
its grantees to share their data via NACJD. The total number of datasets disseminated by
NACJD was 2,741 as of April 1, 2018.
Savolainen and VanEseltine 243
References
Archer, D., & Gartner, R. (1976). Violent acts and violent times: A comparative approach to
postwar homicide. American Sociological Review, 41, 937-963.
Augustyn, M. B., & McGloin, J. M. (2018). Revisiting juvenile waiver: Integrating the incapaci-
tation experience. Criminology, 56, 154-190.
Bechtel, H. K., Jr., & Pearson, W., Jr. (1985). Deviant scientists and scientific deviance. Deviant
Behavior, 6, 237-252.
Brantingham, P. J. (2016). Crime diversity. Criminology, 54, 553-586.
Christ, C. C., Schwartz, J. A., Stoltenberg, S. F., Brauer, J. R., & Savolainen, J. (2018). The
effect of MAOA and stress sensitivity on crime and delinquency: A replication study.
Journal of Contemporary Criminal Justice.
Cohen, L. E., & Felson, M. (1979). Social change and crime rate trends: A routine activity
approach. American Sociological Review, 44, 588-608.
Cole, S. (2004). Merton’s contribution to the sociology of science. Social Studies of Science,
34, 829-844.
Consumer Reports. (2018). Research & testing. Retrieved from https://www.consumerreports.
org/cro/about-us/what-we-do/research-and-testing/index.htm
Freese, J., & Peterson, D. (2017). Replication in social science. Annual Review of Sociology,
43, 147-165.
Garner, J. H. (1981). National Institute of Justice: Access and secondary analysis. In R. F.
Boruch, P. M. Wortman, & D. S. Cordray (Eds.), Reanalyzing program evaluation (pp. 43-
49). San Francisco, CA: Jossey-Bass.
Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can
be a problem, even when there is no “fishing expedition” or “p-hacking” and the research
hypothesis was posited ahead of time. Department of Statistics, Columbia University.
Retrieved from https://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.
pdf. Accessed May 7, 2018.
Gilbert, D. T., King, G., Pettigrew, S., & Wilson, T. D. (2016). Comment on “Estimating the
reproducibility of psychological science.” Science, 351, 1037.
Herndon, T., Ash, M., & Pollin, R. (2014). Does high public debt consistently stifle economic
growth? A critique of Reinhart and Rogoff. Cambridge Journal of Economics, 38, 257-279.
Jacoby, W. G. (2017, June). The replication and verification policy at the American Journal of
Political Science. Lecture presented at the 2017 ICPSR Summer Program in Quantitative
Methods of Social Research, Ann Arbor, MI.
LaCour, M. J. (2016). Political persuasion and attitude change study: The Los Angeles lon-
gitudinal field experiments, 2013-2014. Ann Arbor, MI: Inter-University Consortium for
Political and Social Research [distributor]. doi:10.3886/E100037V8
LaCour, M. J., & Green, D. P. (2014). When contact changes minds: An experiment on trans-
mission of support for gay equality. Science, 346, 1366-1369.
Lakens, D. (2015). What p-hacking really looks like: A comment on Masicampo and LaLande
(2012). The Quarterly Journal of Experimental Psychology, 68, 829-832.
Lentz, T. S. (2018). Crime diversity: Verified and replicated. Journal of Contemporary Criminal
Justice.
Masicampo, E. J., & Lalande, D. R. (2012). A peculiar prevalence of p values just below. 05.
Quarterly Journal of Experimental Psychology, 65, 2271-2279.
Maxwell, C. D., Garner, J. H., & Skogan, W. G. (2018). Collective efficacy and violence in
Chicago neighborhoods: A reproduction. Journal of Contemporary Criminal Justice.
244 Journal of Contemporary Criminal Justice 34(3)
McNeeley, S., & Warner, J. J. (2015). Replication in criminology: A necessary practice.
European Journal of Criminology, 12, 581-597.
Merton, R. K. (1942). A note on science and democracy. Journal of Legal and Political
Sociology, 1, 115-126.
Myers, W., Lloyd, K., Turanovic, J. J., & Pratt, T. C. (2018). Revisiting a criminological classic:
The cycle of violence. Journal of Contemporary Criminal Justice.
Nieuwenhuis, J. (2016). Publication bias in the neighborhood effects literature. Geoforum, 70,
89-92.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science.
Science, 349, aac4716.
Popper, K. R. (1945). Open society and its enemies. London, England: Routledge.
Pridemore, W. A., Makel, M. C., & Plucker, J. A. (2018). Replication in criminology and the
social sciences. Annual Review of Criminology, 1, 19-38.
Reinhart, C. M., & Rogoff, K. S. (2010). Growth in a time of debt. American Economic Review,
100, 573-578.
Sampson, R. J., Raudenbush, S. W., & Earls, F. (1997). Neighborhoods and violent crime: A
multilevel study of collective efficacy. Science, 277, 918-924.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed
flexibility in data collection and analysis allows presenting anything as significant.
Psychological Science, 22, 1359-1366.
Stamatel, J. P., & Romans, S. (2018). The effects of wars on post-war homicide rates: A rep-
lication and extension of Archer and Gartner’s classic study. Journal of Contemporary
Criminal Justice.
Stapel, D. (2014). Faking science: A true story of academic fraud (N. J. L. Brown, trans.) Retrieved
from https://errorstatistics.files.wordpress.com/2014/12/fakingscience-20141214.pdf
Wells, J., Armstrong, T., Boisvert, D., Lewis, R., Gangitano, D., & Hughes-Stamm, S. (2017).
Stress, genes, and generalizability across gender: Effects of MAOA and stress sensitivity
on crime and delinquency. Criminology, 55, 548-574.
Widom, C. S. (1989). The cycle of violence. Science, 244, 160-166.
Author Biographies
Jukka Savolainen is a research professor at the Institute for Social Research, University of
Michigan, where his primary responsibility is to serve as the director of the National Archive of
Criminal Justice Data (NACJD). Most of his research is focused on etiological investigations of
crime, violence and victimization.
Matthew VanEseltine is a sociologist and criminologist in the Institute for Social Research at
the University of Michigan. He is currently the research manager of the Criminal Justice
Administrative Records System (CJARS), a data infrastructure pilot project to support the next
generation of criminal justice research. His research interests are centered on life-course crimi-
nology and criminological theory.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT