Sources of Variability in Estimates of Predictive Validity

AuthorJill Rettinger,Rob Rowe,J. Stephen Wormith,James Bonta,Albert Brews,Lina Guzzo,D.A. Andrews
DOI10.1177/0093854811401990
Date01 May 2011
Published date01 May 2011
Subject MatterArticles
413
CRIMINAL JUSTICE AND BEHAVIOR, Vol. 38 No. 5, May 2011 413-432
DOI: 10.1177/0093854811401990
© 2011 International Association for Correctional and Forensic Psychology
AUTHORS’ NOTE: James Bonta, J. Stephen Wormith, and the estate of D. A. Andrews receive royalties on
sales of the Level of Service instruments cited in this article. D. A. Andrews’s estate also receives license fees
for use of the CPAI-2000. Elements of this article have been included in various public presentations by the
authors in the past 3 years. Correspondence concerning this article should be addressed to Stephen Wormith
at s.wormith@usask.ca.
SOURCES OF VARIABILITY IN
ESTIMATES OF PREDICTIVE VALIDITY
A Specification With Level of
Service General Risk and Need
D. A. ANDREWS
Carleton University, Ottawa, Ontario, Canada
JAMES BONTA
Public Safety Canada, Ottawa, Ontario, Canada
J. STEPHEN WORMITH
University of Saskatchewan, Saskatoon, Canada
LINA GUZZO
Community Safety and Correctional Services, North Bay, Ontario, Canada
ALBERT BREWS
Correctional Service of Canada, Ottawa, Ontario
JILL RETTINGER
University Partnership Centre, Georgian College, Barrie, Ontario, Canada
ROB ROWE
Family Court Clinic, Kingston, Ontario, Canada
Level of Service (LS) is one of the most widely used general risk and need assessment tools in criminal justice agencies
across North America. However, there is significant interstudy variability in the magnitude of the validity estimates. This
study was conducted to examine possible sources of this variability. The predictive validity of LS risk and need increased
with length of follow-up period and with investigator allegiance to LS. The combination of these two variables reveals con-
sistent increases in mean predictive validity estimates from modest (in the .20s) through large (in the mid .30s) to very large
(in the .40s) in samples of both male and female offenders. We hypothesized that the “allegiance effect” reflects the integrity
of LS implementation and support provided by the agency for risk assessment. This is akin to the difference between
“demonstration projects” and “practical” rehabilitation programming in the offender treatment research. Moreover, controls
for Canadian versus non-Canadian evaluations reduced the effect of allegiance and length of follow-up to nonsignificant
levels. Possible explanations for these findings include the degree of integrity in conducting risk and need assessments, the
accuracy of recidivism as the criterion measure, and generalizability across international boundaries.
Keywords: variability in validity estimates; female offenders; risk and need factors; recidivism
414 CRIMINAL JUSTICE AND BEHAVIOR
The Level of Service (LS) assessment instruments are perhaps the most widely used
assessments of risk and need in North American corrections and elsewhere in the world
(Andrews, Bonta, & Wormith, 2010). The applied value of the instruments reflects their
strong ties with the risk-need-responsivity (RNR) model of correctional assessment and
crime prevention and with general personality and cognitive social learning perspectives on
human behavior, including criminal behavior (Andrews & Bonta, 2010). The instruments
incorporate assessment of the “Big Four” risk and need factors (antisocial behavioral his-
tory, antisocial personality pattern, antisocial cognitions, and antisocial associates) along with
substance abuse and levels of rewards and satisfactions in the social domains of family-
marital, school-work, and leisure-recreation that together represent the “Central Eight” risk
and need factors. Recent versions of the Level of Service/Case Management Inventory
(LS/CMI; Andrews, Bonta, & Wormith, 2004; Youth LS/CMI [YLS/CMI]; Hoge &
Andrews, 2002) focus directly on the Central Eight factors, although their scores may be
derived from the Level of Service Inventory–Revised (LSI-R; Andrews & Bonta, 1995).
The number of studies of the predictive validity of LS general risk and need has increased
considerably in the past 10 years. For example, Smith, Cullen, and Latessa (2009) reviewed
27 reports on the predictive validity of LSI-R with female offenders. Eighty-one percent
of the reports were published in the new millennium. The increase is evident not only in
regard to the number of primary studies but also in regard to the number of meta-analytic
summaries now available, thus allowing one to investigate some of the outstanding issues
about the LS in a systematic empirical manner. One such issue is the variability in validity
estimates that is found across individual studies and even across meta-analyses.
The principal objective of the current report is to contribute to an evidence-based appre-
ciation of sources of intersample variability in the magnitude of predictive validity estimates.
It is important to understand the sources of variability in predictive validity estimates for
both theoretical and practical reasons. Risk and need scores of offenders contribute to the
discretionary levels of supervision assigned, the level of rehabilitative services provided,
and the individualized intermediate targets of change established in programming. These
decisions have a significant impact on the level of restrictiveness experienced by offenders,
the availability of treatment services, the costs of offender processing, and the effectiveness
of crime prevention activities. Moreover, understanding these sources of variability in pre-
dictive validity will allow one to implement quality assurance mechanisms to maximize the
predictive validity of the instrument in field settings.
SOURCES OF VARIATION IN VALIDATION STUDIES
There are numerous reasons validity estimates of a specific instrument may vary across
studies. Sources of variability fall into the following categories: the quality or reliability of
the measure as administered by practitioners in the field; the climate and culture of the agency
with respect to risk assessment; the accuracy by which the outcome measure, recidivism, is
determined; and other methodological issues, such as the heterogeneity of offenders to whom
the scales are applied. Moreover, those who have an allegiance to an instrument are likely
to ensure that the above-noted factors are in place in such a manner as to maximize validity
(in large part by minimizing error variance of the predictor and outcome variables).1 These

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT