From noise to knowledge: Improving evidentiary standards for program efficacy to better inform public policy and management decisions
Published date | 01 September 2023 |
Author | Kathryn E. Newcomer,Jeremy L. Hall,Sanjay K. Pandey,Travis Reginal,Ben White |
Date | 01 September 2023 |
DOI | http://doi.org/10.1111/puar.13688 |
RESEARCH ARTICLE
From noise to knowledge: Improving evidentiary standards
for program efficacy to better inform public policy and
management decisions
Kathryn E. Newcomer
1
| Jeremy L. Hall
2
| Sanjay K. Pandey
1
| Travis Reginal
1
|
Ben White
1
1
Trachtenberg School of Public Policy and Public
Administration, The George Washington
University, Washington, DC, USA
2
DPAC 448R, School of Public Administration,
University of Central Florida, Orlando,
Florida, USA
Correspondence
Kathryn E. Newcomer, Trachtenberg School of
Public Policy and Public Administration, The
George Washington University, 805 21st St NW,
Suite 601, Washington, DC 20052, USA.
Email: newcomer@gwu.edu
Abstract
Current approaches employed by U.S.-based clearinghouses to rate the efficacy of
interventions to address social problems typically do not result in sufficient infor-
mation to help practitioners. Current standards of evidence employed across the
United States apply a positivist notion of validity with quantitative research criteria
that discourage answering important how and why questions, explicitly privilege
quantitative/RCT evidence, offer few contextual insights, and rarely discuss dispar-
ities in outcomes across participants differing by race, gender, and ethnicity. We
offer a set of standards of evidence to assess qualitative and mixed methods stud-
ies, as well as RCTs and quasi-experimental designs, and probe the extent to which
the studies address context and equity. We applied our proposed new standards
of evidence to all intervention studies rated as the highest quality by the What
Works Clearinghouse (WWC) sponsored by the U.S. Department of Education from
2017 to 2021 to demonstrate the usefulness of our standards.
Evidence for practice
•Registries and clearinghouses that rate the rigor of evaluations that assess inter-
vention efficacy should revise the standards of evidence they employ to include
and assess the quality of evaluations that rely on multiple methods, and require
more from evaluators about the role of context and about disparities in out-
comes across study participants differing by race, gender and ethnicity.
•Registries and clearinghouses should be transparent about the currently narrow
and restrictive criteria that they employ when vetting and rating interventions.
•Policymakers and managers should be judicious in their application of so-called
evidence-based interventions by reflecting on the importance of context, and
the possibility of differential impacts across groups.
INTRODUCTION
Many policy advocates across the world have joined
the race to promote the use of stronger evidence to
inform decision-making in the public arena. Some gov-
ernment agencies and nonprofit sponsors are key
players in the evidence-based policy and practice
movement, as they disseminate vetted interventions
through websites and online portals called clearing-
houses or registries, and provide information on the
interventions’impact on specific outcomes, in large
part imitating the work of the UK-based Cochrane Col-
laboration’sworktodothisinthemedicalarena
(https://www.cochrane.org).
The stated objectives of the clearinghouses in the
United States are to offer vetted, thus “evidence-based,”
Received: 4 October 2022 Revised: 9 June 2023 Accepted: 9 June 2023
DOI: 10.1111/puar.13688
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium,
provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
© 2023 The Authors. Public Administration Review published by Wiley Periodicals LLC on behalf of American Society for Public Administration.
Public Admin Rev. 2023;83:1051–1071. wileyonlinelibrary.com/journal/puar 1051
interventions to be adopted by governments and practi-
tioners. The clearinghouses typically: provide a searchable
database of vetted programs and practices organized in a
way that scholars and practitioners can search for pro-
grams relevant to their problems and highlight the most
rigorous studies to bring their evidence to larger audi-
ences, especially practitioners, such as the What Works
Website sponsored by the U.S. Department of Education
(See https://ies.ed.gov/ncee/wwc/).
Some of the clearinghouses follow the work of
British-based clearinghouses and review and synthesize
existing research on a topic to provide recommenda-
tions on what works (and to what extent) through sys-
tematic reviews, like the Campbell Collaboration (see
https://campbellcollaboration.org/). The mission of the
CLEAR website hosted by the U.S. Department of Labor
is typical of most: “CLEAR’s mission is to make research
on labor topics more accessible to practitioners, policy
makers, researchers, and the public more broadly so that
it can inform their decisions about labor policies and
programs. CLEAR identifies and summarizes many types
of research, including descriptive statistical studies and
outcome analyses, implementation, and causal impact
studies. For causal impact studies, CLEAR assesses the
strength of the design and methodology in studies that
look at the effectiveness of particular policies and pro-
grams”(https://clear.dol.gov/).TheCLEARwebsiteisthe
newest in the United States, and is more inclusive, but
still emphasizes the causal studies. For the most part,
these efforts are well-intentioned, and they do contrib-
ute knowledge that could inform and shape policy and
management decisions in the age of “evidence-based
everything”(Hall, 2021).
1
To assess evidence quality in the studies they review,
clearinghouses each employ their own criteria and
weighting schemes to rate included programs or studies.
The websites then post studies that meet their own cri-
teria, such as the U.S. Department of Health and Human
Services states: “To meet HHS’criteria for an “evidence-
based early childhood home visiting service delivery
model,”models must meet at least one of the following
criteria: (1) At least one high- or moderate-rated impact
study of the model finds favorable (statistically significant)
impacts in two or more of the eight outcome domains, or
(2) At least two high- or moderate-rated impact studies of
the model (using non-overlapping analytic study samples)
find one or more favorable (statistically significant)
impacts in the same domain”(https://homvee.acf.hhs.
gov/about-us/hhs-criteria).
It is rare for any U.S.-based clearinghouse to post
interventions based on qualitative data or mixed methods
studies. The priority given to quantitative methods is
likely a result of the evidence movement’s inclination
toward a positivist orientation to research, in other words,
the studies employ empirical testing, hyperrationality,
rigor, and the use of comparable statistics measuring
effect size and goodness of model fit (for a discussion of
the importance of symbolism versus substance with
respect to evidence-based practices, see Hall and Paul
Battaglio (2018)).
Some clearinghouses only include studies that used
RCTs, for example, while others assign studies to ordinal
categories based on the designs used. For example, the
Clearinghouse for Labor Evaluation and Research
(CLEAR) classifies studies as Low, Moderate, or High,
based on the design used, and only RCTs and inter-
rupted time series deigns garner the top category (see
https://clear.dol.gov/). Despite inconsistencies in clear-
inghouse rating approaches (Means et al., 2015), quanti-
tative standards of evidence remain entrenched (see
Jennings & Hall, 2012; Zheng et al., 2022).
The current approaches clearinghouses employ to
rate the efficacy of interventions to address social prob-
lems typically do not result in the provision of sufficient
information to help practitioners and policy makers
address complex problems, especially those exacerbated
by institutionalized racism (Newcomer et al., 2022). More
than a decade ago, Khagram and Thomas envisioned that
the year 2020 would see the advent of a “platinum”stan-
dard of evidence, one that will combine RCT evidence
with contextual details. However, 2020 is now in the rear-
view mirror, and clearinghouses persist in using an overly
reductionist normative framework that is no longer
appropriate nor sufficient to support the complex nature
of networked, collaborative, coproduced, intergovern-
mental, and intersectoral governance—especially not in
an era where equity and social justice are dominant
among public values.
The social, economic, and political challenges facing
modern public service providers are dynamic, heteroge-
neous, and often impact different groups in different
ways. As Ray Pawson and other “realist evaluators”have
been highlighting for years, an overly reductionist lens
that focuses on simple interventions often ignores: the
root causes and multiple symptoms of problems that
the intervention address; the differential impact of inter-
ventions in different contexts and for different sub-
groups (e.g., race, ethnicity, gender, social class); variation
among the preferences and goal structures of the imple-
menting agencies, and the complex nature of implemen-
tation networks due to our intergovernmental system of
grants, contracts, and cooperative agreements to deliver
public actions, as well as joined-up or whole of govern-
ment efforts to address wicked problems (2013). Program
implementation is seldom uniform, context matters, and
implementation mechanisms and features matter.
In sum, the standards of evidence currently employed
by clearinghouses across the United States discourage
researchers from answering important how and why
questions, explicitly privilege quantitative/RCT evidence
over any other research design, offer few contextual
insights, and rarely discuss disparities in outcomes across
participants differing by race, gender and ethnicity. When
the standards of evidence that are employed to rate
1052 EVIDENTIARY STANDARDS FOR PROGRAM EFFICACY
To continue reading
Request your trial