Testing, Accountability, and School Improvement

AuthorSusanna Loeb,Erika Byun
Published date01 May 2019
Date01 May 2019
DOIhttp://doi.org/10.1177/0002716219839929
Subject MatterAssessments for System Accountability
/tmp/tmp-17lUPsNxt6CNeU/input 839929ANN
The Annals of The American AcademyTesting, Accountability, and School Improvement
research-article2019
The provision of public schooling in the united States
has primarily been the states’ responsibility, but states
generally lack the capacity to manage day-to-day school
operations. Thus, states delegate responsibility to dis-
tricts while maintaining some oversight. Forms of over-
sight include regulations and political and market-based
accountability. However, these can only do so much in
holding schools accountable for providing high-quality
schooling. Administrative accountability based on stu-
Testing,
dent outcomes and school process measures presents
an alternative to complement other accountability
Accountability, mechanisms. Standardized measures of performance
used for administrative accountability can better align
curriculum with state standards, improve quality, and
and School
signal the skills that society wishes for students to build.
However, they can be counterproductive if they are not
Improvement reliable, valid, or comprehensive. We suggest in this
article that no measure is perfect and that the useful-
ness of test-based accountability depends on whether
the measures enhance educational opportunities and
reflect shared goals with reliability, validity, and com-
prehensiveness.
Keywords: standardized testing; student outcomes;
By
accountability
SuSANNA LOeb
and
erIkA byuN
Although “accountability” has become a con-
tentious term in contemporary debates
about the quality of education, it has a hallowed
place in American history. At least from the
early nineteenth century, during the period of
“common school” reforms, questions about the
quality of instruction and the unequal alloca-
tion of educational resources were fueled by
the evolving American principle of holding
public officials accountable to the citizens
Susanna Loeb is the director of the Annenberg Institute
and a professor of international and public affairs and
education at Brown University.
Erika Byun is a Stanford University Institute for
Economic Policy Research (SIEPR) predoctoral research
fellow, focusing on the economics of education.
Correspondence: loeb@brown.edu
DOI: 10.1177/0002716219839929
94 ANNALS,
AAPSS, 683, May 2019

TeSTINg, ACCOuNTAbILITy, AND SCHOOL IMprOveMeNT
95
(Tyack 1974). As one tool used to provide parents and the public with informa-
tion on how their schools were performing—an example of accountability—
standardized testing emerged in the 1830s (vinovskis, this volume; u.S. Congress
Office of Technology Assessment 1992; kaestle 2013). The phrase “testing and
accountability,” therefore, has long and deep roots in u.S. educational history
(e.g., McDonnell 2004; Feuer 2012).
Despite the long-standing presence of testing and accountability in the
American public school system, discussions of the uses and purposes of testing
remain fraught. In this article, we provide a framework for evaluating the current
use of standardized measures of performance for providing oversight over
schools and explore prospects for the future of testing as a tool for educational
policy. We begin by describing models of accountability—regulatory, political,
market, and administrative systems—that have evolved in many areas including,
but not limited to, education. We then turn to the specific challenges of applying
those models to the goal of enabling coherent oversight of a schooling system that
is, by design, diffuse and fragmented. Though the provision of schooling in the
united States has primarily been the responsibility of states, states generally lack
the capacity to manage day-to-day school operations and delegate responsibility
to districts—close to 14,000 nationwide—while maintaining some oversight. The
complexities of structure and governance of schooling pose challenges to the
design and implementation of coherent and effective accountability systems.
Our review of evidence—from research on accountability generally and
from decades of trials of test-based accountability specifically—reaffirms a
familiar finding, namely, that all measurement systems have imperfections.
Consequently, we suggest that the criterion for judging the relative merits of
testing as a source of accountability data should be whether, on balance, the
benefits outweigh the costs and risks. Our “bottom line” is to favor continued
development of reliable, valid, and comprehensive measures for test-based
accountability, providing that they advance educational progress and preempt
potential negative consequences.
Oversight and Accountability in American Schools
Throughout its history, u.S. education has been decentralized (kaestle 1983;
Feuer 2006). In the seventeenth and eighteenth centuries, local communities
created their own schools based on their priorities and values. Today, states
remain largely responsible for providing public education and make operational
decisions including those related to class size, content standards, teacher licen-
sure, and graduation requirements. Federal sources contribute only
approximately 8 percent of elementary and secondary schools’ budgets under the
current finance model.1 recent passage of the every Student Succeeds Act
(eSSA) reinforced the tradition of a delimited federal role by shifting consider-
able authority for standards and accountability back to the states (see also
Hanushek, this volume; vinovskis, this volume).

96
THe ANNALS OF THe AMerICAN ACADeMy
States vary in their capacity, and perhaps their desire, to manage daily opera-
tions of schools. Most have consequently delegated a portion, and usually a large
portion, of the responsibility for running schools to local educational agencies
(LeAs), also known as districts, which have the advantage of being closer to
schools and thus having a better understanding of local contextual factors. Still,
despite this assumed advantage, unchecked local control can lead to substantial
variation across schools and create or exacerbate disparities in educational oppor-
tunities for students within (and between) states. Districts and schools differ in
their student populations, community goals, and track records of tried and suc-
cessful implementations (O’Day 2002; O’Day and Smith 2016; Spillane, reiser,
and gomez 2006).
To reduce such disparities, states exercise some form of oversight over school
districts, often by relying on standardized measures of school performance
designed to provide relevant information for assessing the extent to which local
actors—teachers, principals, leaders, and so forth—are progressing toward state
goals. These measures may capture school processes, such as observations from
school inspectors; or they may focus on student outcomes, such as standardized
tests. However, no measure is perfect. Measures may not accurately reflect true
performance; that is, they may be unreliable. They may not measure perfor-
mance over a domain of true interest; that is, they may not be valid for the goals
of the state. They may not cover all valued domains; that is, they may not be
comprehensive. The question, then, is, given the shortcomings of particular
measures, whether the available metrics yield worthwhile information for deci-
sion-making and oversight, and whether additional or alternative measures would
be beneficial.
regulation, politics, and Markets
Not all forms of oversight require standardized measures of performance. One
approach that states have taken to provide monitoring of schools is setting regula-
tions that define legal requirements and resource allocations ex ante. regulations
are primarily preemptive, intending to prevent unintended risks and conse-
quences. For example, regulatory policy can provide strict guidelines for pro-
cesses such as the hiring and firing of school employees and the appropriate use
of funds and can also determine standards for school inputs such as the maximum
class size and the necessary credentials for teachers and administrators.
These regulatory forms of monitoring and oversight tend to set a minimum on
quality. They are not designed to provide the information needed to better
inform future decisions and do not encourage improvement in the quality of the
system beyond the floor. Furthermore, such regulations may inadvertently cause
unintended consequences; for example, teachers may gravitate toward low-cost,
low-quality certification programs to fulfill certification requirements. Some
regulations may also be unnecessarily expensive and inefficient for achieving
society’s educational goals. Studies of the effects of various regulations on student

TeSTINg, ACCOuNTAbILITy, AND SCHOOL IMprOveMeNT
97
achievement and educational attainment have found both positive and negative
impacts (Hanushek 1997; Hanushek, this volume; Jepsen and rivkin 2009).
While regulations can usefully set floors on quality, other forms of oversight
and accountability aim to improve quality above a floor. These approaches can
use political processes, market forces, or administrative data to encourage schools
to reach defined goals. In political accountability, elected officials are the actors
making decisions about schools on behalf of their electorate. For example, the
election of school board members for a particular school district by local resi-
dents creates an infrastructure for running schools. In...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT