Assessing Complex Patterns of Student Resources and Behavior in the Large Scale

Published date01 May 2019
Date01 May 2019
DOIhttp://doi.org/10.1177/0002716219844963
Subject MatterToward the Future: Theories of Knowing and Implications for Assessment
/tmp/tmp-172TyDFHjfSUo9/input
844963ANN
The Annals of the American AcademyAssessing Student Resources and Behavior in the Large Scale
research-article2019
Large-scale assessments still focus on those aspects of
students’ competence that can be evaluated using
paper-and-pencil tests (or computer-administered ver-
sions thereof). Performance tests are considered costly
due to administration and scoring, and, more impor-
tantly, they are limited in reliability and validity. In this
article, we demonstrate how a sociocognitive perspec-
tive provides an understanding of these issues and how,
based on this understanding, an argument-based
Assessing
approach to assessment design, interpretation, and use
can help to develop comprehensive, yet reliable and
Complex
valid, performance-based assessments of student com-
petence. More specifically, we describe the develop-
ment of a computer-administered, simulation-based
Patterns of
assessment that can reliably and validly assess students’
competence to plan, perform, and analyze physics
Student
experiments at a large scale. Data from multiple valida-
tion studies support the potential of adopting a socio-
cognitive perspective and assessments based on an
Resources and argument-based approach to design, interpretation,
and use. We conclude by discussing the potential of
Behavior in the simulations and automated scoring methods for reliable
and valid performance-based assessments of student
Large Scale
competence.
Keywords: competence; experimentation; sociocogni-
tive perspective; performance assessment;
large scale
By
the increasing need for citizens sufficiently
KNut NeuMANN,
literate in science has led to a growing call
HORSt SCHeCKeR,
for educational reform around the world (e.g.,
and
Bybee and Fuchs 2006). For too long, science
HeIKe tHeyßeN
education has focused on teaching students
Knut Neumann is a professor of physics education at
the IPN Kiel. His research focuses on the assessment of
student competence in science.
Horst Schecker is a professor of physics education at
the University of Bremen. His research focuses on mod-
eling and improving students’ competencies in physics.
Heike Theyßen is a professor of physics education at the
University Duisburg-Essen. Her research focuses on
experimental skills and knowledge development in physics.
Correspondence: neumann@ipn.uni-kiel.de
DOI: 10.1177/0002716219844963
ANNALS, AAPSS, 683, May 2019 233

234
tHe ANNALS OF tHe AMeRICAN ACADeMy
knowledge, skills, and abilities in isolation from each other (Schmidt, McKnight,
and Raizen 1997). to adequately prepare students for a life in a world shaped by
scientific and technological advancements, students must develop the compe-
tence to engage in the practices of science and engineering (e.g., National
Research Council [NRC] 2012; for an overview, see Waddington, Nentwig, and
Schanze 2007). these practices include identifying questions about phenomena,
conducting investigations to examine phenomena, and constructing models to
explain phenomena, as well as arguing over concurring explanations of the same
phenomenon. One key method of investigation is experimentation (emden and
Sumfleth 2016, 29). therefore, supporting students in developing competence in
experimentation, that is, to plan and perform experiments as well as to analyze
and interpret the data collected through these experiments, addresses the major
aim of science education—to ensure a scientifically literate citizenship (NRC
2012).
to help implement educational reform that can support students in develop-
ing competence in experimentation, assessments are needed that can provide
information on student learning—for purposes of classroom learning (i.e., to help
teachers plan instruction) but also for accountability purposes (e.g., to evaluate
curricula or educational programs; NRC 2014). However, whereas classroom
assessments cover the full range of students’ competence in planning, perform-
ing, and analyzing experiments (e.g., Nawrath, Maiseyenka, and Schecker 2011),
assessments for accountability or monitoring purposes (e.g., the National
Assessment of educational Progress) have mostly focused on the planning of
experiments or the analysis of the collected data, if competence in experimenta-
tion is assessed at all (e.g., Grigg, Lauko, and Brockway 2006; see also Quellmalz
et al. 2007). this is because assessing the performance of experiments comes
with a range of difficulties when implemented at a large scale (e.g., Baxter et al.
1992). Paper-and-pencil tests, favored for use in large-scale studies for their effi-
ciency with respect to administration and scoring (Clarke-Midura and Dede
2010), have been found to underrepresent competence in performing experi-
ments (e.g., Shavelson, Ruiz-Primo, and Wiley 1999; Stecher et al. 2000).
Performance tests, on the other hand, are costlier to develop and score (Stecher
and Klein 1997); and, more importantly, student scores have been found to
depend on multiple factors, including the specific selection of tasks (e.g.,
Shavelson, Baxter, and Gao 1993), the time the tasks are administered (e.g.,
Shavelson, Ruiz-Primo, and Wiley 1999), and content knowledge required to
address the tasks (e.g., Gut-Glanzmann 2012). As a result, there is a paucity of
assessments that can provide reliable and valid information on students’ compe-
tence in experimentation.
In this article, we describe our efforts to develop a performance assessment
that can reliably and validly assess students’ competence in planning, performing,
and analyzing physics experiments for use on a large scale. We begin with
NOte: this work has been supported by a grant of the German Federal Ministry of education
and Research (FKZ 01LSA005).

ASSeSSING StuDeNt ReSOuRCeS AND BeHAvIOR IN tHe LARGe SCALe
235
analyzing previous efforts from a sociocognitive perspective to obtain insights
into the problems underlying these efforts. Based on this analysis we identify
implications for developing the needed assessments. We then detail the steps we
have taken to ensure that the assessment we constructed will provide reliable and
valid information and draw on evidence from multiple validation studies under-
taken throughout this process to create an argument for how the assessment we
developed yields information about students’ competence in experimentation.
We conclude by discussing the implications for the assessment of other aspects
of student competence in science that have also been neglected because perfor-
mance assessments have been deemed too ineffective for use on a large scale. In
doing so, we briefly address the future of assessment and assessment develop-
ment, highlighting the role of psychometric and technological developments for
performance assessments.
A Sociocognitive Perspective on Assessing experimentation
Competence
Competence in the practices of science and engineering, although based on a
profound knowledge of science, also requires a range of skills and abilities.
Competence in any domain is in fact characterized by the capacity to integrate
different knowledge, skills, and abilities (KSAs) required to identify and solve
problems typical for the domain across a wide(r) range of contexts (Weinert
2001). Accordingly, the competence to plan, perform, and analyze experiments
in physics requires a wide range of different KSAs; among them is disciplinary
knowledge of physics concepts and principles (e.g., knowledge about the factors
that may influence measurement of the current through a resistor as a function
of the voltage applied to it), the skills to manipulate experimentation devices
(e.g., the skill to adjust the voltage or use an amperemeter to measure the cur-
rent), and the ability to represent the data obtained graphically (e.g., the ability
to create a diagram showing voltage applied over current measured for the resis-
tor). taken together, these KSAs or more generally the competence to plan and
perform experiments and to analyze and interpret the obtained data are termed
“experimental (or experimentation) competence” (Gut-Glanzmann 2012).
Most efforts to model experimentation competence are based on Klahr and
Dunbar’s (1988) scientific discovery as dual search (SDDS) model. this model
describes experimentation in two spaces: the hypothesis space and the experi-
ment space. the hypothesis space includes potential hypotheses in all variations,
valid or not. the experiment space consists of the experiments suitable for testing
the hypotheses. the actual process of scientific investigation through experimen-
tation then involves (multiple iterations of) three consecutive steps: (1) the search
for a potentially applicable hypothesis in the hypothesis space; (2) the testing of
the identified hypothesis through the selection and implementation of an experi-
ment from the experiment space; and (3) the analysis of the evidence obtained
from the experiment in light of the hypothesis and the decision whether the

236
tHe ANNALS OF tHe AMeRICAN ACADeMy
hypothesis can be or needs to be rejected, that is, essentially, the planning, per-
formance, and analysis of experiments (emden and Sumfleth 2016). More dif-
ferentiated models identify more steps. Nawrath, Maiseyenka, and Schecker
(2011) propose one of the most differentiated models with seven steps: (1) devel-
oping a guiding question, (2) formulating a hypothesis, (3) planning the...

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT