The Accuracy and Value of Voter Validation in National Surveys: Insights from Longitudinal and Cross-Sectional Studies

Published date01 June 2021
Date01 June 2021
Subject MatterArticles
Political Research Quarterly
2021, Vol. 74(2) 332 –347
© 2020 University of Utah
Article reuse guidelines:
DOI: 10.1177/1065912920903432
In recent decades, there has been growing concern about
the accuracy of self-reported survey responses about elec-
tion participation. Previous studies have found that par-
ticipants misreport1 voting by double-digits (Ansolabehere
and Hersh 2012; Berent, Krosnick, and Lupia 2016;
Clausen 1968; Enamorado and Imai 2018b; Traugott and
Katosh 1979) and blame this survey error on a variety of
factors, including nonresponse bias among survey target
samples, social desirability, survey mode, panel attrition,
and errors in government records or the matching process
(Berent, Krosnick, and Lupia 2011; Burden 2000; DeBell
et al. 2018; Jackman and Spahn 2019). Misreporting
leads to inaccurate survey estimates of turnout rates, with
these estimates significantly exceeding the turnout rate
reported by official election results (Ansolabehere and
Hersh 2012). A recent analysis of the 2016 American
National Election Study (ANES) and Cooperative
Congressional Election Study (CCES) studies found the
self-reported turnout rate to be at least 15 percentage
points higher than the official turnout rate (Enamorado
and Imai 2018b). This observation is not limited to U.S.
election studies, as one meta-analysis observed an aver-
age 13 percent gap in reported versus official turnout in
130 studies across forty-three countries (Selb and
Munzert 2013).
For the first time in the validation literature, we use
election survey data from a thirty-year national longitudi-
nal study to estimate and analyze vote overreporting in the
2016 presidential election. The availability of three
decades of longitudinal information about residence loca-
tion, education, marital status, and political engagement
provides an important framework for examining self-
reported voting activity. The insights into overreporting in
our national longitudinal sample reflect the three
903432PRQXXX10.1177/1065912920903432Political Research QuarterlyMiller et al.
1University of Michigan, Ann Arbor, USA
2University of Wisconsin–Oshkosh, USA
Corresponding Author:
Jon D. Miller, International Center for the Advancement of Scientific
Literacy, Institute for Social Research, University of Michigan, 426
Thompson Street, Ann Arbor, MI 48106, USA.
The Accuracy and Value of Voter
Validation in National Surveys: Insights
from Longitudinal and Cross-Sectional
Jon D. Miller1, Jason Kalmbach2, Logan T. Woods1, and Claire Cepuran1
Ansolabehere and Hersh and others have examined the reported voting behavior of survey respondents using a
variety of validation methods, including matching with national voter files provided by outside vendors. This analysis
provides the first examination of a thirty-year national longitudinal study and compares the insights obtained from
this longitudinal analysis to two 2016 national cross-sectional studies of voting behavior using structural equation
modeling. We find that respondents of the longitudinal study overreport at lower rates than respondents in our 2016
samples, and the traditional predictors of overreporting such as political interest, engagement, and partisanship predict
overreporting among respondents in both our longitudinal and 2016 short-term panel studies, but our longitudinal
data include novel predictors of overreporting such as parent socialization factors. We conclude with a discussion of
the phenomenon of overreporting in surveys and how survey accuracy becomes increasingly important for both the
public and policymakers in an era of decreasing trust in institutions and expertise.
voter validation, vote overreporting, 2016 presidential election, longitudinal, structural equation model, ideological
Miller et al. 333
decades—from middle school into their mid-40s—of
socialization and life course experience that is unparal-
leled in the voting validation literature. We compare our
findings from the longitudinal analysis to two national
cross-sectional surveys of adults (age 18 and older) con-
ducted by NORC AmeriSpeak to examine the level of
misreporting and to explore the reasons for misreporting.
Primary Data Sets Used
The Longitudinal Study of American Life
The longitudinal study used for this analysis is the
Longitudinal Study of American Life (LSAL). The study
was funded by the National Science Foundation (NSF)2
in 1985 to assess the scientific knowledge and character-
istics of America’s youth. Originally called the
Longitudinal Study of American Youth (LSAY), the study
was launched in the fall of 1987 using a sample of sev-
enth- and tenth-grade public school students. Both
cohorts were selected through a multistage probability
sample3 and are a national representation of public school
students in those grades in 1987. In 2013, NSF support
for the LSAY was not renewed; however, new support
was secured from the National Institute on Aging (NIA)
to continue the study,4 and the study was renamed the
Longitudinal Study of American Life. Of the 5,945 stu-
dents in the initial 1987 sample, approximately 5,100 are
still eligible to participate. Approximately 4,100 have
completed two or more surveys5 since 2007.
This analysis uses survey responses from 3,150 young
adults who completed the LSAL questionnaire in 2016
and provided information about their participation in the
2016 presidential election, focusing primarily on the
approximately 800 individuals who did not vote. These
young adults were forty-two to forty-six years of age in
2016 and represent the middle of the Generation X age6
distribution. The data from the LSAY/LSAL are depos-
ited in the Inter-university Consortium for Political and
Social Research (ICPSR) and are available for secondary
analysis. To date, more than forty dissertations and 200
refereed articles have been written by secondary analysts
using LSAL data (see An extended
description of the LSAY/LSAL study is included in the
Supplemental Materials (SM hereafter).
Two National Cross-Sectional Studies
To provide a reference point to the literature, which is
built primarily on cross-sectional surveys and short-term
panel studies, we use two panel studies carried out by
NORC AmeriSpeak7 in 2016 and 2017 that included elec-
tion questions using the same wording as the 2016 LSAL
survey. The 2016 survey was a two-wave panel study,
with a national probability sample of adults aged 18 and
older, conducted in February and November of 2016.
Election questions were included in the November wave
of the 2016 panel (N = 2,270) and in the February wave
of the 2017 panel (N = 2,925), the latter of which used a
new national cross-sectional sample.
Although there was some sample erosion8 in the 2016
AmeriSpeak panel, the February 2017 wave was the first
wave of the panel so there was no sample erosion.
Separate weights were computed for each cycle to adjust
for demographic differences due to attrition. A compari-
son of the two AmeriSpeak survey demographics is
included in the SM. The proportion of misreports did not
differ significantly at the .05 level between the two sam-
ples. The November 2016 and February 2017 waves were
combined for this analysis (N = 5,033).
The Use of Catalist for Vote Validation
Numerous cross-sectional and panel studies have vali-
dated the electoral participation of survey participants
using third-party vendors (Ansolabehere and Hersh 2012;
Berent, Krosnick, and Lupia 2011). For example, the
2016 ANES and 2016 CCES have provided voter valida-
tion from third party vendors (ANES 2017; Ansolabehere,
Schaffner, and Luks 2017; Enamorado, Fifield, and Imai
2018; Enamorado and Imai 2018a). Catalist, our vendor,
matches each survey participant’s name, address, and
related demographic and geographical information with a
file of voter records as well as other individual-level data
Catalist possesses.9 In this analysis, we match Catalist
voting data with both the LSAL and the AmeriSpeak sur-
veys to validate voter turnout and registration. Like
Ansolabehere and Hersh (2012), our matching proce-
dures included an iterative process between the project
team and Catalist to ensure the most accurate match pos-
sible (see SM for more details).10
The Magnitude of Misreporting
Our first objective is to examine the magnitude of misre-
porting in the LSAL and AmeriSpeak samples. We define
a validated voter as an individual who reported that he or
she voted in the 2016 election and for whom Catalist was
able to find a public record verifying that vote.
Overreporters consist of respondents who reported that
they voted in 2016 but no public record was found by
Catalist to support that claim. Individuals who reported
that they did not vote in 2016, but Catalist found a public
record that they cast a vote in the 2016 election were clas-
sified as underreporters. Respondents who report that
they did not vote in 2016 and for whom there is no public
record that they cast a vote are coded as validated

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT