Handling Missing Values in Longitudinal Panel Data With Multiple Imputation

AuthorRebekah Young,David R. Johnson
Date01 February 2015
DOIhttp://doi.org/10.1111/jomf.12144
Published date01 February 2015
R Y University of Washington
D R. J The Pennsylvania State University
Handling Missing Values in Longitudinal Panel Data
With Multiple Imputation
This article offers an applied reviewof key issues
and methods for the analysis of longitudinal
panel data in the presence of missing values. The
authors consider the unique challenges associ-
ated with attrition (survey dropout), incomplete
repeated measures, and unknown observations
of time. Using simulated data based on 4 waves
of the Marital Instability Over the Life Course
Study (n=2,034), they applied a xed effect
regression model and an event-history analysis
with time-varying covariates. They then com-
pared resultsfor analyses with nonimputed miss-
ing data and with imputed data both in long
and in wide structures. Imputation produced
improved estimates in the event-history analysis
but only modest improvements in the estimates
and standard errors of the xed effects analysis.
Factors responsible for differences in the value
of imputation are examined, and recommenda-
tions for handling missing values in panel data
are presented.
The use of longitudinal panel (prospective)
survey data is common in the area of family
Department of Biostatistics, Collaborative Health Studies
Coordinating Center, University of Washington,Box
354922, Seattle, WA98195.
Department of Sociology, The Pennsylvania State
University,211 Oswald Tower, University Park, PA16802
(drj10@psu.edu).
Key Words: event history analysis, xed effects, longitudinal
data, missing data, multiple imputation, panel data.
research. From 2010 to 2014, approximately
287 quantitative and qualitative research arti-
cles (excluding theory development, research
reviews, comments, rejoinders, and method-
ological innovation articles) were published in
the Journal of Marriage and Family (JMF). Of
these, 176 (61%) analyzed longitudinal data.
Data on the same individuals or families at
multiple points in time provide for stronger
inferences about change processes and allow
for more control of unmeasured differences
between individuals that can bias study nd-
ings (Johnson, 1995, 2005). What tempers
these advantages is the large amount of miss-
ing data found in many longitudinal studies.
Nearly all of the JMF articles explicitly men-
tioned the presence of missing values and study
dropout—suggestive of the widespread concern
with missing data in panel studies.
Few guidelines for the analysis of longitudi-
nal panel data in the presence of missing values
are accessible to family researchers. Moreover,
no clear appraisals of the consequences of
different ways of handling missing data are
readily offered. Existing guidelines tend to be
directed toward statisticians or focus on types of
longitudinal data rarely found in the family lit-
erature, such as randomized clinical trials (e.g.,
Daniels & Hogan, 2008; Enders, 2011; Hedeker
& Gibbons, 2006; National Research Council,
2010) or data sets with few cases but many
waves, such as cross-national time-series stud-
ies (e.g., Honaker & King, 2010). Methods for
handling missing values have been addressed
in the family literature (e.g., Acock, 2005;
Journal of Marriage and Family 77 (February 2015): 277–294 277
DOI:10.1111/jomf.12144
278 Journal of Marriage and Family
Johnson & Young, 2011; Young & Johnson,
2013), but these resources focus primarily on
cross-sectional data. Although much of what we
know about the approaches to handling missing
values in cross-sectional situations applies to
longitudinal panel data, panel data have char-
acteristics that complicate the application of
techniques such as multiple imputation (MI).
Such complications, along with a lack of acces-
sible guides to help address these issues, may
be contributing to the limited use of modern
methods like maximum likelihood (ML) or MI
among the many studies in the area of family
that use longitudinal data (Jelicic, Phelps, &
Lerner, 2009).
In this article, we review standard approaches
to handling missing data in longitudinal panel
studies, apply several techniques to a simu-
lations study based on an empirical family
research problem using a multiwave panel data
set, and assess how different strategies have
consequences for the research ndings. Our
focus is on missing values in panel data sets
with large numbers of respondents but small
numbers of survey waves administered at xed
intervals—typical conditions for data sets found
in much family research. Missing data MI
strategies with xed effect, pooled time-series
models and event-history (Cox proportional
hazard) models are examined. Our review of the
methods used in 176 JMF articles suggests that
the most common models for analyzing lon-
gitudinal data were event history (19%), xed
effects (18%, or 19% including change scores),
and mixed effect or multilevel (17%, or 22%
including growth curve), followed by linear
regression (16%), logistic regression (15%), and
structural equation models (10%, or 15% includ-
ing growth curve and latent class analysis). Less
common methods for analyzing longitudinal
data included multinomial regression (5%) and
qualitative analysis (2%). (Note that percentages
sum to more than 100% because many articles
used more than one method.)
B
Longitudinal panel studies have several fea-
tures that complicate the techniques commonly
applied when handling missing data. Unlike
cross-sectional data sets, longitudinal data
sets have both within-wave and whole-wave
missingness. Longitudinal data analysis meth-
ods require a particular data structure (long
vs. wide) that creates issues when handling
missing data (Lloyd, Obradovic, Carpiano, &
Motti-Stefanidi, 2013). Other complications
of missing values in longitudinal data include
repeated measures; time-to-event models; non-
random study dropout; and statistical procedures
that routinely handle some, but not all, sources
of missing data.
Two Sources of Missing Values
Missing values in panel data can occur in vari-
ables within a wave and when a full waveof data
is missing for a respondent. Within-wave miss-
ing values result from typical item nonresponse
that is found in any cross-sectional study. These
missing values occur when a valid response is
not recorded for a survey question either because
the participant chose not to answer the question
or an interviewer failed to record the answer.
Item nonresponse occurs most frequently for
sensitive questions (e.g., regarding income or
sexual behavior) and questions that are difcult
to answer (e.g., recalling a date; De Leeuw, Hox,
& Huisman, 2003). Within-wavemissing data in
panel studies can also occur when questions are
included in only some study waves.
Whole-wave missingness occurs when
respondents do not participate in all data col-
lection time points. In a four-wave panel study,
for example, a respondent may participate only
in the rst two waves before dropping out of
the study. This produces missing data on all
variables in the two subsequent waves. The
result is a substantial amount of missing data for
the time period covered by the wave, although
time-invariant characteristics (e.g., date of birth)
may be carried over from an earlier wave. When
respondents are missing entire waves of data,
too little information is available in the wave
to inform the data analysis, and information
on time-varying change is lost because of the
missing waves.
Attrition in longitudinal panel studies (or
study dropout) has received much attention
in the literature, and several strategies for
statistically evaluating and adjusting for the
consequences of attrition have been developed
(Groves, Dillman, Elting, & Little, 2002; Little,
1995). The attrition literature focuses heavily
on the potential for biased statistical estimates
that could result from overlooking attrition. In
medical clinical trials, for instance, the dropouts
from the trial may have been persons for whom

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT