Using paradata to collect better survey data: Evidence from a household survey in Tanzania

AuthorHenry Cust,Johanna Choumert‐Nkolo,Callum Taylor
DOIhttp://doi.org/10.1111/rode.12583
Date01 May 2019
Published date01 May 2019
598
|
wileyonlinelibrary.com/journal/rode Rev Dev Econ. 2019;23:598–618.
© 2019 John Wiley & Sons Ltd
DOI: 10.1111/rode.12583
REGULAR ARTICLE
Using paradata to collect better survey data:
Evidence from a household survey in Tanzania
JohannaChoumert-Nkolo
|
HenryCust
|
CallumTaylor
Economic Development Initiatives (EDI)
Limited, High Wycombe, United Kingdom
Correspondence
Johanna Choumert-Nkolo, Economic
Development Initiatives (EDI) Limited,
38 Crendon Street, High Wycombe,
Buckinghamshire HP13 6LA, United
Kingdom.
Email: j.choumert.nkolo@surveybe.com
Funding information
This survey benefited from the financial
support of the UONGOZI Institute.
Abstract
Data are a key component in the design, implementation,
and evaluation of economic and social policies. Monitoring
data quality is an essential part of any serious, large- scale
data collection process. The purpose of this article is to show
how paradata should be used before, during, and after data
collection to monitor and improve data quality. To do this
we use timestamps, global positioning system (GPS) coordi-
nates, and other paradata collected from an 800- household
survey conducted in Tanzania in 2016. We demonstrate how
key paradata can be used during each phase of a research
project to identify and prevent issues in the data and the
methods used to collect it. Our results corroborate the impor-
tance of collecting and analyzing paradata to monitor field-
work and ensuring data quality for micro data collection in
developing countries. Based on these findings we also make
recommendations as to how researchers can make better use
of paradata in the future to manage and improve data quality.
We argue for an expansion in the understanding and use of
varied paradata among researchers, and a greater focus on its
use for improving data quality.
KEYWORDS
data quality, face-to-face interview, paradata, timestamp, GPS,
interviewer
1
|
INTRODUCTION
Data quality is a public good. In recent years there has been a sharp rise in the availability of high-
quality data relating to development economics. This has helped foster the growing importance of data
|
599
CHOUMERT-NKOLO ETaL.
in the design, implementation, and evaluation of development programs and policies. This increasing
use and importance of data to inform policy decisions requires that the data underlying those decisions
is of high quality. Data quality is thus the focus of much attention within the field of development
economics (Jerven, 2016; Jerven & Johnston, 20151 ; Tasciotti & Wagner, 2017). Generally, however,
there has been relatively little research examining the quality of data and the methods used to collect it.
As pointed out by Jerven and Johnston (2015), “much academic work on Africa regularly uses flawed
data, but not all researchers demonstrate awareness of the flaws.
Recent developments to the techniques and methods used during data collection have helped in the
struggle for high- quality data. This includes the increasingly widespread use of electronic surveys, and
innovative research designs in the field of impact evaluation, among others (for example, randomized
control trials and field experiments). Such improvements to research methods can only contribute
positively to decision- making by helping to ensure that decisions are based on data acquired using the
most rigorous and accurate methods. Here there is still much room for improvement, particularly in
developing- country contexts, to ensure that decisions are based on accurate and reliable data.
Issues such as measurement errors, nonresponse bias, coverage bias, and sampling errors are key
for researchers, and have been studied in detail in the literature (e.g., Caeyers, Chalmers, & De Weerdt,
2012; Grosh & Glewwe, 2000; Landry & Shen, 2005; United Nations, 2008). Yet, despite their po-
tential as a powerful tool for improving data quality, “paradata” have so far been widely underused
and there are very few studies highlighting their uses. The concept of paradata belongs to a longer list
of data types that can be collected and used by researchers doing field work. According to Nicolaas
(2011), survey data include questionnaire data, metadata, paradata, and auxiliary data. Questionnaire
data are the respondents’ answers; metadata include sample design and questionnaire coding in-
structions; auxiliary data include external data such as census data or other administrative data; and
paradata include data about the data collection process, such as timestamps to capture the length of
interviews or specific modules of the questionnaire, global positioning system (GPS) coordinates to
track where interviews take place, and interviewers’ characteristics to investigate interviewer trends.
In this paper, we focus on face- to- face surveys that are still the dominant form of interview in de-
veloping countries, although there is an increasing use of mobile phone surveys with growing mobile
phone penetration rates (Demombynes, Gubbins, & Romeo, 2013). In the last decade there has been
a surge in the use of electronic surveys for face- to- face interviews. This can largely be explained by
the increasing awareness of the need to collect data of the highest quality, the availability of cheaper
and more efficient ultramobile PCs and tablets, the availability of several computer- assisted- personal-
interview (CAPI) software programs, and by the significant savings in time and costs of data collec-
tion when using CAPI (see Banks & Laurie, 2000; Caeyers et al., 2012; Carletto, Jolliffe, & Banerjee,
2015; King et al., 2013; Leeuw, 2008; Leisher, 2014; MacDonald et al., 2016; Rosero- Bixby, Hidalgo-
Céspedes, Antich- Montero, & Seligson, 2005). Using CAPI technology allows researchers to access
data almost instantly and provides data of better quality compared with traditional paper- based sur-
veys (Pen- And- Paper Interviewing, PAPI) (Caeyers et al., 2012).
When researchers collect primary data, they mainly focus on the survey questionnaire data, that is,
the actual responses given by the individuals interviewed. Researchers often complement these data
with auxiliary data, such as administrative data or census data. Survey paradata and metadata, which
are less known to development economists, are an invaluable source of information given the implica-
tions of poor quality data on the results of research and thus on decision- making.
The collection and use of paradata is not widespread when compared with the overall amount of
data collected. In this paper, we present an introduction to paradata and demonstrate how they can be
used: (i) during fieldwork preparation (e.g., piloting) to manage time and resources more effectively;
(ii) during fieldwork to monitor data quality on a day to day basis; and (iii) after fieldwork to evaluate

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT