Thirsty in an Ocean of Data? Pitfalls and Practical Strategies When Partnering With Industry on Big Data Supply Chain Research

AuthorRod Franklin,A. Michael Knemeyer,Kevin B. Smyth,Keely L. Croxton
DOIhttp://doi.org/10.1111/jbl.12187
Published date01 September 2018
Date01 September 2018
Thirsty in an Ocean of Data? Pitfalls and Practical Strategies
When Partnering With Industry on Big Data Supply Chain
Research
Kevin B. Smyth
1
, Keely L. Croxton
1
, Rod Franklin
2
, and A. Michael Knemeyer
1
1
The Ohio State University
2
K
uhne Logistics University
Increased volume, velocity, and variety of data provides new opportunities for businesses to take advantage of data science techniques, pre-
dictive analytics, and big data. However, rms are struggling to make use of their disjointed and unintegrated data streams. Despite this, aca-
demics with the analytic tools and training to pursue such research often face difculty gaining access to corporate data. We explore the
divergent goals of practitioners and academics and how the gap that exists between the communities can be overcome to derive mutual value
from big data. We describe a practical roadmap for collaboration between academics and practitioners pursuing big data research. Then we
detail a case example of how, by following this roadmap, researchers can provide insight to a rm on a specic supply chain problem while
developing a replicable template for effective analysis of big data. In our case study, we demonstrate the value of effectively pairing manage-
ment theory with big data exploration, describe unique challenges involved in big data research, and develop a novel and replicable hierarchical
regression-based process for analyzing big data.
Keywords: big data; data science; practitioner engagement; governance form
INTRODUCTION
Many of todays supply chain managers feel like the ancient
mariner in the famous poem by Samuel Taylor Coleridge. But
rather than being surrounded by undrinkable water, supply chain
managers are more likely to be heard saying Data, data, every-
where, nor any information to use.An estimated 1,000 exabytes
(10
21
bytes) of data are generated each year (Yin and Kaynak
2015), and U.S. rms store on average more than 200 terabytes,
with a 40% expected annual compounded growth in that gure
(McKinsey Global Institute 2011). Often referred to as big data
(McAfee and Brynjolfsson 2012), this rapid expansion of data
creation and storage represents a potential source of competitive
advantage (Waller and Fawcett 2013b) and has businesses scram-
bling to extract value from it (Sanders 2016). Although many
denitions of big data are in common use, we specify big data
as an increase in volume, variety, and velocity of information
such that traditional analytic methods are stretched beyond their
limits (Megahed and Jones-Farmer 2015). This denition cap-
tures the multidimensionality of the concept, as bignessof data
is relative to the constantly expanding capacity to store and pro-
cess information (Guarnieri 2016), as well as to the intended util-
ity (McKinsey Global Institute 2011).
Despite its massive potential, rms struggle to derive meaningful
value from big data. Wieland et al. (2016, 207) note that compa-
nies are nding themselves in a situation of big data, but small
math,and exploitation of this growing data resource requires more
than accumulation and storage. A complementary issueis that busi-
nesses are careful of the level of access granted to sensitive internal
data, which drive trepidation in enlisting the help of outside experts
who may be a valuable resource for translating this data into
actionable information. This is especially true if the value of col-
laboration is unclear to decision makers within the rm. Despite
growing interest and multiple calls by editors to pursue practically
and theoretically relevant big data research (Waller and Fawcett
2013a,b), academics with the analytic tools and training to pursue
such research often face difculty gaining access to corporate data.
This manifestation of the gap between research and practi-
tioner communities (Hutt and Walker 2015), stems from logical,
temporal, and incentive-based differences in how the two parties
approach problems in their work. Management practitioners are
concerned with urgent practical problems, expect results quickly,
and are more often rewarded based on short-term measurable
goals (Hutt 2008), whereas academics have tended to prefer
research on more comprehensive (although oftentime less urgent)
underlying causes for observed phenomena. This methodical
approach inevitably delays results, as research entails an exami-
nation not only of the practical phenomena but also of the exist-
ing body of knowledge pertinent to the subject and all possible
contributing factors. Researchers are often rewarded more for this
epistemological truththan for the managerial relevance of their
work (Bartunek and Rynes 2014).
Successfully bridging the researchpractice gap requires practi-
tioners and academics to nd common ground. Van de Ven
(2007), Avenier and Cajaiba (2012), and Hutt and Walker (2015)
all propose an iterative two-way dialog between the parties to
forge mutually benecial research questions. This dialog often
initiated by and facilitated through an ongoing corporate research
partnership program such as those that exist at many research-
oriented business schools (Hutt 2008), aims to identify recurring
challenges and opportunities of practical relevance that may also
contribute to an increased understanding of more generalizable
management phenomena. Researchers in applied elds seeking to
gain access to bigdata sets to advance epistemological under-
standing of business phenomena must engage in this dialog to
Corresponding author:
Kevin B. Smyth, Department of Marketing and Logistics, Fisher
College of Business, The Ohio State University, 2100 Neil Ave,
Columbus, OH 43210, USA; E-mail: smyth.43@osu.edu
Journal of Business Logistics, 2018, 39(3): 203219 doi: 10.1111/jbl.12187
© 2018 Council of Supply Chain Management Professionals
identify and balance the obligations they hold to both practice
and the academy (Cotteleer and Wan 2016).
With this perspective in mind, the goals of this study are as
follows: (1) to describe a practical roadmap for collaboration
between researchers and practitioners pursuing big data research
and (2) to detail a case example of how, by following this road-
map, researchers can provide insight into a rm on a specic
supply chain problem while developing a replicable template for
effective analysis of big data. The case example highlights our
experience working on a big data study to nd the sources of
replenishment forecast deviation and bias in the quick service
restaurant industry. Key contributions of the current article are to
propose a process of conducting research that is of mutual value
to practitioners and researchers, demonstrate the value of pairing
a priori theorizing with big data exploration as proposed in Wal-
ler and Fawcett (2013a), describe unique challenges involved in
big data research, and develop a novel and replicable hierarchical
regression-based process for analyzing big data.
The remainder of the study is organized as follows: We describe
the origin and context of our big data case study, elucidate unique
technical challenges in collection, manipulation, and traditional
methods of exploration of big data, propose an alternative novel
hierarchical regression-based approach to explore big data, and
nally present a process for future practitioner-academic research
collaborations derived from our case study experience.
RELATIONSHIP DEVELOPMENT
Before describing the proposed facilitating process for big data
research engagement of academics and business practitioners, we
outline our specic experience. The process is organized into
subprocesses of relationship identication, research motivation,
project management, and ndings validation. It should be noted
that the description of the ndings validation subprocess will
occur after we describe the proposed facilitating process.
Relationship identication
The origins of this study stem from an ongoing relationship with
members of a practitioneracademic research group at a large
Midwestern university. Through the group, the primary fourth-
party logistics (4PL) provider for a major international quick ser-
vice restaurant enlisted our assistance in helping address multiple
supply chain issues their major restaurant customer was experi-
encing. The fact that the 4PL service provider felt comfortable in
approaching the team for assistance was based on their experi-
ence with academic researchers through their participation in the
practitioneracademic research group and other academic rela-
tionships that their organization had developed. The level of
experience a rm has working with academics critically impacts
how researchers generate initial interest and foster a strong ongo-
ing relationship for collaborative research.
Research motivation
Following the guidance of Hutt and Walker (2015), our research
team engaged in a bilateral collaborative dialog to translate the
practitioners challenges into mutually benecial research
questions. One of the issues identied was that the major restau-
rant rm was experiencing signicant order deviation by individ-
ual restaurant outlets in their centrally developed replenishment
forecasts, causing costly inefciencies and exaggerated responses
in multiple levels of the rms supply chain.
After this initial problem identication, we visited the head-
quarters of the 4PL provider to understand the work the rm
conducted for the restaurant chain. Through interviews with man-
agers and analysts across multiple divisions in the 4PL, as well
as personnel from one of the ve third-party logistics (3PL) com-
panies that provide distribution services for the restaurant chain,
we began to understand the connections between various inter-
ests that affect the focal issue of replenishment forecast
deviation.
The restaurant rm in our research operates almost 15,000
retail outlets domestically, of which more than 80% are con-
tracted to franchise companies. The rm utilizes the aforemen-
tioned 4PL to centrally develop sales forecasts and plan
replenishment for all restaurant outlets. All involved rmsthe
restaurant rm, its 4PL provider, its 3PL providers, and the fran-
chise companiesutilize a management information system
(MIS) curated by the 4PL, thus providing a centralized source
for big data across the involved companies. Each entitys rele-
vant MIS data are visible to at least the adjacent link in the sup-
ply chain. Figure 1 illustrates the relevant data ows within and
between rms.
Driven by data or by theory?
These data that permit exploration of causes of replenishment
forecast deviation clearly represent big data, because for almost
15,000 distributed outlets and 8,100 stock units, there are more
than 120 million potential daily transactions to evaluate. With
daily frequency, this data also had a velocity consistent with
incipient denitions of big data (McAfee and Brynjolfsson 2012;
Kitchin and McArdle 2016). While the variety of information
forms may not span to unstructured data, drawing from multiple
databases for a useful sample and mixing numeric and non-
numeric data are common features of big data analysis (Megahed
and Jones-Farmer 2015).
The generation of useful knowledge from this ocean of data is
increasingly achieved through a synthesis of data science,
1
pre-
dictive analytics,
2
and big data, referred to as DPB (Waller and
Fawcett 2013b). While use of DPB has expanded rapidly in
industry and in practitioner literature, it is still a matter of intense
debate for academics. Several generic approaches exist for big
data mining or exploratory pattern recognition that can identify
correlative relationships and aid immensely in prediction (Hand
et al. 2001; Han et al. 2011; Kuhn and Johnson 2013). However,
they lack the means to explain the causal mechanisms of their
predictions and appear to challenge the paradigm of a priori
theorizing.
1
Data science is dened here as the study of the generalizable
extraction of knowledge from data (Dhar 2013).
2
We dene predictive analytics as a broad class of statistical or
analytic techniques used to develop predictions of otherwise
unknown future events or behavior (Nyce 2007).
204 K. B. Smyth et al.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT