Sample selection bias and Heckman models in strategic management research

DOIhttp://doi.org/10.1002/smj.2475
Published date01 December 2016
Date01 December 2016
Strategic Management Journal
Strat. Mgmt. J.,37: 2639–2657 (2016)
Published online EarlyView 9 February 2016 in WileyOnline Library (wileyonlinelibrary.com) DOI: 10.1002/smj.2475
Received 29 January 2015;Final revisionreceived 5 October 2015
SAMPLE SELECTION BIAS AND HECKMAN MODELS
IN STRATEGIC MANAGEMENT RESEARCH
S. TREVIS CERTO, JOHN R. BUSENBARK, HYUN-SOO WOO,
and MATTHEW SEMADENI*
Department of Management, W. P. Carey School of Business, Arizona State
University, Tempe, Arizona, U.S.A.
Research summary: The use of Heckman models by strategy scholarsto resolve sample selection
bias has increased by more than 700percent over the last decade, yet signicant inconsistencies
exist in how theyhave applied and interpreted these models. In view of these differences, we explore
the drivers of sample selection bias and review how Heckmanmodels alleviate it. We demonstrate
three important ndings for scholars seeking to use Heckman models: First, the independent
variable of interest must be a signicant predictor in the rststage of a model for sample selection
bias to exist. Second, the signicance of lambda alone does not indicate sample selection bias.
Finally,Heckman models account for sample-induced endogeneity, butare not effective when other
sources of endogeneity are present.
Managerial summary: When nonrandom samples are used to test statistical relationships,sample
selection bias can lead researchers to awed conclusions that can, in turn, negatively impact
managerial decision-making. We examine the use of Heckman models, which were designed to
resolve sample selection bias, in strategic management research and highlight conditions when
sample selection bias is present as well as when it is not. We also distinguish sample selection
bias, a form of omitted variable (OV) bias, from more traditional OV bias, emphasizing that it
is possible for models to have sample selection bias, traditional OV bias, or both. Accurately
identifying the type(s) of OV bias present is essential to effectively correcting it. We close with
several recommendationsto improve practice surrounding the use of Heckman models. Copyright
© 2015 John Wiley & Sons, Ltd.
Selection bias is not well understood by prac-
titioners. (Kennedy, 2006: 286)
INTRODUCTION
Empirical studies in strategy research rely on sam-
ples of observations that represent fractions of
underlying populations. Biases may arise when
Keywords: sample selection bias; Heckman models; endo-
geneity; research methods
*Correspondence to: Matthew Semadeni, Ofce BA 323H,
PO Box 874006, Tempe, AZ 85287, U.S.A. E-mail:
semadeni@asu.edu
Copyright © 2015 John Wiley & Sons, Ltd.
researchers use samples instead of populations to
test hypotheses. In particular, sample selection bias
may occur when values of a study’s dependent
variable are missing as a result of another process
(Greene, 2011; Sartori, 2003). The central objec-
tive of this article is to explore the drivers of sample
selection bias and review how different analytical
tools can correct it.
Scholars routinely describe the intuition of sam-
ple selection bias as requiring a two-stage approach
(e.g., Wooldridge, 2010). Determining whether or
not an observation in an overall population appears
in its nal representative sample is the rst stage,
and modeling the relation between the hypothesized
dependent and independent variables in the nal
sample is the second stage. When an omitted vari-
able (i.e., an unmeasured variable not included in a
2640 S. T. Certo et al.
model) creates a correlation between the error terms
in these two stages, traditional techniques such as
ordinary least squares (OLS) regression may report
biased coefcient estimates. To resolve this poten-
tial bias, Heckman (1976) introduced the Heckman
model, a two-step process for data analysis.1
To better understand how strategy scholars
approach potential sample selection bias, we
reviewed 63 articles appearing in the Strategic
Management Journal (SMJ) between 2005 and
2014 that utilized Heckman models. In recent
years, strategy scholars have employed Heckman
models to study areas, including upper echelons
and board membership (e.g., Quigley and Ham-
brick, 2012), diversication and M&A activity
(e.g., Kim, Hoskisson, and Lee, 2015), executive
compensation (e.g., Chen, 2015), capital market
activity (e.g., Arikan and Capron, 2010), and com-
petition and factor markets (e.g., Ndofor, Sirmon,
and He, 2011).
Despite the signicant growth in the use of
Heckman models in strategy research (more than
700 percent over the last decade), we noted incon-
sistencies in how strategy scholars implemented and
reported the results they derived. We also found that
scholars often justied the use of Heckman models
on the basis of concerns about endogeneity from
a source other than sample selection. This is per-
haps to be expected, however, as some econometrics
textbooks list sample selection as a potential cause
of endogeneity (e.g., Kennedy, 2006). This prac-
tice may cause some researchers to (mistakenly)
equate the effects of sample selection bias with the
effects of other sources of endogeneity. These dis-
crepancies suggest that, as strategy scholars, we
need a more rigorous understanding of (1) how sam-
ple selection bias varies across study conditions, (2)
when sample selection bias affects statistical results,
(3) how to apply Heckman models, and (4) how to
account for effects resulting from sample selection
bias versus other sources of endogeneity.
Accordingly, the rst objective of this article is
to explain how sample selection bias varies across
study conditions. We explain that sample selection
bias is the result of a special case of endogeneity,
which we label sample-induced endogeneity. This
1By “the Heckman model,” we are referring to the Heckman
two-stage model. There is an alternative Heckman model known
as the Heckman full information maximum likelihood (FIML)
model, but the more ubiquitous is the two-stage model. Of the 63
reviewed articles published in the SMJ between 2005 and 2014,
only one article used the Heckman FIML model.
special case occurs when omitted variables create a
correlation between the error term in the selection
equation (i.e., the rst stage of a study’s statistical
model) and the error term in the equation of interest
(i.e., the second stage). We explain that the magni-
tude and direction of the bias depends on two fac-
tors: (1) whether the true relationship between the
independent and dependent variable is positive or
negative, and (2) whether the correlation between
the error terms in the two stages is positive or
negative.
To address our rst objective, we create four
gures (Figure 1a– d) to demonstrate that, in some
cases, sample selection bias can lead researchers to
nd signicant relationships that do not exist, or in
other cases, it can lead researchers to fail to nd sig-
nicant relationships that do exist. Then, we report
on the three studies in which we used simulations
to address our other three objectives. Study 1 exam-
ines the conditions necessary to understand when
sample selection processes bias results. We exam-
ine two factors related to the sample selection pro-
cess: (1) the strength of the correlation between the
error terms from rst- and second-stage equations,
and (2) the extent to which the independent variable
of interest is related to the probability of an obser-
vation’s entering the nal sample. Our literature
review revealed a great deal of confusion regarding
the role of lambda. Our simulations illustrate that
a signicant lambda does not always denote sam-
ple selection bias. Specically, our results indicate
that traditional techniques (e.g., OLS) remain unbi-
ased when the independent variable from the second
stage is not also a signicant predictor in the rst
stage.
Study 2 uses simulations to clarify how
researchers should implement Heckman models.
In this study, our ndings illustrate that lambda
can be insignicant even when sample selection
bias exists, if model specications are improper.
This nding is important as our review uncovered
numerous strategy research interpreting an insignif-
icant lambda as evidence of no sample selection
bias. Taken together, the simulations illustrate
the complex role of lambda in explaining sample
selection bias and help researchers understand how
to use Heckman models.
Finally, Study 3 addresses how researchers
should approach different sources of endogeneity
when employing Heckman models. We distin-
guish between sample-induced and other forms of
endogeneity. Our simulations reveal that Heckman
Copyright © 2015 John Wiley & Sons, Ltd. Strat. Mgmt. J.,37: 2639–2657 (2016)
DOI: 10.1002/smj

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT