Effects of winsorization: The cases of forecasting non‐GAAP and GAAP earnings

AuthorRuby Brownen‐Trinh
Date01 January 2019
DOIhttp://doi.org/10.1111/jbfa.12365
Published date01 January 2019
DOI: 10.1111/jbfa.12365
Effects of winsorization: The cases of forecasting
non-GAAP and GAAP earnings
Ruby Brownen-Trinh
Department of Accounting and Finance, School
of Economics, Finance and Management,
University of Bristol, Bristol, UK
Correspondence
RubyBrownen-Trinh,Department of Accounting
andFinance, School of Economics, Finance and
Management,University of Bristol, The Priory
RoadComplex, Priory Road, Clifton, Bristol BS8
1TU,UK.
Email:Ruby.Brownen-Trinh@bristol.ac.uk
JELClassification: G10, G11, G17, M41
Abstract
This study examines how the winsorization procedure affects the
performance of regression-based earnings forecasting models. I find
that the impact is multifaceted and depends principally on three fac-
tors: the level of data errors in the tails, the characteristics of firms
affected by the process, and the use of scaling. Fora non-GAAP earn-
ings yield specification, where data input errors exist, winsorization
changes the information set in a non-systematic way and helps to
improve the performance of regression-based forecasts, especially
when the least squares estimator is employed. However,for a non-
GAAP earnings per share specification, with fewer data input errors
found in the tails of the distribution, winsorization has a particularly
strong effect on very large companies, lowering the economic value
of earnings predictions. I observe similar results for corresponding
GAAP earnings specifications. Robust estimators, such as least abso-
lute deviation, high breakdown-point and Theil-Sen, appear to be a
more effective solution than winsorization. Their earnings forecasts
consistently yield significant positive abnormal returns across non-
GAAP and GAAP earnings specifications.
KEYWORDS
earnings forecasts, influential observations, robust regression,
scaling, stock returns, winsorization
1INTRODUCTION
Sets of observations which have been de-tailed by over-vigorous use of a rule for rejectingoutliers are inappro-
priate, since they are not samples.
Tukey(1960)
While econometric studies warn against the use of a winsorization process that replaces sample values aboveor below
a given percentile of the sample distribution with the values at the respective percentiles, the majority of empiri-
cal accounting studies employ this process (Leone, Minutti-Meza, & Wasley,2017). Extreme observations/outliers in
cross-sectional data used in these studies lead to biased coefficient estimates and heteroscedastic regression errors
(Barth & Kallapur, 1996), and winsorization appears to be a simple and convenient solution. In cases where outliers
J Bus Fin Acc. 2019;46:105–135. wileyonlinelibrary.com/journal/jbfa c
2018 John Wiley & Sons Ltd 105
106 BROWNEN-TRINH
occur due to shocks or data entry errors, winsorization helps to remove the effect of these observations (Leone et al.,
2017).However, when extreme values are just reflections of the cross-sectional variation in firm characteristics such as
firm size or profitability,this procedure risks systematically altering the data and any economic inferences for a subset
of firms that may form important parts of investment strategies.Hence, it risks affecting the efficiency and usefulness
of coefficient estimates.
This study seeks to shed light on these issues by addressing three questions. First, what is the extent and impact of
data input errors in regression-based forecasts of earnings? Second, what is the impact of the winsorization process on
the statistical performance of regression estimators? Third, how does winsorization affect the investment usefulness
of earnings predictions?
I use the earnings forecast setting for several reasons. First, earnings forecasts are a key determinant of equity
value (Ohlson, 1995; Ohlson & Juettner-Nauroth, 2005) and as such are important to investors in portfolio forma-
tion (Frankel& Lee, 1998; Hou, van Dijk, & Zhang, 2012). Although most investors rely on financial analysts’ forecasts
(Brown, Hagerman, Griffin, & Zmijewski, 1987), manystudies find that these are frequently biased (see, e.g., Bradshaw,
Richardson, & Sloan, 2001; Dichev & Tang,2009; Frankel & Lee, 1998). Therefore, a great deal of research has been
devoted to the developmentof bias-free regression-based forecasts. These forecasting models frequently rely on win-
sorization to reduce the effect of observations with extreme values (e.g., Harris & Wang, 2013; Hou et al., 2012; So,
2013). Hence, while claiming to outperform the forecasts of financial analysts in terms of accuracy,their results may
be limited to a specific sample and potential distortions by the winsorization process are largely ignored. Second, the
use of earnings forecasts in the pricing of stocks and in portfolio formation (see, e.g., Black, Christensen, Ciesielski,
& Whipple, 2018; Bradshaw & Sloan, 2002; Bradshaw,Christensen, Gee, & Whipple, 2018) allows me to look beyond
conclusions offered by existing studies on winsorization, such as Leone et al. (2017), byalso considering the impact of
winsorization on economic values.
Tobegin, I examine the authenticity of archival Generally Accepted Accounting Principles (GAAP) and non-GAAP
earnings data.1I manually check earnings figures in 10-K reports and find that the total GAAP earnings data down-
loaded from Compustat are highly reliable. Meanwhile,for non-GAAPearnings per share downloaded from the I/B/E/S
database, the veracityof 49% of the data in the tails of the distribution is questionable, being more than double the cor-
responding GAAP earnings per share. Here, the role of winsorization might serve different purposes and it might have
different effects. Fornon-GAAP earnings that claim to consist of recurring items, winsorization might help remove data
input errors, while for GAAP earnings, it might help to remove non-recurring items.
Toprovide some insights into the nature of the data in the tails, I carry out a further investigation of the companies
whose GAAP and non-GAAP earnings are likely to be replaced by winsorized values in a cross-sectional regression for
both unscaled and scaled earnings (namely, total earnings, earnings per share and earnings yield). I find that, for the
total earnings and for the earnings per share specifications, the upper tails of earnings distributions consist of genuine
earningsfigures of many important corporations, such as General Motors, Berkshire Hathaway, GeneralElectric, Exxon
Mobil Corporation,and IBM Corporation, all of which play a major role in capital market investment due to their promi-
nence in typical portfolios. In these cases, replacing reported accounting figures with winsorized values that are more
“acceptable”introduces statistical bias and potentially misleading information about large and economically important
companies.Earnings forecasts, therefore, mayhave less economic value even if they appear to have low forecast errors.
However,scaling by market capitalization changes the distribution of earnings. Here, winsorization of companies in the
tails appears to be non-systematic and the impact of winsorization on economic values is less likely to be serious.
I employ both GAAP and non-GAAP earnings forecasts to formally examine the effect of winsorization, with par-
ticular focus on the non-GAAP measure because of its availability, importance and relevance to investorsand other
stakeholders(see, e.g., Bentley, Christensen, Gee, & Whipple, 2018; Black et al., 2018; Bradshaw & Sloan, 2002; Brown
& Sivakumar, 2003; Hoogervorst, 2016; Wieland, Dawkins, & Dugan, 2013). I use both unscaled and scaled earnings
1Non-GAAP (“Street”) earnings numbers are the figures announced by corporations in their press releases and trackedby analyst estimate clearinghouse
services(Bradshaw & Sloan, 2002). They contain only the continuing component of GAAP earnings (Brown, Call, Clement, & Sharp, 2015).

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT