On assessing the relative performance of default predictions

Published date01 November 2017
Date01 November 2017
DOIhttp://doi.org/10.1002/for.2479
AuthorWalter Krämer
Received: 12 January 2017 Revised: 7 April 2017 Accepted: 27 April 2017
DOI: 10.1002/for.2479
RESEARCH ARTICLE
On assessing the relative performance of default predictions
Walter Krämer
Department of Statistics, TU Dortmund
University,Dor tmund, Germany
Correspondence
Walter Krämer, Department of Statistics, TU
Dortmund University, 44221 Dortmund,
Germany.
Email: walterk@statistik.tu-dortmund.de
Abstract
We compare the accuracy of default predictions as, for instance, produced by
professional rating agencies. Weextend previous results on partial orderings to non-
identical sets of obligors and show that the calibration requirement virtually rules
out the possibility of some partial orderings and that the partial ordering based on
the ROC curve is most easily achievedin practice. As an example, we show for more
than 5,000 firms rated by Moody's and S&P that these ratings cannot be ranked
according to their grade distributions given default or nondefault, but that Moody's
dominate S&P with respect to the ROC criterion and the Gini curve.
KEYWORDS
default predictions, rating systems, partial orderings
1INTRODUCTION
Credit ratings have been the subject of quite some interest
in econometrics and finance recently. The Basel III accord,
for instance, obliges banks to attach numerical default prob-
abilities to all outstanding loans, so it is natural to ask how
the accuracy of different default forecasters can be best com-
pared. To the extent that letter grades given by rating agencies
can likewise be viewed as predicted default probabilities,t his
issue also touches upon the relative performance of leading
suppliers such as Moody's and S&P.
The empirical and political importance of assessing the rel-
ative performance of rating agencies is also stressed in the
recent European Securities and Markets Authority (ESMA)
guidelines on how the quality of ratings should be measured:
“In demonstrating the discriminatory power of a methodol-
ogy, ESMA typically expects a CRA to use the cumulation
accuracy profile (CAP) or the receiver operator characteris-
tic (ROC) curve” (ESMA 2016, p. 8). Implicitly, therefore,
ESMA calls for a ranking of competing agencies. Overall,
there are about two dozen such firms in this business in
Europe today, so the need to judge their relative prediction
qualities is obvious.
The present note is focused on the predictive performance
as such. Whereas there is a rich literature on how probability
forecasts are best produced (see, e.g., Hwang & Chu, 2014,
for a recent contribution in the present journal), much less
is known on how competing probability forecasters can be
ranked in terms of predictive accuracy, that is, on how the
ESMA demands can best be met. This note shows in Section
2 that the concept of calibration (DeGroot & Fienberg, 1983)
is a rather tough requirement which prevents most calibrated
probability forecasters from being unequivocally compara-
ble in terms of conditional default or survival distributions.
Section 3 allows for nonidentical sets of debtors and noncali-
brated forecasts and showsthat conventional partial orderings
do not easily carry over to this more general situation. Section
4 illustrates these results using default predictions made by
Moody's and S&P.
2PARTIAL ORDERINGS OF
PROBABILITY FORECASTS
The relative performance of probability forecasters is usually
evaluated via their respective Brier scores:
S=1
n
n
i=1
(𝜃ipi)2,(1)
or any other such scoring rule (see Winkler, 1996), where n
is the number of forecasts made, 𝜃i∈{0,1}denotes whether
854 Copyright © 2017 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/for Journal of Forecasting.2017;36:854–858.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT