Ever since the results of Arthur Schlesinger Sr.'s first survey of presidential experts were released in Life magazine, the presidential ranking game has been a fixture of political journalism (1948). At regular intervals, media titans and research institutes can now be expected to release new poll results updating purported evaluation of presidential greatness (for example see C-SPAN 2009; Siena Research Institute 2010; Wall Street Journal 2005). However, despite their popularity with the public, it is probably fair to say that most political scientists feel that expert presidential ranking polls are simply "not very rigorous" (Pfiffner 2003, 23). Indeed, given the Justice Potter Stewart--"I know it when I see it"--subjective standard of evaluation often employed, even the experts who sometimes take part in the polls usually assume that they do not tell us much.

This study investigates this premise--with as much rigor and openness as the source material permits. It thus helps to fill the gap that exists between the public's almost insatiable interest in the presidential ranking game and the comparatively small attention the subject has received in scholarly journals. (1) In doing so, it provides critical analysis of presidential ranking surveys, investigating suspected problems associated with their subjectivity, lack of control for context, and evaluator bias.

This investigation leads to this study's first finding" regression analysis that both overcomes critics' concerns about the predictability of rating scores and provides fresh insight into the factors that structure presidential rating scores--consistently placing George Washington, Abraham Lincoln, and Franklin D. Roosevelt at the head of the class while consigning those like James Buchanan and Warren G. Harding to the back of the line. Results of multivariate analysis demonstrate just how easy it is to predict rating scores. This ease is demonstrated in two ways: first by showing how accurately and consistently this study's primary model predicts past rating scores; second, by using the model to show what effect getting reelected and being seen by experts as taking advantage of the opportunity to reorder the political regime would be predicted to have on a President Barack Obama's future rating score. Furthermore, the results of analysis demonstrate the significance of new measures, two operationalizing the latest theory extending Stephen Skowronek's path-breaking "political time" thesis (1993, 2011; see also Nichols and Myers 2010), and one controlling for cultural-level evaluative bias. Together these findings challenges past conclusions about the need for chief executives to possess "brilliance" (Simonton 2006).

This study's second finding demonstrates how context matters in structuring rating scores. This confirms that success in the ratings game is not mainly a function of personality or character traits. Indeed, this study additionally reveals the extent to which expert evaluators reward presidents who succeed in taking advantage of the contextual opportunity to reorder an enervated political regime. It also shows that experts punish those presidents who lead their political regime into enervated conditions, as we might expect, as well as those that new theory suggests fail to take advantage of the context to reorder. And because presidents are rewarded for performance within context in the ranking game, this suggests that they need to possess a measure of the contextual awareness that George Edwards (2009), and others, recommend, as well as a Machiavellian ability to alter one's "mode of procedure [to] accord with the needs of the times" (Machiavelli 1980, 121).

This study's third finding demonstrates that experts in every poll (and apparently of every political stripe) rewarded "progressive" presidents, gauged to be above average in their pursuit of "equal justice for all," with higher ratings. This does not show that experts are simply biased, but rather suggests that evaluation now takes place in a cultural milieu that favors presidents dedicated to equal justice. Consequences of this development are explored. In the end, while no claim is made that the popular expert surveys used in this study provide a true measure of presidential greatness, it is argued that the expert ranking polls--which may help the public define what it looks for in a president--may tell us more than critics admit.

Presidential Ranking Polls--Problems with Subjectivity, Context, and Bias

The ever-popular practice of evaluating leadership through comparative study has roots that go back two millennia, when Greek and Roman writers like Satyrus, Suetonius, and Plutarch first utilized biographical character studies to take stock of prominent leaders and advance theories of statecraft. Yet, it was not until 1948 and the behavioral revolution in the social sciences that Arthur Schelsinger Sr. applied a survey instrument to this. task. He asked 55 experts (mostly historians) to judge each president on his "performance in office" by placing them in one of five categories: Great, Near Great, Average, Below Average, or Failure. (2) He later repeated the well-liked exercise in 1962, publishing the results of his survey in the New York Times Magazine. A host of others have tweaked the method while following the tradition of surveying experts to produce presidential rankings (C-SPAN 2009; Murray and Blessing 1988; Siena Research Institute 2010; Wall Street Journal 2005).

Despite its long history, evaluating leadership has always been problematic. Back in the first century AD, Cornelius Nepos admitted as much by apologetically opening his Lives of Eminent Commanders with acknowledgement that many would judge his type of biographical analysis as "trifling" (1886). Today, three critiques challenging presidential ranking polls predominate in the existent literature (Pfiffner 2003), while a fourth (less discussed) problem also deserves brief mention. The first critique centers on the seemingly subjective evaluation standards used in some surveys. These are seen to allow each scholar to use their own criteria to rate presidents, and they have given some observers the impression that the presidential ranking game is "one without any real rules" (Dean 2001, 1). The second pertains to context and the still divisive issue of knowing how to fairly compare presidents who face differing historical opportunities and problems (Bailey 1966; DiClerico 1979). Third, expert polls are possibly biased. Here the usual cited threat is the predominance of Democratic partisan preferences within most expert survey samples (Felzenberg 1997, 2009; Lindgren and Calabresi 2000). There is also the possibility, which is not well addressed in the literature, that expert presidential ranking polls are not fully independent of each other. (3) However, while it is important to note that this concern has not yet been fully considered, my own analysis suggests that it is also fair to conclude that this issue does not threaten the results of this study.

One obvious way to respond to the critique that suggests presidential ranking surveys are subjective, is to set up measuring rods "specifying meaningful criteria to be used in rating presidents" (Faber and Faber 1997, 4). This is the strategy that C-SPAN, the Siena Research Institute (or Siena), and others have followed. Their ranking polls ask experts to score presidents on multiple equally weighted dimensions. (4) However, my own factor analysis of the results of the C-SPAN 2009 and Siena 2010 surveys reveals that the vast majority of their measures collapse onto one or two dimensions. In the case of the C-SPAN 2009 poll 9 of 10 of their measures scale on one dimension with a Cronbach's alpha of .976 and an Eigenvalue of 8.13, (5) while Siena's 20 measures have a Cronbach's alpha of .981 and scale on two dimensions with Eigenvalues of 15.15 and 1.34. (6) This means that while polls that employ multiple dimensions attempt to use many different criteria to rate presidents, the various measures they employ actually tend to be part of one or two underlying "greatness" dimensions.

Expert respondents to the C-SPAN 2009 and Siena Research Institute 2010 polls are therefore clearly not evaluating each president independently on every one of the dimensions. While it is impossible to know exactly why experts fail to make independent evaluations across all the dimensions, one can speculate that they may do so because even they lack the detailed knowledge that is required to accurately rate every president in such detail. Experts may simply work backwards from a general opinion of how presidents score overall and adjust scores on particularly salient measures when they deem it appropriate. (7) Although this does not prove that these surveys are hopelessly subjective, it does demonstrate that they fail to provide as many distinct criteria for evaluation as they purport to.

Another way to respond to the charge that there are no rules in the presidential ranking game is to use regression analysis to determine, post facto, what (if anything) structures rating scores. Despite the lack of meaningful measurement criteria, if a model accounting for a healthy amount of variance can be specified, and significant determinants can be found, then presidential rating scores can be predicted. This finding would provide evidence in support of a modest claim of internal validity and reliability within these polls.

Construction of a predictive model is problematic because it forces the comparison of presidents serving in historical contexts that provide different opportunities and constraints. Indeed, some argue that "a man cannot possibly be judged a great President unless he holds office in great times" (Rossiter 1960, 138). Still others answer that "the most important determinants of presidential effectiveness come from the ability and personality of the president" (Skidmore 2004, 7). While...

