The study of voters and elections has taught us a lot about individuals' vote choices and election outcomes themselves. We know that voters behave in fairly understandable ways on election day (see, e.g., Alvarez 1997; Campbell 2000; Campbell et al. 1960; Gelman and King 1993; Johnston et al. 1992; Lazarsfeld, Berelson, and Gaudet 1944; Lewis-Beck 1988). We also know that the actual outcomes are fairly predictable (see, e.g., Campbell and Garand 2000). Of course, what we do know is imperfect. (1) Even to the extent we can predict what voters and electorates do at the very end, we know relatively little about how voter preferences evolve to that point. How does the outcome come into focus as the election campaign unfolds? Put differently, how does the campaign bring the fundamentals of the election to the voters?
Previous research suggests that preferences evolve in a fairly patterned and understandable way (Campbell 2000; Wlezien and Erikson 2002). This research focuses on the relationship between election results for a set of years and trial-heat poll readings at varying points in time during the election cycle, mostly for presidential elections in the United States. (2) What it shows is that the predictability of outcomes increases in proportion to the closeness of the polling date to election day. The closer we are to the end of the race, the more the polls tell us about the ultimate outcome. Although this may not be surprising, it is important: the basic pattern implies that electoral sentiment crystallizes over the course of election campaigns.
The previous research takes us only part of the way. That is, it does not explicitly address dynamics. This is quite understandable; after all, we lack anything approaching a daily time series of candidate preferences until only the most recent elections. In this context, the U.S. presidential race in 2000 offers us a fairly unique opportunity. The volume of available data for this election allows us to directly observe the dynamics of voter preferences for much of the election cycle. We cannot generalize with but a single series of polls. We nevertheless can explore at much greater depth than has been possible in the past.
The analysis in this article attempts to answer two specific questions. First, to what extent does the observable variation in poll results reflect real change in electoral preferences as opposed to survey error? Second, to the extent poll results reflect real change in preferences, did this change in preferences actually last or else decay? Answers to these questions tell us a lot about the evolution of electoral sentiment during the 2000 presidential race. They also tell us something about the effects of the election campaign itself. Now, let us see what we can glean from the data.
For the 2000 election year itself, the pollingreport.com Web site contains some 524 national polls of the Bush-Gore(-Nader) division reported by different survey organizations. In each of the polls, respondents were asked about how they would vote "if the election were held today," with slight differences in question wording. Where multiple results for different universes were reported for the same polling organizations and dates, data for the universe that best approximates the actual voting electorate are used, for example, a sample of likely voters over a sample of registered voters. Most important, all overlap in the polls-typically tracking polls-conducted by the same survey houses for the same reporting organizations is removed. For example, where a survey house operates a tracking poll and reports three-day moving averages, we only use poll results for every third day. This leaves 295 separate national polls. Wherever possible, respondents who were undecided but leaned toward one of the candidates were included in the tallies.
Figure 1 displays results for the complete set of polls. Specifically, it shows Gore's percentage share of the two-party vote (ignoring Nader and Buchanan) for each poll. Since most polls are conducted over multiple days, each poll is dated by the middle day of the period the survey is in the field. (3) The 295 polls allow readings for 173 separate days during 2000, 59 of which are after Labor Day, which permits a virtual day-to-day monitoring of preferences during the general election campaign. However, it is important to note that polls on successive days are not truly independent. Although they do not share respondents, they do share overlapping polling periods. Thus, polls on neighboring days will capture a lot of the same things by definition. This is of consequence for our analysis of dynamics.
[FIGURE 1 OMITTED]
The data in Figure 1 indicate some patterned movement in the polls over time. For any given date, the poll results nevertheless differ quite considerably. Some of the noise is mere sampling error. There are other sources of survey error, as we will see. The daily poll-of-polls in Figure 2 reveals a more distinct pattern. The observations in the figure represent Gore's share for all respondents aggregated by the middate of the reported polling period. We see more clearly that Gore began the year well behind Bush and gained through the spring, where his support settled at around 47 percent until the conventions. We then see the (fairly) predictable convention bounces, out of which Gore emerged in the lead heading into the autumn. Things were playing out as political science election forecasters might have expected, and much like 1988 (see Wlezien 2001). The sitting vice president is running, the economy and presidential approval are favorable, he is behind in the polls early in the year, and then he gains the lead for good after the party convention. But then the parallel with 1988 stops, and Gore's support declined fairly continuously until just before election day, when it rebounded sharply. The polls in the field at the very end of the campaign indicated a dead heat.
[FIGURE 2 OMITTED]
Survey Error and the Polls
Trial-heat poll results represent a combination of true preferences and survey error. Survey error comes in many forms, the most basic of which is sampling error. All polls contain some degree of sampling error. Thus, even when the division of candidate preferences does not change, we will observe changes from poll to poll. This is well known. All survey results also contain design effects, the consequences of the departure in practice from simple random sampling that results from clustering, stratifying, and the like (Groves 1989). When studying election polls, the main source of design effects relates to the polling universe. It is not easy to determine who will vote on election day: when we draw our samples, all we can do is estimate the voting population. Survey organizations typically rely on likely voter screens. In addition, most organizations use some sort of weighting procedure, for example, weighting by a selected distribution of party identification or some other variable that tends to predict the election day vote. How organizations screen and weight has...