Arousing visuals are ubiquitous in television news because they attract attention and elicit arousal (A. Lang, Potter, & Grabe, 2003). Interestingly, however, visuals with strong emotional content can distract and interfere with news learning (Klijn, 2003), resulting in visuals that draw viewers' attention to the emotional part of the news story but cause subsequent recall error (Brosius, 1993).
Cognitive neurophysiologists have long been intrigued by the relationship between emotion and cognition. An earlier, simpler assumption was that human rationality and reasoning could be hijacked by the pirates of emotion (Cacioppo & Petty, 1981). It is now generally accepted that emotion plays a constructive role in the process of cognition and behavior (Clore & Ortony, 2000). For example, research has shown that emotion affects attention, perception, memory, reasoning, and decision making (Cacioppo & Gardner, 1999) as well as interpretation of media messages (Steinfatt & Roberts, 1983). One explanation is that the amygdala, an area in the brain, is activated. LeDoux (2000) showed that information from sensory input organs travels to this mesial structure of the temporal lobe first before a second signal is transmitted to the cortex and that the amygdala seems to regulate the affective component of cognitive processing. Greenfield (2000) suggested that the amygdala is not the control center of emotion; such a conceptualization would merely change the locus of the problem and suggest the idea of a brain within a brain. However, the amygdala is where emotions are processed first and then transmitted to another brain structure for development and conscious elaboration. It appears, as Greenfield suggested, that emotion and cognition are inseparable.
Despite its important role in cognition, emotion has received scant attention in the study of audiovisual redundancy, or how people integrate television information from both the audio and the visual tracks when they do or do not convey the same information. In fact, none of the existing redundancy studies controls for this variable. Unfortunately, the existing literature on redundancy presents diverse and somewhat contradictory results (Brosius, Donsbach, & Birk, 1996; A. Lang, 1995). Given evidence that people process emotional stimuli differently than nonemotional ones (LeDoux, 2000), it is possible that previous contradictory results may be due to the confound of emotional arousal. This study factors in this variable. In addition, most redundancy studies have focused on memory measures, with none attempting to unravel viewers' thought processes in response to conflicting audio and visual information. It is important to understand how viewers reconcile and evaluate incompatible information in the presence of emotional stimuli (Bradley & Lang, 2000). Only after we gain insight into this process can we begin to correlate it with memory data to construct a more complete picture of viewers' cognitive performance. This investigation is an initial step in this direction.
Paivio's (1986) dual coding theory offers a conceptual understanding of how humans process two streams of information: verbal and nonverbal. When the sensory systems detect these two stimuli, referential connections between them are made. The strength of the connection affects the degree of cognitive elaboration (Sadoski & Paivio, 2001). In this context, referential processing is arguably made easier when the audio track refers to the visual track, saving resources for higher order processing and deeper elaboration of the messages. Conversely, when redundancy is absent, efforts to make referential connections within the pool of limited capacity may divert the spreading of activation, resulting in less thorough processing (A. Lang, 1995). Evidence of this interference can also be found in the so-called McGurk effect, in which visual information provided by seeing a speaker's mouth move can change the perception of the sound stimulus (McGurk & MacDonald, 1976).
Early redundancy researchers defined redundancy as the complete "match" between messages presented in the audio and visual channels. This idea suggested that both channels in television messages need to carry exactly the same information to be completely redundant. However, because the audio and visual channels embody two distinctive forms of information, it is difficult if not impossible to achieve complete synchrony. This disparity has led researchers to advance another definition of audiovisual redundancy namely between channel semantic relatedness (A. Lang, 1995). For example, if the audio refers to the aftermath of an earthquake, visuals showing debris of the disaster are considered semantically redundant with the audio. In a farming regulation story, however, shots of a Congressional debate on such issues are considered more semantically related than shots of cows munching hay because it is about enactment of regulations.
The vocabulary created within this body of research has included such definitions of audiovisual redundancy as the "increased dimensionality" of information, as audio was considered to be an added dimension of the visuals (Garner, 1962); "close relationship" (Drew & Grimes, 1987); "tightness of fit" (Grimes, 1991); linguistic and content-related "repetition" (Brosius, 1989); and "text-picture correspondence" (van der Molen, 2001). Most researchers defined redundancy as shared information between the audio and visual channels, such that they are facilitative and not contradictory relationships between words and pictures (A. Lang, 1995; Reese, 1984). Conversely, nonredundant presentation denotes that information presented in the audio and visual channels is conflicting. That is, the visual cues do not match what is spoken in the audio channel.
The findings of audio-video redundancy studies generally suggest that moving video (an enhance learning if it is complementary to the auditory channel (A. Lang, 1995; McDaniel, 1973; Reese, 1984). When the two channels are redundant, viewers are somewhat able to treat audiovisual presentation as a single source (Grimes, 1991), with the possibility that most attention is focused on the verbal message for crucial information (Drew & Grimes, 1987). When the verbal and visual channels conflict, however, viewers must deal with two streams of information competing for attention.
However, contradictory results were also found such that redundancy did not facilitate learning and comprehension (Brosius, 1989; Son, Reese, & Davie, 1987). Grimes (1990) also found that the medium redundancy condition yielded the worst video recognition scores, rather than the nonredundant condition. Some researchers argued that pictures provided semantic meanings more quickly than did their verbal counterparts (Pellegrino, Siegel, & Dhawan, 1975). Therefore, in the case of conflict, viewers would direct their attention to the visuals (Drew & Grimes, 1987). Apparently, all visuals are not equal, which further highlights the importance of factoring in the emotional content of visuals.
Visual cognition involves constant human monitoring of the environment by comparing the external world with internal knowledge (Lester, 2000). If the visual scan reveals something emotionally arousing, curiosity will be triggered and more processing resources will be allocated to identify the phenomenon (Heilman, 2000).
In the psychology and communication literature, emotion has often been conceptualized using a dimensional approach--that is, categorizing any specific felt emotion in terms of a common set of substrates. Two primary factors common to all emotional experience were valence polarization, or how positive or negative the experience is, and arousal level, or how much the emotional system is activated by the experience (Bradley, 1994; Mehrabian, 1980; Osgood, Suci, & Tannenbaum, 1957). Communication researchers also have found this dimensional approach to be effective when trying to understand the processing of emotional messages. This research includes studies focusing on television messages in general (A. Lang, Bolls, Potter, & Kawahara, 1999) and specifically television advertisements (Morris, 1995; Yoon, Bolls, & Muehling, 1999) and television news reports (Grabe, Zhou, Lang, & Bolls, 2000).
Although the presence of negative versus positive television footage has been the variable manipulation for many experiments in the area (Gunter, 1987; Newhagen & Reeves, 1992), there is some controversy about which of the dimensions, valence or arousal, is more important (Reeves & Nass, 1996). A growing literature shows, however, that arousal, whether caused by positive or negative emotion, may be the primary factor that drives both attention and subsequent memory. When examining attention and memory for still pictures shown to experimental participants for 6 s, P. J. Lang and colleagues (Bradley, Greenwald, Perry, & Lang, 1992; P. J. Lang, Greenwald, Bradley, & Hamm, 1993) have consistently found that when controlling the valence of the stimuli, experimental participants paid significantly greater attention to arousing messages as opposed to calm ones. Similarly, the later recall memory was better for the information contained in the arousing pictures, regardless of whether the pictures were happy or sad, than for the information in the calm pictures (Bradley et al., 1992; P. J. Lang et al., 1993).
Evidence from neuroscience and cognitive psychology further showed that emotional arousal can shape cognition without an individual's being aware of the process (Lewicki, 1986; Strongman, 1996). Significantly for the present research, emotion theory posits that human beings are evaluators. Unlike computers, people evaluate or appraise each stimulus that they encounter with respect to their personal relevance and significance. The idea of appraisal was advanced by Arnold (1970) who suggested...