As pointed out by Willard (1989), argument "is a kind of interaction" (p. 92). Visual and multimodal argumentation is possible because the presentation of an argument is a communicative act performed in interaction with others, and because neither an argument] (understood as a cognitive instance of premises and conclusions) nor any communication of it on some specific occasion is tied to a specifically verbal form of representation. An argument can be offered through different manifestations (words, pictures, moving images). As long as the audience understands that the manifestation presented is a communicative act intended argumentatively, this audience will also understand that the manifestation offers an argument.
Obviously, a picture and a short caption-often used to convey multimodal argumentation-do not explicitly forward premises and conclusions in the same way as purely verbal arguments. So, if pictures are to prompt arguments in the audience, some sort of symbolic condensation must be present. By symbolic condensation I mean the condensing of different sensations, words, and ideas into one pictorial or multimodal representation, in a way that allows an audience to unfold these ideas and sensations in their reception of the rhetorical utterances.
TRANSCRIPTIONS AND THICK REPRESENTATIONS
The immediate and instantaneous character of the reception of many pictures does not mean that the communicated relationships are less complex than those we find in verbal expression. It might instead be said that their complexity is not limited by the linear structure and discursivity of verbal communication. Susanne K. Langer (1980) noted that "[a]n idea that contains too many minute yet closely related parts, too many relations within relations, cannot be 'projected' into discursive form; it is too subtle for speech" (p. 93). This projection, however, is possible with the non-discursive symbolism we find in some pictures because the primary function of the non-discursive is "conceptualizing the flux of sensations, and giving us concrete things in place of kaleidoscopic colors or noises," and this function "is itself an office that no language-born thought can replace" (Langer, 1980, p. 93).
"Conceptualizing" here, as I read it, does not necessarily mean putting into words. It can also be an understanding, an intuitive knowledge, reflected in "the pattern of physical reaction, impulse and instinct" (Langer, 1980, p. 98). There is something in presentational symbolism that cannot be translated (directly) into the discursive symbolism typically associated with verbal language: an epistemological understanding created by the flux of pictorial sensations, which denotative words alone will not provide. Understanding the idea in a work of art, Langer (1980) suggested-and understanding a picture, I may add--"is ... more like having a new experience than like entertaining a new proposition" (p. 263).
In Problems of Art, Langer (1957) displayed an essentialist position when claiming that there can be no hybrid works (pp. 81-82; cf. Mitchell, 1986, 2002), but the distinction between the presentational and the discursive does not make these forms of representation mutually exclusive: Presentational symbols may be proxy for discourse, and their content can be verbalized (Langer, 1980, p. 260). The presentational and the discursive are inseparably present in each other (Palczewski, 2002, pp. 5-6; cf. Mitchell, 1986; Willard 1978, 1981). The verbal art of rhetoric, for instance, has always been imbued with visuality. Orators have always created images with their words and spoken with their bodies (Kjeldsen, 2003). The inseparability of the presentational and discursive is essential for the actuality and power of visual rhetoric and argumentation. When looking at a picture of a man shooting, for instance, we cannot but grasp it both conceptually and concretely: We simultaneously experience the phenomenological event of a specific man shooting and conceptualize this phenomenon in general as a man shooting.
In the same way, the rhetorical and argumentative value of symbolic condensation is that it allows for a simultaneous cueing and evoking of a wide range of emotions and trains of thought. This is possible because of the semiotic richness of pictures. Photographs, noted Roland Barthes (1977), have a feeling of "analogical plenitude" so great that verbal description is literally impossible (p. 18). There are so many details in a photograph that it would require a lengthy book to try to describe it, and still you would not succeed because to describe a picture is "not simply to be imprecise or incomplete, it is to change structures, to signify something different to what is shown" (Barthes, 1977, pp. 18-19). "The image, in its connotation," wrote Barthes, "is ... constituted by an architecture of signs drawn from a variable depth of lexicons" (p. 47). Or, to put it another way: The semiotic richness based on the many different semiotic resources working simultaneously in pictures makes symbolic condensation possible.
Think of something as simple and ritualised as a wedding photograph. In order for such an image to make sense, we must call upon our knowledge of certain photographic conventions and formal traits (e.g., perspective, composition, colour), of distance to objects, interpersonal proximity, body movement, gestures and facial expressions, clothing and flowers. This, in turn, activates knowledge of interpersonal behaviour, cultural norms, and values.
Borrowing from Umberto Eco's theory of semiotics, Robert Hariman and John Louis Lucaites (2007) in their book, No Caption Needed, referred to this multiplicity of simultaneous codings as transcriptions.
Because the camera records the decor of everyday life, the photographic image becomes capable of directing the attention across a field of cultural norms, artistic genres, political styles, ideographs, social types, interaction rituals, poses, gestures, and other signs as they intersect in any event, (p. 35)
Hariman and Lucaites referred here to iconic press photographs, but the theoretical point applies in general. We can have the same kind of multiple transcriptions in paintings, cartoons, and other pictures.
The rich representation of pictorial imagery is made possible by the multiplicity of codes, resources, or transcriptions, working simultaneously. Barthes called this analogical plenitude because he was talking about photographs. Since I examine pictorial communication in general, I will use the term visual plenitude. It is visual plenitude, as described above, that enables thick description. The relation between these two terms are similar to the relation between the classical terms descriptio and evidentia (or the Greek equivalents ekphrasis and energeia) as understood by some ancients (Quintilian, Institutio Oratoria VI.ii.31-32). In the same way detailed descriptions may lead to evidentia, the visual plenitude-the multiplicity of representations in a picture-may lead to a thick representation. Leaning on Clifford Geertz's (1973) term thick description, I thus argue that pictorial representation has the ability to perform a thick representation that, in an instant, can provide a full sense of an actual situation and an embedded narrative connected to certain lines of reasoning. Relying on Gilbert Ryle (2009), from which the notion of thick description originates, Geertz distinguished between the thin description of what someone is doing (rapidly contracting his eyelids, for instance) and a thick description of the same phenomena (winking, flirting, or parodying, for instance) (p. 6). Ethnography is thick description, Geertz (1973) wrote, because "the ethnographer is ... faced with ... a multiplicity of complex conceptual structures, many of them superimposed upon or knotted into one another, which are at once strange, irregular, and inexplicit, and which he [sic] must contrive somehow first to grasp and then to render" (p. 10).
This is very much the same task we face as viewers when looking at pictures. While Geertz considered it the work of the ethnographer to provide thick descriptions of cultural phenomena, I approach the notion of thick description in two ways. First, I view the picture itself as a site able to offer thick representations: Pictures can provide thick representations of the world because they can offer multiplicities of complex conceptual and aesthetic structures knotted into each other. Second, similarly to Geertz's view of the ethnographer, I see it as the role of the argumentation scholar to provide thick representations of visual and multimodal argumentation that involve more than discursive meaning. The work of the rhetorical argumentation critic is to account for the stratified hierarchy of meaningful structures and simultaneously to grasp and render the phenomenological and aesthetic qualities of the picture. A picture may communicate propositions, but rhetorically and argumentatively it is more than the propositions we may extract from it.
No one would identify a Beethoven quartet with its score, wrote Geertz (1973, p. 11). From a phenomenological perspective he is certainly right. In order to understand a piece of music, it is not enough to read the score; you have to hear the music. In order to explain the aesthetic and rhetorical appeals of a piece of music, it is not enough to just provide the bare facts--the eyelid contraction. This is equally valid for ethnography and for rhetorical research of pictures. When attempting to understand pictorial argumentation, it is not enough to extract and write down premises. What we too often write down when examining action, Geertz (1973) argued with a quote from Paul Riceour, is the "noema ['thought,' 'content,' 'gist'] of the speaking. It is the meaning of the speech event, not the event as event" (p. 19).
The notion of thick representation in visual argumentation helps us to establish...