Abstract: This paper aims to contribute to our understanding of multi-modal argumentation by examining the role of prosodic features in persuasive messages. Standard analyses of advertisements already assign a key role to visuals in understanding, reconstructing and assessing the argument. I present reconstructions of TV commercials that take into account verbal, visual and prosodic components. Because prosodic features are here especially relevant to reinforcing the argumentation, they should not be neglected in argumentation analysis.

Keywords: argumentation, multimodal discourse, nonverbal communication, prosodic features.

1. Introduction
Contemporary studies on argumentation broaden the scope of argumentation research beyond verbal and include analyzing the role of images (Birdsell & Groarke 1996; Birdsell & Groarke 2007; Groarke, 1996; Groarke & Tindale 2013….), music (Branigan 1992), gesture (Gelang & Kjeldsen, 2010) and other nonverbal elements in argumentation discourse. The need to deal with other than merely verbal elements in the argumentation process is perhaps most obvious especially in view of technological developments that alter our means of communication (and argumentation), as well as the ever present influences of the media and advertising industry in shaping public opinion, values, interests, and incitements to action. Groarke (1996, p.10) points out the perhaps plainest reason to develop an account of visual arguments that are in some cases crucial to persuade an audience: “Visual appeals are especially pervasive in everyday discourse, in which visual images propound a point of view in magazines, advertising, film, television, multi-media, and the World Wide Web”.

Multimodality expands research to other modes of argument besides visuals which could equally be persuasive, and may be used by arguers in everyday discourse as a sole means of argumentation, or consist in the simultaneous use of several such modes. Film or television commercials, for instance, combine verbal and visual mode but also music, framing, prosodic features such as voice quality, intonation, etc. However, the multimodality of argumentation constitutes a challenge to argumentation analysis because decoding and analyzing non-verbal argument importantly differs from more traditional, verbal argumentation analysis. Differences in analysis have resulted in a dispute among argumentation scholars on whether non-verbal elements, for instance images, could ever be considered as arguments (Fleming 1996; Blair 1996…). Over the recent two or so decades it has become more accepted (through far from being accepted widely, or beyond doubt) that arguing without words is possible (Groarke 2002; Kjeldsen 2012; Lake & Pickering 1998). Gilbert (1994), who has given analyses of argumentation in everyday discourse, suggested another view on multi-modality in argumentation that includes logical, emotional, visceral and kisceral arguments. He states that these modes may sometimes merely ‘strengthen’ or ‘repeat’ each other”, but also that kissing, touching, or feeling could be considered as argument provided it is being used to convince or persuade.

Gelang & Kjeldsen (2010) state that argumentation can occur in a host of different forms of expression, including speech, pictures and nonverbal behavior. Authors who investigate the role of nonverbal communication in argumentation, especially the use of gestures and facial expression, claim that nonverbal elements can function as arguments contributing to the speaker’s ethos, in their case politicians, because “recipients of a message in a rhetorical situation create their perception of the speaker through a holistic perspective” Gelang & Kjeldsen (2010, p. 567)

In summary, the analysis of argumentation in every rhetorical situation thus has to be multi-modal, because messages by which speakers intend to persuade audiences consist not only of a verbal part, but also feature nonverbal elements that can contribute to the strength of argument, or may even stand as arguments themselves. In this paper, we shall particularly deal with the ways in which non-verbal elements known as prosodic features may contribute to argumentation discourse.

2. Prosodic features and nonverbal communication
Prosodic features refer to both voice and speech cues of the speaker. They include features such as pitch, temporal structure, loudness and voice quality, emphasis and accentuation, but also (non)fluencies of the speaker. An extensive literature on nonverbal communication research has generally strengthened the view that such features have an important communicative role. For instance, Vroomen, Collier & Mozziconacci (1993, p. 577) write:

A speaker may indicate, through prosodic means, to which information the listener should pay particular attention (accentuation, emphasis), and he may provide cues about the syntactic organization of the utterance (phrasing). The communicative function of prosody is most readily associated with the expression of emotion and attitude.

Besides a correlation between prosodic features and emotions (Davitz, 1964; Scheerer, 1993; Vroomen, Collier & Mozziconacci 1993; Neuman & Strack, 2000), prosodic features are connected to the perception of a speaker’s personality, credibility, in short his ethos (Kramer, 1977, 1978; Berry 1990, 1992; Kimble & Seidel, 1991; Zuckerman & Miyake, 1993; Hickson, Stacks & Moore, 2004; Zuckerman & Sinicropi, 2011). Past research has particularly confirmed that prosodic features (among other elements of nonverbal behavior) are associated with persuasiveness of the speaker and changing of attitudes (Burgoon, Birk & Pfau, 1990; Knapp 2002). For instance, fluency, variations in pitch, higher intensity (i.e. louder speech) and faster tempo are connected with greater persuasiveness.

Although the connection between prosodic features and perceived qualities of a speaker are based mostly on stereotypes, numerous researches have suggested that such findings likely hold in real-world situations. For instance, Levin & Hall (1985), Knight and Alpert (1985) support a connection between the pathologies of a person and his prosodic features. To give another example, clinically depressed people tend to exhibit a lower speech rate, owed also particularly long pauses in their speech. Acoustic measurements, moreover, confirm that patients can change their vocal characteristics after undergoing therapy (Ostwald, 1961). The presence of stereotypical vocal characteristics is consistent with extant research which shows both female and male speakers to regularly perceive themselves in fairly stereotypically ways (Kramer, 1977, 1978; Berry 1992; Knapp 2002).

Based on this as well as similar empirical research (e.g., Smith et al.1975; Surawski & Ossof, 2006; Bartsch, 2009 etc.), one can conclude that a lower vocal pitch, a faster speech rate, and a relative absence of non-fluencies generally goes along with higher ratings for speaker’s competence and dominance. Zuckerman and Driver’s (1989) research on vocal attractiveness proposed that, similar to attractive faces, attractive voices may also elicit a more positive interpersonal impression. They found that professional judges, for instance, were able to agree on whether voices are attractive or not and that more attractive voices were associated with more favorable impressions of personality. As mentioned earlier, attractive voices include lower pitch, absence of nasality and extreme harshness. Subsequent work has largely replicated such results, showing that vocal attractiveness can be compared to effects of physical attractiveness (e.g., Berry 1990, 1992; Zuckerman et al. 1990; Zuckerman & Hodgins 1993). Speakers with more attractive voices are thus more favorably perceived by others. These insights are, of course, regularly sought to be exploited in public sphere communication such as advertising, radio and television, business communication (telephone announcements, customer service), and politics, among others.

Here, nasality makes for a vocal characteristic considered to be particular undesirable in public speaking. As Bloom, Zajac & Titus (1999, p. 279) state:

Highly nasal voices were rated as being lower in “status” (occupation, ambitious, intelligent, educated, influential), lower in social solidarity (friendly, sympathetic, likeable, trustworthy, helpful), and were negatively correlated with perceptions of persuasiveness.

Prosodic features have thus clearly been shown to be of importance for the assessment of a speaker’s personality and her persuasiveness, but also for the recognition of speakers’ emotional states. One of the early researches in nonverbal communication, Davitz (1964, p. 13) found that “regardless of technique in experiment, all research confirms that emotional state of a person can be recognized on the basis of vocal nonverbal expression,” a claim being supported in recent studies (Scherer, 1993; Neuman & Strack, 2000). Scherer (1986) has even hypothesized about a universality of vocal expression of emotions, the most important cues for emotion recognition being variations in tempo and pitch such that, for instance, happiness goes along with high pitch (higher frequency), variability in frequency changes, higher intensity (loudness) and greater tempo – sadness being associated with the polar opposite. How might such insights be used in rhetoric and argumentation research?

3. Prosodic features and argumentation
Prosodic features are readily connected to a speaker’s ethos (credibility, trustworthiness, honesty, benevolence) which has since antiquity been central to the process of persuasion. The Aristotelian Rhetoric (1.2. 1356a, 1991, p. 38), for instance, states:

There is persuasion through the character whenever the speech is spoken in such a way as to make the speaker worthy of credence; for we believe fair-minded people to a greater extent and more quickly [than we do others] on all subjects in general and completely so in cases where there is not exact knowledge but room for doubt.

The credibility of the speaker is thus important whenever there is intent to persuade, and most importantly so for testimonial claims. As Govier (1993, p. 93) explains:

Testimonial claims are especially important for a variety of reasons. Human knowledge is utterly dependent upon our acceptance, much of the time, of what other people tell us. Only thus can we learn language and pass on knowledge from generation to generation; only thus have we access to times, places, and cultures we do not and cannot experience ourselves.

Although testimonial claims also feature in judicial or political discourse, advertising contrasts as almost fully relying on testimonies of those who experience a certain product or are involved in its development. Discussing importance of the speaker’s credibility in testimonial claims, Govier distinguishes normative credibility, which depends on a person’s sincerity, honesty, and reliability, from her rhetorical credibility, which depends on the impression a speaker gives “the extent to which one is regarded as believable, and is believed, by others.” And she (1993, p. 94) characterizes such rhetorical credibility in exemplary fashion when stating:

People who are white and male, who dress well, look professional, appear middle class or upper middle class, speak without an accent in a deep or low-toned voice, and seem unemotional, rational and articulate, tend in many contexts to have more rhetorical credibility than others. Often those who lack such qualities are, in effect, rhetorically disadvantaged.

On this view, the manner of speaking as well as performance in general (clothing, body movements, body space etc.) are epistemically irrelevant, but rhetorically relevant. But could prosodic features or nonverbal elements be argumentatively relevant in general?

Gelang & Kjeldsen (2010, pp. 567 – 571) have recently claimed that nonverbal communication performs an argumentative function, or purpose, by contributing to speaker`s ethos. They provide examples drawn from the analysis of political discourse, where politicians are perceived in a certain manner as based on nonverbal signs, they suggest that, in some cases, such nonverbal behavior can be taken as a premise:

Moderate physical movement can in some circumstances be taken as a premise for the claim that a person is suitable as president; because it signals that the speaker is in control, where other people would be steered by their emotions.

We now pursue this idea, and wish to suggest that prosodic features can likewise be taken as a premise in specific argumentative situations. As will be illustrated with several examples of television commercials, prosodic features can, in certain cases, either contribute to the strength of argument, or else function as their crucial part.

3.1 Prosodic features as contributors to the strength of an argument
Prosodic features generally make some additional, broadly situated contribution to what, in abstraction thereof, is some non-situated argument-content. For instance, higher pitch of the verbal massage and faster tempo may illustrate the speaker’s happiness; lower pitch, quiet and slow speech may indicate depression, or sadness; staccato rhythm may see a speaker be perceived as strict, bossy, dominant and representing an authority, etc. Prosodic features are frequently used in television commercials to stress certain selling-points, or to establish one.

3.1.1 Always liners
One example of this is provided by a TV commercial for female hygiene products,[i] include a commercial for Always liners which, incidentally being in Polish, perfectly shows to non-Polish speakers that the verbal part of the message is irrelevant towards grasping the claim, and the reasons offered in support. As is well common knowledge, women tend not feel good during the menstruation period, lack energy, be tired, and feel uncomfortable, sometimes even anxious. But, or so the commercial suggests vividly, using the Always product, women may do what they please and nevertheless feel clean, comfortable – as shown by using visuals – but also happy, enthusiastic, energetic, vibrant – as presented through prosodic features connected with happiness such as high pitch, high intonation endings, wide pitch ranges, faster tempo. The chain of reasoning one might thus associate to this commercial is roughly this: Although menstruating, you feel good and vibrant when using Always liners. So, if you want as much, buy Always.

Besides pitch, intonation, tempo and pitch range, several other features can contribute to the strength of an argument. Word emphasis, rhythm and intensity (or loudness) can also be very important. Word emphasis often serves the purpose of identifying the most important word in a sentence, reveals new information, and generally differentiates parts of the speech according to communicative importance. Verbal message, for instance, can be presented in staccato rhythm (speech with pauses between words or even between syllables characterized with tense articulation), which is specific for giving orders in a strict manner that indicates dominance, and establishes authority, or in legato rhythm with smooth transition between syllables and lax articulation. Loudness and intensity may also serve a function as louder speech is frequently perceived as more persuasive.

3.1.2 Depression
A rather good example for the usage of these features is a commercial that advertises services for people who deal with depression.[ii] Its main intention is to raise awareness of depression, stating it to be a disease-like condition that can be cured if approached in a right way. The final claim is: If you suffer from depression, you need to get help. How do prosodic features contribute to this message? The female voice over, reading the message, displays a specific voice quality (a whispery voice suggesting empathy, compassion, and gentleness) and intonation (asking questions and giving answers). Content-wise, the message points to personal insights on depression. For instance, “Did you know that you can also feel it physically?”

Word emphasis is crucial in revealing new information when stating: “you KNOW you can feel it emotionally” – thus suggesting this is common knowledge – “But did you know you can ALSO feel it physically?” The function of emphasis, here, is to point out that depression has more than one symptom, besides emotional consequences (being widely known), pain can also be physical. The ad continues: “There ARE treatments that work on both emotional and unpleasant physical symptoms,” emphasizing ways to deal with this pain. An additional prosodic feature in this commercial is the speech pause, used in a stylistic function to stress the part of the message preceding the pause. For instance, “Where does depression hurt? (pause) EVERYWHERE. Who does depression hurt? EVERYONE.” By stressing the words “everywhere” and “everyone” the problem of depression receives emphasis; there is no need to explain it further. “Everywhere” here indicates that it is indeed a serious and complex condition for which a patient needs expert help. It is not a simple headache which can be cured with a right pill. And who does depression hurt?

By stressing “everyone” there is no need to explain that the whole family is suffering, that patient`s children, spouses, friends and coworkers feel it too. Everyone is affected by someone’s depression. This effectively yields another reason why those suffering from depression should seek expert help, as they can help not only themselves but everyone around them.

3.1.3 Evian
Unlike the two previous examples, the third one, a commercial for Evian water[iii] , is based on the testimony of the product itself. The chain of reasoning is simple: if a product looks clean and healthy, if it sounds clean and healthy, then it is healthy. The commercial combines the verbal mode, explaining where the sources of the water are from (the cleanest water sources in untouched nature), the visual mode (scenes of mountain tops covered with snow), music (instrumental), but also the prosodic features typical of a female speaker with very attractive voice quality, a whispery phonation type, and slower tempo. Her speech is being characterized by enhanced pronunciation of the consonant [s], her speech resembles the sound of flowing water and wind.

3.1.4 Comparison
The argumentation in the commercial on depression is based on the simultaneous use of verbal and visual modes, while prosodic features, music and framing so to speak “straighten” the argument. This is an example of the use of prosodic features where, were one to remove or somehow alter these, the argument-content would remain the same, but it’s the argument would overall be a weaker one.

The argumentation in the Always example was based on the testimony of the product user stating something like: If you want to be like me or feel like me, use this product. Argumentation in the depression example is based on the argument from authority: a person who knows more gives advice. In addition, this person is empathic, gentle and truly wants to help (information conveyed by specific prosodic features). Similarly working in combination, different modes of argument combine in supporting the claim that Evian water is clean and healthy, and therefore should be purchased. In all three commercials prosodic features work in combination with other modes of argument in a multimodal discourse giving an additional strength to the argument. An easy test to determine situations where prosodic features are crucial is to ask whether their absence, or modification, can change the argument-content. If this is the case, such features are in fact essential for the argument-content.

3.2 Prosodic features as an essential part of an argument
In certain situations prosodic features may function as more than just additional elements strengthening the argument; rather, they can be key for understanding the overall message, but also crucial parts of an argument. An example is provided by a Volkswagen television commercial.[iv]

Here, a specific lifestyle, or an attitude to life, is connected to a specific accent of a speaker. The main character speaks English with a recognizably Jamaican accent, stereotypically connected with a particular life-philosophy that values being relaxed, easygoing, carefree, and happy. Other people in this commercial, being his colleagues, are depicted as being frustrated, in a bad mood, frowning, while the protagonist spreads joy wherever he goes (in an elevator, by the coffee machine, at the meeting, etc.), constantly reminding others to look at the bright side of life. At one of the important moments in this commercial, his colleagues ask whether he isn’t in fact from Minnesota, something he confirms. So why does a white American from Minnesota speak his native language with a Jamaican accent? Answer: because he is happy, carefree, and easygoing. Why so? Because he drives Volkswagen, or so the viewer learns when his moody co-workers, after having taken a drive in his Volkswagen car, return in a much better mood, smiling, and also speaking with a Jamaican accent. Jamaican English is here presented not only through vowel pronunciation, but also through its specific syntax. In this commercial, then, the manner of speaking is more important than the verbal message.

The argumentation in this commercial can be reconstructed, Toulmin-style, as follows:

Ground: Happy person in a firm speaks with Jamaican accent (but is not from Jamaica).
Warrant: People with Jamaican accents are perceived as happy
Claim: Volkswagen auto bring happiness to people
Final claim: Buy Volkswagen auto

The second example, an Amnesty International commercial on violence against women, also makes use of accent and pronunciation as a crucial part of an argument.[v] It intends to raise awareness of both the perpetrators and the victims of violence, particularly by countering the stereotypical view according to which perpetrators are generally of low social status, lack education, and come from rural areas and – similarly, that female victims are weak, poor, uneducated, and unintelligent. Its main message is: Everybody can be a perpetrator, and everybody can become a victim. Do not judge people based on their appearance alone.

This message is predominantly communicated through prosodic features, while the commercial itself instantiates an argument from example, in turn based on the findings of sociolinguistic research on language attitudes showing people with some accents to be perceived as more sophisticated, educated, and as belonging to a higher social stratum. Both the male and the female speaker use Received Pronunciation (RP) British English, being a strong signal of their socioeconomic position, at least for native British English audiences (see, e.g., Trudgill 1995; Coupland & Bishop, 2007; Andersson & Trudgill, 1990; Giles, Scholes &Young 1983). Although the most extensive research on language attitudes has occurred for British English, similar findings for many different languages regularly demonstrate the important not only of what has been said, but also how, e.g., Labov (1966, 1972), Lippie-Green (1997) for American English, Hawkings (1993) for French, Kontra (2003) for Hungarian, Pomerantz (2002) for Spanish, Bezoojien (2002) for Dutch, Kišiček (2012) for Croatian. Invariably, accent is connected with the perception of speakers’ status, occupation, intelligence, economic situation and prestige.

The commercial makes uses of these insights, in order to launch an argument, as the commercial presents what in effect is an “audition for the best perpetrator.” During the audition, however, the viewer cannot see the candidates, merely their fists. This body part then is a nonverbal metonymy. The audition is conducted by a female, who the audience can only hear speak, with all the qualities that representing her as an educated, strong, intelligent women with authority and dominance. She even chuckles the moment that the perpetrator displays his aggressiveness by growling. Not intimidated, however, she does not take the obviously aggressive “candidates” seriously. This changes, however, when she faces the third candidate who speaks in perfect RP English with an attractive voice quality. Initially, his tempo is reduced, showing him to be under control, calm but dominant; then his manner of speaking changes, and towards the end he is annoyed because the female speaker interrupted him. These prosodic features typically reveal aggressiveness: louder speech (yelling), modulation (staccato rhythm), determined, dominant, giving orders. Also the female speaker changes features of her speech toward the end, as she begins to stutter, and speaks quietly, being on the verge of tears. Whether this argument is strong or weak may perhaps be discussed, but prosodic features remain a crucial part of it. By removing or changing the specific accent from the argumentation, the message would no longer be clear, nor would the claim be the same.

4. Conclusion
This paper has briefly discussed the importance of prosodic features in multimodal argumentative discourse. The term “prosodic features” covers all aspects of the manner of speech, including voice quality, accent and pronunciation (e.g., of vowel and consonants), tempo, rhythm, intensity, intonation, word emphasis, and (non)fluencies. Based on several examples of TV commercials, it was shown that not only what is being said, but also how it is said can contribute, positively as well as negatively, to the strength of an argument. Prosodic features, however, can sometimes take on an even more important role. Being more than mere contributing factors in these cases, they can be essential for successful making an argument.

Although this paper deals with TV commercials, rather than real-life argumentative situations, one may tentatively conclude that one’s manner of speaking influences one’s persuasive abilities. Thus, features of speech can identify the speaker as being a certain type of human being – determined or weak, cleaver and educated, or not, etc. These identifications, in turn, can be used as premises in specific situations.

21st century public discourse is multimodal, and there is a need to recognize more than a mere verbal, or propositional, mode of argument, something that currently challenges analysts who seek to identity different modes of argumentation. As van den Hoven & Yang (2013, p. 422) conclude:

The argumentative reconstruction of multimodal public discourse is a necessary element of advanced media-literacy in a world in which multimodality is the standard and a critical attitude of experts is desirable.

The argumentative reconstruction of multimodal public discourse should take prosodic features into account; the appeal to ear, as it were, should not be disregarded and its role in argumentative discourse properly analyzed.

