Abstract

Nigel Ward’s new book, Prosodic Patterns in English Conversation, provides a glimpse into what computational techniques, balanced with native speaker introspection and intuitions, can do for research into prosody. Ward’s approach is contrasted with more phonological, pitch focused accounts of prosodic meaning, where particular melodies (e.g., a low rise) or sequences of underlying tones (e.g., a L* L-H% in Tone and Breaks Indices [ToBI] notation; Beckman & Elam 1993) carry meaning, and specifics of the phonetic implementation of those tunes or tones represent variations on those core meanings. Ward’s analysis is more surface oriented, in that he attempts to find meaningful patterns directly from the acoustic signal, considering intensity, duration, and fundamental frequency (f0; one acoustic correlate of the psychoacoustic phenomenon of pitch) in describing what constitutes a meaningful prosodic pattern, or “construction.”
Following a short introduction, in chapters 2, 3, and 4, Ward gives detailed descriptions of both the form and functions of three prosodic patterns. The first is the “bookended narrow pitch” construction, a period of narrow f0 surrounded by areas of wider f0, used for a variety of functions (contrasting something, seeking agreement, complaining, etc.), which Ward unifies under the meaning of “consider this.” The second, the “minor third” construction, is a period of high, narrow range f0 followed by a period of mid, narrow range f0. This construction is used for routine situations which demand a response (e.g., knock-knock jokes, greetings; see also Ladd 1978). Chapter 4 outlines the functions of creaky voice, which Ward describes as not being a construction, due to its more paralinguistic nature, but is used as a component of some other constructions.
Chapter 5 provides a brief history of the study of prosody. Chapters 6 and 7 describe two constructions which Ward says pose problems for traditional approaches to prosody: the “late peak” construction (a period of high intensity following a period of high f0), and a construction for expressing “positive assessment.” Both are later revisited using Ward’s model.
Chapter 8 introduces the concept of “superposition,” where constructions can be overlaid (e.g., creaky voice overlaid on a minor third, or a late peak on one of the “bookends” of the bookended narrow construction). Chapter 9 describes the methodology used in the rest of the book, Principle Component Analysis (PCA). PCA is a method of data reduction, whereby a large number of measurements are distilled into a smaller number of dimensions. Ward uses an extended analogy of physical measurements of humans here. A large amount of the variation in a range of measurements (height, weight, length of limbs, etc.) can be described by a single dimension of “age,” with other dimensions like “tall”/“short” explaining additional variation not accounted for by that first dimension. The PCA used here takes a large number of acoustic measurements and reduces them to a smaller number of dimensions, which inform Ward’s discovery of constructions. For example, a large amount of the variation in intensity and f0 in the measurements from two speakers, A and B, are covered by Ward’s first dimension, which is the “Has Turn” construction. If A is speaking, the first dimension will be negative (counterintuitively, representing high intensity and f0 from A); if B is speaking, the first dimension will be positive (representing low intensity and f0 from A).
In chapter 10, Ward revisits the constructions introduced previously, along with a few others, using PCA. The next three chapters walk through some of the additional dimensions uncovered by the PCA, namely, those related to, respectively, turn-taking, topic management, and stance. These include the “filler” construction, used for holding the floor (a period of lengthened, high intensity speech with narrow f0), or “low pitch for explaining” construction (a period of low f0). A very short chapter 14 covers “The Rest of English Prosody,” and chapter 15 is a brief conclusion, calling for improvements on his model, and for speakers and learners of English to make use of the knowledge gained from the book.
The usefulness of Ward’s approach likely depends on the extent to which the reader agrees with Ward’s definitions and approaches to the three components of his title: “prosodic patterns,” “English,” and “conversation.”
For “conversation,” Ward is concerned with the function of prosody in dialogues: how people “explain things, make plans, cooperate on a task, or get to know each other” (3). The main analysis of the book is on a corpus of conversations between college students (Computer Science majors) recorded in El Paso, Texas. Ward raises legitimate issues with the use of carefully elicited laboratory speech in prosodic research: while these approaches come from a real need to have control over the segmental string, as well as the need to elicit particular speech acts, Ward expresses doubts about how well models built on this type of research (what he calls “monologue” research) fare in the real world, stating that this type of research “risks leading us to theories and frameworks that work for monologue only, precluding the accurate modeling of dialog phenomena” (67).
However, Ward overlooks some of the research grounded in phonological approaches which makes use of more naturalistic data, particularly the body of work looking at prosody and social meaning. McLemore’s (1991) work on the use of uptalk among sorority members is one example; more recently, Podesva (2011) and Holliday (2016) used data gathered from conversations between dyads or groups of participants. These data have been successfully analyzed using phonologically driven frameworks built on elicited lab speech. Ward’s focus is specifically on the role that prosody plays in the interactions of the conversation themselves, however, and is different from some of these other studies, representing a valuable contribution to the field.
By “English,” Ward mostly means American English, specifically the variety spoken by the speakers in his main corpus. Although more detailed data about the speakers in the corpus can be found in previous publications (Ward & Werner 2013; Ward & Gallardo 2015), this information for the most part, unfortunately, is not available in the text. Based on the cited works, and information given in the book itself, the corpus does skew slightly male, and college-aged, but does include both mono- and bilingual speakers (mostly in Spanish), who were all “native or highly proficient” (127) 1 speakers of English. The relative homogeneity of the corpus raises the question of generalizability, but, one benefit of a computational approach is that it should be a relatively simple matter to rerun the analysis on different corpora of other varieties of English, and see if the same or a similar set of dimensions are returned by the PCA. And, in fact, Ward alludes to some success doing so on a corpus of speech from Edinburgh. Here, the data-driven approach has an advantage over more phonological approaches, where extending a model to include other varieties is not a straightforward matter. For example, ToBI annotation systems are, unlike transcription systems like the International Phonetic Alphabet, more explicitly phonological in nature, and need to be tailored for each specific linguistic variety. This specificity means that there are ongoing discussions, for example, about whether, and to what extent, the Mainstream American English (MAE) ToBI system needs to be altered to describe the intonation system of African American Language (AAL) (see, e.g., Gooden 2009; Holliday & McLarty 2018), complicating the ability to directly compare MAE and AAL prosody.
The last part of the title, “prosodic patterns,” or in the book, constructions, is one that is likely to be the most controversial. Ward’s approach to finding these constructions was rooted in an iterative, data-informed methodology, combining native speaker intuitions and observations with a PCA. The measurements for the PCA were taken using a moving window: for example, f0 was measured at a certain time point, along with f0 at six time points earlier, and five time points later; f0 would then be taken at the next time point, along with the other (now readjusted based on the new time point) eleven points. The measurements were taken disregarding things like utterance ends, turn-switches, stress status of the syllable, and even the segmental string. This approach has some benefits, as it allows for the identification of phenomena like the prosodic features of backchannel cues in terms of what both Speaker A and Speaker B are doing: we can “see” Speaker A dropping in pitch before Speaker B produces a backchannel with a narrow f0 range. But it means that what we have here is, ultimately, an aphonological approach to prosody.
This approach means that Ward’s constructions end up crosscutting the descriptive categories of other systems, ranging from ones that are roughly analogous to specific ToBI or British School tunes (the “minor third” construction), to essentially any rise in pitch following a stressed syllable (the “late peak” construction), to things like turn changes. Ward says that he is only interested in presenting the “facts” of prosody in English conversation, but in setting up all of the previous as equal in status—all are “prosodic constructions,” and can be present to a greater or lesser degree—his approach is not quite the view from nowhere he wants it to be. A researcher who is interested in sorting out the linguistic, sociolinguistic, and paralinguistic functions of pitch, intensity, and duration would likely be frustrated by this analysis. The unexplored question in his book is both the role and nature of the phonological system (if there is one) underlying his patterns.
However, other prosodic researchers coming from more phonological schools, particularly those interested in intra- and interspeaker variation, have begun using similar techniques, including PCA, to take a closer look at specific nuclear contours (phrase-final melodies, which Ward somewhat disparagingly calls “cute creatures of the traditional little menagerie” [66]), including Lohfink, Katsika, and Arvaniti (2019). In this study, the authors performed a functional PCA to describe the f0 curves of a set of utterances, which both confirmed and challenged predictions from past intonational phonology work. The tools outlined by Ward can be used to explore prosodic phonology, following in the more general tradition of laboratory phonology.
There are a number of online supplemental materials, including recordings of all of the examples in the books, of which there are a great number (a plus for a book on prosody). The official Cambridge site is a bit clunky; the author’s own site also has the audio recordings and is much easier to use. There are also lesson plans for teaching specific constructions. Although Ward states that he wants the book to be accessible to non-specialists, including ESL teachers, the book is not well-organized for use as a reference manual. The lesson plans, however, may be useful for this audience.
In the end, the book will be, most likely, of interest to a narrower audience: those interested in computational techniques in prosodic research. Even if one disagrees with how Ward uses these techniques and interprets the findings, the book invites the reader to try them out for themselves and to seriously consider how best to account for variation in prosody.
