Abstract
The fields of language production and verbal memory have relatively little contact. I argue that utterance planning for language production has substantial memory maintenance demands and that utterance planning provides the maintenance and ordering processes for short-term verbal memory tasks. There has already been some movement toward this view. I discuss benefits to pursuing these links more fully.
Behind every action, there’s a plan. This was one of Lashley’s (1951) seminal observations about motor behavior—that an internal plan controls the sequence of movements to be executed. Plans are a temporary “memory for what is to come” (Rosenbaum, Cohen, Jax, Weiss, & van der Wel, 2007, p. 528). This perspective sees temporary maintenance and ordering not as part of a dedicated temporary storage system but as a constellation of skills, shaped by long-term knowledge and deployed in the service of action and other goals (Crowder, 1993; Kolers & Roediger, 1984; Postle, 2006). Here, I consider action plans for the goal of speaking—utterance plans—and their role in verbal memory tasks such as serial recall of word lists. 1 Although speaking in conversations and recalling in memory experiments initially seem quite different, and they are studied by distinct groups of researchers, I suggest that there is much to be gained in both memory and language-production research by considering the degree to which they share basic processes.
Utterance Planning in Speaking and Remembering
Utterance planning for speaking requires serial ordering to assemble words into sentences and to assemble subword units such as syllables and phonemes. These processes rely on long-term linguistic knowledge, which has sometimes led researchers to a view that ordering processes are largely “automatic,” guided by well-worn routines for sentence and word assembly (see Bock, 1982, for review). By contrast, researchers studying memory tasks have emphasized the tasks’ temporary aspect, as in Baddeley’s (1986) extremely influential working memory model, which places temporary verbal maintenance and ordering outside of the language system and in the phonological loop, a dedicated short-term storage component. With no ready path for long-term knowledge to influence maintenance and ordering (although other researchers have offered mechanisms to provide this link; e.g., Allen & Hulme, 2006), this dedicated-store perspective emphasizes the substantial attention and executive-control resources that are necessary to produce behavior (recall) in the absence of guidance from long-term knowledge (Shipstead, Lindsey, Marshall, & Engle, 2014). I will make two related arguments: first, that both everyday speaking and recall in a verbal memory experiment rely on both long-term memory and effortful attention for ordering and maintenance of what is to be recalled/spoken, and second, that utterance plans—the memory of what is to come in speaking—are also the arena of serial order and maintenance in verbal memory tasks, obviating the need for a dedicated short-term verbal store.
The degree to which these claims are controversial depends on the speaking and memory tasks under discussion. Consider the repetition of a coherent sentence, which occasionally happens in conversations when we repeat what someone just said. Sentence repetition is also a common immediate verbal memory task, especially for children. While there was once controversy about the temporary-memory versus long-term basis of task performance here, researchers now recognize that sentence repetition, whether in conversation or memory experiments, is handled by the language system (Klem et al., 2015; Lombardi & Potter, 1992): People comprehend the sentence and generate an utterance plan to reproduce it, guided by long-term knowledge of words and word orders. The central role of long-term knowledge and utterance planning in recall is also relatively uncontroversial at a much lower level of speaking and recall, the serial ordering of phonemes and syllables within words and nonwords. While Baddeley (1986) viewed phonological maintenance as the purview of the language-independent phonological loop, there is now overwhelming evidence that phonological ordering and maintenance in memory tasks is guided by the same long-term knowledge that shapes word production in everyday speaking (Acheson & MacDonald, 2009; Allen & Hulme, 2006; Jones, Macken, & Nicholls, 2004; Page, Madge, Cumming, & Norris, 2007). Indeed, Page et al. pointed to language production, specifically a “lexical-level utterance plan” (p. 61), as the source of phonological maintenance and ordering.
If utterance plans maintain whole sentences and subword units in memory tasks, where is the disconnect between temporary verbal memory and language-production research? It is in serial-recall tasks, where participants must recall unrelated words in the order presented. Here, the random word order of lists can seem wholly unlike knowledge-guided word ordering in sentences, leading many researchers to view list recall—the ordering of the words themselves—as supported by language-independent maintenance and ordering mechanisms (e.g., Page et al., 2007).
This position seems to me to be a curious state of affairs: Recall of whole sentences and within-word units is accomplished via utterance planning, but ordering of the words themselves is accomplished elsewhere. My colleagues and I have argued instead that utterance planning underlies maintenance and ordering for all verbal memory tasks, including the ordering of words in lists (Acheson & MacDonald, 2009). Figure 1 illustrates this view, showing simplified utterance plans for speaking a sentence and recalling a word list. These action plans can be considered activated portions of long-term memory under the focus of attention; this is Cowan’s (2005) description of working memory, but it also applies to speaking. The figure shows that the utterance plans begin differently but end with phonological encoding (maintenance and ordering) of words to be recalled and spoken.

Simplified utterance plans for speaking (left) and serial recall (right). In the plan for speaking, the middle phase shows hierarchical sentence (S) order (NP = noun phrase, VP = verb phrase, PP = prepositional phrase). Word ordering in recall is shown as a flat structure for simplicity, but see the text for evidence for other structures in recall. The bottom portion of both plans shows phonological information; characters within each syllable (σ) are International Phonetic Alphabet codes for the phonemes in the words cats and sleeping. Long-term memory includes a semantic network (illustrated as individual words for simplicity), with some words activated or partially activated. Hierarchical procedural long-term knowledge for assembling sentences and words is shown at the bottom, and sequential knowledge is shown as x→y, reflecting transition-probability knowledge across many levels, including words and phonemes.
The crux of the issue is illustrated in the middle of the figure. Sentence planning is guided by long-term knowledge of word meanings, their co-occurrences, and the procedural knowledge of the grammatical constraints in the language. Attention is needed to overcome tendencies to reproduce past sequences and instead plan a message-appropriate utterance. List ordering is also accomplished via utterance planning, with even more need for attention to overcome long-term knowledge about word ordering in sentences, so as to produce the arbitrary order of a list. Those demands are exacerbated in memory studies in which the same words are used across multiple lists (Fischer-Baum & McCloskey, 2015), because the tendency to reuse recent utterance plans (often termed syntactic priming; see MacDonald, 2013, for review and nonlinguistic variants) must be overcome for each newly ordered list. Everyday speaking and list recall are therefore on a continuum of guidance from long-term knowledge and attention-demanding planning of novel sequences violating past habits. Support for this view comes from evidence that both speaking and recall are affected by long-term lexical information (Allen & Hulme, 2006) and syntactic priming/long-term serial-order information (Botvinick & Bylsma, 2005). This approach is also consistent with Gupta’s (2009) computational simulation of both phonological ordering (utterance planning) and word ordering in recall within the same computational architecture. In the next section, I review some other similarities and differences in recall and speaking that bear on the role of utterance planning.
Task Demands in Speaking and Recalling
A critical difference between speaking and recall is that speech typically begins with a conceptually coherent message, not a random list. A coherent message activates well-practiced abstract conceptual representations, such as animate entities doing an activity at a location, which activate procedural knowledge for assembling sentences and phrases, helping to hold the words in order (Dell, Oppenheim, & Kittredge, 2008). This conceptual knowledge–word order link is reminiscent of a lexical semantics–phonological ordering link termed the semantic-binding hypothesis (Patterson, Graham, & Hodges, 1994), the idea that lexical semantics supports phonological maintenance and ordering. There are parallel results from patients with language impairments at both the phonological and word levels: Disruption of lexical semantics increases phonological errors in recall and speaking (Patterson et al., 1994), and disruption of the message yields word-ordering errors (Dell & Chang, 2014).
The existence of external stimuli initiating a memory trial might seem an important difference from spontaneous speaking, but in fact speaking also may begin with external input (e.g., seeing cats on the sofa). External signals to begin recall are also not a major difference from speaking because speakers also may experience “delay intervals” while waiting their turn in conversation, continuing to comprehend/encode another’s speech while maintaining their own utterance plan. A much more important distinction is that serial recall aims to reproduce the list exactly as presented, but everyday speaking has flexibility in both word choice and word order. For example, in Figure 1, speakers could say either “sofa” or “couch,” and both “The cats are. . .” and “There are cats. . .” are viable word-order options. This additional flexibility is critical in comparing speaking and recall the next section.
Memory Effects in Speaking and List Recall
Table 1 shows several results from immediate-serial-recall tasks that served as benchmarks for Botvinick and Plaut’s (2006) computational modeling of immediate serial recall, together with parallels in speaking.
Serial Ordering Phenomena in Immediate Serial Recall and Speaking
Note: See Botvinick and Plaut (2006) for a discussion of phenomena in immediate serial recall, including a review of empirical studies reporting these effects.
The table first shows some effects of interword similarity. Phonological similarity disrupts both speaking and memory (Allen & Hulme, 2006), but parallels at the word-ordering level are much less appreciated, owing to the task differences in speaking and recall. Word-level similarity in serial recall includes positional similarity—words in adjacent list positions are more similar to one another, and they are more likely to exchange than more distant words (Botvinick & Plaut, 2006). In speaking, the conceptual message and procedural knowledge guide word order, such that similar words are ones that share conceptual-grammatical properties. These similarity gradients affect exchanges in speaking: Nouns are conceptually and grammatically similar to other nouns, and they are more likely to exchange with other nouns than with other word types (Dell et al., 2008). These parallel similarity results suggest that the same ordering mechanism is behind word ordering in each case but that it operates over different kinds of similarity, owing to different task demands for speaking and recall.
Similarity has other effects on speaking, and speakers’ additional flexibility may obscure related effects in speaking and recall. For example, the sentence “The woman the girl kissed is tall” has two semantically similar nouns (woman and girl) in close proximity, which increases disfluencies (Smith & Wheeldon, 2004). Even if speakers do not make an error, serial ordering can be affected by this semantic overlap, with people taking advantage of speaking’s flexibility to avoid a difficult sequences of similar words. My colleagues and I (Gennari, Mirković, & MacDonald, 2012) manipulated the similarity of pictured entities (e.g., woman and girl) in scenes that participants described. We found correlations between similarity and word order. First, the more similar the two entities, the more often speakers described the scenes with word orders that placed the entities far apart (“The woman who’s being kissed by the girl . . .”). Second, the higher the similarity, the more often speakers produced utterances that left out “by the girl,” simply saying “The woman who’s being kissed.” Thus, speakers are more likely to omit optional words under high-similarity conditions than under low-similarity conditions. Similarity-based interference effects also appear in sentence recall (V. S. Ferreira & Firato, 2002) and in speech corpora (Hsiao, Gao, & MacDonald, 2014). These results show that if we recognize the additional flexibility in speaking compared to recall, we can see similarity-based interference effects in both tasks.
Next in Table 1 are primacy effects, the superior recall of words in early list positions, typically attributed to stronger encoding of early words than later ones. Language-production research gives no reason to doubt this interpretation, but it offers an intriguing additional point—that serial order in speaking also varies with the strength of representations: More strongly represented words are ordered earlier in the utterance plan (Bock, 1982). Given this well-learned association between representation strength and serial order in speaking, it is possible that primacy effects in lists may be due in part to strong encoding being a cue for early position placement in list recall. Manipulations of encoding or attention during planning and speaking (e.g., Nozari & Dell, 2012) may shed light on these phenomena. Many computational accounts of immediate serial recall use activation-based ordering mechanisms (Botvinick & Plaut, 2006; Hurlstone, Hitch, & Baddeley, 2014), but despite widespread recognition of activation effects on word order in speaking (Bock, 1982), there are relatively few activation-based computational accounts of these phenomena (though see Chang, 2009).
The next rows in Table 1 reveal some task differences and gaps in the available data. Recency effects are typically thought to emerge from an ephemeral sensory store, which likely has a closer relationship to language comprehension than to utterance planning. Word exchanges are more common than repetitions in recall, but I know of no published comparisons of whole-word repetitions (anticipations and perseverations) versus whole-word exchanges in speech errors. Nonetheless, both fields contain discussions of the need to include inhibition mechanisms to prevent repetitions (Botvinick & Plaut, 2006; Dell et al., 2008). Indeed, Dell et al. argued that the computational mechanisms used to avoid repetitions also control ordering in the face of similarity-based interference. A comparison of computational approaches, across recall and speaking and across ordering phenomena in Table 1, should be highly informative.
The list-length effect on recall in the table also has parallels in speaking, although the coherent message helps keep speakers on track. Longer and/or more complex utterances do yield more errors (F. Ferreira & Swets, 2005), but relevant data are sparse. Most collections of speech errors, for example, do not report enough sentence context to investigate whether speech errors increase with broader planning demands, such as in long or complex sentences.
The items in Table 1 suggest that recall and everyday speaking not only vary along a continuum of attention versus long-term memory influences, they also fall on a related continuum of representing adjacency between words versus more abstract positional information. At first glance, there does not seem to be much of a continuum: The very name serial recall emphasizes sequential representations, whereas language researchers view long-term syntactic knowledge as underlying a hierarchical plan, not a simple chain of words. However, serial recall is only partially guided by adjacency information (Fischer-Baum & McCloskey, 2015), and even recurrent (highly sequential) recall models capture more abstract positional information (Botvinick & Plaut, 2006). Meanwhile, hierarchical plans for speaking necessarily also represent sequential information, and there is increasing evidence for substantial effects of long-term serial-order knowledge (e.g., word-transition probabilities) on speaking (e.g., Bell, Brenier, Gregory, Girand, & Jurafsky, 2009). A greater appreciation is needed that both speaking and recall share both sequential and hierarchical representations, and that both draw on long-term learning and on attention to overcome this learning when needed.
Wider Implications
Domain generality
An important debate in studies of cognition concerns the domain generality of processes: To what extent are abilities like attention or cognitive control specific to certain tasks versus available across cognitive domains? Claims for an utterance-planning basis of ordering and maintenance for both speaking and memory tasks might appear to be a strongly domain-specific (language-based) approach to serial order, but that conclusion may be premature. I have previously noted (MacDonald, 2013) that word-ordering processes in utterance planning have parallels in nonlinguistic action planning, leaving open the possibility that domain-general action-planning mechanisms operate on domain-specific linguistic representations both for speaking and in memory tasks. Evidence that nonlinguistic motor activities interfere with both memory (Kozlov, Hughes, & Jones, 2012) and speaking (Boiteau, Malone, Peters, & Almor, 2014) lends support to this idea. Recent research in other areas argues against categorizing processes as strictly domain general or domain specific (e.g., Behrmann & Plaut, 2013), and a mix of language-specific and more general action-based ordering (or attention) processes may be useful in accounting for correlations between verbal and nonverbal memory tasks (e.g., Shipstead et al., 2014). If so, computational accounts would be essential to identify a balance between more domain-specific and -general components.
Verbal working memory assessments and individual differences
Verbal working memory tests abound in studies of typical and atypical child development, young adults, cognitive aging, and patients with brain injury. Some of this research continues to view temporary memory as a dedicated entity that is “used” in skills such as language comprehension, decades after Crowder (1993) derided this approach as “antiquated . . . and downright quaint” (p. 143). The research reviewed here piles on more evidence against a dedicated temporary store: Long-term linguistic representations, such as a word’s frequency and grammatical and semantic properties, affect both speaking and memory performance (and comprehension/encoding; MacDonald & Christiansen, 2002). Language skill and long-term linguistic representations vary with experience, including experience with word orders. It might seem implausible that individual differences in production (utterance-planning) experience could vary enough to yield individual differences in temporary verbal memory tasks, but in fact there are strong interconnections between different types of language experience that could affect utterance planning. Variation in amount of reading, for example, affects utterance planning and spoken production in both children and adults (Montag & MacDonald, 2015). More generally, results such as these suggest that individual differences in tasks purported to assess temporary memory and attention actually have substantial components of language experience and long-term memory. Indeed, a number of tasks that used to be described as verbal working memory assessments are now viewed as tests of language experience and skill (Edwards, Beckman, & Munson, 2004; Klem et al., 2015; MacDonald & Christiansen, 2002).
Language comprehension
This article has discussed maintenance and ordering during utterance planning, but there is also a literature on the maintenance and ordering demands for language comprehension, with a similar emphasis on long-term experience in temporary memory tasks (MacDonald & Christiansen, 2002). There are currently interesting developments in language-comprehension research that may offer new perspectives on the comprehension–production–serial order dynamic. Several researchers have suggested that during comprehension, people engage in covert utterance planning to help them predict what others are going to say, because generating partial predictions helps speed the comprehension process (see Dell & Chang, 2014). The predictions aren’t necessarily conscious or so detailed as to specify exact words, but the point for our purposes is that comprehenders may be using the serial ordering of utterance planning not only to plan their own speech but also to generate predictions and a temporary memory for what is to come . . . out of someone else’s mouth. If so, the production–maintenance–serial order linkages described here may extend well beyond the planning of speaking and recall.
Footnotes
Acknowledgements
Thanks to Mark Seidenberg, Gary Dell, and two anonymous reviewers for helpful comments on an earlier draft of this manuscript.
Declaration of Conflicting Interests
The author declared no conflicts of interest with respect to the authorship or the publication of this article.
Funding
This work was supported by grants from the National Science Foundation (BCS 1123788) and the Wisconsin Alumni Research Fund.
