Abstract
Research shows that cross-linguistically, subject–verb agreement with complex noun phrases (e.g., The label on the bottles) is influenced by notional number and the presence of homophony in case, gender, or number morphology. Less well-understood is whether notional number and morphophonology interact during speech production, and whether the relative impact of these two factors is influenced by working memory capacity. Using an auditory sentence completion task, we investigated the impact of notional number and morphophonology on agreement with complex subject noun phrases in Dutch. Results revealed main effects of notional number and morphophonology. Critically, there was also an interaction between morphophonology and notional number because participants showed greater notional effects when the determiners were homophonous and morphophonologically ambiguous. Furthermore, participants with higher working memory scores made fewer agreement errors when the subject noun phrase contained homophonous determiners, and this effect was greater when the subject noun phrase was notionally singular. These findings support the hypothesis that cue-based retrieval plays a role in agreement production, and suggests that the ability to correctly assign subject–verb agreement—especially in the presence of homophonous determiners—is modulated by working memory capacity.
Introduction
Fluent speakers of a language often feel like language proceeds effortlessly. In fact, the very term “fluent” suggests that speakers are able to put together sentences without intrusive disfluencies or repairs. However, in spoken language, words are produced incrementally—with speakers articulating only one word at a time—even though the conceptual planning of speech is much less linear. The question of how speakers manage to build structures and create grammatical strings has been a long-standing one in psycholinguistic research (Bock & Cutting, 1992). For example, research has focused on which aspects of sentence production are largely automatic, and which depend, at least in part, on working memory (Kemper, Herman, & Lian, 2003; MacDonald, 2016; Slevc, 2011).
The computation of subject–verb agreement is especially well-suited to address the role of working memory in sentence production, since there is a reliable dependency between the number marking on the subject head noun and the verb, which can be used to measure deviations in the syntactic production process. In sentences that contain multiple nouns, each noun is a possible agreement controller. This creates opportunities for speakers to “lose track” of the subject head noun, leading to agreement errors such as agreement attraction, which occurs when the verb of a sentence agrees with the grammatical features of a “local,” or non-subject, noun (Bock, Eberhard, Cutting, Meyer, & Schriefers, 2001). Under working memory models, constraints on working memory should have the greatest impact when a noun’s morphophonology, or the phonological realisation of morphological features, is ambiguous because such ambiguity should increase the chances of retrieving the incorrect subject head noun, resulting in more agreement attraction errors (Badecker & Kuminiak, 2007). Indeed, morphophonology has been shown to affect agreement in many languages, including Dutch (Hartsuiker, Schriefers, Bock, & Kikstra, 2003), French, Italian, and Spanish (Franck, Vigliocco, Antón-Méndez, Collina, & Frauenfelder, 2008), Russian (Lorimor, Bock, Zalkind, Sheyman, & Beard, 2008), and Slovak (Badecker & Kuminiak, 2007).
Conflict between notional number representations and grammatical number valuations is also a source of agreement variability, as noun phrases (NPs) that are grammatically singular—but that refer to more than one thing (e.g., “the label on the bottles”)—are more likely to take plural agreement than notionally singular NPs (Vigliocco, Hartsuiker, Jarema, & Kolk, 1996). While notional agreement is sometimes a grammatical option (Haskell & MacDonald, 2003), it still reflects agreement with something other than the grammatical number of the head noun. However, no models of agreement currently exist that simultaneously account for working memory and notional number effects on agreement, although, a priori, participants with greater working memory skills should have better success at keeping track of subject head nouns overall (Hartsuiker & Barkhuysen, 2006). Furthermore, no studies to date have investigated morphophonology, notional number, and working memory at the same time, even though understanding how working memory interacts with morphophonology and notional number in tandem provides more insight than studying each of these components separately. As long as we continue to study these factors in isolation, we run the risk of advancing models that only capture pieces of the larger mechanisms, rather than gaining insight into all of the mechanisms involved in agreement production. In this study, we use an auditory agreement production task in Dutch that manipulates notional number and morphophonology while measuring participants’ working memory abilities, to determine whether working memory affects agreement with all types of sentences equally, or whether there are differential effects of working memory that are modulated by morphophonological ambiguity or variations in notional number.
Morphophonological and notional number effects on agreement
Previous work has investigated morphophonology and notional number, and whether or not they interact with each other, but without including tests of working memory. Antón-Méndez and Hartsuiker (2010) investigated rates of agreement attraction in Dutch, depending on the morphophonology and whether the subject NP referred to one thing (single-token) or many things (multiple-token). Their morphophonological manipulation was based on the Dutch determiners de and het. The determiner de is ambiguous for number because it is used for both common-gender singular nouns and all plural nouns. In contrast, the other determiner, het, which is used with singular neuter nouns, is not ambiguous for number. Therefore, in the items with number ambiguity, the singular determiner on the head noun and the plural determiner on the local noun would have the same form (de). Sample items from Antón-Méndez and Hartsuiker are shown in 1a-d:
(1a) De buurt met de fietsroutes (de-de; ambiguous; single token) “The neighborhood with the bike routes” (1b) Het dorp met de fietsroutes (het-de; unambiguous; single token) “The village with the bike routes” (1c) De dop op de flessen (de-de; ambiguous; multiple token) “The cap on the bottles” (1d) Het etiket op de flessen (het-de; unambiguous; multiple token) “The label on the bottles”
Antón-Méndez and Hartsuiker’s goal was to determine whether there was an interaction between morphophonology and notional number during agreement production. They argued that a significant interaction between these two variables would provide evidence for interactive, rather than serial, models of language production. However, they did not find an interaction between morphophonology and notional number, even though they found main effects of each because there was more plural agreement with de-de items and with multiple-token items. Furthermore, in discussing their findings, they outlined the possibility that some types of interactions between morphophonology and notional number could also be explained not only by interactivity but also by monitoring effects, such as those predicted by working memory models (Badecker & Kuminiak, 2007). Given the renewed interest in how working memory should affect agreement, especially with regard to morphophonological effects on cue-based retrieval (Lago, Shalom, Sigman, Lau, & Phillips, 2015; Lorimor, Jackson, & Foote, 2015), we wanted to test for effects of morphophonology and notional number on agreement, and to add an independent measure of working memory, to directly test working memory models of agreement production.
Working memory
If working memory does affect agreement, an important question is whether it affects all instances of agreement equally, or whether it exerts greater effects in the presence of notional number conflicts or morphophonological ambiguity. One early study on agreement attraction that examined the role of working memory was Bock and Cutting (1992), which measured how often participants agreed with the plural features on the non-subject nouns, depending on whether the noun was embedded within a phrasal modifier (2a) or a clausal modifier (2b):
(2a) The demo tape from the popular rock singers . . . (phrasal) (2b) The demo tape that promoted the rock singers . . . (clausal)
They found that agreement attraction was more common with phrasal modifiers than with clausal modifiers, suggesting that syntactic structure played an important role in whether the features on a local noun would interfere in the normal subject–verb agreement processes and lead to attraction. To investigate the role of working memory, Bock and Cutting conducted speaking-span tasks and found a significant correlation between speaking-span scores and agreement attraction in only one of their three experiments. Therefore, while they could not conclude that working memory played no role in agreement attraction, Bock and Cutting argued that working memory does not explain the majority of agreement attraction errors in English and that agreement is more of an automatic process that takes place during syntactic structure building.
Further work has shown that working memory effects on agreement may appear when secondary tasks are added (Fayol, Largy, & Lemaire, 1994) and in special populations, like persons with aphasia (Slevc & Martin, 2016). For instance, Slevc and Martin (2016) found that persons with aphasia who also exhibited working memory deficits exhibited higher rates of agreement attraction than participants in a control group. In bilingual and monolingual children, Veenstra, Antoniou, Katsos, and Kissine (2017) similarly found significant relationships between participants’ rates of agreement attraction and scores on some working memory tasks.
Hartsuiker and Barkhuysen (2006) added a secondary task to increase memory load when investigating agreement attraction among nonimpaired college-aged adults, while also examining whether working memory effects were more evident when there was a conflict between notional and grammatical number. Using an auditory sentence completion task in Dutch, they manipulated working memory by giving half of the participants three words that they needed to recall after each trial, thereby increasing the overall memory load of the task. In a separate measure, they also measured participants’ speaking spans, using a task modelled on Daneman and Green (1986). Hartsuiker and Barkhuysen found an overall effect of notional number, in that participants produced more plural verb agreement with the multiple-token (notionally plural) items than with single-token items. They also found higher rates of agreement errors among the participants with low speaking spans, but only when those participants were under higher memory load. However, they found no interaction between memory load and notional number, suggesting that notional number agreement is not caused by a lack of working memory resources and that notional number agreement is, to some degree, separable from agreement attraction. Hartsuiker and Barkhuysen, however, did not manipulate morphophonology, and instead only used de-de items (determiners that are ambiguous for number information). Therefore, while their study provided important insights into the separability of notional effects from the role of working memory, questions remain about whether morphophonology interacts with working memory, or whether there might be a three-way interaction between working memory, morphophonology, and notional number.
Separable components of the agreement process
The separation of notional number agreement from agreement attraction, which is supported by the lack of an interaction between notional number and working memory load in Hartsuiker and Barkhuysen (2006), is codified in the Marking and Morphing model (Eberhard, Cutting, & Bock, 2005). Within Marking and Morphing, two sources of information (notional number information from the message and grammatical number information from the individual nouns and determiners) jointly set the probability of either singular or plural agreement during subject encoding. The grammatical number computation differentially weights elements in the subject NP, so that the number on the head noun has the greatest weight, but number information on determiners or on embedded nouns can also affect the probability of plural agreement, although to a lesser extent than the number information on the head noun. When agreement errors are primarily caused by grammatical number, rather than notional number, this is called agreement “attraction.”
Marking and Morphing includes in its grammatical number computation a mechanism to account for morphophonological effects that relies on the ambiguity of number cues. In Marking and Morphing, items like the Dutch determiner het, which are unambiguously singular, obtain a number value of −1. Items that are unambiguously plural obtain a number value of +1. Items that are ambiguous for number (like the Dutch determiner de) have a number value of zero. Therefore, any unambiguously singular items will lead to a higher probability of singular agreement and therefore less agreement attraction, while number-ambiguous items will not (see Antón-Méndez & Hartsuiker, 2010, for an implementation).
A third component, beyond notional number agreement and agreement attraction, has also been proposed to account for certain types of agreement errors stemming from mis-retrieval of the subject head noun (Lorimor et al., 2015). The basic idea is similar to that of a working memory model of agreement (Badecker & Kuminiak, 2007), in which there is an additional process of subject retrieval at the point of verb production (e.g., cue-based retrieval) that might incorrectly locate the subject head noun when the form of the verb is being planned. This mechanism is similar to proposals for cue-based retrieval in agreement comprehension (Schlueter, Williams, & Lau, 2018), in which participants launch a search for the subject head noun upon encountering a verb, the success of which can be lessened by similarity-based interference (Villata, Tabor, & Franck, 2018). This retrieval component of agreement may be relevant in accounting for morphophonological effects in agreement, above and beyond those predicted by the Marking and Morphing model (Eberhard et al., 2005), and may also be a natural place to look for effects of working memory in agreement production.
Present study
Previous studies on morphophonology in Dutch have explained the effect of grammatical gender based on number ambiguity (Hartsuiker et al., 2003), which is also the explanation given by the Marking and Morphing model (Eberhard et al., 2005). An alternate consideration is that, when the determiners on the head and local noun are ambiguous for number, they also have the same form. When the determiners have the same form, retrieval of the subject head noun should be more difficult, due to increased similarity-based interference, and effects of working memory should be evident. Therefore, we examined agreement attraction in Dutch, using the stimuli from Antón-Méndez and Hartsuiker (2010). Our study differed from Antón-Méndez and Hartsuiker in four main ways. First, we included a larger number of participants to increase statistical power. Second, we used an auditory oral sentence completion task (like Hartsuiker & Barkhuysen, 2006) instead of a visual completion task. Third, because we were looking specifically at the role of working memory, we included items with scorable disfluencies in our analyses, whereas Antón-Méndez and Hartsuiker treated all disfluent responses as miscellaneous items. This is because we anticipated that working memory retrieval effects would be more common among participants with lower working memory spans, and that failures in working memory might lead to disfluencies, so if we excluded disfluent items, we would be eliminating an important set of data. Finally—and most importantly—we added a Dutch version of an operation span (OSpan) task (Turner & Engle, 1989) to obtain an objective measure of participants’ working memory capacity, which we included as a continuous variable in our statistical analysis. Our main goal was to determine whether working memory affects agreement in Dutch, and if so, whether there were interactions between working memory, notional number, and morphophonology.
In terms of morphophonological effects, Antón-Méndez and Hartsuiker (2010) and Hartsuiker et al. (2003) have already shown that agreement errors are more likely on de-de items, compared to het-de items. This effect can be explained in two ways. First, the number ambiguity on the de determiner could lead to more agreement attraction, as described above. Second, the fact that the determiners have the same form could make retrieval of the subject head noun more difficult, leading to less reliable agreement with the subject head noun. If we replicate the higher rate of agreement errors in de-de items, and also show that working memory differentially affects agreement with de-de items, this will provide crucial evidence that the reduction in agreement errors on het-de items is due to the presence of additional retrieval cues on the determiners, and not solely by the mechanisms for morphophonology described by Marking and Morphing. Furthermore, by investigating notional effects at the same time as morphophonology and working memory, we can gain a more complete picture of all of the processes involved in shaping agreement.
Method
Participants
In total, 55 Dutch native speakers in the Netherlands participated in the experiment. Due to technical difficulties, responses from one participant were not recorded. An additional three participants were excluded because they answered fewer than 65% of the math problems on the OSpan task correctly, suggesting they may have privileged remembering the target words over solving the math equations, rather than attending to both components of the task. All results are based on the remaining 51 participants (41 female, 10 male). The mean age of participants was 19.3 years (standard deviation [SD] = 1.7, range = 18-24 years).
Materials
The 80 experimental items were based on the sentence preambles from Antón-Méndez and Hartsuiker (2010). Minor changes were made to 16 items to avoid vocabulary overlap with other items in the task (see Supplementary Material B for a list of all experimental items). All items contained a singular head noun followed by a prepositional phrase containing a plural local noun. They varied according to whether the head noun was common gender, marked with the ambiguous determiner de, as in (1a) and (1c), or neuter gender, marked with the unambiguously singular determiner het, as in (1b) and (1d). In total, 40 items were classified as having a single referent (i.e., notionally and grammatically singular), as in (1a) and (1b), and 40 items were classified as having a distributive referent (i.e., notionally plural but grammatically singular), as in (1c) and (1d).
The 80 complex NPs were split into two experimental lists, such that participants saw 10 items in each of the four conditions. The 40 experimental items in each list were presented along with 112 filler items in a randomised order. Filler NPs included conjoined NPs (e.g., De krant en het tijdschrift “The newspaper and the magazine”), and simple singular and simple plural NPs (e.g., Het groene gordijn “the green curtain”). Across all items, an equal number of NPs were—prescriptively speaking—grammatically plural and grammatically singular. All experimental and filler NPs were recorded by a female Dutch native speaker.
Procedure
Participants were tested individually in a quiet room on a computer using E-Prime v2.0 (Psychology Software Tools, 2012), and their responses were digitally recorded. Participants received both aural and written instructions. They listened to each preamble and were instructed to repeat the preamble exactly as they heard it, and to then complete the sentence by describing where the things are (Lorimor, 2007). Participants were instructed to use the copula verb in either the present or past tense (i.e., is/zijn “is/are”; was/waren “was/were”) and to respond as quickly and fluently as possible. Prior to the experiment, participants were given several example completions (e.g., in de stad “in the city,” op de maan “on the moon”) and one complete example. Participants received feedback if they repeatedly used a verb besides the copula or if they did not specify a location in their response, but they received no feedback on their repetition of the preamble or use of singular or plural verbs in their completions. Participants completed 10 practice items at the beginning of the task.
For each item, participants saw a fixation point for 500 ms and then heard the recorded preamble. Then an exclamation point appeared on the screen, prompting participants to repeat the preamble and complete the sentence. The experimenter advanced between trials manually with a mouse click. Sessions lasted 15-20 min.
After finishing the sentence completion task, participants filled out a language background questionnaire and completed an OSpan task (Engle, 2002; Turner & Engle, 1989). In this task, participants saw a simple math equation, along with an answer to the equation, on the computer screen, and they had to decide whether the answer provided was correct or not. For half of the items, the provided answer was correct, and for half of the items the provided answer was incorrect. This equation appeared for 3,750 ms or until the participant responded. Then a Dutch word appeared on the screen for 1,250 ms and participants were instructed to remember this target word. After a set number of equation–word pairs, the word RECALL appeared on the screen, and participants were prompted to type in as many of the target words as they could remember. The number of equation–word pairs in each set increased from two to six pairs as the experiment progressed, with three sets per level.
Scoring
Responses on the production task were transcribed and coded as singular, plural, or miscellaneous. Miscellaneous responses included instances where the participants did not correctly repeat the preamble or failed to complete the sentence. All singular and plural responses were additionally coded as fluent or disfluent. Fluent utterances consisted of responses in which participants produced the preamble and the verb without any hesitations, filled pauses or repetitions, and used a form of the copula (is, zijn, was, waren). Disfluent utterances consisted of responses in which participants hesitated or repeated a portion of the preamble or the verb, or used a verb other than the copula. In instances where participants changed the number marking on the verb upon repetition, we coded the verb as singular or plural based on the first complete verb that was produced. A second coder scored 5% of the data, with an interrater reliability of 97.6%. There were 326 (16.0%) plural responses (226 fluent, 100 disfluent), 1,602 (78.5%) singular responses (1,263 fluent, 339 disfluent), and 112 (5.5%) miscellaneous errors. All responses are presented in Table 1.
Distribution of response type (singular vs. plural verbs) by scoring category and condition.
On the OSpan task, participants received one point for each word they correctly recalled on the task. Their score then represented the total number of correctly recalled words, as this is a more accurate means of reporting OSpan scores than the largest set-size for which participants recalled all of the target words (Conway et al., 2005).
Results
All 51 participants included in the results answered at least 65% of the math problems on the OSpan task correctly (M = 84.7%, SD = 8.4, range = 65.0%-98.3%). These participants recalled, on average, 49.8 words out of a maximum of 60 words on the OSpan task (SD = 5.8, range = 34-59).
Analyses were conducted using mixed-effect logistic regression models (Jaeger, 2008) with the lme4 package version 1.1-12 (Bates, Mächler, Bolker, & Walker, 2015) in R version 3.2.5 (R Development Core Team, 2016). Number marking on the verb (singular vs. plural) was the dependent variable. Notionality (single-token vs. multiple-token) and morphophonology (het-de vs. de-de) were entered as fixed effects into the model, contrast coded as −.5 and .5 (Davis, 2010). Given that previous research has shown that agreement errors can vary as a function of working memory (e.g., Hartsuiker & Barkhuysen, 2006; Veenstra et al., 2017), we entered participants’ OSpan scores as a fixed effect, centred at the sample mean. To control for any differences in agreement as a function of fluency, we also included whether a given utterance was fluent or disfluent (see Supplemental Materials for an analysis of fluent data only). 1 As there were more fluent than disfluent items, this factor was effect coded and centred (fluent = −.38, disfluent = .62), such that any main effect reflects the average of the two factor levels (Davis, 2010). The final random effect structure was determined by starting with the maximum structure justified by the experimental design, which included random intercepts for subjects and items, correlated by-item random slopes for OSpan, and correlated by-participant random slopes for the main effects of notionality and morphophonology, and their interaction. The random slope for the interaction between notionality and morphophonology was removed due to non-convergence, and random slopes correlated above 0.95 were removed to avoid over-fitting. Finally, we used model comparison to determine the most parsimonious random effect structure (Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2017), removing individual variables one at a time, based on which variables contributed the least amount of unique variance to the random effect structure. The final random effect structure included random intercepts for participants and items and decorrelated by-item slopes for OSpan. The inclusion of additional random slopes did not significantly improve the model fit (all ps > .24). However, the pattern of significant fixed-effects was identical when the maximal random effect structure was used.
As seen in Table 2, there was a significant effect of fluency because participants produced more plural responses in disfluent than fluent utterances. There was a significant effect of notionality because participants produced more plural responses for multiple-token than single-token items. There was a significant effect of morphophonology because participants produced more plural responses with de-de items than het-de items. There was also a significant interaction between notionality and morphophonology because, as seen in Figure 1, there were significantly more plural responses for de-de items that were notionally plural, compared to the other three conditions.
Summary of the mixed logit model for complex NPs.
NP: noun phrase; df: degree of freedom; SD: standard deviation.
Likelihood ratio tests for main effects are based on omitting the main effect and any interaction terms involving that main effect.
Likelihood ratio tests for two-way interactions are based on omitting the relevant two-way interaction term and the three-way interaction term.

Proportion of plural responses (error bars represent the bootstrapped by-participant 95% confidence intervals).
While there was no significant effect of OSpan, there was a significant two-way interaction between OSpan and morphophonology and a significant three-way interaction between notionality, morphophonology, and OSpan. As seen in Figure 2, participants produced fewer plural responses with het-de NPs, regardless of notionality (see also Figure 1). For multiple-token preambles containing de-de NPs, participants produced more plural responses regardless of OSpan score, but for single-token preambles containing de-de NPs, the proportion of plural responses decreased as OSpan scores increased.

Proportion of plural responses as a function of operation span score for (a) het-de items and (b) de-de items. The proportion of plural responses for multiple token items is represented by circles and the solid regression line. The proportion of plural responses for single-token items is represented by crosses and the dotted regression line.
Discussion
As in Antón-Méndez and Hartsuiker (2010), speakers were sensitive to both notional number and morphophonology, as speakers used more plural agreement with notionally plural items and with de-de items. However, unlike Antón-Méndez and Hartsuiker, we found a significant interaction between notional number and morphophonology, as speakers showed greater notional effects with de-de NPs than with het-de NPs. In addition, we found an effect of fluency, with more agreement errors on disfluent items than on fluent items. Furthermore, while there was no main effect of working memory, as measured by OSpan, there was a significant interaction between OSpan and morphophonology, as participants with lower OSpan scores produced more agreement errors on the de-de items. This two-way interaction was qualified by a small, but statistically significant three-way interaction between notionality, morphophonology, and working memory. As seen in Figure 2, while agreement errors were lower on het-de items overall (Panel A), differences based on OSpan were most visible on the notionally singular de-de items (Panel B).
The role of working memory on agreement production
These results support a model of agreement production that depends, at least to some extent, on working memory. The effects of working memory in this study were strongest in the de-de condition, when the determiners had the same form, and were ambiguous for number. The lower rate of plural agreement in the het-de condition suggests that the morphophonological cues provided by those distinct determiners reinforced accurate grammatical agreement by helping speakers correctly locate the subject head noun. This result is consistent with the Working Memory retrieval model (Badecker & Kuminiak, 2007), which predicts that morphophonological cues should reduce the load of working memory on agreement production. The fact that participants made more agreement errors on items with disfluencies provides additional evidence for the role of working memory in agreement production, as disfluencies may be more likely when participants are experiencing difficulties retrieving the subject head noun.
At the same time, working memory is a multifaceted, rather than a unitary construct, and consists of distinct, yet related processes (Logie, 2011; Unsworth, Fukuda, Awh, & Vogel, 2014). Each of these processes—capacity, attentional control, and the ability to retrieve information from secondary memory—can contribute to differences in fluid intelligence and to individual differences in higher order cognitive tasks, like agreement production. Vandierendonck, Loncke, Hartsuiker, and Desmet (2018) provides insight into which aspects of working memory affect agreement production by showing a link between executive control abilities and agreement errors, using a gaze-contingent tone discrimination task. It is important to extend this line of research to consider, more precisely, how the mechanisms underlying working memory correlate with language production if we want to better understand the potential causal relationship between working memory and language production.
Furthermore, it is critical to replicate the current three-way interaction via future research, both in Dutch and in other languages, as interactions may be under-powered and thus susceptible to both Type I and Type 2 errors. The three-way interaction in this study—while statistically reliable according to conventional hypothesis testing criteria—was arguably small. To provide additional evidence about the reliability of the reported three-way interaction between notional number, morphophonology, and OSpan, we conducted the same analyses reported above with Bayesian mixed effects models using rstanarm (Stan Development Team, 2017). Following the procedures and benchmarks by Nicenboim and Vasishth (2016), we considered there to be strong evidence of an effect for any parameter where the 95% credible interval did not include zero. According to the criteria in Nicenboim and Vasishth (2016), if the edge of the 95% credible interval overlaps with zero, there can be weak evidence for an effect if the probability of the effect is still relatively high. This Bayesian model showed the same general pattern of results as the ones that used lme4, and the 95% credible interval for the parameter for the three-way interaction was (0.00-0.02). Therefore, as the lower bound of this three-way interaction overlapped with zero, the Bayesian analysis suggests there is weak evidence for this interaction. When combined with the primary analysis, we consistently find evidence for this three-way interaction, suggesting that it is a small but reliable effect. Further research, especially if conducted with larger sample sizes, is especially important because understanding the role of individual differences can lend important insights into the mechanisms behind language processing (Kidd, Donnelly, & Christiansen, 2018).
The interaction between notional number and morphophonology
The interaction between notional number and morphophonology also lends important insight into the role of morphophonology on number agreement. The reduction in plural agreement on het-de items, compared to de-de items, is consistent with findings from previous work (Antón-Méndez & Hartsuiker, 2010; Hartsuiker et al., 2003). However, the impact of morphophonology on agreement in our study was modulated by notional number because notional effects were greater on the de-de items, compared to the het-de items. The fact that notional effects were greater among de-de items, compared to het-de items, provides additional evidence that distinct determiners facilitate correct retrieval of the subject head noun, which can lead to smaller notional effects among het-de items.
There are several reasons that we may have found an interaction between notionality and morphophonology, while an interaction between notionality and morphophonology was not observed in Antón-Méndez and Hartsuiker (2010). First, we included disfluent utterances in our statistical analysis, while Antón-Méndez and Hartsuiker treated those as miscellaneous items, because we were concerned that discarding disfluent utterances would eliminate many of the sentences in which speakers were having trouble retrieving the subject head noun. Including these items increased our statistical power. However, the inclusion of these disfluent items in our analyses did not change the pattern of effects, and a significant interaction between notionality and morphophonology emerged even in the analysis of fluent-only data (see Supplementary Material A). Second, we had more statistical power to detect an interaction because we tested more participants (51 participants vs. 36 participants in Antón-Méndez and Hartsuiker). Third, participants used more plural agreement in our study overall than in Antón-Méndez and Hartsuiker, which increased our ability to detect an interaction. This could have been due to differences in population, or to the differences in task, as we used an auditory sentence completion task, while Antón-Méndez and Hartsuiker used a visual completion paradigm. Regardless of the cause, the rate of plural agreement in Antón-Méndez and Hartsuiker was lower than in most studies on Dutch agreement (see Table 5 in Antón-Méndez and Hartsuiker), which would limit their ability to detect an interaction.
Cue-based retrieval in agreement production
The role of morphophonology in agreement and how it interacts with OSpan and notional number provides important insights into the mechanisms that drive the production of subject–verb agreement. We, like other studies on Dutch (Antón-Méndez & Hartsuiker, 2010; Hartsuiker et al., 2003), found lower rates of agreement errors on het-de items, compared to de-de items. Within the Marking and Morphing model (Eberhard et al., 2005), this is explained by the unambiguously singular morphology on the determiner het, which lowers the grammatical number value for the whole NP. What Marking and Morphing cannot explain, however, is individual differences in working memory or why working memory effects should be most evident on the de-de items, compared to the het-de items. In contrast, a cue-based retrieval account provides a straightforward explanation, both for why there are fewer agreement errors among het-de items overall and also for why effects of OSpan should be most evident among de-de items. This is because the distinct retrieval cues provided by the number-marked determiners would help speakers retrieve the correct subject noun at the point of producing a verb.
We propose, similar to Lorimor et al. (2015), that agreement can be affected by both number encoding processes and subject retrieval processes. During number encoding, agreement attraction errors and notional number agreement may be generated through a process like the one described in Marking and Morphing (Eberhard et al., 2005), in which grammatical number values (that may lead to attraction) are combined with notional number valuations to obtain a number specification that is encoded on the subject NP. In addition to these processes involved in number encoding, we propose that there is a later retrieval process during which speakers check the number of the verb with the number on the subject head noun that they are holding in content-addressable memory, and that this retrieval process can be facilitated by the presence of distinct morphophonological cues. Although there are reasons to treat production as separate from comprehension (Tanner, Nicol, & Brehm, 2014), this proposal is similar to recent proposals for how cue-based retrieval would work in agreement comprehension (Nicenboim, Engelmann, Suckow, & Vasishth, 2017; Schlueter et al., 2018). Furthermore, it is consistent with evidence that, even in comprehension, number information is processed both during NP encoding and at the point of encountering the verb (Vandierendonck et al., 2018).
A three-component agreement model that includes notional number, agreement attraction, and cue-based retrieval (as outlined in Lorimor et al., 2015) accounts for the present data as follows. During encoding, some agreement attraction errors were created by spreading activation from the grammatical features on the local noun and from other number features within the subject NP (as outlined in Eberhard et al., 2005), which explains why there are agreement errors in all four experimental conditions. In addition, notional number information on the subject NP explains the main effect of notional number and why participants were more likely to use plural verbs with multiple-token, compared to single-token items. At the point of planning the verb, an additional step of cue-based retrieval reduced the rate of plural agreement with the het-de items because there was less similarity-based interference on the nouns in the subject NP. In the de-de condition, while participants attempted to retrieve the subject head noun, the fact that the determiners on the head and local nouns had the same form made the process of cue-based retrieval more difficult. Therefore, while participants with higher OSpans were better able to maintain the subject head noun in content-addressable working memory and establish accurate subject–verb agreement, participants with lower OSpan scores were less accurate in their retrieval processes on the de-de items. It is possible that interactivity was an additional factor influencing our results (as discussed by Antón-Méndez and Hartsuiker, 2010), as feedback from the phonology on the de-de items increased the notional effects on the multiple-token items. However, our results can also be explained through cue-based retrieval, and the existence of interactivity in the language system does not preclude the possibility of an additional step of retrieval. Furthermore, interactivity cannot account for the fact that working memory affected agreement errors among the de-de items, but not the het-de items.
Conclusion
By manipulating morphophonology and notional number in Dutch, while at the same time collecting data on individual participants’ working memory, we provide evidence about how working memory affects agreement attraction in Dutch, even in the absence of a secondary task. In all, these findings support a model of agreement production that incorporates notional number, agreement attraction, and cue-based retrieval as three distinct—yet related—mechanisms. We also provide evidence for a limited role of working memory in the agreement production process. In doing so, we show that agreement is not fully automatic, and that working memory does play a role in ensuring correct agreement; however, at the same time, we show that features of the language itself, such as distinct morphophonological cues, play an important role in facilitating correct subject–verb agreement.
Supplemental Material
QJE-STD_17-322.R3-Supplementary_Material_A – Supplemental material for The interaction of notional number and morphophonology in subject–verb agreement: A role for working memory
Supplemental material, QJE-STD_17-322.R3-Supplementary_Material_A for The interaction of notional number and morphophonology in subject–verb agreement: A role for working memory by Heidi Lorimor, Carrie N Jackson and Janet G van Hell in Quarterly Journal of Experimental Psychology
Supplemental Material
QJE-STD_17-322.R3-Supplementary_Material_B – Supplemental material for The interaction of notional number and morphophonology in subject–verb agreement: A role for working memory
Supplemental material, QJE-STD_17-322.R3-Supplementary_Material_B for The interaction of notional number and morphophonology in subject–verb agreement: A role for working memory by Heidi Lorimor, Carrie N Jackson and Janet G van Hell in Quarterly Journal of Experimental Psychology
Footnotes
Acknowledgements
Portions of this work were presented at the 31st Annual CUNY Sentence Processing Conference. The authors would like to thank Maartje Dona and Anouk Raeven for their assistance with data collection. They also thank the reviewers for their helpful comments and suggestions.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This project was particularly supported by NSF grants BCS-1349110 and DUE 1561660.
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
