Abstract
This study investigates how syntactic complexity affects speaking performance in first (L1) and second language (L2) in terms of speaking fluency. Participants (30 Dutch native speakers with an average to advanced level of English) performed two speaking experiments, one in Dutch (L1) and one in English (L2). Syntactic complexity was operationalized by eliciting active and passive sentences in an experimental setting. By comparing the effect of syntactic complexity on different measures of fluency, the results are telling of underlying cognitive processes in on-line speech production. We found that syntactic complexity indeed elicits hesitations, both in the L1 and in the L2. Because producing a rather simple utterance such as an active sentence may already lead to processing difficulty in the L2, the effect of syntactic complexity was found to be larger for L1 speech. Finally, articulation rate was not affected by syntactic complexity, neither in the L1 nor in the L2.
I Introduction
In second language acquisition (SLA) research, scholars have strived to objectively describe learners’ progress. One of the ways to quantify progress in oral production has often been to measure the level of complexity, accuracy, and fluency in learners’ second language (L2) speech (Housen and Kuiken, 2009). Much of the research into these three dimensions of L2 production has investigated how external factors (such as task characteristics) may influence linguistic output. Within the cognitive approach to second language acquisition, two hypotheses with respect to the effects of task characteristics have been put forward: the Cognition Hypothesis (e.g. Robinson, 2001) and the Limited Attentional Capacity Model (Skehan, 2003). Research that has tested these hypotheses has found that increasing task demands as a whole can lead to more complex utterances with fewer errors, but that these may come at the cost of fluency. These results are explained by positing that there are trade-offs between the three dimensions due to the limited amount of information capacity (Skehan, 2003) or, contrastively, that cognitively higher demanding tasks may lead learners to heighten their level of attention, which in turn leads them to produce more accurate and complex speech (Robinson, 2001). Either way, it is usually argued that linguistic complexity in the output comes at the cost of speaking fluency. However, to date there is no research investigating this claim directly. Moreover, it is unclear how fluency will be affected when speakers encounter difficulty in formulating speech at the morphosyntactic level. The current article therefore investigates how one specific type of morphosyntactic operation (namely producing passives versus actives) influences speaking fluency.
1 Linguistic complexity in SLA
Before we lay out what we predict for the production of passives and actives in L2 speech, we will shortly describe what we mean by linguistically complex and linguistically simple utterances in general, and explain how passive and active constructions fit into these descriptions. Bulté and Housen (2012: 22) define complexity as referring: to a property or quality of a phenomenon or entity in terms of (1) the number and the nature of the discrete components that the entity consists of, and (2) the number and the nature of the relationships between the constituent components.
This is a definition of inherent or absolute complexity, as it can be established objectively by looking at linguistic phenomena. Another element of complexity is relative complexity. If a system or linguistic feature needs more cognitive effort to process, it is cognitively difficult, or relatively complex. Absolute complexity may coincide with relative complexity, but not necessarily (Rohdenburg, 1996).
Passive constructions are known to be cognitively difficult (relatively complex), as they are harder to process in comprehension (e.g. Berndt et al., 2004) and emerge later in language acquisition compared to actives (Diessel, 2004). The passive structure is also much less frequent in daily use than the active structure, which contributes to the cognitive difficulty. In Dutch written discourse, the proportion of active transitives is about 92% and, in English, about 88% (Cornelis, 1996). In an absolute way, passives are also seen as more complex than actives. First, following generative linguistics reasoning, a passive sentence demands an additional computation compared to an active sentence (see Jaeggli, 1986). Although syntactic theories disagree on how passives should be analysed precisely, it is generally assumed that constructing a passive requires an additional computation compared to its active counterpart. In Examples (1b) and (2b), it can be seen that the internal argument (‘De auto’ for Dutch and ‘The car’ for English) appears in subject position. In other words, the object is promoted to subject position, either through a lexical operation (Bresnan, 1982) or through movement (Jaeggli, 1986). Second, if the passives include additional syntactic markers such as the preposition ‘by’/’door’ and auxiliaries as in examples 1b and 2b, they are more complex in the absolute sense simply because they include more components than actives. In short, we can conclude that passives are more complex than actives, both in the relative and in the absolute sense.
(1) a. The mechanic repairs the car. b. The cari is (being) repaired ti (by the mechanic). (2) a. De monteur repareert de auto. b. De autoi wordt gerepareerd ti (door de monteur).
2 Literature review
Usually, linguistic complexity in L2 research has been investigated as a property of the speaker’s system, and linguistic complexity is therefore studied as a dependent variable. In the current article, however, we are interested in the direct influence of linguistic complexity on utterance fluency. Previous studies that have likewise contrasted passive and actives as an independent variable have investigated performance by native speakers in comprehension mostly. These studies have reported that comprehension of passives compared to actives is slower (e.g. Berndt et al., 2004; Kharkwal and Stromswold, 2014) and that native speakers (particularly from lower educational levels) do not always understand passive sentences correctly (Ferreira, 2003; Street and Dąbrowska, 2010).
With respect to research into production, Ferreira (1994) elicited actives and passives using verbal input (e.g. the three words ‘manager’, ‘lay-offs’, ‘worried’) in order to investigate what types of nouns and verbs tend to elicit more passives than others. They also measured formulation times and report that passives are formulated slower than actives. Engelhardt et al. (2010) measured how participants’ fluency was affected by producing passives versus actives, much like the current study will do. They compared the speech production of healthy and attention deficit/hyperactivity disorder (ADHD) native English participants, with the goal of investigating the role of inhibition in the production of disfluencies. They elicited actives and passives by presenting each time one animate object, one inanimate object, and a verb. The verb would either be a participle, hence eliciting a passive (e.g. ridden) or not. The results showed that both groups of participants produced more disfluencies when a passive construction was called for. Research using syntactic priming (Segaert et al., 2011) has also found that formulating passives takes longer than formulating actives for Dutch native speakers.
3 The current study
In summary, scholars in the field of second language acquisition have theorized about potential trade-off effects between syntactic complexity and utterance fluency (e.g. Towell, 2012). Such trade-off effects have been shown indirectly by manipulating task complexity or task difficulty as a whole (e.g. Ellis and Yuan, 2005). The direct relation between syntactic complexity and utterance fluency, however, has thus far only been investigated in native speaker performance (Engelhardt et al., 2010). The current short research article compares the effect of syntactic complexity in first language (L1) and L2 speaking fluency, in answering the following research question:
Research question: What is the effect of producing active and passive sentences on L1 and L2 speaking fluency?
In a controlled experiment, we manipulate difficulty at one specific stage of speech production, namely morphosyntactic encoding in L1 (Dutch) and L2 (English) speech for the same speakers. Following Levelt (1999), De Bot (1992) and Kormos (2006), morphosyntactic encoding occurs after lemma selection and before phonetic encoding. For L1 speech, this stage is assumed to be fully automatized. Nevertheless, because more complex operations must take place and/or because more elements need to be formulated, we expect less fluent responses for passives than for actives and we therefore expect to replicate the findings by Engelhardt et al. (2010), but now for L1 Dutch speakers.
Because passive constructions are learned later and practiced less compared to actives (due to lower frequency of occurrence), we hypothesize that, even for L2 learners who have internalized the passive construction (correctly), producing a passive will be more difficult than producing an active. It is likely that the passive structure is not as proceduralized and automatized as the active structure. Further support for this hypothesis comes from Hinkel (2004), who reports that in written texts, the passive is underused by English L2 speakers relative to native English speakers. Therefore, we hypothesize that the effect of syntactic complexity will be larger in L2 speech compared to L1 speech.
The dependent variable in this article is ‘fluency’. Tavakoli and Skehan (2005) have pointed to the multifaceted nature of fluency and distinguish between three types of fluency: breakdown fluency, speed fluency, and repair fluency. In the current study we operationalize fluency as three measures: initial silent pause time (i.e. response time), articulation rate, and hesitation occurrence. In doing so, we tease apart a measure for speed fluency (namely articulation rate) from measures of breakdown fluency. The hesitations in our measure of hesitation occurrence may also include (covert) repairs and is therefore a combination of repair fluency and breakdown fluency.
II Method
1 Participants
Thirty participants from the linguistic institute’s subject pool at Utrecht University, Netherlands were recruited. They were paid €5 for participation and participated with informed consent. Participants (mean age = 22, SD = 6.0, range from 18 to 48; 3 male; 27 female) were all Dutch native speakers and most (27) were students at Utrecht University. On the English LexTALE test (Lemhöfer and Broersma, 2012), they scored on average 71% correct (SD 14; range 44–100), which indicates that participants roughly had an average to advanced level of English. Early Dutch–English bilinguals and English major students were excluded from participation. Participants all reported having normal or corrected-to-normal eyesight and none reported having speech disorders, such as stuttering or dyslexia.
2 Materials
Twenty cartoon images were searched using Google Image and then modified and manipulated with the aid of Adobe Photoshop CS2003. All pictures presented an actor with at least one inanimate object (e.g. a mechanic next to a car). We supplemented all pictures with circles with words to their left (e.g. ‘the mechanic’ and ‘the car’). To the left of these pictures with words, we also added a root verb (e.g. ‘to repair’). Each resulting slide was created both in English and in Dutch. In the selection of the materials it was made sure that the resulting sentences would be equally natural in both languages. Of each slide, four versions were created. Two versions were used as (active and passive) targets. To match passive and active targets maximally, the active and passive targets were identical, except that an arrow was pointing to either the circled word with the actor (e.g. ‘the mechanic’) or to the circled word with the object (‘the car’). Figure 1 shows an example of a passive target. The remaining two versions were used as filler trials that elicited either two consecutive sentences (e.g. ‘The mechanic holds the tool. He repairs the car’) or a single sentence with an embedded phrase (e.g. ‘The mechanic who holds the tool repairs the car’).

Example of a passive target.
In order to make each sentence as natural as possible, the printed names appeared with the most likely article, either definite or indefinite, (i.e. for English ‘the’, ‘a’, or ‘an’, and for Dutch ‘het’, ‘de’, and ‘een’). Again, care was taken to choose definite and indefinite articles that would result in both naturally sounding Dutch and English sentences. Appendix 1 presents all the word stimuli for the passive and active conditions
3 Procedure
Two pseudo-randomized lists were created: one for English (k = 40) and one for Dutch (k = 40) with the passive target of a picture in one list and the active target of that same picture in the other list. Also, care was taken that no more than two consecutive trials were either passive or active. The presentation of the audio and visual stimuli and the recording of participants’ responses were controlled by ZEP software (Veenker, 2013). Prior to the actual experiment, participants were familiarized with the preferred responses and the procedure in several ways. First, participants received a booklet with printed instructions. In this booklet, a picture was shown in all four versions and the instructions explained how participants should respond in each case. Second, participants were shown the same four examples as in the printed instructions again, on a computer screen, and after a few seconds, for each example, the preferred sentence was shown on the screen underneath the picture. Third, 8 practice trials followed the instructions. The examples in the instructions and practice trials did not appear as actual test items. After the practice phase, participants proceeded with the test phase, comprising of 40 slides to which they had to respond. In total, participants thus produced 40 sentences in Dutch (in Experiment 1) and 40 sentences in English (in Experiment 2). Of these 40 sentences in each language, 10 were actives, 10 were passives, and 20 were fillers.
The English and Dutch experiments were run two days in a row. The order of the experiments was counterbalanced: half of the participants first performed the English experiment; half of them first performed the Dutch experiment. The same procedure was followed in both experiments (including the familiarization and practice phases) except that the English experiment was followed by a vocabulary test (the LexTALE test; see Lemhöfer and Broersma, 2012). In this unpressurized lexical decision task, participants decide, for 60 strings of letters (40 words and 20 pseudowords, taken from Meara (1996), whether these strings were actual English words or whether they were made up. The scores from the LexTALE test are calculated as the percentage of correct responses for both words and nonwords by averaging the percentages correct for these two item types.
4 Scoring and annotation
We annotated all 1,200 active and passive responses (30 participants, 2 experiments, 20 sentences) and marked whether or not the utterance was grammatically correct, measured initial response time and marked all hesitations. Below, we report on the effects of Condition (Passive or Active) and Language (L1 or L2) on three dependent variables: Response time, Syllable duration, and Hesitation occurrence. The response times were calculated as the time (in milliseconds) between the appearance of the slide on the screen and the start of the response by the participant. Syllable duration was only measured for those responses that did not contain any hesitations and was calculated by dividing, for each response, total utterance time by the number of syllables. Hesitation occurrence, finally, is a binomial variable (either 1 or 0) that indicates whether or not the participant used any kind of hesitation, anywhere during the response. Hesitations could be silent pauses, filled pauses, repetitions, and repairs. The lower silent pause threshold was set at 250 ms, to exclude short so-called micropauses (Riggenbach, 1991), which are irrelevant for measures of L2 fluency (De Jong and Bosker, 2013).
III Results
For all data analyses, we excluded ungrammatical responses (5%) and, after inspecting the distribution of response times, we excluded response times longer than 4 seconds (additional 1%). With these exclusions, there were many missing data for one participant, and we therefore excluded data from this participant entirely. 1 For the 29 remaining participants, these criteria led to excluding 5.3% of L2 data and 3.3% of L1 data. Table 1 shows the descriptive statistics of the three dependent variables of all remaining data, aggregated over Language (L1 versus L2) and Condition (Active versus Passive).
Means (and standard deviations) of response time and syllable duration, and number of responses with a hesitation (and percentage) across conditions.
Note. L1 is first language (Dutch) and L2 is second language (English).
For these three dependent variables, we used (generalized) Linear Mixed Models. The two durational variables response time and syllable duration were log-transformed in order to reach normal distributions. We included random effects for speakers (n = 29) and sentences (k = 40), and tested for a fixed effect of order of presentation before we added the fixed effects Condition and Language. 2 Additionally, we also tested for an effect of utterance length (number of syllables), an effect which is by definition strongly related to Condition (the passives were always longer than the actives). Models were compared by likelihood ratio tests and coefficients were estimated using the full Maximum Likelihood criterion. Finally, we also tested models including random slopes. Because this did not lead to any different interpretation of our results, we report the simpler models without random slopes. To predict hesitation occurrence, we tested generalized linear mixed models, using the Laplace approximation.
Table 2 shows the results of the resulting best fitting linear models for (log) Response times (first three columns) and (log) Syllable Durations (last three columns). Table 3 shows the generalized mixed model for Hesitation occurrence. For (log) Response time, we found a significant main effect of Language, indicating that speakers paused longer before the start of their utterance in their L2 compared to their L1. We additionally found an effect of number of syllables: the longer the upcoming utterance would be, the longer the Response time. When we controlled for this variable, there was no additional effect of Condition. Finally, no interaction was found: neither between number of syllables and Language, nor between Condition and Language, indicating that the effects found were of the same magnitude in the L1 and in the L2.
Results of linear mixed models predicting (log) response times and (log) syllable durations.
Note. L1 is first language (Dutch) and L2 is second language (English).
Results of generalized mixed model predicting hesitation occurrence.
Turning to the model predicting Syllable duration, similar to the model for Response time, we found a significant effect of number of syllables: the larger the number of syllables in the utterance, the shorter the mean syllable duration tended to be. When we controlled for this variable, there was no additional effect for Condition. Note that in a model without controlling for number of syllables (not shown in Table 2), the effect of Condition did turn out to be significant, with passives being spoken faster than actives. The main effect of Language, finally, showed that speakers spoke faster in their L1 than in their L2.
For Hesitation occurrence (for the final generalized mixed model, see Table 3), there was no effect of number of syllables, nor was there an effect of Order of Presentation of the items within the experiments. We did find an effect for Language: speakers were more likely to hesitate in their L2 compared to their L1; in addition, the main effect of Condition showed that speakers were more likely to hesitate in passives compared to actives. The significant interaction showed that this effect of Condition was larger in the L1 than in the L2.
IV Discussion and conclusions
Towell (2012: 67) advocates ‘dialogue between SLA scholars with their focus on mental representations, processes, and mechanisms of various kinds and those interested in the performance outcomes of complexity, accuracy and fluency’. The current study does precisely that, by manipulating difficulty at one stage of speech production (namely morphosyntactic encoding), and measuring its effect on fluency performance. We manipulated complexity by eliciting either passive or active sentences in an experimental setting.
Corroborating Engelhardt et al. (2010), we found that in L1 speech passives brought about more hesitations than actives. There was no effect of syntactic complexity on either response time or on articulation rate, when utterance length (number of syllables) was controlled for. It must be noted that one of the aspects in which passives and actives differ is precisely in terms of utterance length. Passives are more complex in an absolute sense than actives because they contain more elements. For response time, the effect of utterance length can therefore be equated with an effect of complexity. For articulation rate, on the other hand, we found that the longer utterances (the passives) were produced faster (hence more fluently) than the shorter utterances. Such more fluent production is not likely to be caused by the complexity of the utterances, but is in line with Quené (2008) who reports that longer utterances tend to have faster articulation rates. When we controlled for utterance length in the current study, there was no additional effect of syntactic complexity on articulation rate.
In summary, for L1 speech we found an effect of syntactic complexity on hesitation occurrence and an effect of syntactic complexity (or utterance length) on initial response time. We had hypothesized that for L2 speech the effects of complexity on fluency could be stronger compared to L1 speech, because for L2 speech constructing passives involves processes that are not yet as proceduralized as in the L1. However, this hypothesis was not borne out by the data. On the contrary, we found that the effect of complexity on hesitation occurrence was stronger in the L1 than in the L2.
Note that this stronger effect of complexity in L1 speech can be explained by the fact that, in L2 speech, participants hesitated much more in the active sentences than in the L1. Hence, the difference in number of hesitations between actives and passives became smaller. In other words, both active and passive production in the L2 led to comparable processing difficulties, in turn leading to hesitations in both conditions. In the L1, however, it was mainly while constructing the more complex passives that speakers used a hesitation.
We can conclude that hesitations are telling of underlying difficulty in speech production. In L1 speech, syntactic complexity therefore leads to more hesitations. In the L2, on the other hand, because producing active sentences also leads to processing difficulties, the effect of syntactic complexity on disfluencies is less strong. Segalowitz (2010) indicates stages of speech production that are vulnerable to processing difficulties and therefore denotes these as ‘fluency vulnerability points’ in speech production. The current study manipulated one specific stage of speech production, namely (morpho)syntactic encoding and found that indeed, more disfluencies are likely to occur, but that articulation rate is not affected by difficulty. Apparently, these two separate measures for fluency are affected by separate aspects in speech planning. Because speakers plan their speech incrementally (Kempen and Hoenkamp 1987), speakers start articulation before they have finished planning their utterance entirely. When speaking in the L1, this means that speakers run into trouble slightly more often when uttering a passive (the less preferred structure), which may lead to hesitating mid-utterance. When speaking in the L2, speakers have higher chances of running into trouble anyway, leading to hesitations mid-utterance for active sentences as well.
With this short research note we wish to show that by carrying out research in which one stage of speech production is manipulated in terms of difficulty or complexity, we will ultimately be able to make hesitations in speech telling of difficulty in underlying processes, and tease apart such effects for L1 and L2 speech production. Of course, the current short report is only the start of such research, limited to intermediate to advanced speakers of an L2 that is typologically rather close to the L1, and limited to one specific operationalization of syntactic complexity.
Footnotes
Appendix
English and Dutch stimuli (words only).
| English stimuli | Dutch stimuli | |||||
|---|---|---|---|---|---|---|
| Verb | Agent | Patient | Verb | Agent | Patient | |
| 1 | to ask | the student | the question | vragen | de student | de vraag |
| 2 | to ruin | the chef | the food | verpesten | de kok | het eten |
| 3 | to paint | the painter | a model | schilderen | de schilder | een model |
| 4 | to serve | the waiter | a drink | serveren | de ober | een drankje |
| 5 | to explain | the professor | the theory | uitleggen | de professor | de theorie |
| 6 | to repair | the mechanic | the car | repareren | de monteur | de auto |
| 7 | to play | the singer | the piano | spelen | de zangeres | piano |
| 8 | to prepare | the chef | the dinner | bereiden | de kok | het eten |
| 9 | to empty | the cleaner | the bin | legen | de schoonmaker | de vuilnisbak |
| 10 | to bake | the baker | a cake | bakken | de bakker | een taart |
| 11 | to hit | the golfer | the ball | slaan | de golfer | de bal |
| 12 | to present | the man | the theory | presenteren | de man | de theorie |
| 13 | to eat | the boy | the cereal | eten | de jongen | de cornflakes |
| 14 | to drive | the policeman | a car | besturen | de agent | een auto |
| 15 | to write | the doctor | a letter | schrijven | de dokter | een brief |
| 16 | to sweep | the woman | the floor | vegen | de vrouw | de vloer |
| 17 | to make | the director | a movie | maken | de regisseur | een film |
| 18 | to give | the banker | the money | geven | de bankier | het geld |
| 19 | to cut | the barber | the hair | knippen | de kapper | het haar |
| 20 | to break | the boy | the window | breken | de jongen | de ruit |
Declaration of Conflicting Interest
The authors declare that there is no conflict of interest.
Funding
This work was supported by research grant “Oral fluency: production and perception” from Pearson Language Tests awarded to NH De Jong.
