Abstract
Previous studies show that infants store functional morphemes for inferring syntactic categories of adjacent words, and they generally perform better with nouns than with verbs. In this study, we tested whether toddlers can exploit phrasal groupings for syntactic categorization in the face of noisy co-occurrence patterns. Using a visual fixation procedure, we examined whether Mandarin-learning 19-month-olds can categorize word X to the left of functional morpheme a in a prosody-neutral 3-word sequence X-a-Y, where a structurally selects X (X and Y being unfamiliar words). Infants at 19 months were familiarized either with X-ye-Y (‘even XN YV’) or with X-le-Y (‘have XV-ed YN’). While le features a more mixed distribution than ye, 19-month-olds succeeded with both ye and le by preferring grammatical new contexts of X over ungrammatical ones, consistent with the hypothesis that phrasal groupings ([Xa. . .]) support syntactic categorization. Our findings provide initial evidence for infants’ ability to capture functional morphemes for backward syntactic categorization.
Keywords
Introduction
Understanding how infants assign words to syntactic categories like nouns and verbs is crucial for theories of language acquisition. It bears on how infants interpret the combinatorics of their mother tongue. As a pertinent account, the prosody-functor bootstrapping hypothesis (Shi, 2014) highlights how the finite set of perceptually and distributionally distinct functional elements allow infants to segment speech and perform abstract syntactic analyses, thereby assigning syntactic categories to neighboring novel words (for earlier versions of the model, see Christophe et al., 1997, 2008; Morgan et al., 1996). The present study inquires whether Mandarin-learning infants can infer the syntactic categories of words preceding functional morphemes and whether such backward syntactic categorization is feasible with functional morphemes featuring mixed co-occurrence patterns.
Although children’s early production data point to the lack or optional use of function words and inflectional morphology before 3 years of age (Brown, 1973; Wexler, 1994), studies on infant speech perception reveal their sensitivity to functional elements as a distinct category from the onset of acquisition. It has been found that 1- to 3-day-old newborns can distinguish function words from content words (Shi et al., 1999). After 6 months of age, they are able to track and represent some function words for speech segmentation (e.g. Höhle & Weissenborn, 2003; Shi & Lepage, 2008; Shi et al., 2006), and by around 11 months, they have become sensitive to many functional elements in their native language (e.g. Hallé et al., 2008; Marquis & Shi, 2012; Shi et al., 2006). From about age 1, infants start to detect sentences with misplaced functional morphemes (Shady, 1996; Soderstrom et al., 2007) and track discontinuous dependencies between two functional morphemes (Höhle et al., 2006; Santelmann & Jusczyk, 1998; Van Heugten & Shi, 2010).
In view of children’s early sensitivity to functional elements, the prosody-functor bootstrapping hypothesis suggests that these functional morphemes can be exploited by infants to assign syntactic categories to neighboring novel words (cf. Shi, 2005). As functional morphemes occur frequently at the edges of prosodic units (Maratsos & Chalkley, 1980) and are characterized by short syllable duration, null coda, and weak prosody (Morgan et al., 1996; Shi et al., 1998), infants can zoom in on phrase structures and derive abstract representations for parsing tasks (cf. Cann, 2000). In addition, the hypothesis assumes that the fundamental functional-contentive syntactic distinction guarantees that functional elements can be stripped off from input and stored in long-term memory so that in unfamiliar contexts, infants can extract and categorize ‘novel’ content words adjacent to functional elements.
Potential cues to syntactic categorization
Regarding syntactic categorization, infants can capitalize on two major distribution patterns of functional morphemes, namely, frames and bigrams (for discussion of the two types of distributional information, see Mintz et al., 2014). Frequent frames specify the syntactic category of the target (X) with joint cues from left and right functional morphemes in full utterances, that is, . . . aXb . . . (Mintz, 2003). In contrast, bigrams include one particular functional morpheme as the only structural cue infants can clutch at to infer the syntactic category of an adjacent target either to its right (i.e. aX), as a case of forward categorization, or to its left (i.e. Xa), as a case of backward categorization.
Previous research has documented infants’ use of bigrams in categorizing nonsense words. Using a variation of the Headturn Preference Procedure, Höhle et al. (2004) found that German-learning 16-month-olds only succeeded in categorizing novel nouns with ‘determiner + noun’ sequences (e.g. ein pronk ‘a pronk’) but not novel verbs in ‘pronoun + verb’ sequences (e.g. sie pronk ‘she pronks’). Specifically, only infants familiarized with ‘determiner + X’ sequences showed differentiation between test passages presenting the novel word X as a noun (e.g. Das kleine Kind vergaß den Pronk dort ‘The little child forgot the pronk there’) and as a verb (e.g. Meistens pronk er auf der großen Lichtung ‘Most of the time he pronk[ed] in the big clearing’). Using a similar task that presented novel words only in bigram contexts controlled for prosodic cues, Shi and Melançon (2010) directly investigated whether distributional cues furnished by the functional morphemes alone conduce to infants’ syntactic categorization and found that French-learning 14-month-olds likewise categorized nouns but not verbs in their study. That is, only ‘determiner + noun’ sequences (e.g. des miges ‘some miges’ and ton mige ‘your mige’ for the noun group) but not ‘pronoun + verb’ sequences (e.g. je mige ‘I mige’ and il mige ‘he miges’ for the verb group) during familiarization led infants to discriminate between noun contexts (e.g. le mige ‘the mige’) and verb contexts (e.g. tu miges ‘you mige’) during test.
It seems that infants are generally better at inferring nouns than verbs with functional morphemes in bigrams, which might be attributable to the co-occurrence patterns of functional elements with nouns and verbs in child-directed speech (Mintz et al., 2002). The co-occurrence relation between a subject pronoun and a verb, for instance, appears to be weaker than that between a determiner and a noun in the input (Höhle et al., 2004). According to Höhle et al. (2004), in German child-directed input, the subject pronoun sie is followed by a verb only 31% of the time, whereas the determiner ein is followed by a noun 71% of the time.
High local co-occurrence, however, is not a guarantee for successful syntactic categorization. In French, for instance, pronouns constitute the most frequent kind of subject in caregiver speech and are even used dominantly in sentences containing a subject noun phrase, a proper noun subject (e.g. Les pattes, elles vont en bas ‘The legs, they go down’; Sarah, elle sait pas ‘Sarah, she doesn’t know’), or a stressed pronoun (e.g. Moi, je mange ‘Me, I eat’; Tu dis rien, toi ‘You say nothing, you’) by means of dislocation and subject doubling (Legendre et al., 2010; Shi et al., 2020). In fact, the probability of a determiner predicting a following noun is comparable to that of a pronoun predicting a following verb in analysis of large-scale input corpora, which is inconsistent with the poorer performance of verb categorization in French-learning 14-month-olds (see the discussion in Shi, In Press).
On the contrary, there is indication that infants can categorize verbs using functional morphemes that have much lower predictive power of a verb than pronouns do. Hicks (2006) showed that English-learning 14- to 18-month-olds categorized novel words as verbs after being familiarized with ‘auxiliary + verb’ sequences (e.g. can pell, will dak). During test, they discriminated between novel words following non-familiarized auxiliaries (e.g. will pell) and those following determiners (e.g. her dak). This contrasts markedly with German- and French-learning infants’ lack of categorization with ‘pronoun + verb’ sequences (Höhle et al., 2004; Shi & Melançon, 2010). Based on a CHILDES corpus analysis, we found that auxiliaries are followed by verbs only 12% of the time. 1 In addition, we investigated the distribution of will and observed a different pattern: will predominantly precedes a noun but not a verb (as in yes-no questions like Will mommy put them in the basket), whereas its contracted form ’ll reliably precedes a verb (as in I’ll read to you later). This implies that in generalizing combinatorial rules, infants do not merely track the specific form in co-occurrence patterns. 2 It also points to the possibility that infants may represent bigrams as a syntactic unit (i.e. [AuxP willAux [VP XV]] where the functional element selects the novel word), not as a co-occurrence pattern disrespecting constituency (i.e. will XN).
We propose phrase boundary as a pivotal factor for predicting infants’ syntactic categorization. That is, infants may represent word sequences as clusters of phrases and categorize adjacent words that fall within the phrase marked by the functional morpheme. Along this line, a determiner and a noun form a cohesive phrasal unit favorable for syntactic categorization (e.g. [DP theDet [NP XN]]), whereas a pronoun itself forms a complete noun phrase disconnected from its following verb (e.g. [NP shePron][VP XV]). This also explains the case of ‘auxiliary + verb’, where the auxiliary entails its following verb as an integral part of its structure (e.g. [AuxP willAux [VP XV]]). Further experiments are needed to investigate whether phrase boundary supports syntactic categorization regardless of co-occurrence uncertainty.
Current study: syntactic categorization within phrase boundaries
In this study we aim at further understanding infants’ syntactic categorization. To do this, we tested whether Mandarin-learning 19-month-olds can infer the syntactic properties of unfamiliar words preceding monosyllabic functional morphemes that occur not at the edge of utterances. Infants from 6–8 months of age begin to track and represent functional items based on their high frequency of occurrence and to use them for categorization from age 1, yet, it is unclear whether backward syntactic categorization is possible as their parsing ability matures. For Mandarin-learning infants, one representative study by Zhang et al. (2015) demonstrates that 12-month-olds can use preceding function words in bigrams to classify words (wo-de X ‘my X’, zhe-ge X ‘this-classifier X’ presenting X as a noun, and wo ye X ‘I also X’, ni bie X ‘you don’t X’ presenting X as a verb), while another study by Zhang et al. (2014) provides evidence for Mandarin-learning 11- to 14-month-olds’ use of frequent frames like zai _ shang (at _ on) and ge _ shi (general classifier _ be) for noun categorization and those like jiu _ le (just _ aspect-marker) and mei _ guo (not _ experiential aspect-marker) for verb categorization. As these studies either looked at forward syntactic categorization with bigrams or at syntactic categorization with frames, it is difficult to infer whether backward categorization alone plays a role in infants’ inference of syntactic categories.
We thus inquire whether functional elements following nouns and verbs alone might assist infants in syntactic categorization. Such an approach to categorizing words is theoretically promising, given the abundancy of inflectional functional morphemes in human languages. Furthermore, we attempt to probe whether phrase boundaries may assist infants in finding informative structures amid co-occurrence patterns. To answer these two questions, we created 3-word sequences X-a-Y that enabled both backward categorization (X-a) and forward categorization (a-Y). The functional morpheme a in the middle was either the focus marker ye, predicting X to be a noun, or the aspect marker le, predicting X to be a verb.
The two functional morphemes ye and le are suitable for two reasons. First, both morphemes are common in children’s input and structurally require a preceding lexical element, which might motivate learners to track their distributions for backward categorization. 3 Functionally close to even, the noun predictor ye is a free morpheme carrying a neutral tone in continuous speech. 4 It is semi-cliticized to a preceding focus element, typically a noun phrase referring to either the subject as in (1a) or the fronted object as in (1b) (e.g. Constant & Gu, 2010; Hole, 2004).5,6
(1) a. [meimei]F ye chi mugua
younger-sister FOC eat papaya
‘Even YOUNGER SISTER eats papaya (least likely among people who eat papaya).’
b. meimei [mugua]F ye chi
younger- sister papaya FOC eat
‘Younger sister even eats PAPAYA (least likely among the fruits she eats).’
The verb predictor aspect marker le (or V-le) is a bound morpheme indicating realization or perfectivity (e.g. Chao, 1968; Li & Thompson, 1981). It carries a neutral tone and requires a subsequent complement, either an object or a duration phrase (Lu, 1980; Zhu, 1982).
(2) a. women kan-le san-ben shu
we read-PERF three-CL book
‘We (have) read three books.’
b. women shui-le san-ge xiaoshi
we sleep-PERF three-CL hour
‘We (have) slept for three hours.’
Second, these two functional morphemes differ in the consistency of their distributional environments, which allows us to test whether phrase boundary can rescue syntactic categorization given significant noise in co-occurrence patterns. In previous artificial grammar studies, 1-year-old infants were able to learn distributional patterns when the input contained certain proportions of noise. Specifically, infants in Koulaguina and Shi (2013, 2019) learned movement patterns when the training input contained 100% or 80% rule-consistent exemplars, but not when rule exemplars were reduced to 50%. Gomez and Lakusta (2004) showed that infants were able to categorize novel words into form classes when the co-occurrence pattern between these words and the adjacent functors was fully consistent in the training input and when the input was mixed with 17% violations; however, no learning was observed when the input contained 33% violations. These findings suggest that infants are sensitive to co-occurrence probabilities. In fact, even at 8 months of age infants can compute transitional probabilities between syllables and use the information to determine whether the elements form a cohesive unit or not (Saffran et al., 1996). In light of this literature, we quantified the co-occurrence context of ye and le in Mandarin by calculating their bigram probabilities in Tong’s corpus (Deng & Yip, 2018). 7 Indeed, ye is consistently preceded by a noun (95.7% by token frequency, 90% by type frequency), whereas words preceding le can be verbs (47.6% by token frequency, 42.6% by type frequency), nouns (16.7% by token, 23.5% by type), or other categories such as adjectives (35.7% by token, 33.9% by type).8,9 Therefore, toddlers should be expected to perform better in the categorization task with ye than with le, if their extraction of regularities is driven solely or largely by statistical coherence in local co-occurrence. If infants exploit phrase boundary in parsing, however, both ye and le should lead to successful categorization of X, as either of them structurally entails its preceding word X without crossing phrase boundaries. That is, if infants are sensitive to the clustering of words and morphemes within a particular phrase, namely, knowing that these functional morphemes share a phrase with their preceding word X (i.e. [FocP XN yeFoc. . .] and [AspP XV leAsp. . .]), they should categorize X retrospectively upon identifying the corresponding phrase structure.
Method
Participants
Our sample consisted of 24 monolingual, normally developing Mandarin-learning infants (13 boys, 11 girls; M age: 19 months 18 days; range: 18 months 18 days to 20 months 25 days). These infants were randomly assigned to one of the two experimental conditions: the ye condition or the le condition (ye: 7 boys, 5 girls; M age: 19 months 14 days; range: 18 months 18 days to 20 months 25 days; le: 6 boys, 6 girls; M age: 19 months 22 days; range: 19 months to 20 months 9 days).
Another 13 infants participated but their data were eliminated from the final analysis due to fussiness (3), failure to stay on task (i.e. no recovery of looking time to the post-trial relative to the last test trial; 8), ceiling performance (1), and researcher misoperation (1).
Stimuli
We created X-a-Y utterances for the familiarization phase. As shown in Table 1, there were four different utterances for the ye condition (i.e. XN-ye-YV ‘even X Y’), and four different utterances for the le condition (i.e. XV-le-YN ‘have X-ed Y’). Each of the utterances was recorded three times. Therefore, we had a total of 12 tokens (i.e. 4×3 = 12) of X-ye-Y sequences, and 12 tokens of X-le-Y sequences. The targets for categorization (i.e. X) were two monosyllables, shai1 and man2, which are phonologically unbiased toward either nouns or verbs (shai1 ‘sieve; to sift’; man2 ‘eel; to conceal’), while the filler words (i.e. Y) were disyllabic words used in Zhang et al. (2015), including tongji ‘calculation; to calculate’ and jianyan ‘examination; to examine’. These words were chosen because, first, they are unfamiliar abstract words that do not exist in the input (e.g. not found in Tong’s corpus; Deng & Yip, 2018), and second, they are ambicategorial between nouns and verbs, hence compatible with either type of contexts. The test phase involved two utterances of shi zhe-ge XN (‘be this-CL X’) for noun contexts, and two utterances of dou keyi XV (‘all may X’) for verb contexts, as shown in Table 1. Each utterance was recorded four times, thus yielding a total of eight tokens of shi zhe-ge XN utterances (i.e. 2 × 4 = 8), and eight tokens of dou keyi XV utterances.
Familiarization conditions and test items in the experiment.
The recording was made in an acoustic chamber (sampling frequency 44.1 kHz, bit rate 24 bits). A native Mandarin female speaker produced the utterances using infant-directed speech. The exemplars of the familiarization and the test utterances varied in their intonation patterns, which prevented toddlers from deriving patterned structures out of an invariant sentential prosody and compensated for the monotony of repetitious stimuli. Besides, the functional morphemes ye versus le had comparable prosodic features, so did the target words as nouns versus as verbs in the familiarization utterances (see the acoustics measures in Table 2 in Appendix 1). The target words as nouns versus verbs in test utterances were also acoustically comparable (see Table 3 in Appendix 1). These controls allowed us to specifically test whether utterance-medial functional morphemes per se enable toddlers to categorize the preceding word.
An animation of a rabbit chasing a running carrot on a rotating moon, together with a piece of instrumental music, was used as the attention-getter. A lip-synched puppet was created as the visual stimuli for the familiarization phase and the test phase. Besides, a green meadow accompanied by light music served as the stimuli for the post-test trial.
Design
Toddlers were assigned randomly to one of the two conditions (12 infants for each condition), as is shown in Table 1.
As Table 1 indicates, toddlers in the ye condition were familiarized with four different utterances of X-ye-Y supporting the target X as a noun, while those in the le condition were familiarized with four utterances of X-le-Y supporting the target X as a verb. The interstimulus interval (ISI) was 1200 mseconds. The familiarization phase ended when a toddler’s total looking time (excluding the duration of lookaways) reached 30 seconds.
Consecutively, toddlers in either familiarization condition entered the test phase presenting ten test trials that alternated in context types (i.e. five trials for each context type), that is, shi zhege shaiN and shi zhege manN for noun contexts, and dou keyi shaiV and dou keyi manV for verb contexts. Within each test trial, the two utterances were presented repeatedly in a random order. Note that the grammaticality of the same test trial was reversed for toddlers assigned to different familiarization conditions (i.e. trials grammatical for the ye condition were ungrammatical for the le condition). As the contexts of target words during the test either matched or mismatched their syntactic category during familiarization, half of the trials would be grammatical and the other half ungrammatical. The grammaticality of the first test trial was counterbalanced across toddlers within each condition. The maximum length of each test trial was 17.6 seconds, with an ISI of 1000 mseconds.
Our arrangement of contexts in the familiarization and test phases did not overlap, as in previous studies (Höhle et al., 2004; Shi & Melançon, 2010). It is also worth noting that our target words were in an utterance-initial position during familiarization but were in an utterance-final position during the test. The phrase structures in the familiarization and test phases were distinct. This allows us to see whether toddlers have categorized the targets and generalized the categories to varied syntactic positions.
Procedure
The experiment was conducted in a Visual Fixation Procedure (Cooper & Aslin, 1990), a modified version of the Headturn Preference Procedure (Kemler Nelson et al., 1995). Each toddler was guided by a lab assistant into a dimly lit acoustic chamber and seated on the parent’s lap in front of a monitor, with loudspeakers on the two sides playing the auditory stimuli. Prior to testing, the parent was instructed to put on headphones through which masking music was played to stave off his or her interference with the toddler. The experiment was run by a program designed to present trials in a prearranged order and to record and calculate looking times of the toddler simultaneously. All trials were infant-controlled in the sense that they were initiated by the toddler’s gaze at the central screen and terminated once he or she looked away from the screen for 2 seconds or more.
At the beginning, an attention-getter was presented for the toddler to fixate at the center of the screen. The familiarization trial then began and only terminated after a predetermined total looking time of 30 seconds was reached (for the familiarization criterion, see Shi & Melançon, 2010). If the toddler looked away for more than 2 seconds in the middle of the familiarization, the attention-getter would appear, but as he or she looked back to the screen, the familiarization utterances would resume. For the subsequent test trials, an attention-getter was inserted between any two trials to bring the toddler’s fixation back to the screen. Each test trial would terminate if the toddler looked away for at least 2 seconds or if the maximum trial length (17.6 seconds) was reached. Finally, a post-test trial ensued to ensure the toddler’s full participation in the previous test trial. The fixation data was only deemed valid for analysis if looking time during the post-test trial increased relative to the last test trial.
Predictions
If toddlers used functional morphemes to categorize target words during the familiarization phase, they were expected to discriminate between grammatical and ungrammatical test trials; otherwise, their average looking times to grammatical and ungrammatical test trials should not differ significantly. Specifically, if their certainty of categorization follows strictly from local co-occurrence, they should show better categorization (i.e. significant looking time difference for grammatical versus ungrammatical test trials) in the ye condition than in the le condition. If they exploit phrase boundary in constraining their analysis, they should succeed in both conditions with comparable grammaticality discrimination.
Results
Looking times during the two types of test trials were analyzed, with the results shown in Figure 1.

Mean Looking Times (and Standard Errors) During the Two Test Trial Types (Grammatical Versus Ungrammatical) for the ye and the le Familiarization Conditions (ye condition: p = .014; le condition: p = .029).
We calculated each toddler’s average looking time per trial for grammatical trials and for ungrammatical trials, respectively. The data were then analyzed in a 2 × 2 ANOVA test, with Grammaticality as the within-subject factor (grammatical vs ungrammatical) and Familiarization Condition (ye vs le) as the between-subject factor. Our results showed that there was no main effect of Familiarization Condition, F(1, 22) = 2.919, p = .102, but a significant effect of Grammaticality, F(1, 22) = 13.428, p = .001, η2 = .055. Furthermore, the interaction between Grammaticality and Familiarization Condition was not significant, F(1, 22) = .342, p = .565.
The significant effect of Grammaticality and the lack of Grammaticality × Condition interaction indicate that toddlers familiarized with either condition showed comparable patterns of response: a preference for the grammatical trials over the ungrammatical trials. The looking time difference between grammatical and ungrammatical trials was equivalent in the two conditions. Toddlers in the ye condition looked significantly longer to the grammatical trials (M = 12.547 seconds, SD = 2.431 seconds) than to the ungrammatical trials (M = 11.349 seconds, SD = 2.612 seconds), paired t(11) = 2.913, p = .014, Cohen’s d = .841, two-tailed. The same looking preference for grammatical trials (M = 10.852 seconds, SD = 3.175 seconds) over ungrammatical trials (M = 9.199 seconds, SD = 3.346 seconds) was found for the le condition, paired t(11) = 2.503, p = .029, Cohen’s d = .722, two-tailed.
In sum, toddlers in both conditions showed a distinction for the two types of test trials – they preferred grammatical trials over ungrammatical ones. These findings confirm the prediction by phrase boundary that toddlers use ye and le equally well to categorize their preceding words. Contrary to the prediction by co-occurrence, the less reliable distributions of verbs with le (relative to nouns with ye) in the input did not hinder verb categorization in 19-month-olds.
Discussion
Using the visual fixation procedure, the present study demonstrates that toddlers can exploit functional morphemes as the only cue to infer the syntactic category of its preceding word. Mandarin-learning 19-month-olds succeeded in using the focus marker ye to categorize a preceding noun and the aspect marker le to categorize a preceding verb. Particularly, those familiarized with X-ye-Y sequences preferred to listen to test trials presenting X as nouns over those presenting X as verbs, whereas those familiarized with X-le-Y sequences listened significantly longer to trials presenting X as verbs than to those presenting X as nouns. Instead of favoring specific contexts during the test, toddlers in both conditions showed a preference for the corresponding grammatical trials (new verb contexts following the le familiarization and new noun contexts following the ye familiarization). We discuss in this section how these findings can be interpreted in relation to foregoing studies.
Our study supports the prosody-functor bootstrapping hypothesis and shows that 19-month-olds can exploit functional morphemes for backward syntactic categorization. As human language allows functional elements to be attached to content words either to their left (e.g. function words like the in English or verb plural prefixes like /z-/ in French such as in /il z-ariv/‘they arrive’) or to their right (e.g. particles like up or suffixes like -ing), it is crucial for infants to track both forward and backward regularities for assigning syntactic categories. Despite the potential challenge of processing utterance-medial functional morphemes (cf. Newport, 1990; Sundara, 2018) and the atypical occurrence of head-final phrases (e.g. [[X]a]) in Mandarin, toddlers in this study succeed in backward syntactic categorization, long before they show comprehension of these functional morphemes in processing experiments (Yang et al., 2018; Zhou et al., 2014).
Infants at 19 month old with their recognition of specific functional morphemes and preceding syntactic contexts have significant implications for toddlers’ differentiation of structures. In our experiment, two groups of Mandarin-learning toddlers assigned to the ye condition versus the le condition showed opposite preferences for the same test materials, which can only be explicated by toddlers’ prompt parsing of the contrasting structures featuring ye and le, as illustrated below.
(3) a. XN yeFoc YV
[FocP shaiN
b. XV leAsp YN
[AspP [shaiV-
In (3a), the focus marker ye incorporates its preceding target word X (e.g. noun shai) as part of the Focus Phrase or FocP (cf. Shyu, 1995), whereas in (3b), the target word X (e.g. verb shai) is local to the phrase headed by the aspect marker le bearing aspectual information of the event (Huang et al., 2009). 10 Despite the structural nuances induced by ye and le, two acoustically unremarkable medial functional morphemes, 19-month-olds can identify the functional morphemes and retrieve their corresponding syntactic environments without semantic cues. In view of the efficiency of their parsing, it is plausible that infants can abstract away subtler syntactic details around or even prior to 19 months of age.
This study also extends our knowledge of whether a variable distributional context might prevent infants from generalizing the right rules about phrase structures. As natural languages do not always present clean data for local distributional patterns, with many syntactic phenomena featuring long-distance dependencies (e.g. syntactic islands), infants must juggle statistical noise and structural mapping (for similar discussions, see Chomsky, 1965; Pinker, 1987). For our study, if infants only exploit structure-independent sequences of elements for syntactic categorization, they are expected to hit a wall in the le condition, given that le is locally compatible with words of multiple syntactic categories that obscure regularity extraction. However, if infants represent structure-dependent grammatical constructs like phrasal units (e.g. [AspP X leAsp . . .]), they will stand a chance to resolve local uncertainties from early on and categorize adjacent words through top-down inference. Our findings indicate that infants performed equally well in parsing with ye and le, despite their enormous difference in local distributional certainty. In this light, 19-month-olds most likely exercise some degree of linguistic intuition that compensates for the distributional uncertainty with local structures. That is, this intuition is available to infants for constraining structural analyses and teasing apart nuanced structures.
Using phrase boundary for constraining analyses and resolving co-occurrence dilemmas elucidate findings on infants’ syntactic categorization. That is, infants tend to categorize lexical elements inherent to the phrase featuring the functional element. For instance, it explicates why frames like zaiV _ shangPrep ([VP V [PP
Through investigating backward syntactic categorization, the study provides further insight into the debate on the continuity in language acquisition. Specifically, when zooming in on infants’ perception of functional elements across sentential positions in the absence of meaning, we discover that they are remarkably versed in using functional elements for computing the combinatorics of neighboring elements consistent with phrase structures (see also Massicotte-Laforge & Shi, 2015, 2020), even when the target structures (i.e. XV-le-YN) are infrequent in input (6 out of 126 occurrences of le in Tong’s corpus) and not yet present in early speech production (cf. Fan & Song, 2013; Peng, 2016; Zhang & Wang, 2009). These observations provide cross-linguistic evidence supporting the continuity hypothesis of language acquisition in the sense that it documents infants’ perception and representation of functional categories from the onset of development, where they bootstrap phrase structures and generalize rules for parsing larger units before linking meanings to forms (Dye et al., 2019; Gleitman, 1990; Naigles, 2002). While functional categories form the core of formal syntax in motivating syntactic operations such as relocating lexical elements for checking inflectional and scope-discourse features (cf. Rizzi & Cinque, 2016), they also figure prominently in language processing for both adults (e.g. Brown et al., 1999) and infants (cf. Shi, 2014, In Press). This convergence of theory and empirical research has major implications for our assessment of the assumption concerning children’s representation of functional categories as a part of their innate or gradually matured grammar (e.g. Borer & Rohrbacher, 1997; Poeppel & Wexler, 1993; Radford, 1995; for a discussion, see Guasti, 2002). Viewing from findings on parsing at early developmental stages, infants’ knowledge of how functional elements relate to syntactic structures could be much more sophisticated than previously expected.
In our study, we used rare content words unknown to toddlers in order to directly test their use of functional morphemes and their early structural knowledge. What remains uncertain is whether 19-month-olds use some meanings of the functional morphemes in their structural analysis of larger utterances. Although assessing the semantic representations of functional morphemes at young ages is methodologically challenging, it is theoretically important to do so in future research.
To conclude, this study addresses the role of functional morphemes in toddlers’ syntactic categorization under the framework of the prosody-functor bootstrapping hypothesis. The results demonstrate that Mandarin-learning 19-month-olds can exploit an utterance-medial functional morpheme a in a 3-word sequence X-a-Y to categorize the target word X backwardly. Their success in categorizing nouns with ye and verbs with le suggests that backward syntactic categorization works for both nouns and verbs; 19-month-olds show evidence of overcoming the dilemma with mixed co-occurrence patterns, which confirms the assumption that they constrain their analyses by categorizing words within the phrase boundary of functional morphemes. In general, our findings offer new insights on childrens’ early knowledge of functional elements and how they harness these structural ‘anchors’ for parsing without recourse to meaning.
Footnotes
Appendix 1
Test stimuli: mean acoustic values (and SDs) of the target words.
| Acoustic measure | Noun uses | Verb uses | Unpaired t-tests (two-tailed) |
|---|---|---|---|
| Total utterance duration (seconds) | 1.084 (0.095) | 1.032 (0.080) | t(14) = 1.198, p = .251 |
| TW duration (seconds) | 0.414 (0.029) | 0.436 (0.034) | t(14) = –1.386, p = .187 |
| TW vowel duration (seconds) | 0.289 (0.026) | 0.293 (0.038) | t(14) = –.221, p = .828 |
| TW vowel mean pitch (Hz) | 200.593 (21.151) | 219.161 (28.876) | t(14) = –1.467, p = .164 |
| TW vowel mean intensity (dB) | 82.935 (1.130) | 82.602 (1.200) | t(14) = .572, p = .576 |
TW = target word.
Acknowledgements
The authors thank Jingying Xu, Han Hu, Ziqi Wang, Miao Miao, Deming Shi, and Xirong Hu of the Language Acquisition Lab for help with data collection and Dr Zhongda Yuan for technical help, as well as all the parents and children for their participation. Thanks are also due to two anonymous reviewers and editors for insightful and helpful comments.
Author contribution(s)
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work presented in this paper was supported by the National Social Science Fund of China to Xiaolu Yang (21BYY019).
Ethical and consent statement
The parent of each infant gave informed consent prior to participation. The study was approved by the Ethics Committee of DFLL of Tsinghua University.
