Abstract
Aim and research question:
The aim of this study is to test Macswan’s ((1999). A minimalist approach to intrasentential code switching. New York, NY: Garland; (2000). The architecture of the bilingual language faculty: Evidence from intrasentential code-switching. Bilingualism: Language and Cognition, 3, 37–54; (2005). Codeswitching and generative grammar: A critique of the MLF model and some remarks on “modified minimalism”. Bilingualism: Language and Cognition, 8, 1–22.) PF Disjunction Theorem (PFDT), which was proposed based on Chomsky’s ((1995). The minimalist program. Cambridge, MA: MIT Press.) minimalist programme, to answer the following question: Is code-switching (CS) behaviour governed by CS-specific grammar or an innate mechanism that produces monolingual and bilingual utterances in our language faculty?
Methodology:
A quantitative approach was adopted to test the PFDT with the Southern Min/Mandarin CS data.
Data and analysis:
811 lexical items extracted from 343 bilingual clauses in my Southern Min/Mandarin CS corpus, and almost no violation against this model (i.e., a word-internal switch) was found, except one example that was regarded as the informant’s slip of tongue.
Findings/conclusions:
The results of this study confirm the prediction of the PFDT that phonological systems cannot be mixed within a word.
Originality:
Although the morphosyntactic structures and in some cases the pronunciations of morphemes are identical, tonal differences of these two languages still prohibit word-internal switches.
Significance/implications:
This study thus supports the PFDT and argues that CS behaviour is governed by a single innate mechanism that governs both monolingual and bilingual language production and that the so-called CS-specific grammar/mechanism is not necessary.
Introduction
Code-switching (CS), that is, the use of two or more languages in an utterance, has been studied from various perspectives. Some scholars (e.g., Myers-Scotton, 1993/1997, 2002; Poplack, 1980) believe that CS is governed by a CS-specific grammar or a grammar that mediates between the two grammatical systems of the participating languages (usually referred to as the “third” grammar). Some researchers (e.g., Disciullo, Muysken, & Singh, 1986) applied general syntactic constraints (e.g., the government theory) to examine CS utterances. Others (e.g., Santorini & Mahootian, 1995), however, argue that no explicit grammatical constraint is needed because all the grammatical information is encoded in lexicons (i.e., the null theory). Each of these three approaches offers a distinctive perspective to the following question: Is CS governed by a CS-specific grammar/mechanism, a general syntactic constraint or the same mechanism that produces monolingual utterances? This question could be interpreted more broadly. For instance, is there a separate system that specifically addresses the production of bilingual utterances in our brain, or is there only one mechanism that addresses both monolingual and bilingual language production? If a separate system exists, what is the mechanism? The answers to these questions are still unclear because the previous studies all encountered either theoretical or empirical challenges (or both).
Macswan (1999, 2000, 2005) proposes the PF Disjunction Theorem (hereafter PFDT) to investigate CS phenomena. He argues that a single computational mechanism that is innate in the language faculty not only addresses monolingual utterances but also governs the production of bilingual utterances. Although this model has been being criticized by other linguists (e.g., Jake, Myers-Scotton, & Gross, 2005), arguably, real empirical counter-examples for the model have not yet been found. Hence, the aim of this study is to test Macswan’s (1999, 2000, 2005) PFDT with the Southern Min/Mandarin CS data to answer the following question: Is CS behaviour governed by a CS-specific mechanism or a mechanism that is also responsible for producing monolingual utterances?
Review of the literature
The government constraint
Disciullo et al. (1986) applied a specific syntactic constraint to study CS and argued that the process of CS was constrained by government relations. According to Chomsky’s (1981) definition, the head (X0) governs its complement within the same maximal projection. For instance, in the sentence “I kill the man”, the V0 head “kill” governs the NP “the man”. They went on to argue that “if X has language index q and if it governs Y, then Y must have language index q” (Disciullo et al., 1986, p. 5). To illustrate how their model operates, consider the invented Mandarin/English CS examples in (1).
(1) I [ I read-Masp. that Mclass. book “I have read that book.”
Disciullo et al.’s model would permit an English-Mandarin CS utterance such as example (1) because the verb kan “read” is the head of the underlined verb phrase and governs its complement, namely the NP na shu “that book”. Hence, the verb and its complement NP are in the same language. However, their constraint encountered empirical challenges. Consider their example in (2).
(2) ha “has received the diploma” (Italian/French; Disciullo et al., 1986, p. 13)
In example (2), the Italian verb ricevuto “receive” governs the NP- “il diplôme”. According to the government constraint, the governed NP should also be Italian. However, a switch (i.e., the French definite article “il”) occurs within this governed NP and therefore violates the prediction of the government constraint. Although several attempts to revise the government constraint were made by Disciullo et al., they were all rejected by empirical data; therefore, the government constraint was finally abandoned by Muysken (2000).
The functional head constraint
With regard to the relationship between heads and their complements, Belazi, Rubin, and Toribio (1994) proposed the Functional Head Constraint and argued that CS between Comp and IP, Infl and VP and Det and nominal projection does not occur. However, consider the counter-example in example (3).
(3) Ye uri vanemud mikone ke I’m stupid. (Comp + IP) a way indicate does that “He acts as if I’m stupid.” (English/Spanish; Mahootian & Santorini, 1996, p. 465)
Example (3) is a clear violation because the switches occur between the complementizer and the IP. Because of the empirical failures, Belazi et al. further proposed the Word-Grammar Integrity Corollary, which stated that a word from language X with grammar Gx must obey grammar Gx. Consider their example in (4):. (4) J’ai une voiture mizyaena. I have a car nice (French/Tunisian Arabic; Belazi et al., 1994, p. 232)
They argued that adjectives occur after nouns in both French and Tunisian Arabic. Hence, a switch such as the adjective mizyaena “nice” in example (4) is permissible because it obeys the syntactic rules of these two languages. However, consider the counter-example in (5).
(5) Sorekara, his wife ni eater In addition to give-COND “In addition, if we give it to his wife.” (Japanese/English; Nishimura, 1986, p. 129)
In example (5), following Japanese OV word order, the switched English object “his wife” occurs before the verb. This empirical example clearly violates the VO word order of English and rejects the Word-Grammar Integrity Corollary.
The null theory
Instead of particular grammatical constraints, Santorini and Mahootian (1995, p. 5) stated that “the language of a syntactic head determines the position of its complements in codeswitching contexts just as in monolingual ones” and proposed the Tree Adjoining Grammar, which postulates that there is no distinction between the lexicon and grammar, which means that when a speaker accesses a lexical item, he/she simultaneously accesses the minimal trees that encode the syntactic categories, projects of that category and slots for the syntactic dependencies of this lexical item. For example, if a speaker accesses the English word “kiss”, a minimal tree is retrieved in which there is a V node and empty slots for two DPs (subject and object). Thus, if a CS utterance involves both a VO and OV language, there will be four possible switches between a verb and its object. Examine their Farsi/English examples in (6).
(6) a. ate Ø sibhara b. the apples xord c. xord the apples *d sibhara ate (Santorini & Mahootian, 1995, p. 9)
According to their analysis, the object is allowed to appear in both pre-verbal and post-verbal positions in Farsi; therefore (6b) and (6c) are possible switches. In English, the object can only appear after the verb. Hence, a switch resembling (6a) could occur, whereas (6d) is not possible. However, consider the Japanese/English example in (7).
(7) Are-o
you have to learn ti. That-ACC “That, you have to learn.” (Japanese/English, Nishimura, 1997, p. 124)
Example (7) violates Santorini and Mahootian’s model because the Japanese object are “that” occurs in the clause-initial position, which is not allowed in English. One may argue that the Japanese object is originally at the post-verbal position. Example (7) may be a result of movement; therefore, it should not be regarded as a real counter-example. However, their model predicts only the position of the object at the deep structure level and does not include the position after movement.
In the premise that the same grammatical mechanism that constrains monolingual utterances also governs CS, Chan (2003, p. 1) argues that “functional categories and lexical categories exhibit different behaviour in code-switching…” and specifically examines CS utterances from two perspectives – namely, word order and selection. According to his observation, the language of the lexical heads (nouns or verbs) does not necessarily determine the position of the complements, but the language of the functional heads (i.e., D, I, C) always does. Consider his examples in (8).
(8) a. I have to ttakē my hand. wash “I have to wash my hand.” (English/Korea; Choi, 1991, p. 889; cited in Chan, 2003, p. 87) b. e wo green dress ko. He/she-PAST TONE wear D/ART (Adŋme-English; Nartey, 1982, p. 187; cited in Chan, 2003, p. 120)
According to his analysis, in (8a) the verb ttakē “wash” is in Korean (an OV language), but the complement “my hand”, following the English VO word order, occurs in the post-verbal position. This indicates that the lexical head (verb) does not determine the word order of the DP complement. However, in (8b), the functional head (i.e., the determiner ko) determines the word order of its English complement “green dress”, which therefore appears before the article according to Adŋme grammar.
Regarding the relationship of selection, Chan argues that CS is possible between a functional head and its complement as long as the requirements of c-selection and s-selection set by the former are satisfied. In other words, the functional heads always determine the categories of the switched complements they select. The lexical heads, however, are not specified in terms of their c-selection properties but are specified for their s-selection properties. Namely, the meaning of the lexical head determines the types of complements it selects. Consider his examples in (9).
(9) je peux le dire had le truc hada
[baš je commence à apprendre]. I can it say this the thing here that I begin to learn “‘I can say this in order that I start to learn.” (French-Moroccan Arabic, Bentahila and Davis 1983, p. 323; cited in Chan, 2003, p. 150)
Although two different languages are present in the complementizer phrase “baš je commence à apprendre”, the functional head – namely, the Moroccan Arabic complementizer “baš” – still selects the finite clause “je commence à apprendre” as its complement (i.e., the right syntactic category). Consider the following examples.
(10) a. English verb + DP (D+NP) b. Cantonese verb + DemP [(Dem (Num)+ CL+ NP)] (Dem = demonstrative; Num = number; CL = classifier) (11) gam2 nei5 jiu3 [V double] [Demp nei5 go3 oi3 sam1] So you have-to double you CL benevolence “So you have to double your benevolence.’” (Cantonese-English; Li, 1996, p. 170; cited in Chan, 2003, p. 197)
As (10a) and (10b) show, an English verb takes a DP as its complement, whereas a Cantonese verb would take a DemP as its complement. If the relationship with c-selection still holds, the English verb double would take a DP rather than a Cantonese DempP as its complement. However, example (11) shows that CS may occur between a lexical head (i.e., “double”) and its complement if the requirement of the s-selection is satisfied. Examine the potential counter-example in (12).
(12) computer ho2 ji5 taau3 gwo3 keyboard tai4 gong1 jat
can through provide one CL “Computers can provide some feedback through the keyboard.” (English/Cantonese, Chan, 1992; cited in Chan, 2003, p. 174)
Cantonese classifiers are generally treated as functional categories, and syntactic and semantic restrictions (e.g., plurality) are present in their complement nouns. In this case, the plural Cantonese classifier di1 should c-select an NP with [plural feature]. However, in example (12), the uncountable English noun feedback appears after di1 and violates Chan’s argument regarding functional heads. Chan explains that the English noun “feedback” is unspecified for features such as [plural] or [count]; therefore, a CS utterance resembling example (12) occurs. However, whether his explanation is theoretically adequate is still open for discussion, and more empirical tests are needed.
Poplack’s (1980) CS model
In her study on English/Spanish CS data collected from a Puerto Rican community in the US, Poplack (1980) proposed two CS-specific grammatical constraints, namely the equivalence and free morpheme constraints, to predict the possible switched points and elements in a given CS utterance. Regarding the equivalent constraint, Poplack argues that CS tends to occur at points in the discourse where the juxtaposition of first language (L1) and second language (L2) elements does not violate the syntactic rules of either language. The free morpheme constraint states that “codes may be switched after any constituent in discourse provided that the constituent is not a bound morpheme” (Poplack, 1980, p. 595). That is to say, a free morpheme and not a bound morpheme can be switched. As Macswan (2014, p. 7) states, although Poplack “expressed a strong preference for avoiding CS-specific mechanisms to mediate between the two languages in contact, they nonetheless concluded that such a mechanism is necessary on empirical grounds”. Consider Nartey’s (1982) English/Adãŋme CS data in (13) and (14).
(13) a ŋε mĩ help-e they copula me help (present progressive) “They are helping me.” (Nartey 1982, p. 185) (14) e so green dress ko. He/she(past) wear art “S/he wore a green dress.” (Nartey 1982, p. 187)
The switched Adãŋme present progressive marker “e” in example (13) is a bound morpheme. In example (14), following Adãŋme grammar, the article “ko” occurs in the sentence final position, which is not permissible in English. Thus, Poplock’s model was empirically rejected.
The Matrix Language Frame model
Among the models that support the existence of a CS-specific mechanism, the Matrix Language Frame (MLF) model, which was proposed by Myers-Scotton (1993/1997, 2002), is probably the most influential. This model specifically examines intra-clausal CS utterances from a morphosyntactic aspect and postulates that one of the participating languages (i.e., the matrix language; hereafter the ML) in a given bilingual clause is always more dominant in terms of providing the morphosyntactic framework of a given bilingual clause (i.e., the basic unit of analysis). Two principles are proposed to identify the ML. The morpheme order principle states that the surface word order of the bilingual clause in question will be that of the ML. The system morpheme principle stipulates that only the outsider late system morphemes (e.g., subject-verb agreement affixes or case markers) will be supplied by the ML. The results of analysis of these two principles will point to the same language as the ML. Consider the Welsh/English bilingual clause in example (15).
(15) doedd hi’m yn mynd ar y Be.PAST.NEG PRO.F.3S-NEG PRT go on DET motorways na’r dual carriageways. nor-DET “She didn’t go on the motorways or the dual carriageways.” (Welsh/English; Deuchar, 2006, p. 2004)
According to Deuchar’s (2006) analysis, the ML of the example in (15) is Welsh because of the Welsh word order. The outsider late system morpheme that marks the third-person singular is also provided by Welsh. The MLF model has enjoyed great empirical success and has unambiguously identified the MLs of CS data in many different language pairs, for example, German/English (Fuller & Lehnert, 2000), Mandarin/English (Wei, 2001), etc. However, criticisms regarding its theoretical and empirical inadequacy have been proposed. For instance, Zabrodskaja (2009) states that the MLF model is problematic because it relies only on morphosyntactic criteria for the identification of the ML. In her study on Russian/Estonian CS, Zabrodskaja argues that because of the typological similarities and long-term language contact between Russian and Estonian, different degrees of morphological and phonological integration are found. For this reason, it was not always possible to unambiguously identify the ML by the two principles of the MLF model. In addition, Auer (2000, p. 131) argues that the MLF model may “run into problems in languages without (much) morphology … and when they are applied to language pairs with the same basic word order”. Moreover, Chan (2009, p. 185) argues that in addition to the two major principles, a number of subsidiary principles of the MLF model were proposed, which make this model “too sophisticated and uneconomical to be desirable as a model of bilingual competence”.
Earlier studies on the analysis of the Southern Min/Mandarin CS data
Wang (forthcoming) tested the MLF model with his Southern Min/Mandarin CS data (the same dataset tested in this study) to seek a universally applicable CS model. He found that Southern Min and Mandarin share most syntactic structures and that both languages have a very limited amount of inflectional morphology. Hence, neither the morpheme order principle nor the system morpheme principle is applicable to the Southern Min/Mandarin bilingual clauses. Although no example that violated the MLF model was found, Wang reported that the MLs of 95% (323 out of 340) of his CS data could not be unambiguously identified. To solve this problem, other possible criteria, (e.g., identifying the language that provides aspect markers and classifiers as the ML) were also tested. However, he found that only 4.1% (14 out of 340) of his data had aspect markers and 10.29% (35 out of 340) had classifiers, which suggested that they were not reliable criteria to identify the ML. To solve the problem caused by the nature of Mandarin and Southern Min, Wang proposed a revised MLF model, which is shown in Table 1.
Wang’s (forthcoming) revised version of the Matrix Language Frame model.
Wang revised the original MLF model by re-introducing an additional criterion (i.e., the morpheme counting principle) for the identification of the ML with reference to Myers-Scotton’s (2002, p. 61) argument that “the language that is the source of the grammatical frame often supplies more morphemes in a bilingual clause”. The revised model in Table 1 includes a two-step analysis, and the criteria to identify the ML are applied in a hierarchical order. That is, the morpheme order principle and the system morpheme principle in stage 1 are first applied. If these two principles are not applicable to identify the ML, the morpheme counting principle in stage 2 is then applied. Although Wang reported that this revised MLF model unambiguously identified the MLs of 92.64% of his data, it was insufficient in the theoretical sense. For instance, how can one prove that a process of morpheme calculation is conducted when people switch between different languages? Or, as Auer (2000) notes, is there any evidence to show that bilingual speakers perform a grammatical analysis to construct a ML before they produce bilingual utterances?
In their study on Welsh/English, Tsou/Mandarin and Southern Min/Mandarin CS (100 bilingual clauses from the same corpus tested in this study), Deuchar, Muysken, and Wang (2007) adopted Muysken’s (2000) model to categorize the CS patterns (i.e., insertion, alternation and congruent lexicalization) of their data. They argued that the criterion to define a switch was not clearly stated in Muysken’s (2000) model. They then adopted Myers-Scotton’s (2002) MLF model to determine the MLs of their data, and the linguistic elements that were not supplied by the MLs were categorized as switches. To adopt the MLF model as a means to identify a switch implied that the theoretical problem caused by the nature of Southern Min and Mandarin was encountered. They used a frequency-based criterion and also checked other functional elements (e.g., classifiers) to identify the MLs of Southern Min/Mandarin CS. However, as mentioned above, a frequency-based approach is too arbitrary and theoretically inadequate.
Theoretical framework: The PF Disjunction Theorem to code-switching
The minimalist programme (MP) proposed by Chomsky (1995) is a theory that states that there is an innate mechanism in the human language faculty that includes two important components, namely CHL and a lexicon. CHL is a computational system that is invariant across all languages, and the lexicon is the element that distinguishes one language from another. According to the MP, the I-language is invariant and that the reason why all languages appear to be different is that the morphological features encoded in the lexicon drive movement operations. Through the operation Select, the grammatical information contained in a given lexicon is sent through a process of numeration to re-assemble all the sub-sets of the lexicon to form a derivation. Moreover, through the Merge operation, all items from the numeration are taken to form different syntactic objects, which are then introduced to the Move operation to construct new structures. In short, these three operations are constrained by a process (i.e., feature checking) that ensures all the features of lexical items match at each stage (Herring, Deuchar, Carmen Parafita Couto, and Moro Quintanilla, 2010).
Many scholars (e.g., González-Vilbazo & López, 2011, 2012; Toribio & González-Vilbazo, 2014) adopted the MP and examined their CS data from different perspectives. However, their approaches may not provide sufficient account of the Southern Min/Mandarin data. For instance, in their study on Esplugish (a CS variety used by a community of German/Spanish bilinguals in Barcelona), González-Vilbazo and López (2012, p. 47) argued that little v “determines three crucial grammatical properties of the selected VP; linearization, Focus/Background, and prosodic structure”. However, they stated that their arguments could only be tested in grammars of CS data in which the two participating languages were typologically different. Because Southern Min and Mandarin share most of their grammatical systems and are typologically very similar, their approach may not be applicable to the CS data involving these two languages.
With reference to the MP and its key argument that all syntactic variation is lexically encoded, Macswan (1999, 2000, 2005) proposes the PFDT to CS. He argues that CS cannot occur within a single lexical item or a single X0 because X0s are inputs to PF (Macswan, 2005). According to the MP, PF (phonetic form), namely the phonological elements that are actually uttered, contains all the rules and constraints. Hence, a PF element of lexicon A (PF1) contains certain grammatical rules, whereas a PF element of lexicon B (PF2) contains different grammatical rules. CS is a union of two lexicons, which contain only parts of the grammars of language A and B and therefore cannot meet the requirement imposed by PF1 or PF2. Thus, the PFDT predicts that CS, or more precisely, a switch of phonological systems within a lexical item, is prohibited. Macswan’s model is listed below: “(i) The PF component consists of rules/constraints which must be (partially) ordered/ranked with respect to each other, and these orders/rankings vary cross-linguistically. (ii) Code switching entails the union of at least two (lexically encoded) grammars. (iii) Ordering relations are not preserved under union. (iv) Therefore, code switching within a PF component is not possible.” (Macswan, 2000, p. 45)
Consider his examples in (16).
(16) * a. Juan está Juan be/3Ss eat-DUR. “Juan is eating.” (Spanish/English; Macswan, 1999, p. 222) b. Juan está Juan be/3Ss park-DUR his car “Juan is parking his car.” (Macswan, 2005, p. 7) (3Ss = third-person singular subject agreement; DUR = durative aspect)
According to the PFDT, example (16a) does not occur because the underlined lexicon “eat-iendo” includes two phonological systems. The English stem “eat” has the English pronunciation but the Spanish durative aspect suffix “-iendo” has a Spanish pronunciation. However, a switch like example (16b) is possible. Macswan argues that the underlined element “parqueó” contains only one phonological system (i.e., Spanish), for the English stem “park” is borrowed and is phonologically integrated into Spanish.
Criticisms against the PF disjunction theorem
Macswan’s (1999, 2000, 2005) PFDT has spurred a large amount of criticism. For instance, Macswan (2005) emphasizes the importance of grammatical judgement to evaluate whether a given CS utterance is legitimate or not in his study on German-English CS. German-English bilingual speakers were selected to judge the wellformedness of the collected German-English CS data. However, as Jake et al. (2005) argue, making a grammatical judgement may fail to consider the negative evaluations that some bilingual speakers have toward CS. Moreover, his attempt to judge the well-formedness of CS utterances by selected bilingual speakers may be too arbitrary. Whether these bilingual speakers are qualified to make such judgements is questionable. A more fundamental question is the following: Because CS involves the grammars of all participating languages, what are the criteria for determining the well-formedness of a CS utterance? Thus, Macswan’s model was criticized as having methodological flaws.
Morphological analysis of Mandarin and Southern Min words
Mandarin and other dialects of Chinese are often seen as isolating languages in which most of the morphemes are regarded as free morphemes that can themselves form lexical items. Although ancient Chinese was a language in which most words consisted of a single monosyllabic morpheme, modern Chinese is not like this (Goddard, 2005). There is a distinction between zi and ci in Chinese, but both are equivalent to the concept of a word in English. Zì refers to a single morpheme in spoken language or a character in written language (e.g., shen “god”, hua “flower”, etc), and a ci usually contains two or more morphemes or characters (e.g., lao-ban “boss” or qi-che “car”). Recall that the aim of this study is to test the PFDT – which predicts that CS within a lexical item is prohibited – with the Southern Min/Mandarin CS data. It is therefore crucial to define what counts as a lexical item or word in Mandarin and Southern Min. Although the answer for this question is still open for discussion, this study will adopt a syntactic definition proposed by Packard (2000, p. 12) and defines a word as “a syntactically free form, commonly designed as X0”, which is the smallest occupant of constituent slots that are affected by syntactic rules. Examine the examples below.
(17) god will help I-plur. “God will help us.” (18) a. boss drive car come company “The boss comes to the company by car.” *b. *c. -
In example (17), shen “god” (a free morpheme) is considered a word because it can occur as a syntactically independent element and fill in the syntactic slot of the subject NP of the sentence. In example (18a), lao-ban “boss”, which contains two or more bound morphemes, is also regarded as a word because it also fills the syntactic slot of the subject NP. In examples (18b) and (18c), lao- “old” or -ban “boss” cannot act as syntactically independent elements; therefore, neither is a word.
Packard further proposes four basic word types in Mandarin – namely, compound words, bound root words, derived words and grammatical words. A compound word refers to a word that consists of two free morphemes. For instance, the two free morphemes huo “fire” and shan “mountain” form the compound word huo-shan “fire-mountain (volcano)”. A bound root word is a word that consists of a bound morpheme and a free morpheme or two bound morphemes. In a word like chu-ban “publish”, chu “emit” is a free morpheme, whereas ban “edition” is a bound morpheme. A derived word consists of a bound or free morpheme and a word-forming affix. According to Packard (2000, p. 70), a word-forming affix is an affix that “may change the form class of terms to which they attach… and may attach to free words or bound boots” and strongly resemble derivational morphemes. Examples of word-forming affixes in Mandarin would be nominalizing (e.g., -tou) or verbalizing suffixes (e.g., -hua). For example, the word cha “insert” is a verb. If we add a nominalizing suffix -tou “head” to cha, the derived word cha-tou “plug” becomes a noun. The fourth type of word is called a grammatical word, which is a combination of a free morpheme and a grammatical affix (e.g., aspect marker). For instance, the grammatical word chi-le “have eaten” is a combination of a free morpheme chi “eat” and a perfective marker le (a bound morpheme). Furthermore, Packard (2000, p. 75) argues that “the combination ‘number-classifier’ is a word because it can occur in a noun syntactic slot”. Examine the Mandarin examples in (19).
(19) a. san-zhi ji three-Mclass. chicken *b. san ji three chicken *c. -zhi ji Mclass. chicken d. wo chi-le I eat-Masp. three-Mclass. “I have eaten three (chickens).”
Examples (19a)–(19c) show that neither the numeral san nor the classifier zhi can occur independently; they should always co-occur. In example (19d), the number-classifier combination san-zhi serves the function of an object noun and replaces the noun ji “chicken” that it originally modified. Following Packard’s argument, any number-classifier construction in my Southern Min/Mandarin CS corpus will be analysed as a word.
Southern Min is regarded as a regional variety of Chinese. While the major differences between Mandarin and Southern Min are phonological and lexical, their morphological structure and formation of words are identical. Therefore, I argue that the discussion on Mandarin words mentioned above also pertains to the analysis on Southern Min words.
Methodology
Subjects and data
Spontaneous conversations produced by three groups of fluent speakers (30 participants in total) of Southern Min and Mandarin (i.e., Group 1: 10 university students; Group 2: 10 retired junior high school teachers; Group 3: 10 civil service workers) in different locations (such as offices, the university campus and cafés) in Tainan City, Taiwan, were recorded and transcribed. To avoid the observer’s paradox effect, all recordings were made by one participant from each group. In total, 2458 clauses were collected from approximately 50 hours of recordings. Of the 2458 clauses, 86.04% (2,115) were monolingual clauses and 13.95% (343) were bilingual clauses. As discussed earlier, Macswan (2005) emphasized the importance of grammatical judgement for the CS data, which was criticized because of its arbitrariness. To avoid this methodological problem, no such judgement will be applied in this study.
Applying the PFDT to the analysis of the Southern Min/Mandarin CS data
Because the PFDT specifies the prohibition of CS within a lexical item, only Southern Min/Mandarin bilingual clauses are included in my corpus, and monolingual Mandarin and Southern Min clauses will be excluded. A bilingual clause is defined as a clause that contains one or more morphemes from more than one language. This is shown in my Southern Min/Mandarin example in (20).
(20) guá bô jiang-jin. I Sneg bonus “I don’t have any bonus.’” (Mandarin = normal print; Southern Min = italics)
The PFDT implies that a lexical item is the basic unit of analysis. Hence, every word in the bilingual Southern Min/Mandarin clauses in my corpus will be examined. This study will adopt Packard’s (2000) proposals on Chinese word types, and only words that contain two or more morphemes will be analysed as no further morphological analysis can be performed on words that have only one morpheme. To illustrate how the PFDT was applied, examine my Southern Min/Mandarin examples in (21).
(21) a. lao-da lóng bô tshut-khì ooh? eldest son all Sneg go-out Spart.-question “(Her) oldest son never studied abroad?” *b. lao-tuā lóng bô chu old-big all Sneg go-out Spart.-question “(Her) oldest son never goes out (studied abroad)?”
The Southern Min words lóng, bô and ooh in example (21a) will not be analysed because they contain only one morpheme. The Southern Min word tshut-khì
(22) xiang-cai sī tsit-ê. Coriander Scop. this-Sclass. ”This is the one with coriander-flavour.” (23) lín lao-gong bô lâi ooh? your husband Sneg come Spart.-question “Why didn’t your husband come?” (24) i ū tsit-ê dong-zuo tsiok hô-tshiò. he have one-Sclass. movement very funny. “He had a very funny movement.’”
Results
Adopting the procedures of analysis discussed in the previous section, the PFDT was tested with 343 bilingual Southern Min/Mandarin clauses produced by three sample groups, that is, Groups 1, 2 and 3. Within the bilingual clauses, 811 lexical items were analysed. The results are shown in Table 2.
The numbers and percentages of different word types collected from the sample groups.
In 114 bilingual Southern Min/Mandarin clauses collected from Group 1, 272 words were analysed, including 72 compound words (26.47%), 130 bound-root words (47.79%), 25 derived words (9.19%), six grammatical words (2.21%) and 39 number + classifier constructions (14.34%). From Group 2, 142 bilingual clauses were collected and 350 words were analysed, including 74 compound words (21.14%), 164 bound-root words (46.86%), 46 derived words (13.14%), eight grammatical words (2.29%) and 58 number + classifier constructions (16.57%). Finally, 87 bilingual clauses were collected from Group 3, from which 189 words were analysed, including 45 word compound words (23.81%), 77 bound-root words (40.74%), 36 derived words 19.05%), seven grammatical words (3.7%) and 24 number + classifier constructions (12.7%). See Table 3.
The numbers and percentages of Mandarin and Southern Min words and potential violations of the PF Disjunction Theorem.
Of the 272 words extracted from the bilingual clauses produced by Group 1, 165 were in Mandarin (60.66%), 107 were in Southern Min (39.34%) and no violation against the PFDT was found. Of the 350 words extracted from Group 2 data, 238 were in Mandarin (68%), 112 were in Southern Min (32%) and no violation was found. Finally, of the 189 words collected from Group 3, 134 were in Mandarin (70.9%), 54 were in Southern Min (28.57%) and one potential violation (0.53%) was found. In other words, approximately 99% of the data support the prediction of the PFDT, which states that CS within a lexical item is prohibited.
Discussion
Of the 811 words that were examined, only one potential counter-example was found, which is shown in (25).
(25) shang-ci wo-men jiu yi-thuann ren qu chi a. last time I-plur. then one-Sclass. people go eat Mpart.-emphasis “Last time, we (a group of people) went to (the restaurant) and ate (meals).”
At first glance, the number + classifier construction “yi-thuann” seems to reject the PFDT because the number is in Mandarin, whereas the classifier is in Southern Min. This example leads us to re-consider the linguistic nature of Chinese classifiers. Consider Cheng’s (2011) Mandarin example in (26) and Cheng and Sybesma’s (1999) Cantonese example in (27).
(26) * (Zhe) Zhi gou hen keai. This class. dog very cute “This dog was very cute.” (Cheng, 2011, p. 66) (27) Zek gau zungj sek juk. class. dog like eat meat “The dog likes to eat meat.” (Cheng & Sybesma, 1999, p. 511)
Without the occurrence of a proceeding demonstrative zhe “this”, example (26) would be an ungrammatical sentence. Hence, in line with Li and Thompson (1981), Cheng (2011, p. 66) argues that “classifiers in Mandarin are enclitics (suffixes), a clitic that must follow its host”. Moreover, he argues that Cantonese classifiers are free morphemes because they do not need to co-occur with numbers or demonstratives, as shown by example (27). However, Cheng and Sybesma (1999) argue that in Mandarin, classifiers do not always need to co-occur with overt numerals or demonstratives. Examine their Mandarin examples in (28) and my Southern Min example in (29).
(28) Wo xiang mai ben shu. I would-like buy class. book “I would like to buy a book.” (Mandarin; Cheng & Sybesma, 1999, p. 511) (29) guá lâi lim pue liâng-ê. I will drink Sclass. drink “I am going to have some drinks.”
In example (28), the Mandarin classifier ben occurs directly after the verb mai “buy” without any preceding numerals or demonstratives. The Southern Min classifier pue occurs independently after the verb lim “drink” without attaching to any other linguistic elements. Examples (28) and (29) seem to suggest that Mandarin and Southern Min classifiers may also have the features of free morphemes rather than clitics, which can probably explain why a switch occurs in the numeral + classifier construction “yi-thuann” in example (25). Although Cheng and Sybesma’s (1999) point of view on the nature of classifiers is still open for discussion, this single potential counter-example may be interpreted differently. Table 2 shows that 121 words were categorized as number + classifier constructions, which means that this single example only accounts for approximately 0.83% (1 out of 121) of the total number of the words with the number + classifier construction and 0.12% (1 out of 811) of the entire corpus. Because of its infrequency, I argue that this potential counter-example can be treated as a case of the participant’s slip of the tongue and is not a real threat to the PFDT.
Furthermore, according to Macswan (2000, p.45), “CS below X0 is not permitted… because phonological system cannot be mixed”. He argues that the PFDT is in fact a theory about the relationship between the phonological components of a bilingual’s linguistic system, but not a rule of grammar. Such an argument is also supported by the Southern Min/Mandarin data. Consider example (30a) and its monolingual Southern Min and Mandarin versions in (30b) and (30c).
(30) a. ing-kai sī ū shou-guo cuo-zhe. (Southern Min/Mandarin) must Scop. have receive-Masp.exp. failure “(He) must have experienced some failures before.” b. ing-kai sī ū siū-kue tann-kik (Southern Min) must Scop. have receive-Sasp.exp. failure c. ying-gai shi you shou-guo cuo-zhe. (Mandarin) must Mcop. have receive-Masp.exp. failure
As the monolingual Southern Min and Mandarin examples in (30b) and (30c) show, there is a one-to-one correspondence in terms of their syntactic structures. Moreover, the morphological structure of a word in these two languages is also identical. For example, the Southern Min compound word ing-kai “must” in (30b) consists of a free morpheme ing “must” and the other free morpheme kai “must”. Its counter-part, that is, the word ying-gai “must” in Mandarin in (30c), also combines two free morphemes, that is, ying “must” and gai “must”. If the occurrence of CS only depends on morphosyntactic factors, then we would expect more word-internal switches in my corpus. This is because they could occur almost freely without violating the grammar of these two languages. However, the results of this study showed that almost no word-internal switch was found. This suggests that the phonological factor may play a more important role here. Mandarin and Southern Min are both tone languages, but their tonal systems are different. The former has four tones and the latter has seven tones. In a tonal language, different tones will change the meaning of words even though the pronunciation is the same. For instance, the Mandarin word fá with the second tone means “punish”, while fǎ with the third tone means “hair”. The Southern Min ing “must” in (30b) and the Mandarin free morpheme ying “must” in (30c) are both pronounced like /iŋ/. However, ying has the first tone in Mandarin, and ing has the third tone in Southern Min. Also consider the examples in (31a)–(31c).
(31) a. i-ê pin-zhong bô-kâng. (Southern Min/Mandarin) That-Sclass. type Sneg-same “They are different types (of fruits).” b. i-ê phín-tsíng bô-kâng. (Southern Min) that-Sclass. type Sneg-same c. na-ge pin-zhong bu-tong (Mandarin) that-Mclass. type Mneg-same
The pronunciation of the bound morpheme phín- “type” in the Southern Min derived word phín-tsíng in (31b) is the same as the bound morpheme pin- “type” in the Mandarin word pin-zhong “type” in (31c). Both are pronounced like /pin/. While pin- has the Mandarin third tone, phín- has the Southern Min second tone. I argue that such tonal differences prohibit the occurrence of word-internal switches, such as ying-kai or ing-gai in (30a) and phín-zhong or pin-tsíng in (31a), which could be regarded as supporting evidences to the PFDT that phonological systems cannot be mixed within a lexical item.
Following Chomsky’s (1995) MP, Stabler and Macswan (2014, p. 259) argue that “while syntax can freely compose lexical elements from various languages, the units of morphophonology, those elements that are part of the head structure, ‘below X0’ in the syntax, cannot be broken up and mixed”. This argument is supported by the results of this study because almost no violation against his PFDT was found in my Southern Min/Mandarin CS corpus. As mentioned earlier, syntactic constraints or CS-specific grammars/mechanisms proposed in the previous literature were all rejected because of theoretical or empirical flaws. Furthermore, as discussed earlier, other influential CS theories, such as Muysken’s (2000) typological model and Myers-Scotton’s (2002) MLF model, were tested with the same sets of Southern Min/Mandarin CS data of this study, and both models had theoretical problems. The PFDT, however, achieves theoretical adequacy and empirical success and provides the best account to the Southern Min/Mandarin CS data. Thus, this study supports the PFDT, which states that there is only one mechanism that governs both monolingual and bilingual language production in our brain, and the so-called CS-specific grammar or mechanism is not necessary.
Conclusion
The aim of this study was to test Mascwan’s (1999, 2000, 2005) PFDT with Southern Min/Mandarin CS data. This model, which specifies that CS and the switch of phonological systems are prohibited within a lexicon, was tested with 811 lexical items extracted from 343 bilingual Southern Min/Mandarin clauses. The results of the analysis show that almost all the lexical items tested (810 out of 811) support the main argument of Macswan’s model, with only one exception that was regarded as the informant’s split of tongue rather than a real counter-example because of its infrequency (0.52%).
Macswan’s (1999, 2000, 2005) PFDT is proposed with reference to Chomsky’s (1995) MP, a model that describes how the components and a single mechanism that is innate in the language faculty in our brain operate in the process of language production and has successfully been applied to the Southern Min/Mandarin CS data. Thus, the results of this study support the argument that monolingual or bilingual utterances are governed by the same language production mechanism but not by any overt grammatical constraint or CS-specific grammar or mechanism.
List of glosses
exp. = experiential
Masp. = Mandarin aspect marker
Mclass. = Mandarin classifier
Mcop. = Mandarin copular
Mneg = Mandarin negation marker
Mpart. = Mandarin sentence final particle
plur. = plural marker
Sasp. = Southern Min aspect marker
Sclass. = Southern Min classifier
Scop. = Southern Min copular
Sneg = Southern Min negation marker
Spart. = Southern Min sentence final particle
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
