Abstract
Saussure proposed the division language/parole and argued that language can be studied as a formal system. Fifty years later Chomsky declared competence the core interest of linguistics. Although for years Generative second language acquisition (GenSLA) has adopted this view, a number of recent publications poke holes into the competence bubble. Westergaard’s article is among those that pushes the boundaries of Generative Grammar (GG). In our commentary, we propose that Westergaard’s Micro-cue Model (McM) and the Linguistic Proximity Model (LPM) may actually be closer to a Usage-Based Approach (UbA) to language development than to the original spirit of GG, and that Westergaard’s sound, evidence-based proposals face some drag by being presented under the aegis of GG. Specifically, the assumption that all learning derives from general cognitive processes—hence, no essential difference between L1, L2, and Ln; the use of cues that are emergent and acquired piecemeal, and the idea that language development proceeds from the specific to the general, are all hallmarks of the UbA. We believe Westergaard’s contribution is important and timely and should encourage a better appreciation of the work being done in other domains as well as an understanding of how the different approaches complement each other.
Keywords
The keynote article by Marit Westergaard (2021) extends the Micro-cue Model (McM) to multilingualism. The Linguistic Proximity Model (LPM) first outlined in Westergaard et al. (2017) is the key piece that allows her incursion into the multilingual domain. Overall, the current proposal constitutes a well-designed model and an important contribution for a field that keeps struggling to find a unified theoretical front. Westergaard’s research is couched within Generative Grammar (GG) and consequently assumes at least a Universal Grammar (UG), principles, primitives such as categories or features, and rules/constraints. Nonetheless, some of her positions are at odds with the classic tenets of GG. In this commentary, we will critically review some of these points of conflict and argue that her sound and well-motivated proposals make more sense for a Usage-based Approach (UbA) and face some drag by being presented under the aegis of GG. Throughout this commentary, our use of terms such as ‘classic’, ‘original’, or ‘core’ GG is intentional, in an attempt to emphasize how the original postulates of GG have been stretched in recent decades in the direction of usage. Westergaard’s proposal belongs to a general trend in general second language acquisition that started with the advent of minimalism and helped to broaden the scope of classic GG to include aspects that were originally conceived as part of performance and left to alternative approaches. Part of this general move may seem questionable to those who would prefer to see generative research focused exclusively on matters of competence and to those that for years have been studying variation from a sociolinguistic or usage-based approach. Our comments come from a UbA perspective and our basic observation is that the McM and LPM may actually be closer to the UbA than to the original spirit of GG.
The first point that stands out against original tenets of GG is the focus on fine details and variation. Westergaard proposes the McM to account for the observation that children are attuned to variation and make fine distinctions from a very early age. Although the minimalist program allows for the interpretation that some variation is the result of problems at the interfaces, it is no secret that details and variation were not the original concern of GG.
One important component missing form Westergaard’s generative toolbox is the parameter. The focus on detail and variation makes a classical Principles and Parameters (P&P) account of acquisition incompatible with her data. Instead of proposing modifications to P&P, Westergaard chooses to follow Fodor’s (1998) and Lightfoot’s (1999) lead and go with cues instead of parameters. From our perspective, this is a good choice, but it is a fact that the concept of ‘cue’ doesn’t enjoy the kind of acceptance in formal approaches that has under a Usage-based Approach (UbA). P&P addressed the argument of the poverty of stimulus and provided a simple view of input triggers switching innate binary options provided by UG. As argued in criticism by authors such as Boeckx (2011), Haspelmath (2008) or Newmeyer (2005), the approach predicts the existence of kinds of errors and overgeneralizations that are not attested, and also the absence of actually attested possibilities. P&P responded to criticisms coming from inadequacies in dealing with cross-linguistic typology and variation by introducing microparameters. The main problem with proliferating parameters is that their combinatorial explosion quickly expands into billions of options. To address this problem, parameter hierarchies have been introduced in which microparameters combine into larger parameters forming a hierarchy. Parameter setting is argued to start at higher levels moving to microparameters when dealing with local properties. Roberts (2019) distinguishes between macro-, meso-, micro-, and nanoparameters, and posits that smaller parameters may not be part of UG. With this four-way typology of parameters (some emergent) much of the criticism that Westergaard levels against P&P loses some of its force and, with it, the motivation for cues. Given the potential of the hierarchical parametric approach to account for local variation, Westergaard should provide a stronger motivation for rejecting parameters and adopting cues instead.
Another apparent point of friction with classic GG is the claim that all learning takes place incrementally. As much as we agree with this view, the fact is that this parsimonious learning is in conflict with original generative claims about the incredible speed of acquisition that can only be explained by positing a specialized Learning Acquisition Device (LAD). Assuming that adult learners have only partial, or no access, to UG avoids the problem, but Westergaard assumes full and continued access to UG and no essential differences between L1 and L2/L3.
Indeed, the article’s main goal is to extend the McM to multilingual development based on the assumption that L1/L2/L3 acquisition are not fundamentally different. Again, a challenge for classical GG. Bley-Vroman (1990) was among the first to claim with the Fundamental Difference Hypothesis that there are too many differences between L1 and L2 acquisition to claim that the LAD is responsible for both, and that L2/L3 acquisition uses general cognitive processing. We happen to assume that all learning is the result of general cognitive mechanisms, but the proposal that there are no fundamental differences was not part of the original view of the GG perspective.
Once in the domain of multilingualism, Crosslinguistic Influence (CLI) becomes a prominent issue. We couldn’t agree more with The Full Transfer Potential (FTP), but perhaps a bit of historical contextualization could be useful. Interest on CLI started with Lado’s contrastive analysis hypothesis (CAH; Lado, 1957). The CAH posited that differences between L1 and L2 were the source of difficulties, as features common to both languages readily ‘transfer’, while differences need to be learned. This was a sensible hypothesis but had a weak point: the assumption that differences were the only source of difficulties in acquiring a new language. Sure enough, this is precisely the weakness that successive proposals attacked, and the CAH was thrown out with the tub water. When the field, later on, focused on CLI in multilingual settings, the question was again framed in terms of the supplier language.
Williams and Hammarberg (1998), posit that the language that scores highest on a number of factors becomes the supplier, but we know that the quest for the source language yielded inconsistent results. For Williams and Hammarberg, the L2 was the supplier language. Several articles that followed reported similar findings (Bohnacker, 2006; Hammarberg, 2001; Rothman and Cabrelli Amaro, 2010; Williams and Hammarberg, 2009). Bardel and Falk (2007) proposed the L2 Status Factor Hypothesis to account for this observed primacy of the L2. However, other studies suggested the L1 was the only, or at least the predominant, supplier (Bouvy, 2000; Hermas, 2010; Leung, 2005; Na Ranong and Leung, 2009).
At his point, it becomes pertinent to ask why we have this recurrent a priori preference for the singular provider option. Obviously if there is just one source, the task at hand is simpler and, after all, it is good practice to try simpler options first. However, the options we try first rarely give final answers, and science actually advances by adding degrees of complexity and refinement to initial questions/answers. Besides simplicity, we need to keep in mind that singularity is a big cognitive attractor. Throughout history humanity has searched for the source of happiness or longevity, or the theory of everything. It is not a coincidence that ‘the’ is the most common word in the English language; and not by a small margin. In this specific case, an important factor may be our terminological choices. As Lakoff has extensively shown, we tend to assign meaning through frames and metaphors (Lakoff and Johnson, 2003). The term ‘transfer’ is a bad metaphor because it conjures two spatially separated sources connected by a path. Applied to multilingualism this leads us to assume two systems – one the source and the other the recipient – with elements moving from one to the other. Schachter (1983) already argued against the conception of transfer as a process or movement. Rothman et al. (2019) proposes to use transfer for the wholesale copying of a language and cross-language effects for the ensuing piecemeal restructuring, and in this sense the use of the term ‘transfer’ is appropriate (whether this wholesale copy is necessary, or not, is another matter). The point we want to make here, though, is that the term ‘transfer’ leads us to assume a single source. Finally, the preference for a singular source may also be due to the theoretical baggage we bring to the issue. After all, the messy ‘soup’ of performance factors that comes with the acceptance of multiple sources is not an appealing option for a minimalist program that is concerned with economy, perfection, and competence.
Westergaard rejects the idea of a single source assuming that all previously acquired languages can have an influence on the new target language. In that point her position is compatible with The Cumulative Enhancement Model (CEM; Flynn et al., 2004), the Typological Primacy Model (TPM; Rothman, 2010, 2011, 2013, 2015) and the Scalpel Model (Slabakova, 2017). The conclusion that any of the previous languages, independently of order or acquisition, can be the source of influence begs the question of determining under which circumstances one of them actually becomes the source. The TPM’s proposal is that all languages are candidates but ultimately just one of them will be selected. Westergaard’s position is substantially different from the TPM, closer to the CEM, and germane to the Scalpel Model. It is based on the premise that any construction, from any source, may be selected for L3 production (the Full Transfer Potential). If anything can transfer from any source, a set of criteria to account for when and how CLI happens needs to be in place. The LPM is an attempt to address this issue. In this model, CLI occurs when learners use one of the previous languages to parse L3 input. The idea is that when a given cue in the TL input matches one of the existing cues in previous languages, the cue becomes activated. This conception of CLI opens the door to at least two sorts of questions: On the one hand, we would need to know more about the mechanism responsible for the matching of cues. How does the learner settle on a given cue? How is its similarity computed? How is the competition between possible cues handled? In what order are they considered? On the other hand, the LPM requires a separation between the parser and the grammar that is questionable. Even conceding that parser and grammar can be separated, we would need to know what mechanism is responsible for parsing in one language and then applying the right restructuring to another language. Normally parsing new input with the L2 is assumed to lead to restructuring in the L2, but surely that is not the intended result. The model also assumes a transition from early stages in which surface typological similarly is predominant toward stages in which more abstract structural properties are at play. This seems a sensible assumption but the model should be able to predict how and when this transition takes place.
The article closes with a short section devoted to other factors where it is acknowledged that other variables such as relative proficiency, psychotypology, the interlocutor or the context of the speech act, recency and priming, frequency, input quality, complexity of each construction, age, motivation, memory, etc. may also affect CLI. It is comforting to see some of these other factors mentioned, but we object to having them relegated to a mere footnote at the end of the article. In our view, this is again an effect of the theoretical lenses distorting what is more, or less, relevant.
The McM and the LPM are conceived and proposed as part of GG, but seem more compatible with the UbA. The assumption that all learning derives from general cognitive processes (hence no essential difference between L1 and L2), the use of cues that are emergent and acquired piecemeal, and the idea that language development proceeds from the specific to the general are all hallmarks of the UbA that feel kind of constrained under a generative mantel. In our opinion, Westergaard’s attempt to make GG compatible with a wider set of data is very commendable. Her proposals are well-motivated (data-driven) and her argumentation is solid. Although she does a good job at walking a fine line between approaches, her positions are likely to reopen old discussions between hard-core generativists and proponents of expanding the theory. In our view, the boundaries between competence and performance are more nuanced than originally conceived, and there is a case to be made for exploring the intersection from each perspective. In addition, we need a better understanding of how the different approaches complement each other and a better appreciation of the work being done in other domains. The work of Westergaard explores this intersection and is a great contribution to the kind of understanding and appreciation that will make future advances in the field possible.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
