Abstract
Sign language learners with a spoken language background face the challenge of acquiring a second language in a different modality. In the course of this endeavor, one of the modality-specific phenomena they encounter is the use of classifier predicates, also known as depicting signs. Classifier predicates contain a meaningful hand configuration that refers to an entity, denoting a salient characteristic of this entity (Zwitserlood, 2003). The use of a classifier predicate allows the signer to indicate the location, motion and orientation of a referent. If two classifier predicates are used simultaneously, the signer can represent the spatial arrangement of both referents (Schembri, Jones and Burnham, 2001). This visual representation is new for learners with a spoken language background. Since there is a paucity of literature on second language (L2) sign language acquisition, there is no empirical evidence on the developmental stages that L2 learners go through in acquiring the devices to produce such visual representations. In this study, we followed 14 novel learners of Sign Language of the Netherlands (NGT) over a period of two years. The learners were asked to produce sign language descriptions of prompts containing various objects (e.g. cars, bicycles, trucks, human beings and animals) that could be depicted by a classifier predicate. Analyses show that after a year of instruction, the majority of learners are capable of producing scene descriptions featuring two classifier predicates to denote the spatial layout of the objects. The first classifier predicates appear in the data at an early stage, suggesting that the strategy of denoting an object with a meaningful handshape representing the object is not difficult to learn. Furthermore, the data show that learners initially struggle with the orientation of objects and handshape selection. This study is the first to systematically elicit classifier predicates from novel learners for an extended period of time. The results have important implications for the field of sign language pedagogy and teaching.
Keywords
I Introduction
Worldwide thousands of individuals learn a sign language as a second or additional language (L2/Ln). Sign languages are expressed in the visual-spatial modality, which allows the signer to exploit potentials that spoken languages do not offer. As a result, sign language learners encounter grammatical phenomena that are unattested in their (spoken) first language (L1), such as the use of space to express grammatical relations, the use of space to depict the spatial layout of a scene, and the grammatical use of facial expressions. To date, there is surprisingly little research available regarding the characteristics and challenges of L2 acquisition of sign languages. As a result, practitioners in the field ‘often revert to their own understanding of what language is, how to teach it, how learners learn, and how to assess learners’ language knowledge and skills’ (Rosen, 2020, p. 17). In order to develop the field, evidence that demonstrates the development of L2 acquisition of sign languages, in particular milestones in this process, is much needed.
In this article, we describe a longitudinal study into the acquisition of Entity classifier predicates, a typical modality-specific phenomenon that is unfamiliar to L2 learners of a signed language (henceforth M2L2 learners, i.e. learners of a second language or L2 in a second modality or M2). The study provides insight into the stages learners go through, the difficulties they encounter, and typical learner behavior they display.
II Theoretical background
1 Classifiers and classifier constructions
Almost all sign languages studied to date employ linguistic elements that are referred to as ‘classifier predicates’ (Zwitserlood, 2012) or ‘depicting signs’ (Liddell, 2003). By using these elements, the signer can depict how an object is positioned or moving in space or how an object is handled or manipulated. The handshape signals that the referent has certain salient characteristics, such as size and shape, or that the referent belongs to a class of semantically related items (Cormier et al., 2012). Classifier predicates are highly productive; i.e. their meaning is not stable and conventionalized, but rather compositional and determined by the context.
Before we move on, we wish to comment on the terminology used. Above, we introduced two of several terms that have been proposed to refer to the elements under investigation. These terms reflect different views regarding the linguistic analysis of this phenomenon. 1 As we aim to describe the acquisition process for the phenomenon rather than its grammatical status, we do not take a stance regarding this, admittedly interesting, controversy, but rather adopt the term ‘classifier’, as it is commonly used among teachers.
a Whole Entity classifiers and two-handed classifier constructions
Whole Entity classifiers, one of the classifier types that have been distinguished in the literature (Schembri, 2003),
2
directly represent (part of) a static or moving referent. They combine with verb stems that denote the motion or location of a referent in space. The handshape unit (the classifier) signals that the entity it refers to belongs to a class of objects that are semantically related (e.g. class of vehicles) or share a property (e.g. class of ‘long, thin objects’). In Figure 1, the signer produces two classifier predicates simultaneously, i.e. he produces a two-handed classifier construction. Both handshapes signal that the referent belongs to a certain class (vehicles and ‘long, thin objects’, respectively), and they combine with a stem (

Two-handed classifier construction: view from front and from above.
b Figure and ground
In the context of such complex spatial expressions, the notions of Ground and Figure are important. The Ground object is stationary (being located at rest or fixed) and serves as reference point, whereas the Figure object is moving (or could move) in relation to the Ground object (Talmy, 1975). The Ground object is usually the bigger or backgrounded entity, while the Figure object is usually the smaller entity or the entity that is the focus of attention (Zwitserlood, 2012). In the construction depicted in Figure 1, the Figure (the vehicle) moves in relation to the Ground (the ‘long, thin object’). Özyürek, Zwitserlood and Perniss (2010) describe a ‘canonical structure of locative expressions’ found in most sign languages studied to date (but not in the language they analysed, Turkish Sign Language): the Ground object is introduced first and held in space, while subsequently, the Figure object is introduced.
c Conventions regarding parts of the hands
In addition to the conventions regarding Figure and Ground, learners have to acquire conventions regarding front and back, namely: (1) in classifier predicates representing standing humans, the front of the finger (usually) represents the front of the human being; (2) in classifier predicates representing vehicles, the fingertips (usually) represent the front of the vehicle; and (3) in case of four-wheeled vehicles, the palm of the hand represents the bottom of the vehicle. However, when the exact orientation of the referent is considered irrelevant, these conventions can be violated (e.g. Wallin, 1990), or the signer may choose to leave the orientation of the classifier predicate unspecified (Zwitserlood, 2003).
d Variability in choice of classifier
Learners have to learn which handshape unit should be selected to represent a particular referent. Occasionally, however, there may be variability in the choice of a classifier (Zwitserlood, 2003). In Sign Language of the Netherlands (NGT), for example, a standing person can be depicted using a
-handshape (the index finger represents the person as a whole) or a
-handshape (the fingers represent the legs), depending on the characteristics the signer wants to focus on. A second source of variability is ease of articulation. In NGT, vehicles (cars, trucks, bikes) are depicted with a
-handshape. However, in some configurations, this would require awkward bending of the wrist or elbow. In such contexts, NGT-signers use a phonetic variant, the
-handshape, which is easier to articulate (Van der Kooij, 2002; Zwitserlood, 2003).
e Classifier-like constructions in gestural behavior
Earlier we wrote that M2L2 learners learn a new language in a new modality. However, given the multi-modal nature of languages, M2L2 learners do have a repertoire of gestures at their disposal. There is a body of literature demonstrating that there are similarities between some gestures and signs. Ortega, Schiefner and Özyürek (2019) term such gestures that overlap in form with signs ‘manual cognates’. With regard to the phenomenon under investigation, a number of studies have shown that sign-naive individuals, when asked to describe objects in a motion or static event, use their hands to represent these objects. These ‘hand-as-object gestures’ resemble the classifier predicates used by signers (Brentari et al., 2012; Janke & Marshall, 2017; Schembri et al., 2005; Singleton, Morford & Goldin-Meadow, 1993). However, sign-naive gesturers employ a broad array of handshapes and lack consistency, whereas signers employ classifier handshapes from a limited and conventionalized set, which they use consistently (Brentari et al., 2012; Janke & Marshall, 2017; Schembri et al., 2005). These findings suggest that M2L2 learners could draw on their gestural repertoire to scaffold their learning, i.e. use their existing (gestural) knowledge to build new knowledge upon.
2 Alternative devices
Besides the use of classifier constructions, there are alternative devices to encode the spatial relationships between referents. As these devices will be part of our analysis, we briefly introduce them here. On the one hand, a signer may choose to use lexical expressions such as spatial prepositions (e.g.
3 Acquisition of Entity classifiers
A growing body of studies has analysed the acquisition of classifiers in L1 signers. The picture that emerges from these studies is that children are able to produce and comprehend classifier constructions at a young age (Schick, 2006; Slobin et al., 2003). However, their production is error-prone, and it takes several years to master the system completely. The prolonged developmental time course, with full mastery at around 9 years of age (Baker, Van den Bogaerde & Woll, 2008) is attributed to the complexity of classifier constructions. Reported errors are:
Substitution of the classifier handshape (De Beuzeville, 2006; Supalla, 1982);
Omission of meaning components (e.g. manner of movement; Newport & Meier, 1985);
Sequential production of complex movement patterns (e.g. a straight upwards movement followed by an arc instead of an upward arc movement; Newport & Supalla, 1980);
Failure to introduce referents (Slobin et al., 2003; Tang, Sze & Lam, 2007);
Omission of the Ground object (e.g. Slobin et al., 2003; Sümer, 2015; Supalla, 1982; Tang et al., 2007);
Failure to produce Figure and Ground simultaneously, instead expressing both objects sequentially (Supalla, 1982; Tang et al., 2007);
Signing outside the signing space (De Beuzeville, 2006).
Morgan (2002), De Beuzeville (2006), and Tang et al. (2007) report children employing avoidance strategies such as production of lexical descriptions instead of classifier predicates, role shift, or use of the whole body as stand-in for an animate referent (‘whole-body language’). Some children employ Entity classifiers in particular contexts, while deleting or modifying the same construction in a more complex environment (Kantor, 1980).
What we know about the acquisition of classifiers in M2L2 signers is largely based on a few recent studies. Marshall and Morgan (2015) investigated how learners of British Sign Language (BSL), who had been learning BSL for 1–3 years, depicted the spatial location of a variety of non-moving objects. The authors report that the learners were aware of the need to use classifier predicates to represent objects, but had difficulties in choosing the correct classifier handshape. Handshape errors comprised omissions and substitutions. The location feature, on the other hand, did not cause much difficulty. Ferrara and Nilsson (2017) analysed how learners of Norwegian Sign Language (NSL) produced elaborate descriptions of an environment. They found that the NSL learners experienced difficulties in producing orientation and location features, but less with handshape selection. When producing two-handed classifier constructions, they struggled with the coordination of both hands in relation to each other, and they misjudged the space needed. The learners often resorted to the production of lexical signs instead of classifier constructions, or they used signs marked for location where a classifier would be expected.
We can conclude from these studies that learners, both L1 and M2L2, find it difficult to master the system of classifier constructions. This is likely due to the fact that classifier constructions form a complex system characterized by various linguistic conventions that have to be learned.
4 Research questions
As mentioned above, there is a paucity of empirical data on the M2L2 acquisition of signed languages to inform the practice of teaching. The present article contributes to filling this gap by describing the acquisition of classifier handshapes denoting a variety of entities in two-handed classifier constructions in novel M2L2 learners. In particular, we attempt to answer the following research questions:
Are there developmental stages in novel M2L2 learners of NGT regarding the different Entity classifier handshapes that denote different classes of entities?
Are there typical patterns or errors that characterize the learner productions?
Is there evidence for transfer of gestural knowledge in the acquisition of NGT Entity classifiers?
In the remainder of this article, we provide a qualitative and quantitative analysis of the acquisition process regarding two-handed classifier constructions in M2L2 learners, and we discuss the implications of these findings for the teaching practice.
III Methodology
1 Participants
In this study, we elicited two-handed classifier constructions from 14 hearing learners of NGT enrolled in a four-year undergraduate program offered by the Institute for Sign, Language & Deaf Studies (ISLDS), hosted by Hogeschool Utrecht, University of Applied Sciences (HU, UUAS). The institute trains students for the professions of sign language interpreter, sign language teacher, or speech-to-text captionist (STT captionist). Students can enroll in these programs without previous knowledge of NGT. During the first and second year, eight NGT courses are offered, with a total study load of 55 European Credits (ECs) for teachers and interpreters and 30 ECs for SST captionists (for an overview of the NGT curriculum, see Appendix 1). During the first course, classifier constructions are not explicitly taught, but occur frequently in the input (teaching materials and teacher input). During the second course, teaching materials explicitly target classifier constructions. However, in both courses, little explicit rule explanation is provided.
At the beginning of their first year of the program, 14 learners (all female, mean age 23 years) were recruited for participation in the study (for details, see Table 1). They were followed over a period of two years, with the exception of two participants, who quit the program after year 1. In addition, we assessed how a baseline group of deaf L1 NGT-users (n = 4) and a group of ISLDS sign language teachers (two L1 signers, two M2L2 signers) performed on the same task. Tables 2 and 3 provide information about these two groups.
Background information for M2L2 participants.
Notes. STT = speech-to-text. * Data on previous knowledge were self-reported. Participant 12 had a deaf friend; participants 13 and 14 had followed a beginner course.
Background information for L1 signers.
Background information teachers.
2 Materials
The present study is part of a longitudinal study investigating M2L2 acquisition of a variety of grammatical features of NGT. A series of six tests was constructed, each consisting of 30 prompts and 5 distractors. However, not all of the 180 prompts are relevant for the present study. Three of the six tests (tests 1, 3, 5) included 22 and the other three (tests 2, 4, 6) 13 prompts featuring two or more entities that could be mapped out using a two-handed classifier construction (‘classifier-prompts’). That is, a total of 105 prompts targeted the production of a classifier construction. The remaining prompts served to elicit other NGT structures that will not be discussed here (see Boers-Visker, 2020; Boers-Visker & Pfau, 2020).
The ‘classifier-prompts’ represented different combinations of objects from the following categories: upright humans (standing or moving), sitting humans, vehicles (cars, trucks and bicycles; standing or moving), and animals (standing). The prompts were designed in a way that would allow us to identify whether certain features (or combinations thereof) appear earlier in the learners’ productions than others and whether certain construction types are more error-prone than others. 4
The six tests included comparable prompts, that is, for each target-construction, six (or three) photos, drawings or video clips were created or found on the internet. The same entities appeared across the elicitation materials but in different configurations and sometimes from a different angle (for an example, see Appendix 2). In each session, the order of prompts was randomized. During the test construction phase, we collected data from adult L1 NGT-users and from a sample of the target population (first year ISLDS-students, cohort 2015–2016) to ensure the appropriateness of the stimuli and tasks. Subsequently, some problematic prompts were adapted or removed.
In sum, the final test set included six tests, consisting of 30 prompts and 5 foils each. A total of 22 prompts targeted the production of two-handed classifier constructions. Some of these stimuli (n = 13) were included in all tests, the remaining 9 prompts only appeared in sets [1,3,5]. For each type of prompt, six different, though comparable, photos or video clips were assembled. Participants were tested 15 times, meaning that the six tests were repeated after the first cycle.
3 Procedure
a Procedure M2L2 participants
The experiment involved 15 sessions, preceded by a short baseline session (pre-test). 5 The M2L2 participants were instructed in spoken Dutch that they were to describe in NGT short video clips, photos, and drawings presented to them on a laptop. During the first year, twelve 15-minute sessions were scheduled on a two-weekly/three-weekly basis; the remaining three sessions were recorded with 10-week intervals during year 2. The sessions took place in a quiet, well-lit (class)room at UUAS. 6 The responses were filmed with a video camera located in front of the participant. The test was self-paced, and the participants were allowed to view the video clips several times. Furthermore, they were allowed to skip prompts for which they were insecure how to represent them in NGT. During the sessions, the author or her assistant were present in the room (both hearing and fluent M2L2 signers).
b Procedure benchmark (L1 participants and teachers)
The L1 participants were filmed on one occasion, at home, at work, or at UUAS. In two cases, a deaf colleague of the author was present in the room; in two cases, the hearing author was present in an adjacent room. The six sets of stimuli were recorded in one session of one hour or in two half-hour sessions. Instructions were offered in NGT. The task itself was identical to the task the M2L2 participants performed.
The teachers were filmed at UUAS. Instructions and examples were provided in NGT by the author, who subsequently left the room. Like the L1 signers, the teachers signed the six sets of stimuli in one or two sessions. Both the M2L2 participants and the L1 participants were unaware of the exact purpose of the study.
4 Transcription and coding
All data were transcribed in ELAN, a software package developed by the Max Planck Institute of Psycholinguistics in Nijmegen (Crasborn & Sloetjes, 2008). All manual activity produced with the dominant and/or the non-dominant hand was annotated with a Dutch gloss. In the pilot phase, the author transcribed a subsample of the data, followed by a revision of the code book. Subsequently, a second coder, a research assistant, was trained. To identify any inconsistencies, part of the data (6 sessions, 4% of the dataset) was transcribed separately by both annotators. The two annotators were quite consistent in their transcriptions, with a satisfactory agreement-rate between 87–93% (mean 91%).
In a successive stage, the total data set, comprised of 2,798 M2L2 responses and 880 L1/teacher responses, was coded by the author for (1) the presence of classifier predicates and their formational features; (2) the coordination of both hands, in case a two-handed classifier construction was produced; and (3) the use of alternative devices. Responses were categorized according to the categories set out in Table 4.
Overview of codes for categorizing responses.
Notes. The use of alternative spatial devices or lexical expressions including a preposition are acceptable alternatives in NGT (see Section II.2). * Appeared in the data of the learners only.
When one or more classifier predicates were produced, the formational features of the individual classifiers (handshape, location, orientation, movement) were analysed and coded, and in case of a two-handed construction, the location and orientation of the hands in relation to each other. Whenever one or more parameters did not meet the specifications of the target item or the referent was unclear, additional codes (Table 5) were added. In case a production was ambiguous (i.e. the production could be a gestural production), an extra code was added. For the reader’s convenience, an overview of the expected Entity classifiers for the different entities, drawn from the benchmark-data, is provided in Figure 2.
Overview of codes for categorizing substitution or underspecification errors.

Overview of expected Entity classifiers for the entities featured in the prompts.
As in the transcription process, we ran a pilot trial to revise and elaborate the coding scheme. Moreover, we recorded examples of idiosyncratic signing and typical learner productions (e.g. overgeneralizations, omissions, substitutions) in an extensive logbook. The productions of the L1 signers and the teachers served as a benchmark during the coding process. In case of uncertainty with regard to the appropriateness/well-formedness of a construction produced by a M2L2 signer, at least two L1 informants were consulted.
IV Results
1 Benchmark-data
We will first discuss the descriptions produced by the teachers and L1 signers (henceforth, ‘benchmark participants’). The benchmark participants produced in 75–100% (mean 93%) of the trials one or two classifier predicates (see Figure 3). The percentage of two-handed classifier constructions (either simultaneous or sequential) ranged from 53–100% (mean 84%). Signers N4 and N5 produced a relatively high number of alternative spatial devices or lexical expressions.

Productions of the benchmark participants (all prompts).
It is important to note that in this study, all benchmark participants produced at least some sequential classifier constructions. This is unexpected given the often-held assumption that the classifiers referring to the two entities should be produced simultaneously (‘canonical structure of locative expressions’, see Section II.1.b). Obviously, this has consequences for the analysis of the M2L2 productions. We performed an item-analysis on the benchmark-data to identify the responses each prompt (n = 24/48) induced (Figure 4).

Analysis of descriptions per prompt (n = 24 or 48) produced by benchmark participants.
The graph above reveals that some prompts (e.g. 3, 8, 19 and 20; for an overview of the prompts, see Appendix 3) induced a relatively high number of lexical expressions, instead of the targeted classifier constructions. Furthermore, we observe that a relatively high number of the responses to prompts 4, 5 and 14 were produced sequentially. This can probably be explained by the fact that these three prompts contained more than two objects, that is, there were more entities than articulators. The benchmark participants solved this problem by either dropping one of the two objects that had been introduced first, and then using this hand to sign the third object (resulting in a sequence of two simultaneous constructions, see Figure 5b), or by dropping both objects and sequentially signing the third object (Figure 5c). 7

Examples of options to depict a prompt involving three objects (5a) by using either; (1) a sequence of two simultaneous constructions (i.e. one hand remains in space) (5b); or (2) a simultaneous construction followed by a non-simultaneous construction (5c).
Furthermore, we noticed that prompts 1, 10 and 17 scored particularly high on simultaneous constructions. Prompt 10 (featuring a car and a truck, both static) was produced with a simultaneous classifier construction in all cases, while prompt 17 (two cars colliding) induced a simultaneous construction in all cases but one (98%; 100% for tests 1–4 and 6). Prompt 1 (two standing persons) was produced by means of a simultaneous construction in all instances in which a classifier construction was used (in other cases, an alternative spatial device was produced). These three prompts provide an opportunity to explore the differences between the signing of the L1 signers and teachers on the one hand, and the M2L2 signers on the other (see Section IV.3.f).
2 M2L2 data: Developmental stages
The M2L2 data were analysed per participant, per prompt, and per session. Due to limited space, we can only present a representative selection of graphs.
a Distribution of strategies over time
The graphs in Appendix 4 show the M2L2 descriptions during year 1. A surprising observation is that 12 out of 14 M2L2 participants (henceforth: participants) produced some descriptions featuring one or two classifier predicates after two weeks of instruction. Yet, the instruction offered during this period did not target classifier predicates. It must be noted that in the graphs in Appendix 4, erroneous and correct productions are not separated. We will return to this below.
A second finding is expected: all participants except one produced locative gestures during the first year. After the first semester, the use of gestures decreased, and after session 9 (i.e. after 22 weeks of instruction), gestures were no longer produced. The decline of gestural behavior coincided with an increase in the production of classifier predicates at the start of the second semester (session 7). Whereas the percentage of (simultaneous or sequential) two-handed classifier constructions ranged between 0 and 58% (mean 28%, SD 20) during session 5, we observe an increase to an average of 47% (SD 22, range 14–82%) four weeks of instruction later, during session 7. Towards the end of the first year, during session 11 (22 prompts, 13 participants), the participants produced an average of 77% of two-handed classifier constructions (SD 31, range 36–100%). 8 These numbers approach the percentages observed for the benchmark participants, who produced a mean of 83% (SD 21, range 45–100%) of two-handed classifier constructions for the prompts tested in session 11.
b First appearance of classifier predicates
As pointed out in the previous paragraph, the M2L2 participants used classifier predicates at an early stage and without having received explicit instruction. Table 6 shows the onset of the (correct) production of classifier predicates referencing the targeted entities. The numbers refer to the session during which the participant produced at least one appropriate classifier predicate for a particular group of entities (e.g. car, standing person, etc.) to place the object in space. 9
Overview of the first session each participant produced a correct Entity classifier predicate for the different classes of entities (shaded cells indicate per participant the first category/ categories of entities that was/were represented using a classifier predicate).
When we examine the first correct appearances, we notice that for most of the participants, the first classifier predicates produced correctly denote bicycles and cars. Interestingly, the majority of the participants produced a classifier predicate for a truck only at a later stage. This is surprising considering the fact that trucks, like bicycles and cars, are members of the ‘vehicle-family’. The classifiers for a sitting person and for animals were produced relatively late (or not at all). Notably, the classifier handshapes for vehicles and standing people represent the whole object, while the classifier handshapes for sitting people and animals denote a part of the body (i.e. bended legs of a person, legs of the animal). We will return to this in Section V. The acquisition rates are clearly visible in the graphs in Appendix 5, showing the M2L2 productions for (a selection of) the prompts.
3 M2L2 data: Characteristics of the learner-output
Our previous discussion demonstrated that some classifier predicates were produced at an early stage, while others appeared much later. However, none of the participants showed a consistent pattern during these early sessions. That is, some objects were depicted with a classifier predicate while other similar objects were not, and the participants used both conventionalized classifier handshapes and self-invented ‘classifier-like constructions’ within one session and even within one trial. Moreover, they produced different orientations for the same objects within one trial.
A quantitative analysis of the errors, or ‘learner characteristics’, is presented in Figure 6. Note that these errors only concern the classifier predicates produced by the learners; for a distribution of correctly and erroneously produced classifiers as well as non-classifier productions, see Appendix 6. The errors we identified included orientation errors (OR), handshape errors (HS), mirroring the scene, and failure to mention the referent/failure to identify referents clearly (see Section III.4, Table 5). In the following, we describe, by means of examples, the error types displayed in Figure 6 (Sections IV.3.a and IV.3.b) and other learner characteristics (Sections IV.3.c–f).

Distribution of errors in the classifier predicates produced by the M2L2 participants (the total number of produced errors is indicated between brackets beneath each bar).
a Orientation of the hand
A recurrent error in the first sessions was the failure to discriminate between the classifier for a car/truck and the classifier for a bicycle (error type OR: confusing orientation (

Failure to distinguish between bicycle and car by means of hand orientation.
Other, less frequent, errors involved violations of the conventions regarding the top/bottom and front/backside of objects (error types OR: confusing bottom and top, and OR: confusing back and front), resulting in descriptions in which objects appeared to be placed upside down or moving backwards (Figure 8).

Failure to encode correct orientation of objects by means of fingertip/palm orientation.
During the first sessions, productions were often characterized by uncertainty, hesitation, and self-correction. With regard to the orientation of the hand(s), we identified multiple examples of participants signing a response, looking at their hands, and slightly modifying the orientation of one of the hands to optimize the depiction. Furthermore, we noticed that some learners, while signing a construction in which one object is positioned on top of another, realize that they omitted the Ground object while signing the Figure object and subsequently ‘shuffle’ the Ground under the Figure object (see rightmost picture in Figure 9).

Example of shuffling the Ground object under the Figure (rightmost still).
A notable difference between some M2L2 participants and the benchmark participants is the off-target phonology displayed by some learners in responses involving a car or truck seen from the front. To represent a car or truck in this position, NGT-signers use the
-classifier instead of a
-classifier (see Section II.1.d). The use of the phonetic variant
enables the signer to articulate the classifier without awkwardly bending the wrist or arm. However, some learners consistently selected the
-classifier, which forces them to twist their hands and bodies to display the correct configuration (see Figure 10).
10

Failure to use the phonetic variant (
) to represent vehicles, leading to scene descriptions that are physically difficult to articulate.
b Handshape
With regard to the choice of handshape, we identified two types of errors: selection of the wrong handshape (error type HS: non-existing classifier handshape) or selection of a handshape belonging to another class of referents (error type HS: handshape refers to other object; e.g. selecting the handshape for a bike to depict an animal). Examples are shown in Figure 11. It is remarkable that the learner-solutions for depicting the sitting person and the animal involved handshapes that represent the whole object, while the conventionalized handshapes represent parts of the body (legs) (see the two bottom rows in Figure 11). This suggests an initial bias towards selecting classifier handshapes to represent the whole object. Notably, the learner-solutions were not idiosyncratic, that is, we noticed different learners coming up with the same solutions to represent an object, e.g. a flat handshape to represent a standing person or a bended finger to represent a sitting person.

Erroneous handshape selections displayed by M2L2 participants.
So far, we have discussed the errors regarding the formational features handshape and orientation, as displayed in Figure 6. Other errors shown in Figure 6 are mirroring the scene and failure to indicate the referent or to identify the referent clearly. Both errors frequently occur in the M2L2 data.
In addition to the error-analysis displayed in Figure 6, we investigated characteristics regarding movement, scene-depiction, and the use of alternative devices. In the remainder of this section, we will discuss these findings.
c Movement
The M2L2 participants regularly omitted movement in their descriptions. However, they did not differ from the benchmark participants in this respect. Both groups of participants tended to focus on the location of the objects and – apparently – considered the movement less important. Prior to the onset of classifier constructions, some M2L2 participants denoted the movement of an entity by tracing the path with an index finger or by modifying the lexical verbs
d Use of lexical expressions and other alternative devices
Not surprisingly, the participants produced lexical expressions, such as
e Planning scenes
Similar to Ferrara and Nilsson (2017), we found examples of M2L2 learners experiencing problems in building up a scene. Examples include (1) choosing the wrong hand to depict the first object (e.g. using the left hand to depict the object on the right), resulting in a switch of hand(s) during the depiction; (2) misjudging the distance between the hands, resulting in a depiction of two objects nearly touching each other; (3) placing an object too high in space in relation to the other object (exemplified in Figure 12A, the prompt depicted two cars on the same horizontal plane colliding); and (4) misjudging the size of the signing space (Figure 12B, the participant runs out of space and literally ‘bumps into her own body’). In Figure 12C, a M2L2 participant tries to resolve the problem that her own left arm (depicting a car) is blocking the description by letting the right hand (depicting a walking person) ‘jump’ over the wrist.

Examples of issues regarding planning the description: (a) participant places the left hand too high in space in relation to the right hand; (b) participant misjudges the space needed and finds her own body blocking the depiction; (c) participant’s left arm is blocking the depiction.
f Simultaneity
In Section II.3, we discussed that studies on the L1 acquisition of classifier constructions found that children often omit the Ground object or fail to produce the Ground and Figure object simultaneously, signing a sequential construction instead. Classifying such productions as deviant is based on the assumption that constructions featuring a Figure and a Ground are signed simultaneously by default. However, our benchmark participants demonstrated multiple examples of sequential constructions (see Section IV.1; obviously, this observation puts into question whether sequential constructions should generally be considered deviant in NGT). In order to investigate whether the findings reported in the L1 literature also apply to M2L2 learners, we specifically assessed the M2L2 responses to prompts 10, 17 11 and 1, since the benchmark participants consistently produced simultaneous classifier constructions for these prompts (see Section IV.1, Figure 4). Data reveal that 9 out of 14 M2L2 participants produced at least one sequential construction for either prompt 10 (truck and car next to each other) or prompt 17 (two cars colliding). Prompt 1 (two standing persons facing each other) was signed sequentially by one participant in the first session. These responses provide evidence that this learner behavior, found in L1 acquisition, is sometimes also attested in L2 acquisition of NGT.
V Discussion
The aim of this study, the first systematic and longitudinal investigation into the M2L2 acquisition of classifier predicates and two-handed classifier constructions, was to gain a better understanding of the developmental stages that L2 learners of NGT pass through in their acquisition of classifier constructions (research question 1), to provide insights into typical learner characteristics (research question 2), and to come to a better understanding of the (non-)facilitative function of existing gestural knowledge (research question 3). Below we relate our findings to other studies and highlight novel findings.
1 Findings in relation to other studies into M2L2 acquisition of classifier predicates
Recapitulating the findings from Section IV, we observed that after a year of instruction, all M2L2 participants succeeded in producing two-handed classifier constructions in order to depict the targeted scenes. The majority of the M2L2 participants (11 out of 14) applied a two-handed classifier construction in 80% or more of the responses. This outcome, in combination with the observation that the first classifier predicates appeared already after a short period of (untargeted) instruction, might lead to the conclusion that classifier predicates are not very difficult to acquire. These findings contrast with previous results reported by Ferrara and Nilsson (2017) and Marshall and Morgan (2015), who claim that classifier predicates are difficult to acquire. The different outcomes can probably be attributed to differences in task type. Marshall and Morgan investigated a different set of objects, and Ferrara and Nilsson (2017) examined the use of classifier predicates in extended spatial descriptions, while our study consisted of prompts that elicited short (mono-clausal) descriptions.
Marshall and Morgan (2015) reported that the selection of the appropriate handshape caused difficulties, whereas the learners reported in Ferrara and Nilsson (2017) experienced difficulties in the production of orientation and location. Our study shows that the learners, when experiencing difficulties, struggle with both handshape and orientation. Movement, on the other hand, does not cause many problems.
Our data corroborate previous results obtained by Ferrara and Nilsson (2017) regarding difficulties the participants encountered in coordinating the hands to depict a scene. Our participants demonstrated similar difficulties, resulting in misplacement of classifier predicates or a need to switch hands during the depiction.
2 Findings in relation to L1 acquisition
Our data are in agreement with observations in the L1 literature regarding (1) handshape substitutions (De Beuzeville, 2006; Supalla, 1982), (2) errors and difficulties regarding the expression of Figure and Ground, (3) sequential realization of constructions that are expected to be expressed simultaneously (e.g. Supalla, 1982; Tang et al., 2007), (4) failure to specify referents (Slobin et al., 2003; Tang et al., 2007), (5) use of whole-body language (De Beuzeville, 2006; Tang et al., 2007;), and (6) signing outside the signing space (De Beuzeville, 2006). We did not find evidence that M2L2 learners omitted meaning components or decomposed complex movement patters. We assume that these particular characteristics are not present in the M2L2 data because M2L2 learners have more control over their body than L1 learners (Rosen, 2004).
3 Findings in relation to literature on gestures
In Section II.1.e, we discussed the resemblance between ‘hand-as-object gestures’ produced by sign-naive individuals, and Entity classifier predicates. Previous research suggests that learners might use these gestures as substrate to build their knowledge upon (Janke & Marshall, 2017; Marshall & Morgan, 2015). The present study supports this assumption, since our participants produced classifier constructions that resemble hand-as-object gestures at a very early stage. We acknowledge that this early appearance in the data could be an artifact of the coding process, that is, a result of miscoding gestures as classifier productions. This, however, seems implausible, given the results of the baseline session conducted prior to the start of the program. Recall that 11 of the 14 participants had no prior knowledge of NGT, and as such, their productions during the baseline test can be considered as gestures. Yet, only four participants produced a hand-as-object-gesture to denote a (moving) car or bicycle during this pre-test, while, after two weeks of instruction, more than twice as many participants produced an Entity classifier for the same objects. If these early productions were all gestural, we would expect them to also surface in the baseline test of all these participants. It thus seems that a part of the learners did not use hand-as-objects gestures spontaneously, but picked up classifier constructions from the early input.
The present study has investigated how novel learners develop their skills over a longer period of time. One of the findings that emerged from the longitudinal data is that entities from the category ‘vehicles’, especially cars and bicycles, are represented by an Entity classifier earlier than other categories in most participants (see Table 6). Interestingly, the four participants that used a hand-as-object gesture in the above-mentioned baseline session, exclusively used these gestures to denote cars and bicycles. None of the participants used a hand-as-object gesture to depict any of the other entities during the baseline test. One can speculate that two factors may be responsible for these related observations. First, there is a possibility that the
-classifier and the
-gesture used to denote a vehicle are ‘manual cognates’ (Ortega et al., 2019), while other entities, such as a sitting person, lack such a manual cognate. This would explain the higher prevalence of this specific element in both the baseline test and the early data. Secondly, the ‘late’ appearance of Entity classifiers for human entities in our dataset, as well as the absence of hand-as-object gestures for these entities in the baseline test, might be caused by the fact that learners are biased to represent a human being using their own body (i.e. a bias towards an ‘action strategy’, see Van Nispen, Van de Sandt-Koenderman & Krahmer, 2017). Our dataset points to the direction that learners indeed employ different strategies to represent concepts from different semantic domains, and that the ‘seeds’ of these strategies can be found in their gestural behavior.
Our study suggests that the commonalities between gestures and signs facilitate the learning process, that is, we are dealing with an instance of positive transfer. This in is line with the conclusion of Janke and Marshall (2017), who hypothesize that the challenge for learners is not the acquisition of classifiers as a phenomenon per se, but rather to ‘narrow down the set of handshapes that they have potentially available to them to the set of classifier handshapes that is grammatical in the sign language they are learning’ (p. 10). The latter implies that the learners’ gestural repertoire initially interferes with their learning to some extent. That is, beside positive transfer, there is also negative transfer of handshapes that are not part of the classifier inventory or of orientations that violate the conventions. Our findings indicate that, once classifiers appear, there is indeed some negative transfer from ‘non-conventionalized components’, and that the challenge lies in the acquisition of the appropriate classifier handshapes and conventions regarding the orientations of palm and/or fingertips (e.g. the orientation distinguishing the NGT classifier for a bicycle vs. a car), as well as learning the conventions regarding Figure and Ground.
4 Novel findings
In the previous paragraph, we set out that with regard to developmental stages (research question 1), our investigation shows that classifier predicates representing vehicles (bicycles, cars) appeared early, followed by classifiers for standing persons. Classifiers representing sitting persons and animals appeared much later. We speculated that the early appearance of classifiers within the ‘vehicle domain’ might be related to manual cognates in the gestural domain. This, however, does not explain the differences in acquisition rate regarding the other entities we elicited. So, what could explain the relative early onset of classifiers representing standing people as compared to the classifiers for sitting people and animals? Here, we offer two potential explanations for this difference. First, learners might be more sensitive to classifiers representing certain semantic domains (e.g. vehicles and standing people) than to other classifiers, resulting in (non-)uptake from the input. This difference in sensitivity might be caused by features such as degree of abstractness (the classifiers for vehicles and standing people represent the objects as a whole and are more abstract than the classifiers for sitting people and animals, which represent a specific part of the entity, i.e. the legs), or frequency in the input. Second, learners might be biased to use alternative devices such as the lexeme
Regarding learner characteristics (research question 2) we noted learner behaviors that were clearly idiosyncratic (e.g. one learner tried to depict the spatial layout of two cars by using the classifier handshapes for standing people) as well as learner behaviors that appeared in the data of several learners. We noted, for example, several identical handshape substitutions across learners. Notably, these self-invented classifier handshapes (see Figure 11) are all existing NGT handshapes, and all of them represent attempts to depict an object as a whole. Another learner characteristic that we observed in several learners was the failure to use the
-classifier as an alternative for the
-classifier, resulting in physically difficult and off-target scene descriptions (see Section IV.3.a). At the moment, we only can speculate about the cause of this error. It might well be that the teachers have emphasized the relation between the rectangular shape of a car and the
-handshape (thus neglecting the
-handshape), causing some form of overgeneralization in the learners. A third typical learner characteristic was the combined use of different spatial devices (Section IV.3.d). This high degree of redundancy decreased during the second year of the study.
5 Limitations and future research
We are aware that our research has limitations. A first limitation is the repetitive nature of the study. In order to compare the learners’ productions, we opted to use prompts involving the same referents, yet in different presentation formats, in each test. This format comes with the risk that learners, being confronted with the same construction repeatedly, use resources that enable them to sign a construction correctly during the next session (e.g. looking up how to sign the construction or asking their teacher), with an accelerated learning curve as a result. A second issue involves our decisions regarding coding. As discussed elsewhere, there are commonalities between the form of gestural productions on the one hand, and linguistic representations on the other. To prevent over-attribution of linguistic status to gestural productions, we were conservative in our coding. This, however, may have led to an underestimation of the learners’ performances. A third limitation is that the study was confined to the elicitation of short responses using visual prompts, which does not provide any information about the learners’ capacity to apply these structures in longer and more elaborate, natural conversations. Despite these limitations, we believe that the study provides valuable data on the acquisition of classifier predicates, and that the findings presented offer a good starting point for further research.
6 Future research
We hope that the current investigation will serve as a base for future studies on the M2L2 acquisition of Entity classifiers to confirm, complement or challenge our findings. Future work could focus on the production of these structures in elaborated, natural conversation. Our study shows that learners are capable to use these structures at an early stage, but to what extent are they capable to apply them in a communicative context? And how can teachers support learners in this process? Furthermore, it would be interesting to replicate the current study in other sign languages, especially sign languages which employ classifier handshapes that clearly differ from the gestural forms produced by non-signers (e.g. American Sign Language). This might shed light on the potential supportive role of gesticulations in the learning process.
VI Conclusions
In this longitudinal study, 14 novel learners of NGT were, over the course of two years, repeatedly presented with a task that was designed to elicit two-handed classifier constructions, a linguistic construction that is not present in the mother tongue of the learners. Our data demonstrate that, after a year of instruction, the production of classifier predicates representing objects that people encounter in their daily lives (cars, bicycles, trucks, persons, and animals) did not pose a significant challenge for the majority of the participants. In fact, already early on during the learning process, most learners demonstrated understanding that an object can be positioned in space by a handshape representing that object. The biggest challenge for them was to acquire the rules governing the (default) orientation and handshape, as well as the coordination of both hands in relation to each other in space. In particular, the classifier predicates denoting sitting persons and animals posed challenges, and appeared late as compared to the other classifiers. This implies that learners might benefit from explicit instruction directed at these particular classifier predicates. A second pedagogical implication is that, given the difficulties experienced by our participants regarding the coordination of the hands in space, instruction regarding the use of both hands in relation to each other might be beneficial.
Footnotes
Appendices
Acknowledgements
We are indebted to the participants who volunteered their time and gave their permission to use screenshots of their performances in this publication. We are grateful for the assistance of Jamie Knecht, Karin Vinke, Christiaan Plug, Yfke van der Woude, Adde Woest, and Marijke Scheffener.For reasons of copyright, the prompts presented in this chapter were re-produced. We thank Jacques Visker and Dorieke van Luit for their help. The prompt at the top left of Figure 8 was produced by photographer Peter Stam and is used with permission. Figure 1 depicts deaf signer Tobias de Ronde and are produced by Annette Jansen. We gratefully acknowledge the valuable feedback on earlier versions of this article provided by Roland Pfau, Beppie van den Bogaerde, Rick de Graaff, and the anonymous reviewers.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
