Abstract
This study investigates whether gesture-enhanced recasts lead to better production of the English regular past tense. Fifty-nine low-intermediate ESL students at a US university took part in communicative activities in class, during which they received, respectively, no feedback, verbal recasts, or gesture-enhanced recasts, the latter being a verbal recast accompanied by a point-back gesture indicating the non-target-like use or absence of the past tense. All learners also completed two assessments, a grammar test about the regular past tense and an oral production test that was designed to elicit the regular past tense, as a pre-test, an immediate post-test, and a delayed post-test a week later. Then, a repeated-measure ANOVA was used to analyse the linguistic development, using the obtained test scores. The results showed that there was no difference across the conditions in the grammar test, owing to the ceiling effect. On the other hand, learners significantly improved from the pre-test to the post-test in the oral production test, but there were no differences across the conditions. This contradicts a previous finding that teachers’ pedagogical gestures during recasts better facilitated the development of locative prepositions. Further, this study discusses how learning types (rule-based vs. item-based) involved in two different linguistic targets and different gestures used in the two studies may affect the efficacy of recasts.
I Introduction
This study investigates the effectiveness of gesture-enhanced recasts for the acquisition of the regular past tense in English. Recasts, ‘the teacher’s reformulation[s] of all or part of a student’s utterance, minus the error’ (Lyster & Ranta, 1997, p. 46), are one of the most frequently used forms of corrective feedback (CF) in a language classroom. A recent meta-analysis by Brown (2016) showed that among the classroom CF studies, recasts accounted for the 57% of all the CF, being a CF type with the highest ratio among various CF types used in a language classroom. Although recasts are often treated as one of the CF categories, the saliency of recasts, which potentially affects second language (L2) learning, vary depending on various features such as prosodic features, linguistic focus, to name a few (e.g. Loewen & Philp, 2006; Sheen, 2006).
While the existing studies often focus on the verbal aspects of recasts, it is possible that other cues used during recasts contribute to L2 learning. A series of studies have reported the benefits of using gestures in educational contexts. Seeing the instructor’s gestures is reported to help children’s cognitive development, particularly with regard to understanding mathematical concepts (e.g. Cook, Yip, & Goldin-Meadow, 2010; Goldin-Meadow, Cook, & Mitchell, 2009; Goldin-Meadow & Sandhofer, 1999; Goldin-Meadow & Singer, 2003). Linguistically, seeing gestures helps second language (L2) comprehension (e.g. Dahl & Ludvigsen, 2014; Sueyoshi & Hardison, 2005) and L2 vocabulary learning (e.g. Kelly, McDevitt, & Esch, 2009; Macedonia et al., 2011; Macedonia & Klimesch, 2014; Tellier, 2008), and teachers’ gestures have pedagogical features such as metaphorically representing the concept of tense and aspect (e.g. Hudson, 2011; Zhao, 2007). Recently, Wang and Loewen (2016) reported how gestures and other non-verbal cues are used during CF. However, intervention studies examining the effectiveness of gestures used with verbal CF are extremely limited (for an exception, see Iizuka & Nakatsukasa, under review; Nakatsukasa, 2016). To contribute to the existing body of research on CF and on gestures, this study examined whether seeing metaphoric gestures indicating the past tense during verbal recasts can facilitate L2 acquisition of the regular past tense.
II Gesture and L2 learning
Gestures, hand movements that are directly tied to speech (McNeill, 1992), play an integral part in interaction and communication. In language classrooms, some classroom observational studies described how teachers used gestures to serve pedagogical functions – to teach vocabulary (e.g. Inceoglu, 2015; Lazaraton, 2004; Smotrova & Lantolf, 2013) and abstract concepts such as metaphors, verb tenses, and spatial relationships (e.g. Allen, 2000; Hudson, 2011; Lazaraton, 2004; Matsumoto & Dobs, 2017; Tellier, 2006; Wang, 2009; Zhao, 2007).
Similarly, intervention studies have reported that using gestures helps vocabulary teaching (e.g. Kelly et al., 2009; Macedonia et al., 2011; Macedonia & Klimesch, 2014; Morett, 2018; Tellier, 2008), the teaching of tones and intonations (e.g. Morett & Chang, 2015; Yuan, González-Fuente, Baills, & Prieto, 2019), but not segmental phonology (e.g. Hirata, Kelly, Huang, & Manansala, 2014; Kelly, Hirata, Manansala, & Huang, 2014), and finally L2 comprehension (e.g. Dahl & Ludvigsen, 2014; Kida, 2008; Sime, 2008; Sueyoshi & Hardison, 2005), which is a linguistic area closely related to the present study. Although the number of intervention studies regarding L2 comprehension is limited, the studies so far present a case that seeing gestures is helpful for L2 comprehension. Sueyoshi and Hardison (2005) investigated whether L2 learners comprehend a lecture better if they see the speaker’s gestures and facial expressions. Forty-two low–intermediate and advanced learners of English watched a video of an English lecture in one of the following three conditions: (1) an audiovisual lecture with facial expressions and gestures, (2) an audiovisual lecture with facial expressions but without gestures, and (3) an audio lecture without facial expressions or gestures. They compared the scores of a listening–comprehension test across the three conditions and found that facial cues were helpful for higher-proficiency speakers, whereas seeing both gestures and facial cues were helpful for lower-proficiency speakers. Kida (2008) also reported the facilitative role of gestures on L2 comprehension for lower-proficiency speakers. Dahl and Ludvigsen (2014) also reported that seeing a speaker’s gestures helped the comprehension of adolescent learners of English as a foreign language (EFL) more than that of native English speakers. Collectively, the studies suggest that gestures help L2 comprehension, particularly for lower-proficiency speakers, as well as vocabulary teaching and some phonological aspects of L2 acquisition.
The efficacy of gestures for L2 grammar development has not been confirmed yet by intervention studies, to the best of the present author’s knowledge. If, as psychological studies have reported, seeing gestures triggers semantic processing (Holle & Gunter, 2007; Kelly, Kravitz, & Hopkins, 2004; Kelly, Ward, Creigh, & Bartolotti, 2007; Wu & Coulson, 2005), allowing the brain to decrease the need for semantic control and to use gesture as an additional source of information (Skipper, Goldin-Meadow, Nusbaum, & Small, 2007), it can be hypothesized that seeing meaningful gestures, the gestures that metaphorically represent the grammatical functions, helps learners process the information of the CF with less effort.
III Teacher gestures, corrective feedback, and learning
Studies generally agree that oral CF promotes L2 acquisition, particularly, according to meta-analyses, of grammatical and lexical features (e.g. Mackey & Goo, 2007, but see also Li, 2010; Lyster & Saito, 2010; Russell & Spada, 2006). Among various types of CF, language teachers often employ recasts, typically considered as a type of implicit feedback (e.g. Brown, 2016; Doughty, 1994; Havranek, 1999; Lyster & Ranta, 1997; Mackey, Gass, & McDonough, 2000, but see also Oliver, 1995). Language teachers employ recasts because they do not interrupt the flow of communication and the correction is done in a timely manner (Ellis & Sheen, 2006; Han, 2002; Leeman, 2003; Loewen & Philp, 2006). Theoretically, recasts are helpful because learners can notice the difference between their first language (L1) and target-like use and because recasts include positive evidence (Long, 2007). Yet, debate continues on whether noticing occurs after the learners have received recasts. Some studies presented the favorable impact of recasts on L2 learning (Doughty & Varela, 1998; Mackey & Philp, 1998; Nassaji, 2009, Oliver, 1995), while others showed the limited effect (Havranek, 1999; Leeman, 2000; Lyster, 1998, 2001, 2004; Lyster & Ranta, 1997; Sheen, 2004, 2007; Slimani, 1992) or were inconclusive (Loewen & Nabei, 2007; Long, Inagaki, & Ortega, 1998; Yang & Lyster, 2010).
One of the reasons for these inconclusive findings seems to be that the way in which the recasts were provided may affect their explicitness (Loewen & Philp, 2006; Sheen, 2006). For example, Loewen and Philp investigated the relationship between the characteristics of recasts, uptake, and the accuracy of learners’ post-test scores. Their results showed that some specific aspects of recasting, including stress, declarative intonation, use of one correction as opposed to multiple corrections, and multiple feedback moves, resulted in higher rates of successful uptake. In addition, interrogative intonation, shortened length, and use of one change between the learners’ original utterance and a recast were associated with the accuracy of the test scores. Sheen (2006) also found that three characteristics of recasts (length, type of correction, and linguistic focus) were related to learner uptake, and identified six characteristics of recasts (mode, length, type of correction, linguistic focus, reduction, and number of corrections) associated with learners’ repair. All these factors have been associated with how ‘corrective’ recasts appear to learners and whether linguistic targets are salient enough for learners to notice. The studies pointed out several verbal features that affect the saliency and efficacy of recasts. If so, it is also necessary to investigate if nonverbal cues, such as gestures, also affect the saliency and efficacy of recasts.
A few recent studies examined how gestures are used during corrective feedback (Davies, 2006; Wang & Loewen, 2016) and some intervention studies examined if gestures affect L2 learning (Iizuka & Nakatsukasa, under review; Nakatsukasa, 2016). Wang and Loewen (2016) documented the non-verbal behavior of instructors in ESL classrooms while corrective feedback was being given. They observed about 65 hours of an ESL classroom and identified that more than 60% of corrective feedback was accompanied by various nonverbal behaviors. Specifically, they found that the more explicit form of feedback was accompanied by nonverbal behaviors. Similarly, Davies (2006) observed how frequently nonverbal cues were used during focus-on-form episodes (FFEs) and reported that 47% of FFEs were accompanied by nonverbal cues.
Although these studies did not assess whether learners’ perception of CF change with or without non-verbal cues, it is possible that some nonverbal cues may affect the impact of CF. First, non-verbal cues may provide additional information (e.g. metalinguistic cues provided via gestures) to the verbal CF and make the CF more salient and noticeable than when CF is provided only verbally. Additionally, the gestural studies have shown that seeing speakers’ gestures help L2 comprehension. Based on these studies, it can be hypothesized that incorporating gestures makes CF more noticeable and comprehensible. If so, CF presented both gesturally and orally would have a greater impact on L2 development than CF given only orally.
The present author’s study (Nakatsukasa, 2016) appears to be the only intervention study to have examined the effectiveness of gestures when used with CF, specifically recasts, for the development of the English locative preposition. The results of that study indicated that learners who received recasts with pedagogical gestures that depicted the concepts of locative prepositions retained their linguistic development in delayed post-tests, whereas the development was diminished among learners who received verbal recasts only. Based on these findings, the author argued for the effectiveness of gestures incorporated into recasts, although it is premature to draw any firm conclusions based on one study. Specifically, the target items used in the study (Nakatsukasa, 2016) have similar characteristics to those of vocabulary learning. The learners were required to learn locative prepositions that entailed the most prototypical meanings. What is not yet known is whether a similar finding can be obtained for linguistic items that involve rule-based learning, such as the regular past-tense.
IV Corrective feedback and the regular past tense
The selected target structure of this study was the English regular past tense -ed, which can be difficult for learners but can be easily corrected via feedback. Previous studies showed that teachers actually use gestures while teaching the past tense (Hudson, 2011; Matsumoto & Dobs, 2017).
Although the regular past tense is introduced early in English language courses, students often have issues in using it in their production (e.g. Davies, 2006; Wang, 2009). Ellis, Loewen, and Erlam (2006) reported that learners are familiar with the concept of the past tense but their spontaneous production lags behind. This was also reported by the instructors of the classes where the current study’s data collection took place.
A few studies have specifically explored how learners can learn the past tense from corrective feedback. Yang and Lyster (2010) investigated how learners of English-as-a-foreign-language from China acquired English regular and irregular past tense verbs under three conditions: prompts, recasts, and no feedback. Students in the prompt condition improved more than those in the other two conditions in the acquisition of regular past-tense verbs. The significant impact of feedback type on the acquisition of regular past tense verbs was also reported in Ellis et al. (2006). They randomly assigned low–intermediate ESL students into one of three conditions (recasts, metalinguistic feedback, and no feedback) and compared the acquisition of regular past tense verbs between the three conditions. The researchers found that learners in the metalinguistic feedback group learned better than those in the recast group. However, it is still not known if the recasts can foster the acquisition of the regular past tense when the metalinguistic information is provided visually during recasts.
V Research questions
By bridging the literature on recasts and gestures, this study specifically investigated the effectiveness of recasts when used with pedagogical gestures for the acquisition of regular past tense. Specifically, the following questions were asked:
How effective are recasts for the acquisition of the regular past tense, as measured by a grammar test when recasts are provided only verbally versus with gestures?
How effective are recasts for the acquisition of the regular past tense, as measured by an oral production test when recasts are provided only verbally versus with gestures?
VI Methodology
1 Participants
A total of 70 participants who were enrolled in one of the ten low–intermediate ESL classes at a large state university in the United States participated in this study. Their L1s included Arabic (n = 21), Chinese (n = 44), Korean (n = 1), Japanese (n = 2), and Thai (n = 1). All the participants were new arrivals to the United States. It was the first semester for the majority of the students (n = 58) or the second semester for a few (n = 12). One participant had lived in the USA until the age of five but stayed in Korea afterwards. The participants had studied English for an average of 6.20 years (SD = 3.40) at the time of data collection. After receiving approval for data collection from the course instructor, the researcher randomly designated ten classrooms as either classrooms of gesture-enhanced recast conditions (GR) or classrooms of verbal recast conditions (VR). Each class had seven to 11 students. Initially, the 70 participants agreed to participate in the study. Four classrooms with a total of 37 students were assigned to the GR condition classroom, another four classrooms with 22 students were assigned to the VR condition classroom, and a further two classrooms with 11 students were assigned to a control group classroom. Eligible participants were those who did not miss more than one post-test and who scored lower than 80% on the oral production test, which was an average score obtained by students in an advanced ESL class of the same institution, which is one level above the participants of the current study. Therefore, data from 11 students were excluded from the study. This resulted in that as shown in Table 1, data for 27 students from the GR, 21 from the VR, and all 11 from the control group were included for the data analysis. Table 1 illustrates the distribution of learners’ L1s, gender, and average lengths of studying English. 1
Gender, first language (L1s), and lengths of English study of participants included in the analysis.
Notes. C = control group. L1 = first language. VR = verbal recast conditions.
The control group was included in the study to evaluate the acquisition of the regular past tense without any kind of recast, on the assumption that it is possible that learning may occur just by participating in a communicative task. It would have been ideal to have the same number of participants in each condition. However, the control group ended up being smaller than the two experimental groups because the two experimental conditions were prioritized for the statistical analyses on the grounds that they would yield more meaningful contrasts in answer to the research questions. Specifically, it was originally planned to assign three classrooms to the control condition. However, owing to the high attrition rate in the VR condition, it was necessary to reassign one class from control to VR. The author did not question the students about absences during the data collection, and therefore the reasons why class A and B had lower rate of participation is not known.
2 Target structure
For the present study, the researcher specifically selected the following verbs that take the regular past tense: cook, play, kiss, watch, talk, call, and wash. The verbs were selected as the participants were, according to interviews with the instructors, already familiar with them and thus would not have to struggle with lexical items during the study. Any variation in pronunciation was not the focus of this study, but the feedback was provided with the appropriate pronunciation.
3 Materials
This intervention study included pre-tests, treatment sessions, immediate post-tests, and delayed post-tests. The detailed procedure is presented in the next section.
a Assessment instruments
The pre-test, post-test, and delayed post-test included an oral production test and a grammar test targeting the regular past tense. The oral production test was designed to assess learners’ use of the target structure in spontaneous speech, and the untimed grammar test was designed to assess their explicit knowledge.
Three versions of the past tense oral production test were created, each containing a set of seven pictures to elicit the selected regular past tense verbs. This test was composed of two slides, as shown in Figure 1. The participants were presented with seven pictures that described what happened to a character the previous weekend. The pictures were designed to elicit all the target verbs, cook, play, kiss, watch, talk, call, and wash. Participants had 30 seconds to review the story, after which they responded to the prompt, ‘Please tell me what happened to Julia [the name of the character] last week.’ This oral test was audio-recorded using a voice recorder. The ratio of the correct use of past tense in the obligatory context was used as the score. The three versions of picture description tests with past tense were randomized and used as a part of a pre-test, immediate post-test, and delayed post-test sequence.

Pictures used in the oral production test.
The past tense grammar test, consisting of 20 questions, was constructed to measure the participants’ understanding of regular past tense verbs. Three versions of the grammar test were created. There were five distractor sentences (e.g. containing articles and prepositions errors), ten ungrammatical sentences in which the past tense was misused, and five grammatical sentences. For each question, the participants judged if the sentence was grammatical. When they judged that the sentence was grammatical, they were asked to circle ‘correct’. When they answered ‘correct’ to the grammatical sentences, they were given .5 point. When they judged that it was ungrammatical, they were asked to circle ‘incorrect’ and write a correct sentence. The correction part was added to make sure that the judgment was made based on the target structure (for the ten target sentences) and not on other linguistic features. When they answered ‘incorrect’ to an ungrammatical sentence, they were given .5 point. An additional .5 point was given when their correction was right. The participants were allowed to spend as much time as needed to complete this test. The maximum possible score was 12.5. An example of an ungrammatical (past tense) sentence is shown below.
Example: Yesterday, Julia cooks Indian food for John (CORRECT / INCORRECT) If Incorrect, please correct the sentence.
b Communicative activities
A total of two communication activities were designed for this study to elicit the regular past tense in class.
c Picture sequencing activity
The first task was a picture-sequencing task. The participants were divided into pairs, and each pair received two pictures in a sequence illustrating part of a story about how one male character, Michael, met a female character, Erica, on the previous Friday. There were 18 pictures in total, and each pair received a set of two consecutive pictures (Pictures 1 and 2, Pictures 3 and 4, Pictures 5 and 6, and so forth) (see Figure 2). The task was designed to elicit the aforementioned regular past tense verbs. First, the researcher introduced some vocabulary words with which the learners might not have been familiar (e.g. outlet, barista, and chef). Then, the participants were asked to describe the two pictures in front of the class, following the prompt, ‘Please describe what happened to Michael and Erica last Friday,’ without showing the pictures to the rest of the classmates. This prompt was used consistently throughout the task, so that the participants could respond using the past tense. After everyone finished describing their own pictures, they were asked to arrange the nine pairs of pictures in the right order. The participants needed to negotiate the possible order of the pictures with one another until they came up with the correct order. The entire session was video-recorded. This activity lasted about 25 to 30 minutes. During the task, the participants received VR or GR when not using the past tense where it should have been used.

Picture sequencing activity.
d Information gap activity
The second task was an information-gap activity, also targeting the use of the regular past tense. First, the researcher described the context of the task: ‘There was a murder last Saturday and we need to identify the murderer.’ The entire class played the roles of suspects, with one participant playing the murderer. After reviewing some vocabulary words (e.g. murder, murderer, suspect, jail, and arrest), each participant was presented with a picture card that illustrated his or her alibi. However, one card contained the sentences, ‘You are the killer. Make up a story so that you will not be arrested.’ Each participant was asked to describe his or her alibi in front of the class, based on the card they received. The researcher asked some related questions to their alibi – such as, ‘How long did you stay at your friend’s place?’, ‘What teams were playing?’, ‘Who won the game?’ – to increase opportunities for oral production. On average, three related questions were posed to each participant. The participants needed to listen to one another and adjust their stories accordingly, so that their stories matched those told by the other characters. After telling their stories, the participants asked one another clarification questions until they found the murderer. This activity lasted about 20 minutes.
4 Procedure
Each round of data collection took about two weeks (see Figure 3). On the day of Session 1, the participants completed the background questionnaire, a regular past tense oral production test, and then a regular past tense grammar test. As mentioned earlier, there were three versions of each test to avoid a learning effect from repeating the exact materials. The pre-test results compared with one-way ANOVA revealed that there was no difference among the conditions with regard to the test results in the regular past tense grammar test F(2, 57) = 2.27, p = .11, or the regular past tense oral production test F(2, 57) = 1.07, p = .35.

Data collection procedure.
One to three days after the pre-test, the participants completed the treatment session that included the two communicative activities. The entire session was video-taped. When the participants did not use the regular past tense or used it incorrectly during the tasks, the researcher provided VR or GR, or provided no recasts, according to their assigned condition. To keep the consistency of quality of recasts, the researcher attempted not to stress or emphasize any of the words. An average of 15.25 (SD = .90) instances of recasts were provided for the 45–50-minute treatment session in the verbal recast condition, and 16.50 (SD = 2.30) instances of gesture-enhanced recasts. All the feedback was directed to a learner who produced the non-target-like use of regular past-tense. (A t-test has revealed that the frequencies in the two conditions are not significantly different: t(6) = 1.01, p = .35). The immediate post-test, which included the oral production test and grammar test in a different version from the pre-test, was administered a day after the treatment session. A subset of ten participants participated in a stimulated recall session instead of immediate post-test within 24 hours. During the stimulated recall, learners were asked to watch every instance of CF episode that happened in their own class. The video recordings were segmented by the researcher in advance. The CF episode started with learners’ utterance, CF, and learners’ uptake when available. In addition, some non-CF episodes were also included so that the learners do not think that they must comment on corrections always. The ratio of CF episode accounted for 75% of stimuli and 25% for the non-CF episode for each stimulated recall session. The post-test was repeated a week after the treatment session as a delayed post-test. Those who participated in the stimulated recall also completed a delayed post-test but they were not included as a part of the analysis because completing a stimulated recall session potentially has learning effect.
5 Description of gestures and recasts
When the participants did not use the past tense in the obligatory context in two tasks, the researcher consistently provided recasts with or without gestures immediately following the participants’ utterances, depending on learners’ assigned conditions. AFollowing Hudson’s (2011) description of teachers’ gestures that were used to teach the concept of the past tense, the ‘point-back’ gesture with the thumb was used to indicate the past in the GR condition (See Figure 4). From the researcher’s personal observations of the ESL classrooms, this point-back gesture is commonly used when teaching the past tense. For the VR condition, the researcher provided recast only verbally, putting her hands down next to the side of her body to avoid gesturing. In addition, the researcher tried not to stress any part of the recast in either condition to keep consistency. The presence or absence of modified output could potentially impact on the effectiveness of feedback (Egi, 2010). In all the instances, learners had the opportunity to modify their output; however, production of modified output was not enforced in the present study, to keep the flow of interaction and the saliency of feedback as equal as possible across conditions.

The point-back gesture used for gesture-enhanced recasts.
Feedback Example (VR)
And he wash the car.
Feedback Example (GR)
And he wash the car.
Oh,
6 Analysis of grammar and production tests
The research question first asked whether the learners used the regular past tense more readily when recasts were provided along with gestures. The grammar test scores and oral production test scores in the pre-test, immediate post-test, and delayed post-test were compared, using repeated-measures ANOVA to identify if any of the groups performed significantly differently from the others. 2 Before interpreting the results, Mauchly’s sphericity test was used to verify whether the assumption of sphericity was violated. When it was violated, Greenhouse–Geisser adjusted scores were used. In addition, the effect size was calculated by Cohen’s d. According to Plonsky and Oswald (2014), the following benchmark was used: small (d = .40), medium (d = .40), and large (d = 1.00). 3 Finally, one-way ANOVAs were used as a post hoc test.
VII Results
1 Effect of VR and GR and grammatical knowledge
The first research question asked whether the regular past tense was acquired more efficiently following the two types of recasts. Table 2 shows the mean scores of the grammar test on the past tense; Figure 5 shows this visually. As may be seen, the scores obtained were relatively high at the time of pre-test for all the conditions and there were no obvious changes in the test scores in the immediate and delayed post-tests.
Descriptive statistics of grammar test.
Note. Maximum test score is 12.5.

Visual representation of mean grammar test scores and standard deviations.
Because Mauchly’s test of sphericity showed that the sphericity of the dataset was not assumed (p = .91), the adjusted data obtained from Greenhouse–Geisser was used for the interpretation. A repeated-measures ANOVA revealed that there was no significance in any contrast: F(1.99, 87.70) = .17, p = .87 for Time with a minimum effect size (d < .001), F(2, 44) = 1.29, p = .29 for Group with medium to large effect size (d = .50), and F(3.99, 87.71) = 3.08, p = .69 for Time X Group with small to medium effect size (d = .28). The results indicate that all the conditions remained the same from the pre-test to the delayed post-test and there were no differences between the conditions.
2 Effect of VR and GR on the development of oral production
The second analysis compared the learners’ oral production of the past tense following the two types of recasts. Table 3 shows the scores obtained from the past tense oral production test, and Figure 6 shows this visually. Figure 6 illustrates that there was an overall increase from the pre-test to the immediate post-test in all the conditions. Specifically, the participants in the control group and the VR condition had higher scores in the delayed post-test than the immediate post-test, whereas those in the GR condition did not. For the analysis, the scores from Greenhouse–Geisser were used because the sphericity of the data was not assumed, according to Mauchly’s test of sphericity (p = .42). A repeated-measures ANOVA revealed that there was a significant Time effect, F(1.93, 86.67) = 7.24, p < .001 with a medium effect size (d = .70) indicating overall improvement. A paired-sample t-test showed that there was a significant increase from the pre-test to the immediate post-test, t(58) = −4.08, p < .001, but not from the immediate to the delayed post-test t(57) = −.416, p = .68, indicating that learners overall improved from the pre-test to the immediate post-test but remained the same in the delayed post-test. However, no significant Group effect F(2, 45) = .07, p = .94, and no significant effect of Time X Group F(3.85, 86.67) = 1.344, p = .26 were found. Their effect sizes were small to medium for Group (d = .50) and small for Time X Group (d < .001).
Descriptive statistics of oral production test.
Note. Maximum test score is 100.

Visual representation of mean production test scores and standard deviations.
3 Results from stimulated recall
Finally, it is worth mentioning that none of the stimulated recall comments was related to noticing the difference between learners’ interlanguage and the target-like use of the regular past tense or recognizing the corrective nature of recasts. The comments illustrated their engagement in the communicative tasks and their anxiety about speaking in front of the class. In total, about 30% of the comments were about the participants’ engagement in the game aspect of the tasks (e.g. I really wanted to guess right), 30% about not remembering anything specific (e.g. I don’t remember), 15% about their effort in comprehension (e.g. I paid attention to [the student’s name] because his character name was on my card), 5% about being anxious about speaking up in class (e.g. I was nervous because I needed to speak next), 5% about evaluating peers’ performance (e.g. I think his English is very good), and the remaining 10% were about other topics.
VIII Discussion
This study examined whether gestures, when used in addition to verbal recasts, can help ESL learners acquire the regular past tense. The researcher hypothesized that recasts combined with gestures promoted better learning, for the following reasons. First, the gestural studies collectively reported that seeing gestures is helpful for vocabulary learning (but no similar intervention studies have been published for grammar learning hitherto). If so, seeing gestures that illustrate a regular past tense may metaphorically also facilitate its acquisition. Second, according to Ellis et al. (2006), metalinguistic feedback is more helpful for teaching the regular past tense than recasts. If the gesture can carry the same linguistic information as metalinguistic feedback, the GR should be more effective than VR. In this discussion, first the interpretation of results is presented. Then, the following two sections introduce why VR and GR did result in learning in this study.
1 Interpretation of test scores and stimulated recall comments
The results of the first analysis revealed that all the conditions remained the same after receiving feedback in the grammar test. This is likely due to the high ceiling effect of the test scores. (The participants’ average pre-test score was 9.82 for the control group, 10.18 for the recast condition, and 10.07 for the gesture and recast condition, out of 12.50 possible points.) This also indicates that the participants already had a good understanding of the regular past tense as an element of their grammatical knowledge.
Unlike the grammar test scores, the pre-test scores for the production test were between 21.40 and 23.40 out of 100 across three conditions, meaning that the participants had some room for ‘growth’ after the treatment. In addition, it indicates that, although participants seem to have had explicit grammatical knowledge, they were still unable to use the regular past tense in spontaneous production. The statistical analysis showed that all the conditions, including the control, improved equally from the pre-test to immediate post-test. This indicates that the communicative activities used during the treatment had some effect on subsequent production tests.
Interestingly, none of the comments obtained from the stimulated recall comments exhibited learners’ noticing. Combined with the results obtained from the pre-tests and post-tests, it is possible to argue that both types of recasts fostered learning even with lack of noticing. However, this argument needs to be interpreted with caution because there was a methodological issue as addressed in the limitation section.
2 Limited effects of verbal recasts on production test of the regular past tense
The results of the oral production test did not indicate the benefits of verbal recasts for the acquisition of the regular past tense. This is consistent with existing studies pointing out the difficulty of using recasts effectively to help in acquiring the regular past tense, such as those of Yang and Lyster (2010) and Ellis et al. (2006), which reported that recasts were not as effective as other feedback types such as prompts (in Yang & Lyster, 2010), and metalinguistic feedback (in Ellis et al., 2006). As the findings of Erlam and Loewen (2010) suggest, some structures may be better developed via corrective feedback than others. Erlam and Loewen compared the effectiveness of implicit and explicit feedback for French noun–adjective agreement. Fifty learners of French as a foreign language participated in the study. The learners engaged in four communicative tasks that were designed to elicit a target structure. Each session involved four to seven learners, and a researcher provided either implicit or explicit feedback. Their analysis of the learners’ pre-test and post-test scores showed that no significant difference existed between the two feedback modes. This finding indicates that some structures, including French noun-adjective agreement as reported in Erlam and Loewen and a regular past tense as reported in this study, are less likely to benefit from corrective feedback regardless of its explicitness or its inclusion of metalinguistic information. However, further studies are needed to determine what structures are least likely to be improved through corrective feedback.
It must be noted that some existing studies, such as Doughty and Varela (1998) and Han (2002) report contradictory findings regarding the effectiveness of recasts for the development of the regular past tense. Doughty and Varela (1998) and Han (2002) reported that the recasts were helpful for fostering past tense acquisition. The differences in the results seem to be attributable to differences in the characteristics of the recasts and the contexts in some studies. As opposed to the verbal recasts used in this study, Doughty and Varela (1998) provided corrective recasts, which are a combination of repetition and recasting. As the name suggests, in corrective recasts the emphasis is on correction, which may have motivated learners’ noticing of target structures and improved their learning outcomes more than when traditional recasts were provided. Regarding Han (2002), the feedback was given in a lab setting, which has been reported to show a stronger effect than a classroom-based study, as shown in a meta-analysis by Li (2010). In addition, although the participants of the present study were expected to use the regular past tense consistently during communicative tasks, the participants in Han’s study used various tense types depending on the situation of the tasks, which may have required more cognitive effort, as they needed to choose the correct tense. These contextual differences may account for the greater impact of recasts in the previous two studies than in this study.
Finally, it may be postulated that other types of feedback might have worked better for regular past tense acquisition. In Ellis et al. (2006) and Yang and Lyster (2010), learners developed their oral production of regular past tense verbs when metalinguistic feedback without positive evidence or prompts was used instead of recasts. One commonality between these effective feedback types, based on these two studies, is the use of output-promoting feedback that encouraged the participants’ self-repair. As Egi (2010) suggested, participants of the present study may therefore have benefited more if they had been told to produce self-repair.
3 Limited effects of gesture-enhanced recast on oral production test of the regular past tense
In order to supplement the lack of metalinguistic information of recasts, which is reported to be effective for teaching the regular past tense (Ellis et al., 2006), this study added gestures that illustrate the source of the problem in learners’ original utterances, that is, the lack of past tense. The benefits of gestures during recasts were anticipated for the following reasons, (1) The studies show that seeing gestures helps L2 comprehension (e.g. Dahl & Ludvigsen, 2014; Kida, 2008; Sime, 2008; Sueyoshi & Hardison, 2005). If so, the participants should be able to understand the meaning or intention of recasts better when they are accompanied with gestures (2) A collection of gestural studies has shown that seeing the gestures that describe the meaning of the L2 vocabulary helps vocabulary learning (e.g. Kelly et al., 2009; Macedonia et al., 2011; Macedonia & Klimesch, 2014; Morett, 2018; Tellier, 2008). Then, seeing teachers’ gestures that metaphorically explain the meaning of a grammatical function may help with L2 grammar learning. (3) In the literature on feedback studies, researchers have pointed out that recasts have the drawback of lacking noticeability (e.g. Havranek, 1999; Leeman, 2000; Lyster, 1998, 2001, 2004; Lyster & Ranta, 1997; Sheen, 2004, 2007; Slimani, 1992). By nonverbally illustrating the source of error using gestures, the recasts’ corrective nature may become more recognizable, and the source of error may thus become more noticeable. However, the results showed that the performance of the production test did not differ significantly among three conditions. Recasts did not help the acquisition of the regular past tense, even when the gestures were added. In this section, I first speculate why recasts with gestures were not effective in relation to the duration of time that participants had to process recasts. Then, comparing the results from this study with the present author’s 2016 study (Nakatsukasa, 2016), I discuss the varying effects of gesture-enhanced recasts in relation to the linguistic targets, then, the comprehensibility of different types of gestures.
First, it may be speculated that learners needed a longer time to process the feedback, even when metalinguistic information is provided gesturally. On average, it took about two to three seconds to provide recasts with or without gestures in this study, whereas the participants in Ellis et al. received a slightly longer feedback and the participants had more time to process it. Anecdotally, after the data collection was completed, the researcher asked an ESL teacher to act out an example of metalinguistic feedback in Ellis et al. (2006), and the duration of feedback was found to be about four to five seconds, giving learners more time to process the information. The example of feedback used in Ellis et al. is shown below:
He kiss her.
Kiss – you need past tense.
He kissed.
(Ellis et al., 2006, p. 353)
Second, the results of this study contradict the findings of the present author’s previous study (Nakatsukasa, 2016). The first major difference is that the earlier study reported the significant advantage by the GR and VR group over the control group in the acquisition of locative prepositions. This contradictory finding may have resulted from different types of learning that are needed to acquire the locative preposition and the regular past tense. The author’s (Nakatsukasa, 2016) linguistic target, locative preposition, follows item-based learning. By contrast, the regular past tense of the present study may have been approached by rule-based learning. Previous studies illustrated that the acquisition of irregular past tense verbs tends to benefit more from corrective feedback than of regular past tense verbs (e.g. Boom, 1998; Yang & Lyster, 2010). Yang and Lyster reported the same findings specifically in relation to recasts. They argued that regular past tense verb acquisition involved rule-based learning and irregular past tense verb acquisition involved item-based learning. Drawing on Skehan’s dual-model system (Skehan, 1998), Yang and Lyster speculated that learners found accessing rule-based systems more difficult and were only able to do so following prompts in circumstances in which they were required to apply rules in actual production. By contrast, irregular past tense verbs were stored in an exemplar-based system, which does not require internal computation, resulting in easier retrieval than regular past tense verbs. This argument seems to justify the lack of benefit from recasts observed in the present study.
The next contradictory finding between the present study and the author’s previous study (Nakatsukasa, 2016) is that GR did not outperform VR long-term. The 2016 study, which found that seeing gestures is helpful for acquiring locative prepositions with clear one-to-one form-meaning mapping, is in line with Dual Coding theory, which argues that receiving information via multiple modalities is helpful particularly for concrete items (Clark & Paivio, 1991). By contrast, this study indicated that gestures did not provide additional help for regular past tense learning. This suggests that the English regular past tense lacks concreteness in comparison to locative prepositions and that VR is not an optimal form of instruction for the presumably less concrete linguistic features.
In addition to the concreteness of target structure, different types of gestures used in the 2016 study and the present study may have led to the contradictory findings. First, in the 2016 study, the four gestures illustrated the concept of locations (in, on, at, next to) as opposed to the one gesture used in this study. Using four different types of gestures may have allowed the participants to pay more attention to the gestures than in this study, where they saw only one type of gesture throughout the tasks. The second difference is that the gestures that illustrate locations may have been more comprehensible than the past tense gesture. In the previous study, gestures represented an image schema (e.g. in was represented by a container (left hand) and an object (right hand)), which may have allowed easier interpretation of the gestures. In this study, however, it may not have been as easy because the concept of the timeline was not introduced. If an illustration of a horizontal timeline, such as future (front) and past (back), had been presented in a form of explicit instruction before the communicative activities, the participants may have been better able to benefit from the gestures used in this study.
Nonetheless, this finding is interesting, because descriptive classroom studies have shown that teachers frequently incorporate gestures for teaching tense and aspect. Although no intervention studies have compared the effectiveness of instruction with and without gestures for tense and aspect, to the best of my knowledge, the results of this study suggest the limited effects of using gestures as a part of grammar instruction.
IX Conclusions
This article examined whether gesture-incorporated recasts help the acquisition of the regular past tense by beginning-level ESL learners. This topic is significant for the following reasons. First, an increasing number of second language acquisition (SLA) studies have incorporated gestures; however, most intervention studies examined learning of the pronunciation of vocabulary, not grammatical development. Second, although a significant amount of research on corrective feedback has been published, very few studies have incorporated gestures as a variable. It was hypothesized in the present study that learners will benefit more from gesture-incorporated recasts than verbal only recasts, following the author’s 2016 study (Nakatsukasa, 2016) and based on other descriptive studies. However, the hypothesis was not supported. The results showed that there was no significant difference between the two recast conditions. Furthermore, neither experimental condition differed significantly from the control group. The following possible reasons were speculated on: (1) feedback types that explicitly present corrective nature and/or require self-repair may have been more suitable for the regular past tense than either type of recast, and (2) gesture-enhanced recasts may be suitable for teaching concrete linguistic items and those that require item-based learning, such as locative prepositions and vocabulary, as opposed to the regular past tense. Although a number of descriptive studies report that an instructor uses gestures for pedagogical purposes, it should not be assumed that seeing teachers’ gestures is universally helpful for L2 development. That is not to say, however, that seeing instructor’s gestures cannot help grammar learning at all. More intervention studies are needed to examine the efficacy of gestures for learning various linguistic items. It is possible that gestures may be effective for linguistic items that go under item-based learning (e.g. vocabulary and locative prepositions) and ineffective for rule-based items (e.g. the regular past tense). Further, a learner may need a longer time to fully understand the concept of the tense system.
This study is not without limitations. First, the recasts were provided only for the target structures during the communicative tasks. There were other linguistic errors that occurred during the interactions; however, they were not corrected. Providing recasts only to the specific structures may have enhanced the explicitness of the recasts more than the same recasts used in language classrooms, where, it may be assumed, the recasts would target a variety of structures. For better ecological validity, corrective feedback should be given that is not limited to the target structures.
The second concerns the lack of a long-delayed post-test. The time frame used for delayed post-tests varied greatly in existing studies. In the present study, the delayed post-test was administered seven to nine days after the completion of the sessions. However, as mentioned in Li (2010), the timing of the delayed post-test appears to affect assessment of the learning outcomes. Because of logistic difficulties, this study included only one delayed post-test, whereas an additional delayed post-test after a month may have yielded more information on the long-term effectiveness of feedback.
Third, the length of treatment was only 50 minutes. Although the author’s previous study (Nakatsukasa, 2016) found that this was enough for learners to acquire locative prepositions, learners may need a longer session to acquire the past tense, even when they know the rules explicitly. This also would have allowed learners to receive CF more frequently.
Fourth, the finding that the stimulated recall comments did not reflect learners’ noticing should be interpreted with caution, and the following methodological issue should be considered. Stimulated recall was conducted in English and not in the learners’ L1s. Given that they were not advanced speakers, the interview would ideally have been conducted in their L1. However, their wide range of L1s made it impossible to identify collaborators who could assist the stimulated recall in every relevant language.
Finally, learners’ perception of gestures was not assessed in this study because the gesture used in this study is very common based on the observational studies and the researcher’s personal observation of the ESL classrooms where the data was collected. However, because of their diverse backgrounds, some learners may have interpreted the meanings of gestures differently from what was intended. A brief interview with the participants would have been helpful. If a longitudinal study is to be conducted in the future, introducing the meaning of the gestures and using the relevant gestures constantly during class may be an ideal way to avoid this issue.
Despite these limitations, this study contributes to the field by being one of the few studies to examine the effects of gestures when used during corrective feedback. Specifically, the study showed that the effectiveness of gestures cannot be claimed across the board. More intervention studies involving various linguistic targets are needed to see what linguistic elements benefit from recasts with or without gestures and to examine whether the learning system is a crucial factor.
Pedagogically, the findings of this study suggest that gestures when used with recasts have a limited effect on linguistic items that go under rule-based learning. However, prior to incorporating this finding in a teacher preparatory program, systematic analysis of pedagogical gestures used in a language classroom and a series of intervention studies are necessary. Then, during teacher preparatory courses, prospective language teachers will be instructed in how to maximize the efficacy of corrective feedback.
Footnotes
Acknowledgements
I extend my gratitude to Drs Shawn Lowen, Susan Gass, Paula Winke, and Debra Hardison for their valuable comments in the earlier version of the manuscript. Thank you, Andrew Dennis and crisshasart for the beautiful drawings. Thanks to Drs Jenifer Larson-Hall and Aaron Braver for their assistance to statistical analysis and Texas Tech Women Faculty Writing Group for providing an opportunity to complete this manuscript. Lastly, I greatly appreciate ESL teachers and students of Michigan State University who collaborated with this me on this study.
Funding
This manuscript used data from my dissertation project which was funded by a Language Learning Dissertation Grant.
