Abstract
Research on corrective feedback (CF), a central focus of second language acquisition (SLA), has increasingly examined how teachers employ CF in second language classrooms. Lyster and Ranta’s (1997) seminal study identified six types of CF that teachers use in response to students’ errors (recast, explicit correction, elicitation, clarification request, metalinguistic cue, and repetition) as well as target linguistic foci (lexical, phonological, and grammatical errors). These taxonomies have remained dominant in observational studies conducted in a growing range of second language teaching contexts. Several studies have acknowledged that contextual factors may influence how teachers provide CF (e.g. Mori, 2002; Sheen, 2004) with few generalizable conclusions. The present study brings together research in this area in the first comprehensive synthesis of classroom CF research seeking to aggregate proportions of CF types teachers provide, as well as their target linguistic foci. Findings reveal that recasts account for 57% of all CF while prompts comprise 30%, and grammar errors received the greatest proportion of CF (43%). The study further identifies a range of contextual and methodological factors (i.e. moderators) that may influence CF choices across teaching contexts, such as student proficiency, teacher experience, and second/foreign language context. A clearer picture of the patterns of CF that teachers provide and the variables that influence these choices serves to complement the growing body of research investigating the efficacy of CF in second language pedagogy.
Keywords
I Introduction
Corrective feedback (CF) has received significant attention in second language acquisition (SLA) research. This line of research stems, pedagogically, from the shift towards communicative language teaching with a focus on form (e.g. Norris & Ortega, 2000) and, theoretically, from the long-standing interactionist tradition of SLA (e.g. Gass, Mackey & Pica, 1998; Plonsky & Gass, 2011). As this line of inquiry builds, numerous classroom observational studies have examined the priorities teachers tend to exhibit in their oral feedback choices across instructional contexts (e.g. Li, 2010; Sheen, 2004), and to what extent different CF types are noticed and incorporated by learners (Ellis, Basturkmen & Loewen, 2001; Lyster & Ranta, 1997). Both experimental and observational CF research examines the effects of feedback on second language (L2) development, measured by pre-posttests or the rate of uptake or repair in response to different CF types (e.g. recasts). Classroom observational research also investigates student affect in response to CF, noticing or awareness of CF, or how teacher beliefs influence the provision of CF, among other foci. In recent years a growing number of meta-analyses (18 to date; Plonsky & Brown, 2014) have synthesized various domains of CF research (e.g. oral, written, computer mediated) with general findings that lend substantial support to the efficacy of CF in L2 learning. This body of synthetic research has helped illuminate when and how particular types of CF may be most effective (Li, 2010; Lyster & Saito, 2010; Mackey & Goo, 2007; Miller & Pan, 2012; Russell & Spada, 2006). While past CF meta-analyses have all investigated effectiveness, the present study aims to supplement these findings by synthesizing descriptive observational research on L2 teachers’ provision of CF in terms of type, extent, and linguistic target. A clearer picture of how practitioners choose to provide CF across contexts can lend ecological classroom insight and generalizability to the domain of oral CF research.
II Patterns of CF type and target in L2 classroom research
Lyster and Ranta (1997, henceforth L&R) introduced a refined taxonomy for coding incidental CF in meaning-focused L2 classroom instruction that included six prominent types of oral CF teachers provide. They reported the following distribution in the classrooms they observed: recasts (55%), explicit correction (10%), elicitation (11%), clarification requests (11%), metalinguistic cues (8%), and repetition (5%). From a pedagogical viewpoint, L&R’s study raised awareness of the full range of feedback options that teachers have available in providing CF. Their study also had important implications for L2 theory and research. Specifically, rather than assuming comparable effects for all CF types, they called attention to findings that teachers tend to rely heavily on one form of CF (recasts) but that recasts were found to be least effective in terms of eliciting uptake and repair. This finding spurred interest in the relationship between these and other feedback moves found in classroom interaction (Ellis et al., 2001; Loewen, 2003; Lyster, 1998; Miller & Pan, 2012; Panova & Lyster, 2002; Sheen, 2004, 2006). L&R’s taxonomy continues to be widely adopted in coding CF type in observational reports, allowing for comparable results across studies regarding choices teachers make in reactive focus on form.
Researchers have also generally followed L&R’s classification of the targeted linguistic forms of CF as phonological, lexical, and grammatical. Some have proposed that teachers’ choice of CF type (and their differential effectiveness) may depend on the characteristics of the targeted linguistic forms (Lyster, 1998; Mackey, Gass, & McDonough, 2000; Mackey & Goo, 2007). In one of the first studies to investigate error type as a potential mediator of learner repair, Lyster (1998) found that prompts (that lead students to self-repair) were more likely than reformulations (that supply the correct forms, usually as a recast) to lead to repair of lexical and grammatical errors, whereas reformulations were more likely to lead to repair in phonological errors.
III Exploring the role of context in mediating CF choices
One of the first observational CF studies to examine contextual variables that influence CF was Sheen (2004), which synthesized findings from four observational studies in an effort to identify moderating variables that may influence the proportion of CF types teachers utilize across varying contexts. This study compared rates of CF among Korean EFL, French immersion (Lyster & Ranta, 1997), and New Zealand ESL (Ellis et al., 2001) and investigated variables in instructional setting (e.g. EFL, ESL, immersion, syllabus type, class size) and learners (e.g. age, proficiency, L1, experience with L2 instruction) in interpreting teachers’ CF choices. The findings of Sheen’s study revealed that proportions of CF type vary considerably across contexts, as teachers in Canadian immersion and ESL settings provided lower rates of recasts than in the New Zealand ESL and Korean EFL settings. Furthermore, different rates of repair across settings led Sheen to posit that the efficacy of learner repair may be greater in instructional contexts where the focus of CF is more salient, that is, where students are accustomed to focusing on form explicitly.
In other primary research, Mori (2002) compared rates of reformulation and prompt CF types (see the methods section for a detailed description of CF categorization), across elementary grade levels finding the proportion of reformulations to increase and prompts to decrease in higher grades. Also examining age, Oliver (2000) found that children were more likely to receive recasts and adults negotiation strategies (confirmation checks and clarification requests). Ahangari and Amirzadeh (2011) examined the relationship between CF type and learner proficiency level, finding that at higher proficiencies, the proportion of recasts decreased and prompts increased. While these studies lend potential insight into the effects of contextual variables on teachers’ choices, the limited samples of teachers across individual studies cannot be assumed to be representative of their contexts.
IV The contribution of research synthesis in CF research
Russell and Spada (2006) suggest, ‘there is a need not only for a greater volume of studies on corrective feedback, but also studies that investigate similar variables in a consistent manner’ (p. 156). To help address these concerns, research synthesis has quickly gained prominence in applied linguistics as an effective systematic tool to review primary research (see also Norris & Ortega, 2006; Oswald & Plonsky, 2010). One of the benefits of meta-analytic research is the ability to retrospectively examine variables in research context: questions that would be less practical or meaningful to investigate in individual primary studies with limited samples (Plonsky, 2012). In this spirit, Lyster and Saito’s (2010) meta-analysis of the effectiveness of CF examined instructional setting and learner age as potential moderators, finding that younger learners benefited more than older learners, with larger effects for prompts than recasts. Li’s (2010) meta-analysis investigated additional variables finding:
larger effect sizes for CF during activities involving mechanical drills than for meaning-focused activities;
studies involving L2 English yielded larger effects than L2 French or Spanish;
foreign language (FL) contexts resulted in significantly larger effects for CF than second language (SL) contexts; and
CF was more effective from native than non-native speakers.
As synthetic research continues to explore the effectiveness of CF taking into consideration an increasingly broader scope of potentially moderating and mediating variables, questions remain in the domain that could benefit from a meta-analytic perspective.
Much of our current understanding regarding teachers’ tendencies in feedback choices can be traced to just a small set of influential studies (Ellis et al., 2001; Lyster & Ranta, 1997; Sheen, 2004). As primary data builds regarding teachers’ CF moves, there is a growing opportunity to determine the extent of generalizability of past findings to various populations. Accumulating data across observational studies can help reveal how well these sampled contexts (e.g. proficiency levels, learners’ ages, teachers’ experience) represent L2 classrooms in general in terms of the provisions of CF as well as when systematic differences across contexts might exist, a hallmark of meta-analysis (Plonsky, 2012).
To be clear, the present study differs from past meta-analyses on CF in that it does not investigate effectiveness. Effectiveness in observational research is often operationalized as uptake or repair (i.e. students’ immediate response to CF), however many of the studies in this data set focus on other aspects of the provision of feedback (e.g. the impact of teacher beliefs or preferences about CF, the rate at which learners notice CF, the impact of CF on student affect). As more observational studies contribute uptake data, meta-analytic methods should play an important role in understanding how CF type influences uptake as well. It should also be noted that effect size refers not only to values that describe the magnitude of the effects of a treatment. In this study, effect size is defined more broadly as ‘a quantitative reflection of the magnitude of some phenomenon…of interest’ (Kelley & Preacher, 2012, p. 140), and refers to aggregate percentages of feedback types and targets provided by teachers in L2 classrooms. Another key feature in meta-analytic research, in addition to providing average effect sizes, is the ability to explore and explain variation among those effects as a function of moderating variables (Bangert-Drowns, 1995; Plonsky & Oswald, 2012). In order to provide greater generalizability of past findings and to complement findings related to effectiveness of CF relative to context (e.g. Li, 2010; Lyster & Saito, 2010), the following research questions were formulated:
What is the aggregate proportion of CF types teachers provide in L2 classrooms?
What is the proportion of target linguistic foci (lexis, grammar, phonology) across observational CF studies?
What contextual variables (student, teacher, instructional setting, or methodological) moderate teachers’ feedback choices most significantly, in terms of feedback type and linguistic structure targeted?
V Method
1 Data collection: Identifying primary studies
Studies in the present meta-analysis were limited to descriptive, observational classroom data that incorporates coding of CF types administered by teachers in natural classroom discourse. Experimental studies (i.e. studies looking to test the effectiveness of a particular treatment) were excluded, as feedback type is usually controlled rather than naturally occurring in such studies. Studies that included preemptive focus on form (Ellis et al., 2001) were excluded to focus exclusively on immediate CF moves teachers make in reaction to errors. Interaction studies that investigated feedback in dyadic interaction between students were also excluded, keeping the focus on teacher behavior while leading classroom activities. A final inclusion criterion was for the CF categorization scheme used by the researcher(s) to either overlap with or be transferrable to L&R’s dominant taxonomy (or in a way that could be transferrable to the prompt-reformulation dichotomy with recast data provided separately), in order to ensure that constructs aligned and that data could be grouped together for comparison.
The literature search was not limited to published research with an attempt at including unpublished ‘fugitive literature’ (Norris & Ortega, 2006), including any available doctoral and masters dissertations, as well as conference and working papers that could be retrieved in an attempt to both minimize publication bias (Oswald & Plonsky, 2010) and maximize the robustness of the data set. The search for candidate studies ended in January 2013, and only studies written in English were included for the sake of practicality.
Following Plonsky & Brown’s (2014) findings of the most common electronic databases used in meta-analyses in applied linguistics – the Education Resources Information Center (ERIC), the Linguistic and Language Behaviors Abstracts (LLBA), PsycINFO, and ProQuest Dissertations and Thesis databases – were searched, as well as Google and Google Scholar. Search terms included ‘corrective feedback’, ‘classroom’, ‘focus on form’, ‘second language’, and ‘observational’. Each database was searched until a replication of findings occurred, or more than 20 consecutive results revealed studies of little relation to the search focus. An ancestry search in the initial articles resulted in additional studies, and a forward citation search for Sheen’s (2004) study was conducted within ProQuest Web of Knowledge.
2 Coding for CF types
The investigation of CF types in observational studies has produced a variety of categorization schemes ranging from dozens of categories to several dichotomous views of CF, summarized in Table 1. Early studies such as Fanselow (1977) and Chaudron (1977) provided the groundwork for classroom discourse research on CF by introducing taxonomies that included 16 and 28 different types of CF, respectively. Later, L&R consolidated their overarching taxonomy to six error types, which has proved the most common classification incorporated in 81% of the sample. Their taxonomy distinguishes two types of CF. First, negotiation of form, which calls for the learner to draw on their own knowledge to attempt a follow-up effort to produce a repaired utterance, sometimes by supplying an embedded hint (metalinguistic cue or elicitation) and other times without (clarification request or repetition). This type of negotiated feedback that requires a response from the learner, either explicitly or implicitly so, is also commonly referred to as prompt (Lyster & Saito, 2010), which will be the term used to report findings in the present study, but elsewhere as ‘output-prompting’ (Ellis, 2009), ‘initiations to self-correct’ (Samar & Shayestefar, 2012) and ‘elicit’ (Wang, 2009). In contrast, CF that supplies the correct form of the ill-formed utterance, such as recast and explicit correction, have been grouped together as reformulation (Panova & Lyster, 2002), also referred to as ‘input-providing’ (Ellis, 2009), or simply as ‘recasts’ since the occurrence of explicit correction is rare in many studies.
Corrective feedback (CF) type taxonomies in CF literature.
Several researchers have also focused on the distinction between implicit and explicit CF (Ellis, 2009; Panova & Lyster, 2002), lending theoretical relevance regarding the importance of noticing in SLA. However, this categorization requires a more detailed analysis of CF types, particularly for recasts, which must be considered on a spectrum of salience depending on length, scope, and intonation among other variables (Sheen, 2006). Due to this limitation, the implicit/explicit dichotomy was not considered in this study. As Lyster and Saito (2010) convincingly argue, the dichotomy of prompt and reformulation (which consist dominantly of recasts) may be most convenient for pedagogic considerations, as teachers seem better able to make the binary decision of supply or withhold in their online CF choices. With these considerations in mind, as well as the practical need to code study results systematically, CF types were coded according to L&R’s six individual categories, and prompts were also separated out along side recasts, as recasts are the most commonly studied CF type throughout the literature. Data for both total raw frequency and percentages were gathered for each of these categories.
Coding a wide range of studies that focused on varied primary research questions proved challenging in categorizing data, which led to important exclusion and coding decisions. For instance, three studies that report observational data on CF types could not be included because their error categorization could not be transferred into the dominant categories used in the coding scheme (Kamiya, 2012; Chaudron, 1977; Musayeva, 1998). One additional study used L&R’s categories but failed to provide complete data and was also regrettably excluded (Vaezi, Zand-Vakili, & Kashani, 2011). Another critical decision involved the coding of data sets that combined CF data from multiple teachers. Wherever possible, data sets were coded for individual teachers separately in order to retain the unique identifying variables to analyse for moderation of effect sizes. At times, coding for teacher data were combined in the primary studies involving multiple teachers: sometimes for a single type of data within a study, such as target linguistic foci, but then separated out by each teacher for other data, such as CF types. In order to maintain accurate teacher-level data, a decision was made in each of these instances to either code as ‘mixed’, which would not contribute to moderator findings, or preferably to code separately for variables in which two or more teachers shared similar characteristics, each falling into the same nominal coding category. For instance, when CF data was combined for two teachers in a single study but each teacher fit into the same ‘limited experience’ category (3–6 years), they were coded together and the data were averaged and treated as a single set.
3 Coding for study variables
Once studies that met the inclusion criteria were identified, each was surveyed according to the coding scheme in Table 2 to measure the presence of different design features as well as contextual variables. The coding scheme was designed by first consulting some of the common variables considered in educational and applied linguistics meta-analytic research (Cooper, 2010; Oswald & Plonsky, 2010), such as study identification and methodological features. Additional variables were identified after analysing the methodological and substantive features common to the sample (e.g. ‘teachers’ knowledge of research focus’), with emphasis on contextual variables that may influence CF provision (classroom setting, student, and teacher variables), as discussed in the literature review. Table 2 specifies the coding categories that largely follow authors’ description. Proficiency, a problematic and often idiosyncratic variable in L2 research, was coded by inference when no description was provided, coding beginners in their first year of language study and advanced students with reported TOEFL scores over 550. For classroom instruction type, another high inference variable, ‘decontextualized grammar teaching’, was operationalized as classroom time devoted dominantly to explicit grammar lessons or activities involving mechanistic drills, as described by the authors. Another meta-analyst coded a subset of 20% of the studies separately, resulting in 89.2% inter-rater reliability.
Data coded from primary studies.
VI Analysis and results
After coding the studies, descriptive statistics were calculated, including means, standard deviations, standard errors, and 95% confidence intervals (CI). As noted earlier, effect size in this study refers to aggregate percentages of (1) each CF type and (2) each target linguistic foci, rather than an effect of a treatment, typically reported as Cohen’s d (Kelley & Preacher, 2012). Although it is customary to weight effect sizes by sample size or some other indicator of precision (e.g. inverse variance), effect sizes were not weighted because most percentages are based on a sample of one teacher. Finally, SPSS was used to analyse relationships between potential moderator variables and each of the effect sizes separately (Cooper, 2010; Plonsky & Oswald, 2012). Due to space considerations and the size of this data set, results for moderator analysis could not be reported across all variables, but rather reporting is limited to where differences occurred. If readers are interested, I would be happy to provide SPSS outputs for any other variables of interest via email.
The search resulted in a total of 28 studies, 16 from peer-reviewed journals, eight doctoral or master’s dissertations, and four conference or working papers. The studies include 52 separate data sets comprising 85 teachers across 11 countries, and including seven target languages. A total of 7,188 CF moves were tallied in over 466 reported classroom hours observed (10 of the 52 data sets did not report observation time).
1 Proportion of feedback types
Responding to Research Question 1, a total of 42 data sets included data following L&R’s taxonomy, while 49 sets supplied data to code for prompts and reformulations/recasts. Findings presented in Table 3 echo results from popular primary studies in that reformulations (66%, with recasts comprising 57%) outweigh prompts (30%), and that following recasts, the other feedback types are significantly less frequent with no statistically significant difference among them. Note that contents of the ‘other’ category varied across studies but generally comprised feedback on L1 use, or in some studies, multiple CF moves for a single error. Evidently, L2 teachers tend to supply reformulations about twice as often as they elicit reformulated responses from their students.
Average percentages of corrective feedback (CF) types.
Notes. a = number of samples/teachers contributing data. Totals for means do not equal 100% because means derive from a varying number of data sets, as CF types were not reported uniformly across all data sets.
2 Target linguistic foci
In response to Research Question 2, findings reveal that grammatical errors received the greatest load of total CF (43%), followed by lexical errors (28%), and phonological errors (22%) (see Table 4). Of the 21 sets that included this data, 11 sets matched this order of distribution. Although the confidence intervals for lexical errors and phonological errors overlap, those for grammar and the other two categories do not, indicating that the greater frequency of feedback moves resulting from grammar errors is statistically significantly at the .05 level.
Average percentages for target linguistic foci of corrective feedback (CF).
Note. Total mean does not equal 100% as a few studies included an ‘other’ category defined differently across studies; always less than 10% of the total within individual studies.
3 Moderator analysis of feedback type and linguistic foci
Research Question 3 examines the extent to which contextual variables may moderate CF patterns. Moderator analyses is often the most interesting phase of a meta-analysis in that questions not addressed in the primary studies can often be answered. Results of the moderator analysis revealed several statistically significant differences in terms of proportions of CF provided (with non-overlapping 95% confidence intervals), as well as cases that may suggest noteworthy patterns as potential moderators but require further investigation as this line of research builds. With the limited number of data sets that represent each category, interpretations of the data should be cautious.
a Student variables
While little variation from the aggregate percentages was found in CF type regarding students’ L1, findings suggest student proficiency may influence teachers’ CF choices. Although 95% confidence intervals overlap, teachers supplied more recasts to advanced proficiency students (k = 8, M = 65%, SD = 0.16, 95% CI [0.51–0.78]) compared to beginners (k = 9, M = 51%, SD = 0.19, 95% CI [0.36–0.66]).
b Classroom setting variables
Teachers’ CF choices in relation to target language, immersion setting, or class size revealed little difference from the aggregate. However, level of education, second and foreign language context, and meaning-focused vs. form-focused instruction revealed notable patterns. Concerning level of education, adults received a significantly greater proportion of recasts than high school students (95% confidence intervals do not overlap) but, surprisingly, elementary-level students received a similar rate of recasts/prompts as adults (see Table 5). In addition, younger learners received a significantly greater proportion of CF targeted at lexis (M = 36%, SD = 0.02, 95% CI [0.32–0.42]) compared to adults (M = 23%, SD = 0.11, 95% CI [0.16–0.29]), while adults received a greater proportion of CF on pronunciation (M = 27%, SD = 0.18, 95% CI [0.16–0.37]) compared to elementary students (M = 10%, SD = 0.05, 95% CI [–0.03–0.23]).
Corrective feedback (CF) type based on level of education (age).
Analysis of SL and FL teaching context revealed that teachers in SL contexts targeted significantly more phonological errors (k = 15, M = 36%, SD = 0.17, 95% CI [0.20–0.52]) than teachers in FL contexts (k = 36, M = 15%, SD = 0.09, 95% CI [0.10–0.20]). Lexical errors as target were more consistent between contexts, while grammar was the target of CF more often in FL (M = 46%, SD = 0.19, 95% CI [0.35–0.57]) than SL contexts (M = 36%, SD = 0.12, 95% CI [0.25–0.48]) with overlapping confidence intervals.
Although only two data sets were coded as decontextualized grammar teaching, teachers in these settings offered a greater proportion of prompts (M = 46%, SD = 0.14, 95% CI [–0.79–1.71]) than teachers in communicative language teaching classrooms (M = 32%, k = 22, SD = 0.22, 95% CI [0.22–0.42]).
c Teacher variables
While no patterns were found in CF provision among native, non-native, and bilingual teachers, teaching experience and education/training appear to moderate CF choices. Results suggest that more teaching experience may be related to less attention to phonological errors and possibly greater concern for lexical errors (Table 6). These findings, however, should be interpreted cautiously in view of the limited number of studies that coded for teacher experience (see Figures 1–3).
Target of corrective feedback (CF) based on teacher experience.

Teacher experience as moderator of corrective feedback (CF) linguistic target: Grammar target.

Teacher experience as moderator of corrective feedback (CF) linguistic target: Lexis target.

Teacher experience as moderator of corrective feedback (CF) linguistic target: Phonology target.
While teachers in only two data sets were coded ‘without education in L2 teaching’, data suggest that teachers with more training may provide a greater proportion of prompts, placing greater demand for repair on their students (Table 7, Figure 4).
Corrective feedback (CF) type based on teacher education.

Teacher education as moderator on corrective feedback (CF) type.
4 Methodological considerations
An important role of meta-analytic research lies in its ability to assess the methodological rigor and transparency of primary research in a given domain and offer insight for future methodological consideration with the aim of improving study quality and generalizability (Oswald & Plonsky, 2010). Although the research purposes varied across studies in this data set and therefore naturally include disparate methodological features, future observational CF research would benefit by reporting on study variables more comprehensively and consistently. Table 8 summarizes the frequency of studies that incorporated different methodological features, revealing a lack of reporting about teacher backgrounds, that teacher data were often mixed, and that very few studies reported an estimate of rater reliability.
Methodological features as reported in the sample of primary studies.
In addition to tracking features of methodological design, the ability to assess methods as moderating variables is another important contribution of meta-analysis that can provide perspective on variations in methods and how these design choices affect outcomes. A notable finding in this regard reveals that when researchers informed teachers being observed of the specific research focus (CF), the teachers provided nearly 40% fewer recasts (k = 7, M = 34%, SD = 0.25, 95% CI [0.11–0.58]) in favor of prompts, compared to teachers who had no knowledge of the research focus (k = 15, M = 56%, SD = 0.18, 95% CI [0.46–0.67]). In contrast, when teachers were aware that the research focus was on interaction (rather than CF specifically), the outcomes were more typical of data from studies in which teachers had no knowledge of the research focus at all (Figure 5).

Effect of teachers’ knowledge of researchers’ focus.
VII Discussion and conclusions
1 Implications of findings to CF research and practice
Results of this meta-analysis offer greater generalizability in patterns of classroom CF that can supplement research developments regarding effectiveness across CF type and target. In addressing Research Question 1 (What is the aggregate proportion of CF types?), findings indicate that L&R’s findings from 18.3 hours of classroom data appear relatively generalizable across contexts, learners, and so forth. Specifically, the average proportions for each CF type found in the aggregate data are remarkably close to L&R’s findings, with nearly matching proportions for recasts (55%) and metalinguistic cues (8%), and within only a few percentage points for explicit correction, elicitation, and clarification. Surprisingly, their data matches more closely to the aggregate results than any of the subsequent primary data sets individually. This overlap is encouraging considering the spotlight on L&R’s results for nearly two decades. Looking across samples we can conclude that L2 teachers do supply a greater proportion of recasts than prompts (with non-overlapping 95% confidence intervals).
As some research suggests greater general effectiveness of prompts over recasts in L2 development (e.g. Lyster & Saito, 2010), the finding that prompts comprise only 30% of the teachers’ CF suggests that their use of CF does not align closely with what is now known about CF effectiveness. As has been noted across the literature, recasts may be relied on to avoid disrupting communicative focus (on meaning) in classroom discourse. Recasts may also simply come more naturally for teachers (i.e. requiring less online cognitive effort) particularly for novice teachers, as the findings in this study suggest. The seemingly disparate findings of observational and experimental research support the need for more conclusive evidence regarding the relative effectiveness of prompts and reformulations and how these CF types operate under different conditions. Echoing L&R’s call for more variety in CF choices, and considering that more experienced teachers may supply more prompts, a case appears to be developing for teachers to generally expand on their provision of prompts, although specifically when and where more prompts would be beneficial must be further explored through moderator analysis of the effectiveness of CF in meta-analyses.
Shifting attention to the types of errors that teachers target with their feedback, a greater emphasis on grammatical errors found here falls in line with prior research (Lyster, 1998; Mackey et al., 2000). However, Lyster et al. (2013) point out that learners may be more perceptive of CF targeted at lexis and phonology, as these errors could more likely inhibit comprehensibility and therefore may prove more salient. Likewise, Mackey and Goo’s (2007) meta-analysis found larger effects from CF for lexical than for grammatical development. Further, results from this synthesis do not conflict with Lyster’s (1998) findings that learners’ lexical errors received the highest rate of feedback in terms of the overall proportion of errors made (80% of students’ lexical errors received feedback compared to 70% of phonological and only 56% of grammatical errors). In contrast, the current study focuses on the proportion of teachers’ total moves, revealing that the distribution of teachers’ CF is more likely to target grammar than other error types. This is likely due to a higher rate in terms of overall quantity of grammatical errors produced, although total errors produced by learners are rarely recorded in these studies. Rather, attention is usually focused to those errors that received feedback. However, considering that more teaching experience may result in more CF directed towards vocabulary, concerted effort by teachers to shift focus of more CF to vocabulary may prove beneficial in some contexts, although further research comparing amenability of morphosyntactic, lexical, and phonological errors is needed.
With respect to variables that potentially moderate teachers’ choice in CF type and target, as mentioned above, interpretations of findings should be cautious considering potential sources of variability, such as sampling variability, particularly with limited data (Cooper, 2010). With this limitation in mind, results suggest relationships between several variables that offer a deeper perspective into how CF is provided, which can improve interpretations of findings across the domain. For instance, regarding learner proficiency as a potential moderator, findings that suggest higher proficiency students receive a greater proportion of recasts do not support Ahangari and Amirzadeh’s (2011) results in which more proficient learners received a higher proportion of prompts. This may be explained by teachers’ trust that higher proficiency students are more capable in recognizing and utilizing recasts more readily than lower proficiency students.
In regard to educational context, Lyster and Saito’s (2010) meta-analysis found that younger learners tend to benefit more from CF, finding stronger delayed effect sizes, particularly for prompts. As a result, they posited that CF ‘engages implicit learning mechanisms that are more characteristic of younger than older learners’ (p. 293). Findings in the present study reveal that high school students received significantly fewer recasts and a higher proportion of prompts than adults across the sample; however, elementary students received a similar proportion of recasts/prompts to adults. If younger learners benefit more from prompts, as Lyster and Saito’s (2010) findings suggest, it may be beneficial for teachers at the elementary level, in particular, to make more concerted effort at supplying prompts. On the other hand, higher rates of recasts for adults may be partially explained by greater concern for their pronunciation, as adults receive significantly higher proportion of CF for phonological errors, in which recasts are likely to be used for modeling. In comparing SL and FL settings, the finding that teachers target significantly more phonological errors in SL contexts makes intuitive sense as pronunciation would likely be a greater concern for students residing in the target language context, whereas less concern for pronunciation would be expected in homogenous FL classrooms.
Knowledge of the patterns in teachers’ CF relative to context can also be valuable in considering students’ preferences. Results of research on student preferences for CF varies considerably according to learners’ backgrounds, previous and current language learning experiences, proficiency levels, and so forth (Brown, 2009; Loewen, et al., 2009; Schulz, 2001). The awareness of patterns that teachers tend to be susceptible to in providing CF in particular contexts could complement student survey data to better accommodate student needs. For example, if needs analysis data from students in an adult course revealed greater preference for prompts, awareness that adults tend to receive more recasts could help a teacher focus effort in providing the type of CF that she may naturally be less inclined to provide.
2 Future directions for CF research
To move this line of research forward and continue to inform classroom practice, a more substantial number of observational studies are needed that employ more detailed coding of feedback sequences, contextual variables, and error types. Several studies have adopted more refined coding of error sequences, for example, accounting for length of exchanges (Havranek, 2002; Loewen & Philp, 2006; Margolis, 2007). Sheen (2006) elaborated on recast type, creating six categories (mode, scope, reduction, length, number, and type of changes), which could assist in more detailed coding of CF. More studies like Lee (2013) are needed that include learner repair data (rather than limited to uptake, which can be operationalized generally as the presence of learners’ response to CF, whether correct or not) for each CF type, and even more ambitious would be to code for error type in relation to CF type (e.g. Sheen, 2006, found that recasts on articles were not salient enough to be noticed).
Results of the methodological features analysis suggest areas in need of more detailed reporting to guide future observational CF research. A lack of information about teacher background, such as teaching experience and training, has limited the moderator findings in this meta-analysis. More consistent inclusion of teacher background data could offer insight into the differences between highly trained teachers’ tendencies relative to novice teachers. Do more highly trained teachers provide a higher proportion of prompts and focus more heavily on lexical errors, as the limited sample sizes from this study suggest? When possible, more consistent separation of data for individual teachers would help in future meta-analytic efforts to investigate teacher variables across studies. Another potentially interesting variable, as an anonymous reviewer noted, pertains to the distinction between immersion and intensive language programs as moderators of CF distribution (Lyster & Mori, 2006; Sheen, 2004). However, many studies failed to report details of the classroom context, such as immersion, intensive, or elective, and therefore samples across such categories were too small to compare. In addition, more consistent reporting of estimates of rater-reliability measures would help improve methodological rigor.
Another purpose of meta-analysis is to identify areas in need of further study (Oswald & Plonsky, 2010). Results of this meta-analysis suggest the CF domain would benefit in particular from more studies that report on observation of decontextualized grammar teaching to provide comparison with classroom activities with communicative focus. Only two studies explicitly stated that classroom instruction was overtly and dominantly form-focused wherein a higher proportion of prompts occurred than in communicative based approaches. With more studies of this kind, Lyster and Mori’s (2006) Counterbalance Hypothesis could be investigated, which posits CF that counterbalances a classroom’s predominantly communicative orientation is more effective than CF matching a classroom’s orientation (i.e. recasts may be less noticed in communicative classrooms). If prompts indeed occur at a higher rate within decontextualized grammar instruction, it may help explain Li’s (2010) findings of larger effect sizes for CF during activities involving mechanical drills than for communicative-focused activities. There is a clear need for more evidence to better understand the influence of instructional type on effectiveness and provision of CF.
Other variables that have received scant investigation, particularly in relation to effectiveness, include measures of motivation and purposes for learning, learners’ metalinguistic knowledge, and student preferences, as well as more detailed analysis of learners’ opportunities for modified output (whether response to CF is encouraged or discouraged; Mackey & Goo, 2007). In general, to help establish clear patterns across studies of classroom CF there is a need for a greater volume of studies that isolate and address in a consistent, systematic manner the myriad variables that could influence the provision or the effectiveness of CF. As research accumulates that offers wider perspective on when and how CF is effective in L2 teaching, a more nuanced understanding of teachers’ feedback practices in relation to particular linguistic features and teaching contexts will lend pedagogical and theoretical insight into the role of CF in L2 classrooms.
Footnotes
Acknowledgements
This study would not have come about without Luke Plonsky’s perspective in identifying a ripe area for meta-analysis and for his continued support and inspiration throughout the process.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
