Computer-Mediated Corrective Feedback to Improve L2 Writing Skills: A Meta-Analysis

Abstract

Written corrective feedback for improving L2 writing skills has been a debatable issue for more than two decades. The aims of this meta-analysis are to (1) provide a quantitative measure of the effect of computer-generated written feedback for improving L2 writing skills and (2) verify how moderators (i.e., adopted technology, task types, and learners’ language proficiency) mitigate the effectiveness of corrective feedback provided by computer technology for developing the L2 students’ writing fluency and accuracy. A comprehensive search was performed to collect the population of computer-mediated corrective feedback (CMCF) studies. The effect sizes were calculated for 14 primary studies with L2 participants (N = 1220). The findings indicate a large overall effect of CMCF (d = 1.21). A medium overall effect was found in using automated writing evaluation (AWE) technology for writing skills, whereas a large effect size was determined in using non-AWE technology. The results further indicate a large overall effect in using CMCF for both writing fluency and accuracy. As for the proficiency level of moderators, the results indicate a large overall effect in using CMCF among beginners and intermediate learners, whereas the overall effect is small among advanced learners. Limitations and recommendations for future studies are also raised in this study.

Keywords

computer-mediated corrective feedback writing accuracy writing fluency second language meta-analysis

Introduction

Written corrective feedback (WCF) has been the central issue of second language acquisition (SLA) studies. WCF is mainly prompted by its affordances of allowing learners to observe their language faults and help rectify these errors in their successive revised drafts, thereby leading to L2 writing development (e.g., Karim & Nassaji, 2020; Zhai & Ma, 2021; Zhang & Zhang, 2018; Xu & Zhang, 2021). Although the WCF of teachers is generally accurate and positively helps improve students’ writing ability, teachers cannot provide WCF for huge numbers of learners in the classroom as instantaneously as computers; even delivering delayed feedback is time consuming and requires effort among teachers. Aiming to solve the aforementioned issue, many technological devices have been launched to speed up the teachers’ role in providing corrective feedback of learners’ writing. Modern technology has provided immediate synchronous and asynchronous corrective feedback to language learners, which can benefit L2 writing accuracy and development (Shintani & Aubrey, 2016). In the current work, two skills of second language writing are examined from a meta-analytic viewpoint: (1) writing accuracy to ensure that students can master grammar, punctuation, and spelling and (2) writing fluency to deal with the students’ mastery of coherence, content organization, and appropriate use of vocabulary.

One of the potential gains of modern technological advancement is the creation of automated writing evaluation (AWE). The AWE is built with natural language processing (NLP), artificial intelligence (AI), or latent semantic analysis to provide L2 learners with advanced corrective feedback that go beyond polishing language accuracy, further helping them to improve their macro-skills, such as coherence, organization, and language content, in which only human teachers can usually implement (Hockly, 2019; Li et al., 2016; Stevenson & Phakiti, 2014; Zhai & Ma, 2021; Zhang & Zhang, 2018). These cutting-edge technologies provide immediate feedback to L2 writers, enabling them to trace the progress of their writing by raising collocational errors, linking students to language corpuses with respect to identifying correctly used expressions, and integrating peers and teachers’ feedback and the AWE feedback (Zhang & Zhang, 2018). AWE has the potential to analyze students’ errors and provide meaningful feedback not only in low-level form but also in terms of writing development content and rhetoric improvement (Cotos, 2014). This potential of AWE to aid L2 content development has been accomplished through the rapid development of technological devices in the modern age, enabling the program “to work by comparing a written text to a large database of writing of the same genre, written in answer to a specific prompt or rubric” (Hockly, 2019, p. 82). Examples of well-known AWE programs are Criterion, My ACEESS, Pigai, and e-rater®, which have been investigated in a myriad of studies, examining whether they have a positive impact on learners’ writing output (e.g., the comprehensive review of Stevenson and Phakiti (2014)). These programs not only aim to provide WCF to learners on their writing production, but they also have the potential to provide automated scoring. However, computer-generated feedback is not only restricted to AWE, as countless types of technology have been empirically investigated to aid writing output. Notable examples include Microsoft Word (AbuSeileek, 2013), Google Docs (Ebadi & Rahimi, 2019; Shintani & Aubrey, 2016), and Annotator (Yeh & Lo, 2009).

Technology and Corrective Feedback

Computer-mediated corrective feedback (CMCF), which refers to the textual input provided by a software installed on a computer to correct the learners’ writing errors, either directly or indirectly, has long been a debatable issue among researchers in the field of SLA since the first call of Truscott (1996). The main argument is that grammatical error correction, which should be considered the ultimate goal of writing, does not only entail correcting grammar but also developing L2 writing—a task that a machine cannot always perform. Truscott’s call was enhanced by the findings of many empirical studies that demonstrated the absence of significant effects (Polio & Fleck 1998; Ware, 2014). However, advocates of technology-enhanced L2 writing argue that WCF can aid L2 writing development (Chandler, 2003; Cheng, 2019; Huang & Renandya, 2020; Wang et al., 2013). A plausible explanation of the conflicting results can be ascribed to the different research designs, such as the small size of participants, limited number of instructional sessions, different L2 competence of learners, and absence of a control group that can compare how students who receive WCF would perform in L2 writing competence against those who received none (Guénette, 2007; Karim & Nassaji, 2020; Storch, 2010). Indeed, the different research designs of the aforementioned studies would lead to conflicting results. Besides, many other factors, such as the proficiency level of learners and the task type, could have mitigated the study results. This controversy in L2 writing research has prompted the current meta-analysis to investigate the mean scores of published studies. In this manner, the average effect size of experimental and/or quasi-experimental designs can be determined with respect to manipulating CMCF and non-CMCF (the latter refers to L2 learners not receiving teacher feedback and/or peer feedback). Examining how different variables can mitigate L2 writing development resulting from CMCF intervention is also an urgent matter. Thus, this study can be of great significance to L2 pedagogues aiming to examine the feasibility of using corrective feedback generated by new types of technology as a way of easily providing learners with feedback to improve their L2 writing outputs. In particular, this meta-analysis examines the overall effect of CMCF on improving L2 writing skills that are moderated by other variables, such as task type (refers to whether corrective feedback addresses writing accuracy or writing fluency), adopted technology (AWE and non-AWE), and L2 proficiency level, which is a gap that has yet to be addressed in computer-mediated technology and L2 writing literature.

Supporting Second Language Acquisition Theories for Corrective Feedback

CMCF is guided by many theoretical frameworks of SLA, such as the noticing hypothesis of Schmidt (1990), interactional hypothesis of Long (1996), and monitor theory of Krashen (1981). According to Schmidt, negative feedback provided by a teacher or a computer can help L2 learners identify their errors and understand the gap in their second or foreign language, thus enhancing their interlanguage development. Similarly, interactionist theory argues that the negotiation of meaning resulting from feedback creates opportunities for learners to attend to oral or written linguistic inputs by noticing their errors, which can be overcome in subsequent learning sessions. According to Heift and Hegelheimer (2017), CMCF is grounded in interactionist theory as the focus is “on learner-computer interactions by emphasizing computer reactions and responses to learner output, error detection, and error-specific feedback and by drawing the learners’ attention to a gap between their interlanguage and the target language through salient modified language input” (p. 54). As for monitor theory, Krashen (1981) argues that WCF helps to develop implicit knowledge by identifying errors and consciously converting the implicit knowledge into explicit knowledge that can be automated by learners when they are repeatedly exposed to the same learning module (Zhang, 2021).

L2 researchers have also examined whether different corrective feedback provided by computer can enhance a learner’s linguistic gap by showing his/her errors and help improve his/her linguistic competence (e.g., AbuSeileek & Abualsha’r, 2014; Cheng, 2019; Lai, 2010). The previous scholars have also tested whether interactions via AWE can improve learners’ writing by attending to the feedback and enhancing them in their revisions (Hockly, 2019; Stevenson & Phakiti, 2014). Despite the controversy among L2 researchers as to whether WCF is harmful or beneficial, numerous studies have verified its positive effects (e.g., Karim & Nassaji, 2020; Sarré et al., 2019; Zhang, 2021). A possible justification is that CF augments the learners’ awareness of their errors and helps them to address these errors in the subsequent writing drafts or develop a new version of the essay (Karim & Nassaji, 2020; Zhang, 2021). Corrective feedback provides an ideal opportunity for learners to identify their faults in L2 writing and understand the gap between what they want to write and their actual writing via the feedback provided by a computer, either in the form of focused or unfocused feedback, to attain learning outcomes (Swain, 2004). Other studies have found that CMF is beneficial to improving L2 grammatical competencies (micro-skills), teachers’ feedback can help to address learners’ writing macro-skills, and combining AWE feedback with teachers’ feedback fosters L2 writing development (Link et al., 2020; Mohsen & Alshahrani, 2019).

Moreover, the research has extended beyond the investigation of the efficacy of WCF, may they be the teachers’ WCF or a technological WCF, to examine its potential to induce L2 writing accuracy and development. The studies have explored the efficacy of direct and indirect WCFs on L2 writing development. Direct WCF refers to the explicit knowledge about the errors and the immediate correction performed by a teacher, a peer, and/or a computer by providing error location and giving the correct answer (i.e., recast), whereas indirect feedback refers to notifying learners that an error has been made (Sarré et al., 2019; Van Beuningen, 2010; Zhang, 2021). Indirect CF can be classified into meta-linguistics (highlights the grammatical rule and provides examples) or indirect location of errors by indicating the occurrence of errors and the number of errors using codes or symbols, such as the asterisk (Lee, 2017). Other researchers, such as Lee (2017) and Zhang (2021), have classified WCF into focused feedback (correcting single types of learners’ errors), mid-focused feedback (correcting multiple types of learners’ errors), or unfocused feedback (correcting comprehensive errors committed by learners). My investigation in this current meta-analysis is to examine the efficacy of WCF—regardless of the different types of WCFs—that manipulates CMCF to aid the students’ learning accuracy and development by using different moderators that can impact L2 writing improvement.

Proficiency level is one of the moderators used in L2 research to investigate what types of learners can benefit from the WCF provided by the teacher or computer (Li et al., 2016; Ranalli, 2018; Saricaoglu, 2019; Xu & Zhang, 2021). Computer-generated WCR does not consider the proficiency level factor, the learners’ previous educational experience, the L2 cultural background of the participants, and the familiarity with L2 (Ranalli, 2018). In the literature review, only few studies have attempted to bridge this gap, some of them investigating how learners with different proficiency levels can process the computer-generated WCFs (Bitchener & Ferris, 2012; Li et al., 2016; Ranalli, 2018; Saricaoglu, 2019; Xu & Zhang, 2021). Furthermore, the research indicates that advanced learners tend to benefit more considerably from content feedback because this can help improve their revised drafts, as they are mainly concerned with improving their writing fluency more than accuracy (Xu & Zhang, 2021). However, beginning students tended to use WCF to improve their writing accuracy, such as grammar, spelling, and punctuation, and they were reported to be well-motivated to engage through WCFs by submitting several revised drafts (Xu & Zhang, 2021). Li et al. (2016) revealed that low-level L2 students perceive WCF as useful in polishing grammatical and mechanical errors, whereas high-level learners have expressed low-perceived usefulness and reported that their WCFs were formulaic and vague, as they found the feedback related to language accuracy was out of context of their actual needs (i.e., their needs were beyond L2 form correction). The research on learners’ cognitive processes in L2 writing has found that beginning students have suffered from inability to write fluently due to the lack of linguistics resources, encountered great difficulty in writing, and reported being extremely worried with polishing mechanical errors; by contrast, experienced writers tend to focus on improving their L2 writing contents (Barkaoui, 2016; Mohsen, 2021; Révész et al., 2019). Indeed, more studies are needed to robustly answer the question that may be raised by instructors and pedagogues: Who would benefit from CMF? How can CMCF address the L2 learners’ needs with different proficiency levels?

Previous Meta-Analysis Studies

Numerous meta-analyses and reviews of literature were conducted to explore the effect of CMF on language learning. Kang and Han (2015) analyzed 21 studies to explore if WCF has helped to improve the grammatical accuracy of second language writing. Many moderators, such as L2 proficiency level and writing genre, were investigated to determine if they could mitigate the general effects on language accuracy. The past results showed that WCF can lead to greater grammatical accuracy in L2 writing. Moreover, findings showed that the higher the proficiency level, the greater benefit from the WCF; in other words, advanced learners benefitted from WCF, whereas beginning students were unable to leverage WCF. Wisniewski et al. (2020) meta-analyzed 435 empirical studies to explore the effects of feedback on student learning. The overall results based on the random effects model showed a medium effect (d = 0.48) of the feedback on student learning. However, the significant heterogeneity in their data showed that feedback cannot be simply understood as a single consistent form of treatment. The aforementioned meta-analyses dealt with corrective feedback in general (Wisniewski et al., 2020) or grammatical accuracy (Kang & Han, 2015). In summary, the previous meta-analyses focused on the corrective feedback generated by teachers and peers to improve students’ writing accuracy. However, to the best of my knowledge, no single meta-analysis study has investigated the efficacy of corrective feedback generated by computer-mediated technology to aid L2 writing fluency and accuracy. Therefore, the current study attempts to analyze the impact of CMCF from a new perspective, including how effective CMCF can affect L2 writing accuracy (form) and fluency (content). The current work also explores what type of technology (AWE vs. non-AWE) can aid L2 writing accuracy and fluency. Furthermore, other moderator variables (proficiency level, task type, and adopted technology) have been examined to determine the possible impact on L2 writing competence as aided by CMCF. In particular, the current study attempts to address the following main research questions:

1. What is the overall effect of CMCF as examined by the target studies on L2 writing improvement?

2. To what extent do the following factors affect the effectiveness of CMCF: (a) adopted technology in the treatments, (b) task type, and (c) L2 learners’ proficiency level?

Methods

Design

The author utilized the Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) (2020, as reported by Page et al., 2021) in the current meta-analysis.

Literature Search

Several steps have been undertaken to identify the preliminary studies that tackle the target scope of the current meta-analysis:

(a) The author consulted three databases covering the studies on SLA. These databases are Scopus, Educational Information Resource Center (ERIC), and Clarivate Web of Science (Social Science Citation Index). These databases permit researchers to retrieve full records of the studies, such as titles, keywords, and abstracts. Many keywords were inputted into these well-known databases, such as Computer-Corrective feedback, Computer-mediated feedback, Criterion, Automated evaluation program, computer-generated feedback, My access, e-rater, Writing Roadmap, Write to Learn and Summary Street error-correction task, automated essay evaluation, automated essay scoring, and writing evaluation technology, L2 writing, or second language writing.

(b) The author searched in applied linguistics journals, computer-assisted language learning (CALL) journals, and educational technology journals, as proposed by Smith and Lafford (2009). See Appendix A for the details.

(c) Manual search was conducted in Google Scholar to check whether my search was comprehensive and whether some studies were missed and not covered by target journals and the consulted databases. Google Scholar is a generic database that covers journals that are not indexed in ERIC, Scopus, or Web of Science (Vitta & Al-Hoorie, 2020).

(d) I consulted references and bibliographies from articles on systematic reviews and meta-analysis studies (e.g., Bahari, 2021; Kang & Han, 2015; Stevenson & Phakiti, 2014; Strobl et al., 2019).

Inclusion and Exclusion Criteria of Eligible Studies

The author utilized a set of criteria for the study inclusion in the meta-analysis to ensure comprehensiveness of the search and guarantee that almost all of the retrieved studies could meet the criteria. The applied criteria included the following:

(a) The target study investigates the effect of any WCF provided by computer technology during an L2 writing task.

(b) The independent variable should be any type of WCF provided by computer technology to aid L2 writing, either for L2 grammatical accuracy or writing fluency or both.

(c) A study must hold a control group that either received peer or teacher’s corrective feedback or did not receive any feedback.

(d) The target study should be reported in English.

(e) A study should contain an experimental group in which a type of WCF was manipulated and compared with a control group, such as a group with peers and/or teachers’ WCF or zero feedback.

(f) If the article abstract does not contain sufficient information about the study design, then the full article should be consulted.

The articles were excluded from the met-analysis pool when one of the following criteria was found:

(a) A study examined CMCF, but no control group was maintained. Some studies reported a case study group or contained experimental studies for different types of WCFs provided by computer.

(b) Means and SDs were not reported. Some studies reported only frequencies and percentiles.

The literature search, which was not restricted by timespan, was concluded in December 2020. The search outcomes resulted in 1128 reports from all of the research outlets that I have checked. Only 14 studies that recruited L2 participants (N = 1220) and could meet the inclusion criteria were selected. These studies are marked with an asterisk (*) in the reference list. Figure 1 illustrates the literature search process.

Figure 1.

Flow diagram of study identification and selection.

Coding Study Reports

Moderators

Three categories were developed for the coding of each study. The coding relied on how outcomes were measured.

Adopted Technology

Two categories were identified for this moderator. If the study used AWE, then it is coded as “AWE.” In cases in which AWE refers to a type of technology that is designed with AI, if the study used the other technology, then it is coded as “non-AWE.”

Task Type

Two categories were identified for the task type moderator. Writing fluency refers to WCF that tackles the writing content, such as coherence, structural organization, and lexical appropriateness. Writing accuracy refers to form, such as spelling, grammatical issues, and punctuation.

Learners’ TL proficiency

Learners’ target language proficiency level was used either as an independent variable or as a covariate in the studies that passed the inclusion criteria. Learners’ target language proficiency level was coded as one of the following three levels: beginners, intermediate, and advanced. The code was determined on the basis of the participants’ background information, as provided in the primary studies. The original labels used by the researchers to classify the participants into different levels were retained, and no inferences were made about this feature.

Two coders were involved in coding the studies. A consensus should be reached when a disagreement occurs between the coders. The inter-rater reliability was .90.

Interpretation of Effect Size

The data were analyzed using Comprehensive Meta-Analysis version. 3. The Q statistic, which is a statistic used for multiple significance testing across several mean values, was used to determine the heterogeneity among the sampled study properties. In addition, the analysis conducted by the moderators allowed for the determination of how different factors may affect AWE. Moreover, the I-square statistic was used to show that the variation was not due to chance but rather the heterogeneity of the sample (Higgins et al., 2003). A low I-square value indicates a non-significant variance, whereas an increasing I-square value indicates heterogeneity. A value of 25% represents a low I-square statistic, 50% represents a medium I- square statistic, and 75% represents a high I-square statistic. In the analysis, the confidence interval (CI) of 95% was used to test the statistical trustworthiness of the individual and averaged effect sizes. If the CIs include a zero, then the calculated effect size may be due to chance; it may also be that the true effect size is zero and thus not trustworthy. The random effect size was selected in this meta-analysis because the random effect model entails a relatively strong conceptual motivation. A fixed effect model assumes that the study effects are homogeneous, or the samples have only a single population effect size. By contrast, the random effect model directly estimates heterogeneity as a variance estimate (Oswald & Plonsky, 2010).

Publication Bias

A funnel plot was created (Figure 2) in this meta-analysis to ascertain whether availability bias was present (i.e., whether the retrieved studies entail significant results). In a funnel plot, studies with large sample sizes, given their small sampling error and high precision values, appear towards the top of the graph and tend to cluster near the mean effect size. The studies with small sample sizes have greater sampling error and lower precision; thus, they tend to appear towards the bottom of the graph. If no availability bias is found, then the studies will be symmetrically distributed around the mean. If availability bias is present, the small studies will be concentrated on the right side of the mean. The funnel plot of this meta-analysis shows the following patterns: First, the larger sample studies (those with higher precision values) are generally evenly distributed around the mean and appear towards the upper part of the funnel. Second, some effect sizes can be observed at the bottom of the plot, and they are evenly distributed around the mean and appear towards the bottom of the funnel. This trend in the current meta-analysis indicates a normal distribution of the mean values. The funnel plot also shows some dots on the right side of the aggregated mean values because few studies have large effect sizes. However, in general, the funnel plot presents a symmetrical distribution around the mean, which indicates the absence of publication bias.

Figure 2.

Availability bias: Funnel plot of precision by standard difference in mean values.

Results

A total of 22 effect sizes from 14 studies (highlighted with an asterisk in the list of references) were analyzed, including the effect size, the standard error, and the 95% CI of each effect size. These studies involved 1220 participants.

Overall Effect of Computer-Mediated Corrective Feedback for improving L2 writing skills

The results obtained from the preliminary analysis of the 14 studies in this meta-analysis are shown in Table 1 and Figure 3. The first research question was used to address the overall effect of CMCF for improving L2 writing skills.

Table 1.

Overall effect of CMCF for improving L2 writing skills.

k*	Point Estimate	SD Error	Confidence Intervals		Test of Null (2-Tail)		Heterogeneity				Mean d
k*	Point Estimate	SD Error	Lower Limit	Upper Limit	z-value	p-value	Q-value	df (Q)	p-value	I-Squared	Mean d
22	.843	.179	.492	1.194	4.707	0	158.61	21	0	86.76	1.21

k = number of aggregated effect sizes.

Figure 3.

Overall effect of CMCF for improving L2 writing skills.

The overall weighted mean effect size (d = 1.21) represents a large effect based on Cohen’s (1988) scale. Figure 3 shows the results of the overall effect size estimates and the effect sizes for each study, including research information, effect size estimate, standard error, 95% CI, Z-value, p-value, and forest plot.

Table 1 shows the results of the random effects model. A large overall effect size (d = 1.21) with a CI of [0.492, 1.194] was obtained, which represents a skewness towards the experimental groups rather than the control groups. This trend indicates that CMCF has a large effect on the improvement of L2 writing skills. The statistical significance of the Q-test results suggests a significant distribution across the 14 primary studies. As for the heterogeneity index, an I² of 86.76 indicates high variability among the studies, suggesting a need for moderator analyses.

The moderator analysis was conducted to determine which factors moderate the effectiveness of CMF on writing. The Q-test (Lipsey & Wilson, 2001) was performed to detect the statistical significance of the effect size estimates between subgroups

Three moderators were evaluated with respect to their relationship with the effectiveness of CMCF to improve L2 writing skills. Table 2 presents the results of the moderator analyses for the contextual variables. The CIs of many subgroups in this meta-analysis seldom overlapped, indicating statistically significant differences between their effects.

Table 2.

Moderator analyses across contexts.

Moderators	Categories	K	Confidence Intervals				Heterogeneity			Mean (d)
Moderators	Categories	K	Lower Limit	Upper Limit	z-value	p-value	Q-value	p-value	I-Squared	Mean (d)
Technology adopted	AWE	6^a	-.204	.812	1.173	.241	30.52	0	83.62	0.58
Technology adopted	Non-AWE	16^a	.641	1.601	4.57	.000	122.4	0	87.74	1.44
Task type	Writing fluency	11	.487	1.561	3.73	.000	95.53	0	89.53	1.25
Task type	Writing accuracy	11	.168	1.094	2.673	.008	56.32	0	82.24	1.17
L2 Learners’ proficiency	Beginner	3	.108	1.902	2.197	.028	15.82	0	87.36	1.03
	Intermediate	11	.753	2.148	4.076	.000	107.31	0	90.68	1.80
	Advanced	5	.179	.502	3.799	.000	3.57	.466	.0000	0.27
	NA	3	-.879	1.364	.424	.671	18.36	0	89.11	0.77

^aTitles of these studies are reported in Appendix B.

The first moderator was the type of technology used in evaluating writing skills. The technology types used in the primary studies were categorized into two types: AWE and non-AWE. The results indicate that the use of non-AWE interventions produced substantially larger effects than the ones that used AWE. As shown in Table 2, the effect size is significantly larger (d = 1.44) for non-AWE treatments compared with AWE treatments (d = .58).

The second moderator was the type of tasks in writing skills. The task types were categorized into two types in this study: writing improvement and grammatical competence. The results indicate that the use of CMCF to improve L2 writing skills has large effect sizes for both types of tasks. As shown in Table 2, the effect size is large (d = 1.25) for writing improvement and grammatical competence (d = 1.17).

The third moderator was learners’ target language proficiency level. The proficiency levels were categorized into three groups in this study: beginner, intermediate, and advanced. As shown in Table 2, the effect sizes for beginner and intermediate learners are large (d = 1.03 and 1.80, respectively), whereas the effect size is small (d = .27) for advanced learners. In addition, the CIs were positive in the three categories, suggesting that the language learners who used computer feedback to improve their writing skills performed better than those who did not use the technology.

Discussion

This meta-analysis study explores the overall effect of computer-generated WCF to show the learners’ errors as a way of improving their L2 writing in their subsequent revised drafts. This research also seeks to understand if moderators, such as task type, learners’ proficiency level, and type of adopted technology, can mitigate the general effect of CMF on L2 writing outcomes. The results of this meta-analysis demonstrate an overall large effect of CMCF over the traditional WCF, indicating that L2 learners are supported using CMCF to aid their writing development. The findings of this meta-analysis are consistent with the meta-analysis of Kang and Han (2015), revealing that the WCF’s overall effect on L2 grammatical accuracy is moderate to large. However, the magnitude of effect sizes in this study is different from those reported in similar meta-analyses on WCF. For example, Wisniewski et al. (2020) reported a medium effect size (d = 0.48) for the feedback on student learning. The findings of the current study demonstrate the large effect size of CMCF over the traditional corrective feedback for developing L2 writing accuracy and fluency. A plausible interpretation for these positive findings is that WCMCF assists students to notice the feedback provided by computer, help identify their errors (content and form), and consequently avoid these errors in their subsequent writing drafts (Karim & Nassaji, 2020; Li et al., 2016). These findings also corroborate with interactionist theory, which states that learners find CMF as a scaffolding in which they can interact with the feedback provided by a computer and help achieve a negotiation of meaning until they can ensure that their subsequent writing attempts are correct, thereby leading to L2 writing automaticity (Ellis, 2009; Heift & Hegelheimer, 2017; Long, 1996). Learners who encounter salient language errors can attend to the language input, and the learning becomes automatized as they attend to these language errors in their subsequent learning modules (Krashen, 1981). The positive findings of CMF found in this meta-analysis study align with the other findings in the literature, demonstrating the positive impact of CMCF to enhance L2 writing improvement by identifying the students’ weaknesses in different aspects of L2 writing, helping them to address these errors in their subsequent revisions or drafts (Karim & Nassaji, 2020; Sarré et al., 2019; Zhang, 2021).

The second research question addresses the effects of the three moderator variables on the use of CMCF to improve writing. In terms of the technology used in the treatments, the findings suggest that both types of CMCF entail significant differences over the traditional feedback. However, the non-AWE intervention significantly outperforms the AWE intervention for improving L2 writing fluency and accuracy. This result can be attributed to the AWE studies that have been included in this meta-analysis pool; that is, the AWE studies examined L2 development whereas non-AWE studies investigated L2 writing accuracy, except one of them that focused on L2 writing fluency. Another reason is the small number of aggregated effect sizes (k = 6) for the AWE studies analyzed in this meta-analysis; the small number may have skewed the results. By contrast, the number of aggregated effect sizes for non-AWE studies is high (k = 16), thus yielding different effect sizes.

As for task type as a moderator, the results indicate that CMCF can significantly improve L2 grammatical competence (accuracy) more than L2 writing fluency. Clearly, micro-level errors can be accurately polished by a computer, as it is easy for technology to identify these types of errors and give indirect or direct grammatical, orthographical, and punctuation feedback. However, the use of CMCF to handle content errors seems to be a difficult task; incidentally, these kinds of errors can be accurately identified by human teachers. Although rapid technological advancements enhanced by AI and NLP can simulate the work of human teachers, the tasks only seem to assist human teachers but not replace his/her corrective feedback.

Advanced learners tend to be less beneficent from receiving CMCF, as demonstrated in the current study. However, beginning and intermediate students have shown great progress in their learning performance as a result of their attending to CMCF. A possible reason is that beginning and intermediate learners lack automaticity in L2 competence, and they focus on micro-level corrective feedback, such as on grammar, spelling, and punctuation, provided by computers (Barkaoui, 2016; Révész et al., 2019). In contrast to Kang and Han’s (2015) meta-analysis, the current study found that advanced learners find the CMF less useful, whereas the beginning level students are typically involved with computer-generated feedback. This difference can be ascribed to the focus of Kang and Han (2015) who examined the overall effect on language accuracy that matched the low-level students’ concerns because they lacked writing automaticity. Nonetheless, the current meta-analysis is in line with the findings of Xu and Zhang (2021) who showed that beginning learners tend to be highly engaged with CMCF interaction, are much interested in addressing the errors suggested by computers, and pay much attention to improve their micro-level revisions in successive drafts. By contrast, advanced learners may not show interest in addressing the low-level corrective feedback suggested by computers, as they tend to be much occupied with high-level corrective feedback to improve their content and discourse level in their writing (Xu & Zhang, 2021). This result matches the findings of Révész et al. (2019) and Mohsen (2021) who reported that experienced writers have mastered their writing accuracy, and they may overlook CMCF concerns related to mechanics because their automaticity in writing form is higher than those of their counterparts. As a result, their working memory resources are free to focus on the high-cognitive level, such as generating and organizing ideas, maintaining cohesion and coherence, and keeping the idea flow from one section to another (Barkaoui, 2016; Mohsen & Qassem, 2021).

Conclusion

Being a debatable issue among scholars for nearly two decades, this meta-analysis contributes to the literature by summarizing the quantitative findings on whether CMCF can enhance L2 writing accuracy and fluency. Previous meta-analyses studies focused on how corrective feedback generated by teachers or peers can aid L2 writing in terms of language accuracy. The relevant technology was first incorporated in L2 learning to aid language accuracy, as it is easy for designers to set algorithms to show the language form errors and help learners identify their grammatical, orthographic, and punctual faults. Therefore, the majority of the software programs were constructed to improve language accuracy; as a result, many studies have examined the potential of the technology to aid L2 learners’ grammatical competence. The new advancements in modern technology have the potential to aid language fluency to a certain extent, and they can also help to improve language fluency. The technological advancements that manipulate AI and NLP have helped designers to determine how language content can be improved by developing new programs to address the aforementioned gap. The findings of the current meta-analysis found that CMCF has a large effect on L2 writing accuracy and fluency. Expectedly, computer-detectable language accuracy is significantly higher than CMCF-aided language fluency. The types of learners exposed to CMCF can determine the type of task type to be aided by CMCF. The current findings suggest that advanced learners can benefit more from feedback for improving language fluency, whereas beginning and intermediate learners utilize the corrected feedback related to language accuracy improvement because they are more concerned with polishing their errors. As for the efficiency of adopted technology as a moderator, the results suggest that AWE-manipulated studies obtained a medium overall effect in writing scores. By contrast, for studies that manipulated non-AWE (corrective feedback for micro-skill level of L2 writing), the effect size was large. The difference indicates that even if AWE can simulate the error detection of human teachers at the macro-skill level, the tool cannot entirely replace human teachers.

Pedagogical Implications

In light of the current study findings, many pedagogical implications can be highlighted. First, the teachers’ role is crucial in raising the students’ macro-skill errors, as technology cannot fully detect all of the students’ errors. Therefore, teachers’ intervention is necessary besides the feedback provided by technology (Mohsen & Alshahrani, 2019). Unlike advanced learners who show less interest in CMCF interaction, beginning learners are more interested in attending to WCF as a way of improving their language accuracy, particularly by addressing CMCF in their revised drafts. These scenarios will require instructors to consider individual differences when manipulating AWE or non-AWE in their students’ curriculum. Second, teachers should monitor the students’ learning processes during L2 writing involvement and elicit their difficulties when attending to computer-generated feedback. As of this writing, scholars have yet to understand how much effort is involved in the writing task aided by a computer and to what extent they can interact and address the CMF.

Limitations and Suggestion for Future Studies

This study encountered certain limitations that may be addressed by future researchers interested in conducting meta-analyses. First, the number of target studies was too small, implying that the results cannot sufficiently contribute to the relevant domain. The low number can be attributed to the narrow scope of studies examined in the current work. Second, the interaction between the three moderators (adopted technology, task type, and language proficiency) was not examined. Besides, the inferences pertaining to the task types of moderators and proficiency levels of learners were included in the examination of the impacts of AWE and non-AWE on writing skills. Third, the settings of the participants with respect to learning the target language (e.g., as a foreign or second language) were also not addressed in this current study. This aspect is crucial in determining the difficulty of the language and how CMCF can mediate the participants’ writing development. Fourth, the gender of the participants was not explored. Whether male or female learners would be affected differently by CMCF was not examined as a moderator. Finally, the type of genre is another issue that can be tackled by future studies to show what type of prompts can be aided by CMCF.

Footnotes

Acknowledgments

The authors would like to thank the editor and three anonymous reviewers for their valuable comments during the peer review stage of this article. My sincere appreciation goes for Dr Hassan Mahdi for his insightful views of the statistical analysis. The author thanks the Deanship of Scientific Research at Najran University for funding this study through a grant research code (NU/-/SEHRC/10/941).

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research is supported by Najran University (NU/-/SEHRC/10/941).

ORCID iD

Mohammed Ali Mohsen

List of Journals Proposed by Smith and Lafford (2009)

(a) Applied Linguistics Journals

Journal Title	URL
Applied Linguistics	https://academic-oup-com-s.web.bisu.edu.cn/applij
Canadian Modern Language Review	https://www.utpjournals.press/loi/cmlr
Foreign Language Annals	https://onlinelibrary-wiley-com-443.web.bisu.edu.cn/journal/19449720
International Review of Applied Linguistics in Language Teaching	https://www.degruyter.com/journal/key/iral/html?lang = en
Journal of Second Language Writing	https://www.journals.elsevier.com/journal-of-second-language-writing
Language Learning	https://onlinelibrary-wiley-com-443.web.bisu.edu.cn/journal/14679922
Modern Language Journal	https://onlinelibrary-wiley-com-443.web.bisu.edu.cn/journal/15404781
Studies in Second Language Acquisition	https://www-cambridge-org-443.web.bisu.edu.cn/core/journals/studies-in-second-language-acquisition
System	https://www.journals.elsevier.com/system
TESOL Quarterly	https://onlinelibrary-wiley-com-443.web.bisu.edu.cn/journal/15457249

(b) CALL Journals

Journal Title	URL
CALICO	https://journals.equinoxpub.com/CALICO
CALL- EJ	http://callej.org/
Computer-Assisted Language Learning	https://www-tandfonline-com-s.web.bisu.edu.cn/toc/ncal20/current
JALT Journal	https://jalt-publications.org/jj
Language Learning and Technology	https://www.lltjournal.org/
ReCALL	https://www-cambridge-org-443.web.bisu.edu.cn/core/journals/recall

Journal Title	URL
British Journal of Educational Technology	https://bera-journals.onlinelibrary.wiley.com/journal/14678535
Computers and Education	https://www.journals.elsevier.com/computers-and-education
Educational Technology Research and Development	https://www.springer.com/journal/11423
Journal of Computer Assisted Learning	https://onlinelibrary-wiley-com-443.web.bisu.edu.cn/journal/13652729
Journal of Educational Computing Research	https://journals-sagepub-com-s.web.bisu.edu.cn/home/jec
Journal of Research on Computing in Education	https://www-tandfonline-com-s.web.bisu.edu.cn/toc/ujrt19/28/3

AWE and Non-AWE Studies

S. No	AWE Studies	S. No	Non-AWE
1	Lai (2010)	1	AbuSeileek (2013)
2	Cheng (2019) a	2	AbuSeileek (2013) a
3	Cheng (2019) b	3	AbuSeileek (2013) b
4	Huang and Renandya (2020)	4	Gao and Ma (2020) a
5	Tang and Rich (2017)	5	Gao and Ma (2020) b
6	Wang et al. CALL (2013)	6	Gao and Ma (2020) c
		7	Gao and Ma (2019) a
		8	Gao and Ma (2019) b
		9	Gao and Ma (2019) c
		10	Shintani and Aubrey (2016)
		11	Shintani and Aubrey (2016)
		12	Al-Olimat and AbuSeileek (2015)
		13	Hosseini (2012)
		14	Yeh and Lo (2009)
		15	Sauro (2009) a
		16	Sauro (2009) b

References

AbuSeileek

A. F.

(2013). Using track changes and word processor to provide corrective feedback to learners in writing. Journal of Computer Assisted Learning, 29(4), 319–333. https://doi.org/10.1111/jcal.12004.

AbuSeileek

Abualsha’r

(2014). Using peer computer-mediated corrective feedback to support EFL learners’ writing. Language Learning & Technology, 18(1), 76–95. http://llt.msu.edu/issues/february2014/abuseileekabualshar.pdf.

Al-Olimat

S. I.

AbuSeileek

A. F.

(2015). Using computer-mediated corrective feedback modes in developing students’ writing performance. Teaching English with Technology, 15(3), 3–30.

Bahari

(2021). Computer‐mediated feedback for L2 learners: Challenges versus affordances. Journal of Computer Assisted Learning, 37(1), 24–38. https://doi.org/10.1111/jcal.12481.

Barkaoui

(2016). What and when second‐language learners revise when responding to timed writing tasks on the computer: the roles of task type, second language proficiency, and keyboarding skills. The Modern Language Journal, 100(1), 320–340. https://doi.org/10.1111/modl.12316.

Bitchener

Ferris

D. R.

(2012). Written corrective feedback in second language acquisition and writing. Routledge.

Chandler

(2003). The efficacy of various kinds of error feedback for improvement in the accuracy and fluency of L2 student writing. Journal of Second Language Writing, 12(3), 267–296. https://doi.org/10.1016/s1060-3743(03)00038-9.

Cheng

(2019). Exploring the effects of automated tracking of student responses to teacher feedback in draft revision: Evidence from an undergraduate EFL writing course. Interactive Learning Environments. https://doi.org/10.1080/10494820.2019.1655769.

Cohen

(1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

10.

Cotos

(2014). Conceptualizing genre-based AWE for L2 research writing. In Cotos

(Ed.), Genre-based automated writing evaluation for L2 research writing (pp. 65–95). Palgrave Macmillan. https://doi.org/10.1057/9781137333377_4.

11.

Ebadi

Rahimi

(2019). Mediating EFL learners’ academic writing skills in online dynamic assessment using Google docs. Computer Assisted Language Learning, 32(5–6), 527–555. https://doi.org/10.1080/09588221.2018.1527362.

12.

Ellis

(2009). Corrective feedback and teacher development. L2 Journal, 1(1), 3–18. https://doi.org/10.5070/l2.v1i1.9054.

13.

Gao

(2019). The effect of two forms of computer-automated metalinguistic corrective feedback. Language Learning & Technology, 23(2), 65–83. https://doi.org/10.125/44683.

14.

Gao

(2020). Instructor feedback on free writing and automated corrective feedback in drills: Intensity and efficacy. Language Teaching Research. https://doi.org/10.1177/1362168820915337.

15.

Guénette

(2007). Is feedback pedagogically correct?: Research design issues in studies of feedback on writing. Journal of Second Language Writing, 16(1), 40–53. https://doi.org/10.1016/j.jslw.2007.01.001.

16.

Heift

Hegelheimer

(2017). Computer-assisted corrective feedback and language learning. In Nassaji

Kartchava

(Eds.), Corrective feedback in second language teaching and learning: Research, theory, applications, implications (pp. 129–140). Routledge. https://doi.org/10.4324/9781315621432-5.

17.

Higgins

J. P.

Thompson

S. G.

Deeks

J. J.

Altman

D. G.

(2003). Measuring inconsistency in meta-analyses. British Medical Journal, 327(7414), 557–560. https://doi.org/10.1136/bmj.327.7414.557.

18.

Hockly

(2019). Automated writing evaluation. ELT Journal, 73(1), 82–88. https://doi.org/10.1093/elt/ccy044.

19.

Hosseini

S. B.

(2012). Asynchronous computer-mediated corrective feedback and the correct use of prepositions: Is it really effective? Turkish Online Journal of Distance Education, 13(4), 95–111.

20.

Huang

Renandya

W. A.

(2018). Exploring the integration of automated feedback among lower-proficiency EFL learners. Innovation in Language Learning and Teaching, 14(1), 15–26. https://doi.org/10.1080/17501229.2018.1471083.

21.

Kang

Han

(2015). The efficacy of written corrective feedback in improving L2 written accuracy: A meta‐analysis. The Modern Language Journal, 99(1), 1–18. https://doi.org/10.1111/modl.12189.

22.

Karim

Nassaji

(2020). The revision and transfer effects of direct and indirect comprehensive corrective feedback on ESL students’ writing. Language Teaching Research, 24(4), 519–539. https://doi.org/10.1177/1362168818802469.

23.

Krashen

(1981). Second language acquisition and second language learning. Pergamon Press.

24.

Lai

Y.-h.

(2010). Which do students prefer to evaluate their essays: Peers or computer program. British Journal of Educational Technology, 41(3), 432–454. https://doi.org/10.1111/j.1467-8535.2009.00959.x.

25.

Lee

(2017). Classroom writing assessment and feedback in L2 school contexts. Springer.

26.

Link

Mehrzad

Rahimi

(2020). Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2020.1743323.

27.

Lipsey

M. W.

Wilson

D. B.

(2001). Practical meta-analysis. SAGE Publications.

28.

Zhu

Ellis

(2016). The effects of the timing of corrective feedback on the acquisition of a new linguistic structure. The Modern Language Journal, 100(1), 276–295. https://doi.org/10.1111/modl.12315.

29.

Long

M. H.

(1996). The role of the linguistic environment in second language acquisition. In Ritchie

Bhatia

T.K.

(Eds.), Handbook of Second Language Acquisition (pp. 413–468). Academic Press. https://doi.org/10.1016/b978-012589042-7/50015-3.

30.

Mohsen

M. A.

(2021). L1 versus L2 writing processes: What insight can we obtain from a keystroke logging program? Language Teaching Research. https://doi.org/10.1177/13621688211041292.

31.

Mohsen

M. A.

Alshahrani

(2019). The effectiveness of using a hybrid mode of automated writing evaluation system on EFL students’ writing. Teaching English with Technology, 19(1), 118–131.

32.

Mohsen

Qassem

(2021). Analyses of L2 learners’ text writing strategy: Process-oriented perspective. Journal of Psycholinguistic Research, 49(3), 435–451. https://doi.org/10.1007/s10936-020-09693-9.

33.

Oswald

F. L.

Plonsky

(2010). Meta-analysis in second language research: Choices and challenges. Annual Review of Applied Linguistics, 30, 85–110. https://doi.org/10.1017/s0267190510000115.

34.

Page

M. J.

McKenzie

J. E.

Bossuyt

P. M.

Boutron

Hoffmann

T. C.

Mulrow

C. D.

Moher

(2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Bmj: British Medical Journal. https://dx-doi-org.web.bisu.edu.cn/10.1136/bmj.n71.

35.

Polio

Fleck

leder

(1998). “If I only had more time:” ESL learners’ changes in linguistic accuracy on essay revisions. Journal of Second Language Writing, 7(1), 43–68. https://doi.org/10.1016/s1060-3743(98)90005-4.

36.

Ranalli

(2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31(7), 653–674. https://doi.org/10.1080/09588221.2018.1428994.

37.

Révész

Michel

Lee

(2019). Exploring second language writers’ pausing and revision behaviors: A mixed-methods study. Studies in Second Language Acquisition, 41(3), 605–631.

38.

Saricaoglu

(2019). The impact of automated feedback on L2 learners’ written causal explanations. ReCALL, 31(2), 189–203. https://doi.org/10.1017/s095834401800006x.

39.

Sarré

Grosbois

Brudermann

(2019). Fostering accuracy in L2 writing: Impact of different types of corrective feedback in an experimental blended learning EFL course. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2019.1635164.

40.

Sauro

(2009). Computer-mediated corrective feedback and the development of L2 grammar. Language Learning & Technology, 13(1), 96–120. http://llt.msu.edu/vol13num1/sauro.pdf

41.

Schmidt

R. W.

(1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129–158. https://doi.org/10.1093/applin/11.2.129.

42.

Shintani

Aubrey

(2016). The effectiveness of synchronous and asynchronous written corrective feedback on grammatical accuracy in a computer‐mediated environment. The Modern Language Journal, 100(1), 296–319. https://doi.org/10.1111/modl.12317.

43.

Smith

Lafford

B. A.

(2009). The evaluation of scholarly activity in computer‐assisted language learning. Modern Language Journal, 93(1), 868–883. https://doi.org/10.1111/j.1540-4781.2009.00978.x.

44.

Stevenson

Phakiti

(2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007.

45.

Storch

(2010). Critical feedback on written corrective feedback research. International Journal of English Studies, 10(2), 29–46. https://doi.org/10.6018/ijes/2010/2/119181.

46.

Strobl

Ailhaud

Benetos

Devitt

Kruse

Proske

Rapp

(2019). Digital support for academic writing: A review of technologies and pedagogies. Computers & Education, 131, 33–48. https://doi.org/10.1016/j.compedu.2018.12.005.

47.

Swain

(2004). Verbal protocols: What does it mean for research to use speaking as a data collection tool? In Chaloub-Deville

Chapelle

Duff

(Eds.), Inference and generalizability in applied linguistics: Multiple research perspectives. John Benjamins.

48.

Tang

Rich

C. S.

(2017). Automated writing evaluation in an EFL setting: Lessons from China. JALT CALL Journal, 13(2), 117–146. https://doi.org/10.29140/jaltcall.v13n2.215.

49.

Truscott

(1996). The case against grammar correction in L2 writing classes. Language learning, 46(2), 327–369. https://doi.org/10.1111/j.1467-1770.1996.tb01238.x.

50.

Van Beuningen

(2010). Corrective feedback in L2 writing: Theoretical perspectives, empirical insights, and future directions. International Journal of English Studies, 10(2), 1–27. https://doi.org/10.6018/ijes/2010/2/119171.

51.

Vitta

J. P.

Al-Hoorie

A. H.

(2020). The flipped classroom in second language learning: A meta-analysis. Language Teaching Research. https://doi.org/10.1177/1362168820981403.

52.

Wang

Y.-J.

Shang

H.-F.

Briody

(2013). Exploring the impact of using automated writing evaluation in English as a foreign language university students’ writing. Computer Assisted Language Learning, 26(3), 234–257. https://doi.org/10.1080/09588221.2012.655300.

53.

Ware

(2014). Feedback for adolescent writers in the English classroom: Exploring pen-and-paper, electronic, and automated options. Writing & Pedagogy, 6(2), 223–249. http://doi:10.1558/wap.v6i2.223.

54.

Wisniewski

Zierer

Hattie

(2019). The power of feedback revisited: A meta-analysis of educational feedback research. Frontiers in Psychology, 10, 3087. https://doi.org/10.3389/fpsyg.2019.03087.

55.

Zhang

(2021). Understanding AWE feedback and English writing of learners with different proficiency levels in an EFL classroom: A sociocultural perspective. The Asia-Pacific Education Researcher. https://doi.org/10.1007/s40299-021-00577-7.

56.

Yeh

S.-W.

J.-J.

(2009). Using online annotations to support error correction and corrective feedback. Computers & Education, 52(4), 882–892. https://doi.org/10.1016/j.compedu.2008.12.014.

57.

Zhai

(2021). Automated writing evaluation (AWE) feedback: A systematic investigation of college students’ acceptance. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2021.1897019.

58.

Zhang

(2021). The effect of highly focused versus mid-focused written corrective feedback on EFL learners’ explicit and implicit. System, 99, 102493. https://doi.org/10.1016/j.system.2021.102493.

59.

Zhang

(2018). Automated writing evaluation system: Tapping its potential for learner engagement. IEEE Engineering Management Review, 46(3), 29–33. https://doi.org/10.1109/emr.2018.2866150.