The Combined Impact of ChatGPT and Teacher Feedback on the Syntactic Complexity of EFL Learners’ Writing

Abstract

This study explores the use of artificial intelligence (AI) in language learning, focusing on its ability to enhance English as a Foreign Language (EFL) writing. Specifically, it examines the effect of integrating ChatGPT feedback with teacher feedback on the syntactic complexity of Saudi EFL learners’ writing. A quasi-experimental design was employed, involving two intact groups of undergraduate students (n = 35) enrolled in an academic writing course. The 9-week intervention provided the experimental group with integrated ChatGPT and teacher feedback, while the control group received only teacher feedback. Pre- and post-test essays were analyzed using the Tool for the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC), covering 177 indices across global, clausal, and phrasal levels. Results showed that the combined feedback condition did not produce a reliable advantage over teacher feedback alone at the global or clausal levels of syntactic complexity. At the phrasal level, a limited set of noun phrase–related indices revealed post-test differences between groups, suggesting localized, feature-specific development rather than broad syntactic restructuring within a short instructional period. However, these differences did not remain statistically significant after false discovery rate (FDR) correction and are therefore interpreted as exploratory rather than confirmatory. The findings are discussed in relation to previous research on AI-mediated feedback and FL writing development, and pedagogical as well as research implications for the integration of AI tools in EFL writing instruction are outlined.

Keywords

ChatGPT feedback teacher feedback syntactic complexity large-grained indices fine-grained indices

Introduction

Technology-enhanced language learning has exerted a significant influence on the development of FL learners’ proficiency and performance (Fredrick & Craven, 2025). Within this paradigm, the integration of artificial intelligence (AI) tools has markedly reshaped FL writing pedagogy and assessment practices. Among these tools, ChatGPT has attracted growing interest due to its ability to provide immediate, personalized feedback, thereby improving grammar, vocabulary, sentence clarity, and overall writing effectiveness (Deng & Lin, 2022; Han & Li, 2024; Kim & Chon, 2025; Oh & Hsieh, 2025; Shen et al., 2023; Song & Song, 2023; Zhai, 2022). However, concerns have been raised regarding its inconsistency, potential to mislead students, and risks of overreliance (Lingard, 2023). Consequently, educators increasingly advocate for the use of ChatGPT as a supplementary resource rather than a sole replacement in FL writing pedagogy (Escalante et al., 2023; Kim & Chon, 2025; Wang et al., 2024).

The development of FL writing has been assessed using several measures, such as lexical richness, cohesion, and sentence variety, among which syntactic complexity (SC) stands out as a strong indicator of writing ability. SC is defined as the degree of diversity, elaboration, and sophistication in the grammatical structures employed in language production (Ortega, 2015). Although there is a growing need to examine SC as a multidimensional structure, encompassing both fine-grained and large-grained indices, research in this area remains limited. Generally, most FL studies analyze only one or two complexity indices, often relying on a narrow selection of commonly used measures, such as the average unit and subordination ratios (Bulté & Housen, 2012; Norris & Ortega, 2009). This reductionist approach is also apparent in research focusing on specific SC measures, which often tend to emphasize one or two indices of complexity at the clause-linking or sentence level while overlooking complexity at other syntactic levels, such as phrasal or clausal levels (Bulté & Housen, 2012).

To address this gap, this study investigates the impact of integrating ChatGPT feedback with instructor feedback on the SC of EFL learners’ writing at three levels: global, clausal, and phrasal, using both large- and fine-grained indices. The study aims to clarify the role of combined feedback in syntactic development in FL writing by employing a mutli-dimensional approach and comprehensive computational tools (Kyle, 2016; Lu, 2010). Essentially, it seeks to enhance the development of more effective pedagogical approaches for integrating ChatGPT feedback into EFL writing classes. Therefore, this study explores how combining ChatGPT feedback with teacher feedback affects EFL learners’ writing SC at the global, phrasal, and clausal levels. Accordingly, the study addressed the following research questions:

RQ1. Does combining ChatGPT feedback with teacher feedback influence EFL learners’ global (large-grained) SC compared to teacher feedback alone?

RQ2. Does combining ChatGPT feedback with teacher feedback influence EFL learners’ clausal SC compared to teacher feedback alone?

RQ3. Does combining ChatGPT feedback with teacher feedback influence EFL learners’ phrasal SC compared to teacher feedback alone?

Literature Review

Syntactic Complexity and Writing Quality

SC constitutes a fundamental aspect of language production, reflecting the diversity and sophistication of grammatical structures employed to express meaning and achieve communicative objectives (Ortega, 2015; Zheng & Barrot, 2024). Situated within the broader domain of linguistic complexity, SC has been extensively acknowledged as a significant predictor of both writing development and quality (e.g., Hao et al., 2024; Lu, 2010; Ortega, 2003; Zhang & Lu, 2022), as well as language proficiency (e.g., Y. Li et al., 2022; Lu & Ai, 2015). A variety of indices have been used to quantify SC across various levels (Biber et al., 2011; Kyle & Crossley, 2018; Lu, 2010; Wolfe-Quintero et al., 1998; Zhang & Lu, 2022).

Previous research predominantly employed the mean length indices of clauses (MLC), sentences, and T-units (MLTU) to evaluate SC (Ortega, 2003). Expanding on this foundation, Lu (2010) advanced the field by integrating 11 additional large-grained indices of SC derived from Wolfe-Quintero et al.’s (1998) and Ortega’s (2003) comprehensive synthesis of FL writing research. These 14 measures were subsequently categorized into five dimensions based on the specific syntactic characteristics they represent: (a) length of production indices (e.g., MLTU), (b) subordination (e.g., clauses per T-unit), (c) coordination (e.g., coordinate phrases per clause), (d) sentence complexity (clauses per sentence), and (5) phrasal elaboration (e.g., verb phrases per T-unit). These indices can be systematically analyzed through computational tools designed to automatically annotate learners’ texts for syntactic features, with Lu’s (2010) L2 Syntactic Complexity Analyzer (L2SCA) being among the most widely employed program in this domain.

Numerous studies have investigated the extent to which large-grained SC indices function as indicators of FL proficiency and writing quality. For example, utilizing a corpus of 1,198 argumentative essays, H.-J. Yoon (2017) analyzed seven large-grained indices through the L2SCA and identified significant proficiency-related differences in MLT, MLC, MLS, noun-phrase complexity (CN/C), and phrasal coordination (CP/C). Similarly, W. Yang et al. (2015) reported that length-based measures, particularly MLS and MLT, exhibited significant correlations with holistic writing scores. Among the various large-grained SC measures, MLTU has consistently been the most frequently applied metric and is recognized as one of the strongest predictors of writing quality. For instance, Ortega (2003) identified MLTU as the sole index common to all six longitudinal FL writing studies reviewed, and Johnson’s (2017) meta-analysis revealed MLTU as one of the two most frequently reported metrics in task-based FL writing research.

Despite the demonstrated significant correlations between large-grained indices and FL writing quality, their validity as comprehensive representations of syntactic constructs has been increasingly questioned. Scholars contend that measures such as MLTU, while informative regarding unit length, do not specify the structural elements (e.g., clauses, phrases, or modifiers) that contribute to this length. Consequently, these indices offer limited insights into the developmental trajectories of learner syntax, for example, whether writers are transitioning from reliance on clausal subordination to greater use of phrasal elaboration (Biber et al., 2011; Kyle & Crossley, 2018). In response, researchers have advocated the adoption of fine-grained indices that capture discrete grammatical configurations, particularly those reflecting clausal subordination (e.g., adverbial, complement, and relative clauses) and nominal modification (e.g., possessive constructions, compound nouns, and adjectival modifiers). These indices can be systematically measured using the Automatic Analysis of Syntactic Sophistication and Complexity (TAASSC; Kyle, 2016).

Empirical research indicates that fine-grained indices are stronger predictors of FL writing quality compared to large-grained measures. For example, Zhang and Lu (2022) investigated the comparative predictive efficacy of these indices concerning writing quality. Their findings demonstrated that the implemented fine-grained measures surpassed large-grained indices across both genres. Specifically, fine-grained indices accounted for 31.9% of the variance in quality ratings for application letters, whereas global measures explained only 20.2%. Similarly, for argumentative essays, fine-grained indices explained 30.6%, contrary to 15.7% for the large-grained indices. Among the most salient predictors (r > .200) were indices associated with prepositional complexity (e.g., prepositions per clause and prepositions per object of the preposition), noun phrase elaboration (e.g., dependents per nominal and dependents per nominal subject), and modifications through adverbials and adjectives (e.g., adverbial modifiers per clause and adjectival modifiers per nominal). Previous studies have also highlighted the predictive significance of fine-grained phrasal indices. For example, Qian (2022) analyzed a corpus comprising 120 essays authored by Chinese college FL learners and found that fine-grained phrasal complexity indices, rather than clausal or large-grained measures, were the most reliable predictors of overall writing performance.

Recently, SC has been widely conceptualized as a mutli-dimensional construct (Norris & Ortega, 2009), and methodological approaches to its assessment in FL writing research have progressively evolved to incorporate multiple analytical levels, including global, clausal, and phrasal dimensions (Jiang et al., 2019). Large- and fine-grained indices fulfil distinct, yet complementary roles. Large-grained measures are valued for their practicality and their capacity to abstract from learner- and context-specific variability, thereby offering a degree of generalizability that facilitates their potential for broad application across diverse writing contexts (Zhang & Lu, 2022). Conversely, fine-grained indices enable a more nuanced examination of the syntactic structures that manifest at various developmental stages. By pinpointing specific structural patterns, these measures enhance analytical transparency and provide a more comprehensive understanding of how syntactic variation contributes to differences in FL writing quality. Therefore, this study combined large- and fine-grained measures to leverage the advantages of each approach, thereby providing a holistic and detailed analysis of SC in FL writing.

Syntactic Complexity and Feedback

Within the corrective feedback (CF) domain, only a limited number of studies have investigated whether, and in what ways, feedback influences the development of SC in learners’ written production. However, these studies’ findings have been inconclusive. Some studies have demonstrated that students who receive CF exhibit greater SC development. For example, Van Beuningen et al. (2012) reported overall development in SC, while Fazilatfar et al. (2014) revealed that learners exhibited significantly higher MLS scores and an increased dependent clause ratio (DC/C). Similarly, W. Li et al. (2020) observed positive effects on measures of subordination and coordination. Conversely, other studies have documented the adverse effects of CF on SC. For instance, Hartshorn and Evans (2015) conducted a longitudinal 30-week investigation and found that although accuracy improved, SC declined. Likewise, Eckstein and Bell (2021) reported a significant reduction in SC among students receiving CF compared to their peers in the control group. These findings suggest that CF may redirect students’ attention toward accuracy, potentially discouraging the use of more complex syntactic structures. A further line of research has indicated that CF may exert minimal or no impact on SC. For example, Thi et al. (2023) concluded that mere exposure to CF is insufficient to enhance students’ writing complexity.

Recent empirical investigations into the impact of automated and AI-generated feedback on SC in FL writing have yielded a mixed picture. For example, Thi et al. (2023) compared teacher feedback, Grammarly, and a combination of both over the course of a semester, finding no significant development in SC; moreover, there was some evidence that learners simplified their writing while prioritizing accuracy. In contrast, Hou (2024) reported that automated essay scoring (iWrite) facilitated development in global measures such as MLTU, although certain finer-grained indices remained unaffected, and verb phrase use even declined. Similarly, Fan (2023) found no significant benefits of adding automated feedback from Grammarly to teacher feedback for SC, whereas Bagheri Nevisi and Arab (2023) noted that learners receiving computer-generated feedback through Ginger outperformed their peers on SC measures, suggesting that specific automated tools may promote greater variation in sentence structure. Notably, Deygers et al. (2025) indicated that, although the use of ChatGPT had a significant positive effect on two SC indices, particularly MLTU and MLC, these effects were limited in scope and unsustained, as development diminished once students ceased using ChatGPT.

The existing literature indicates that CF, whether provided by teachers or AI-based tools, has yielded mixed and sometimes inconclusive findings regarding the development of SC in FL writing. These inconsistencies suggest that feedback effectiveness may depend less on the mere presence of feedback and more on how learners’ attention is directed to linguistic form during meaning-focused writing and revision (Thi et al., 2023). To account for this variation, the present study adopts a focus on form framework (Long, 1991), which posits that language development is facilitated when learners’ attention is selectively and temporarily drawn to linguistic features as they emerge in communicative activities. Within this perspective, feedback serves as a pedagogical mechanism that promotes noticing of form–meaning mismatches and supports restructuring during revision.

Building on the noticing hypothesis, this study argues that different sources of feedback may guide learners’ attention to different, yet complementary, aspects of syntactic form. Teacher feedback tends to be selective and pedagogically focused, often targeting higher-level or discourse-relevant structures and providing explicit explanations aligned with instructional goals (Han & Li, 2024). Such feedback is particularly suited to directing learners’ attention to global and clausal-level SC, including sentence structure, subordination, and cohesion. In contrast, AI-generated feedback is characterized by its immediacy, consistency, and high degree of personalization. AI feedback can repeatedly and systematically highlight localized grammatical and structural issues, allowing learners to notice patterns of form–function mappings across their texts (Guo & Wang, 2024). This type of feedback is especially effective in directing attention to phrasal-level features that may otherwise remain unattended during meaning-oriented writing.

Despite these theoretically complementary affordances, prior research has largely examined AI feedback as a standalone intervention, often comparing it with traditional teacher feedback rather than investigating their combined effects (e.g., Deygers et al., 2025; Hou, 2024). Moreover, previous studies have operationalized SC using diverse and sometimes limited indices, with some focusing exclusively on large-grained measures (e.g., Thi et al., 2023) and others selectively examining specific indices (e.g., Bagheri Nevisi & Arab, 2023). Consequently, it remains unclear how AI feedback, when integrated with teacher feedback, influences SC across multiple levels of linguistic analysis.

To address this gap, the present study investigates the impact of ChatGPT as a complementary feedback tool on SC across global, clausal, and phrasal levels, employing both large- and fine-grained indices. By grounding the integration of teacher and AI feedback in a focus-on-form framework, this study provides a theoretically principled and empirically comprehensive account of how combined feedback may support multidimensional syntactic development in FL writing, thereby contributing to the growing literature on AI-assisted language learning.

Methodology

Research Design

This study utilized a quasi-experimental design with two groups: an intervention group (experimental) and a comparison group (control). This design was chosen to examine the impact of the combined feedback on the SC abilities of EFL learners’ writing performance within a real-world classroom environment where random assignment was impractical. The study was conducted in three phases: (a) a pre-test administered to both groups to identify participants’ writing proficiency, (b) a 9-week intervention period, and (c) a post-test to evaluate potential learning development.

Context and Participants

The study consisted of 35 female Saudi EFL students, who were assigned to two intact classes designated as the experimental group (n = 19) and control group (n = 16). A sensitivity analysis for an independent-samples t-test (two-tailed, α = .05) indicated that the present sample size provided approximately 80% power to detect large between-group effects (d ≈ 0.98), whereas statistical power was substantially lower for small-to-moderate effects. Accordingly, non-significant findings should be interpreted with caution, as they may reflect limited power to detect modest instructional effects.

Participants were first-year college students who had completed two semesters, including language skills courses and Writing I. Prior to university enrollment, they had received at least 8 years of English instruction from public schools, spanning elementary, intermediate, and secondary levels. According to departmental placement assessments, their English proficiency ranged from elementary to low-intermediate levels, corresponding to A2-B1 on the CEFR framework. Participants’ ages ranged from 18 to 21 years. Informed consent was obtained from all participants for the use of their performance and classwork data for research purposes. To comply with the ethical standards concerning anonymity and confidentiality, the names of the university and participants have been omitted in this study.

The study was conducted within the English Writing II course, a compulsory module for third-semester English majors at a university in Saudi Arabia. The course’s primary aim is to develop students’ academic writing skills across various genres, emphasizing rhetorical, lexical, and grammatical components. The course spans nine weeks, comprising three hours of teaching per week, divided into a 2-hr session and a 1-hr session. Throughout the course, students completed three major essay assignments, with multiple drafts.

The course instructor also served as the first author and provided teacher feedback to both groups. While this dual role is not uncommon in classroom-based research, it may introduce potential researcher bias or expectancy effects. To mitigate these concerns, several safeguards were implemented. First, outcome measurement relied on automated, tool-based indices (TAASSC) and standardized pre–post writing tasks administered under identical conditions for both groups, reducing the influence of subjective judgment. Second, both groups followed the same curriculum, materials, and instructional schedule, with feedback procedures guided by consistent commenting priorities aligned with course outcomes across sections. The only systematic difference between the groups was the inclusion of ChatGPT feedback in the experimental condition, while both groups received teacher feedback to ensure instructional equity and consistency.

Data Collection

Data were collected in three main stages: a pre-test, a 9-week intervention, and a post-test. At the outset of the course, prior to the intervention, participants from both the experimental and control groups completed a pre-test writing task. This task required composing a 300-word narrative essay within a controlled classroom setting, with a time allocation of 60 min, deemed sufficient for the task allocation. The objective was to evaluate the participants’ baseline writing skills and to establish an initial point of comparison between the two groups. Following the pre-test, the first researcher and instructor facilitated a two-hour orientation workshop aimed at training students in the experimental group on the use of ChatGPT for feedback purposes and exploring its potential application. While some students had prior experience utilizing ChatGPT for feedback, others were introduced to it for the first time. This orientation ensured that all participants in the experimental group were adequately prepared to employ ChatGPT during the intervention stage.

The intervention period spanned nine weeks, during which both groups completed weekly essay writing assignments integrated into their coursework. The instructor implemented a process writing methodology to teach the course. Each essay genre was taught over a 3-week period, resulting in three instructional cycles and a total of nine hours of instruction. Each cycle concentrated on a specific essay genre and adhered to a consistent instructional process. Microsoft Teams served as the platform for assignment submission, feedback delivery, and lesson dissemination.

Each cycle commenced with prewriting activities in the first week, involving brainstorming and outlining thoughts. This was followed by a 50-min in-class writing session during which students created their first drafts. In the second week, students in the experimental group received a carefully designed prompt to solicit feedback from ChatGPT, ensuring that the feedback was constructive without rewriting the text. In addition to ChatGPT feedback, the instructor also provided feedback to the experimental group. Conversely, the control group received feedback exclusively from the instructor. During the third week, students revised their drafts based on the feedback and subsequently submitted their final versions.

Building on prior research on AI-mediated feedback in FL writing (e.g., Koltovskaia et al., 2024; Yeung, 2025), the present study conceptualized ChatGPT and teacher feedback as serving complementary but differentiated functions within the writing process. Previous research suggests that AI-based feedback is particularly effective when used to provide systematic, immediate feedback on linguistic form, whereas teacher feedback typically focuses on higher-level, pedagogically salient concerns such as sentence structure, coherence, and discourse organization. In line with this distinction, ChatGPT feedback in the present study was deliberately constrained through standardized prompting to address localized grammatical and structural features, while teacher feedback targeted global and clausal-level aspects aligned with instructional goals. Feedback was sequenced such that ChatGPT feedback was received prior to teacher feedback, and students were explicitly instructed to prioritize teacher feedback in cases of discrepancy. This design ensured a clear division of labor between the two feedback sources and minimized potential conflict in students’ revision decisions.

Following the intervention, all participants undertook a post-test, which involved writing a 300-word narrative essay under conditions identical to those of the pre-test (60 min, controlled environment). The purpose was to assess development in students’ writing skills upon completion of the intervention and to facilitate a comparative analysis of the experimental and control groups’ performance. To ensure the validity of development measures, the post-test employed a different essay topic from the pre-test; both topics were carefully constructed to be of equivalent difficulty, thereby ensuring that performance changes reflected genuine development rather than task familiarity. The pre- and post-tests were face-validated by three subject matter experts, whose feedback was incorporated prior to the test administration.

Data Analysis

Statistical Analysis

In this study, pre- and post-test written tasks were analyzed using the TAASSC (Kyle, 2016). To ensure the reliability of the analysis, all spelling errors within the text were corrected prior to processing. The TAASSC provides a comprehensive set of indices comprising 31 fine-grained clausal complexity measures, 132 fine-grained phrasal complexity measures, and a re-implementation of the 14 large-grained SC indices originally developed for the L2SCA (Lu, 2010). Consequently, a total of 177 distinct SC values were generated for each writing sample.

Because TAASSC yields a large number of SC indices, the analyses involved multiple statistical tests. To reduce inflated Type I error, p-values were adjusted using the Benjamini–Hochberg false discovery rate (FDR) procedure within each family (global, clausal, and phrasal). The tables report unadjusted p-values; FDR-adjusted q-values are provided in in Supplementary Tables S1–S6, and findings are interpreted as robust only when they remain significant after correction.

In addition to statistical significance, effect sizes (Cohen’s d) were computed for between-group comparisons to quantify the magnitude of differences. Where appropriate, 95% confidence intervals were reported to convey estimation uncertainty, allowing interpretation beyond p-values alone.

The computational methodology underlying TAASSC is elaborated in detail by Kyle (2016). In line with Lu (2010), the calculation of the 14 large-grained indices began with the generation of a constituency representation for each sentence using the Stanford Parser (Zhang & Lu, 2022). Structural units, including T-units, dependent clauses, and complex nominals, were identified and quantified through Tregex queries, with these counts serving as the basis of the indices.

In contrast, the fine-grained clausal and phrasal measures were derived from dependency parsing: each sentence was processed using the Stanford Neural Network Dependency Parser, after which pertinent linguistic units and dependency relations were extracted using a Python XML parser.

Index Selection

All 177 indices generated by TAASSC were retained at the computation stage. For inferential testing, indices were screened for distributional plausibility, |skewness| and |kurtosis|≤ 2, to support parametric comparisons. Between-group differences were then tested within each analytic family aligned with the research questions, global, clausal, and phrasal. To address multiplicity, p-values were corrected using the Benjamini–Hochberg false discovery rate procedure within each family, and results were interpreted as robust only when they remained significant after correction.

Results

Across the three analytic levels, global, clausal, and phrasal, the results showed a largely consistent pattern: the experimental group, who received combined feedback, did not demonstrate reliable advantages over the control group, who received teacher feedback only, on global or clausal indices. At the phrasal level, only a limited subset of noun-phrase–related indices showed post-test differences. Full index-level outputs are provided in Supplementary Tables S1–S6.

At the Global Level

At the global level, the experimental and control groups were comparable at pre-test across the large-grained indices. Post-test comparisons similarly indicated no consistent between-group differences on global measures, suggesting that combined feedback did not produce measurable changes in overall global SC indicators over the intervention period.

Table 1 summarizes the post-test between-group results at the global and clausal levels; full index-level outputs are provided in Supplementary Tables S1–S6.

Table 1.

Post-test Between-group Comparisons on Global (Large-Grained) and Selected Clausal (Fine-Grained) SC Indices.

Level	Index	Experimental (n, M, SD)	Control (n, M, SD)	t	df	p
Global	MLS	19, 13.24, 2.27	16, 13.09, 2.65	0.18	33	.86
	MLT	19, 11.43, 1.70	16, 11.58, 2.32	−0.23	33	.82
	MLC	19, 7.70, 0.98	16, 7.80, 1.20	−0.28	33	.78
	C_S	19, 1.73, 0.27	16, 1.69, 0.30	0.43	33	.67
	VP_T	19, 1.80, 0.28	16, 1.85, 0.35	−0.51	33	.61
	C_T	19, 1.49, 0.17	16, 1.48, 0.17	0.08	33	.94
	DC_C	19, 0.29, 0.09	16, 0.29, 0.07	−0.11	33	.92
	DC_T	19, 0.44, 0.18	16, 0.44, 0.15	0.03	33	.98
	T_S	19, 1.16, 0.12	16, 1.13, 0.11	0.72	33	.48
	CT_T	19, 0.34, 0.12	16, 0.41, 0.12	−1.66	33	.11
	CP_T	19, 0.35, 0.15	16, 0.29, 0.22	0.84	33	.41
	CP_C	19, 0.24, 0.11	16, 0.20, 0.14	0.90	33	.38
	CN_T	19, 0.88, 0.22	16, 0.93, 0.42	−0.51	33	.61
	CN_C	19, 0.59, 0.14	16, 0.63, 0.27	−0.56	33	.58
Clausal	cl_av_deps	19, 2.63, 0.14	16, 2.60, 0.23	0.46	33	.65
	cl_ndeps_std_dev	19, 1.16, 0.18	16, 1.14, 0.16	0.29	33	.77
	neg_per_cl	16, 0.06, 0.03	9, 0.06, 0.03	0.01	23	.99
	prep_per_cl	19, 0.28, 0.09	16, 0.31, 0.11	−0.75	33	.46
	xcomp_per_cl	18, 0.08, 0.04	15, 0.09, 0.05	−0.43	31	.67
	nsubj_per_cl	19, 0.80, 0.09	16, 0.82, 0.11	−0.54	33	.59
	advmod_per_cl	19, 0.25, 0.09	16, 0.23, 0.10	0.54	33	.60
	modal_per_cl	18, 0.08, 0.04	15, 0.09, 0.05	−0.62	31	.54

At the Clausal Level

At the clausal level, pre-test results indicated comparability between groups across the analyzed indices. Post-test comparisons did not show a stable between-group advantage attributable to combined feedback, indicating that clause-level restructuring (e.g., subordination-related complexity) was not reliably affected within the study timeframe.

Post-test comparisons on clausal indices did not show a stable between-group advantage attributable to combined feedback. In other words, clause-level restructuring (e.g., subordination- and clause-function–related complexity) was not reliably affected within the study timeframe. Selected clausal indicators are summarized in Table 1, and the complete clausal outputs are reported in Supplementary Table S4.

At the Phrasal Level

At the phrasal level, the two groups again started from comparable baselines at pre-test. In the post-test, most indices did not differ between groups; however, a limited number of noun-phrase–focused measures showed differences, suggesting localized changes in nominal elaboration rather than broad phrasal restructuring. Key phrasal indices and noun-phrase structural measures are reported in Table 2; complete phrasal outputs are provided in Supplementary Table S6.

Table 2.

Post-test Between-Group Comparisons on Key Phrasal Indices, Including Noun-Phrase Structural Measures With Nominal (Uncorrected) Between-Group Differences.

Section	Index	Experimental (n, M, SD)	Control (n, M, SD)	t	df	p
Core phrasal (averages)	av_nominal_deps	19, 0.74, 0.14	16, 0.78, 0.20	−0.78	33	.44
	av_nsubj_deps	19, 0.30, 0.13	16, 0.29, 0.14	0.33	33	.74
	av_dobj_deps	19, 1.13, 0.27	16, 1.29, 0.42	−1.40	33	.17
	av_pobj_deps	19, 1.02, 0.29	16, 1.05, 0.24	−0.33	33	.74
	av_ncomp_deps	18, 2.23, 0.64	14, 2.51, 1.01	−0.93	30	.36
Core phrasal (NN averages)	av_nominal_deps_NN	19, 1.15, 0.19	16, 1.23, 0.27	−0.98	33	.33
	av_nsubj_deps_NN	19, 0.89, 0.27	16, 0.97, 0.46	−0.63	33	.54
	av_dobj_deps_NN	19, 1.33, 0.34	16, 1.45, 0.42	−0.92	33	.36
	av_pobj_deps_NN	19, 1.03, 0.29	16, 1.04, 0.26	−0.08	33	.94
	av_ncomp_deps_NN	17, 2.24, 0.61	12, 2.74, 0.86	−1.85	27	.08
Dispersion	nominal_deps_stdev	19, 0.97, 0.11	16, 1.06, 0.24	−1.46	33	.15
	nsubj_stdev	19, 0.61, 0.16	16, 0.68, 0.29	−0.82	33	.42
	dobj_stdev	19, 0.98, 0.20	16, 1.04, 0.37	−0.57	33	.58
	pobj_stdev	19, 0.91, 0.21	16, 0.88, 0.21	0.40	33	.69
NP-structural (significant)	poss_dobj_deps_struct	7, 0.12, 0.03	6, 0.17, 0.04	−2.84**	11	.02
	conj_and_dobj_deps_struct	17, 0.22, 0.05	16, 0.25, 0.03	−2.69**	31	.01
	nn_dobj_deps_struct	17, 0.07, 0.02	16, 0.09, 0.02	−2.90**	31	.01
	poss_dobj_deps_NN_struct	7, 0.12, 0.03	6, 0.17, 0.04	−2.84**	11	.02
	nn_dobj_deps_NN_struct	7, 0.13, 0.03	6, 0.24, 0.09	−3.04**	11	.01

Note. **: Result is significant at the 0.01 level.

Overall, the post-test results demonstrated no statistically significant differences between the two groups after controlling for multiple comparisons within the phrasal family. A small set of noun phrase–focused indices reached nominal significance at the unadjusted level (p < .05); however, these effects did not remain significant after FDR correction. Accordingly, the phrasal findings are interpreted as exploratory and suggestive rather than confirmatory.

Discussion

This study’s primary objective was to investigate whether integrating ChatGPT feedback with teacher feedback would result in enhanced SC among EFL students, compared to teacher feedback alone. The findings indicated that students receiving combined feedback did not significantly outperform those receiving solely teacher feedback at the global, clausal, or phrasal levels of SC. Our findings on the ineffectiveness of feedback on SC are consistent with those reported by Thi et al. (2023), who demonstrated that writing complexity remains unaffected by feedback, irrespective of whether it is delivered by teachers, automated systems, or a combination thereof. This outcome also supports the conclusions of Fan (2023), who found that integration of automated written feedback with teacher feedback did not enhance students’ SC. Similarly, Xu and Zhang (2021) reported that, unlike accuracy and fluency, learners’ SC exhibited no significant development following automated CF. However, the results of this study diverge from those of Deygers et al. (2025) and Bagheri Nevisi and Arab (2023), who observed that learners receiving automated CF attained higher levels of SC compared to their peers. Additionally, the findings are inconsistent with those of Hou (2024), who documented substantial development in global measures of SC as a consequence of automated feedback. These discrepancies may be attributed to differences in the role and affordances of the AI tools employed and their integration within instructional design. In studies such as Deygers et al. (2025) and Bagheri Nevisi and Arab (2023), automated feedback functioned as a primary or relatively unconstrained source of revision support, allowing extensive reformulation and syntactic expansion. Similarly, Hou (2024) employed an automated essay scoring system that provided holistic feedback, potentially encouraging global increases in SC. In contrast, the present study positioned ChatGPT as a complementary and constrained feedback tool, with teacher feedback explicitly prioritized. Moreover, the standardized prompting used in this study limited ChatGPT rewriting and emphasized localized, form-focused revision, which may help explain the absence of significant SC development observed in the present study.

Previous research has shown that CF may redirect learners’ attention toward accuracy, potentially discouraging the use of more complex syntactic structures, as students adopt simpler constructions to minimize the risk of error (Eckstein & Bell, 2021; Hartshorn & Evans, 2015; Truscott, 2007). Eckstein and Bell (2021), for example, argued that FL writers may deliberately employ linguistically simplified structures when accuracy is emphasized, while Hartshorn and Evans (2015) similarly noted that careful monitoring for errors can inhibit SC as learners favor safer, more controlled forms. This tendency may be particularly pronounced in high-stakes instructional contexts such as the Saudi EFL setting examined in the present study (Al-Seghayer, 2022). In this context, students are evaluated on each written assignment, and performance directly contributes to their final course grades. Such assessment practices, combined with a low tolerance for grammatical errors, may encourage learners to prioritize accuracy and error avoidance over syntactic elaboration. As a result, students may prefer to produce simpler but more accurate sentences rather than attempt more complex structures that could jeopardize their scores

Most importantly, the findings of the present study align with a growing body of research indicating that, although CF does not necessarily promote the development of SC, it also does not lead to structural simplification in learners’ writing (Thi et al., 2023; Xu & Zhang, 2021). In the present study, no significant development in SC were observed; however, there was also no evidence that feedback resulted in less complex syntactic production. This distinction is particularly important in light of earlier studies reporting declines in SC alongside development in accuracy following CF (e.g., Eckstein & Bell, 2021; Hartshorn & Evans, 2015). Instead, the findings are more consistent with research suggesting that written CF may exert a largely neutral effect on SC, neither enhancing nor suppressing it (Thi & Nikolov, 2023).

The absence of significant effects on SC in the present study may be attributed to a combination of intervention-related, learner-related, and task-related factors. First, the 9-week intervention may have been insufficient to elicit measurable development in syntactic structures, particularly at the clausal and global levels. Previous studies reporting significant development in SC have typically involved longer instructional periods or more intensive exposure to automated feedback (e.g., Bagheri Nevisi & Arab, 2023). Second, learner proficiency likely played an important role. Although participants were English majors, most fell within the A2–B1 proficiency range, which may have constrained their ability to reliably produce and control syntactically complex sentences under timed writing conditions. Research on SC development suggests that more advanced learners are better positioned to deploy a wider range of syntactic resources, whereas lower-proficiency learners tend to rely on simpler structures that place fewer demands on linguistic control. For example, Kyle and Crossley (2018) showed that higher-proficiency FL writers produced more syntactically elaborated language at both large- and fine-grained levels. In contrast, lower-proficiency learners may lack the resources to manage such elaboration under time pressure and therefore favor simpler sentence constructions, which may help explain the absence of significant SC development observed in the present study. An additional explanation relates to task genre. Previous research (e.g., H. J. Yoon & Polio, 2017) indicates that narrative writing typically relies on chronological sequencing and event-based progression, which may limit opportunities for syntactic elaboration. Accordingly, the use of narrative tasks in the present study may have constrained learners’ production of syntactically complex structures, regardless of feedback type.

Despite these results, the study contributed important findings to the practice of FL writing pedagogy and research. The findings suggest that integrating ChatGPT with teacher feedback is instructionally safe with respect to SC, as it neither enhanced nor diminished learners’ syntactic sophistication. Although the combined feedback did not lead to measurable development in SC, it also did not discourage the use of complex structures, indicating that ChatGPT may be employed as a teacher-in-the-loop support tool for localized revision and accuracy-focused feedback without negative structural consequences. However, the absence of complexity development also suggests that feedback alone may be insufficient to promote syntactic development. To foster such development, teachers may need to pair AI-assisted feedback with explicit metacognitive explanation, particularly for lower-proficiency learners (Almutlaq & Alsaleh, 2025). Metacognitive explanation enhances learners’ understanding and uptake of both teacher and AI feedback, potentially improving overall writing quality. For FL writing researchers, the findings indicate that the effectiveness of feedback may be shaped by a range of interacting factors related to both learners and tasks. Variables such as learner proficiency, task genre, and instructional duration appear to mediate how feedback is processed and applied, suggesting that feedback effects cannot be fully understood in isolation. Consequently, future research should adopt more context-sensitive and design-aware approaches, systematically examining how learner characteristics and task features condition the impact of feedback. Such work may help clarify the conditions under which feedback, whether human, AI-based, or combined, supports different dimensions of writing development.

Conclusion

This study examined whether integrating ChatGPT feedback with teacher feedback influences EFL learners’ SC at global, clausal, and phrasal levels. The results did not indicate a reliable advantage of combined feedback over teacher feedback alone within a 9-week instructional window, suggesting that such integration may not be sufficient to promote short-term changes in SC.

Several limitations of the present study arise from methodological considerations. First, the classroom-based sample size was relatively small, which limited statistical power to detect modest effects and increased the risk of both Type I and Type II errors, particularly given the large number of indices examined. Second, the sample consisted exclusively of female students drawn from a single higher education institution in Saudi Arabia, thereby constraining the generalizability of the findings across genders, institutional contexts, and cultural settings. Third, the relatively short duration of the instructional intervention restricts conclusions regarding the sustainability of the observed changes in SC.

Further research is recommended to validate and extend the present findings. Longitudinal studies with larger and more diverse samples across multiple institutions and demographic groups are needed to enhance generalizability and to examine the long-term effects of AI-assisted feedback on self-concept. Future research should also employ extended instructional periods, incorporate appropriate control or comparison groups, and apply statistical corrections for multiple comparisons to strengthen the robustness of findings.

Supplemental Material

sj-docx-1-sgo-10.1177_21582440261453413 – Supplemental material for The Combined Impact of ChatGPT and Teacher Feedback on the Syntactic Complexity of EFL Learners’ Writing

Supplemental material, sj-docx-1-sgo-10.1177_21582440261453413 for The Combined Impact of ChatGPT and Teacher Feedback on the Syntactic Complexity of EFL Learners’ Writing by Eman Alkhalifah and Sana Almutlaq in SAGE Open

Footnotes

Acknowledgements

The authors would like to thank Imam Mohammad Ibn Saud Islamic University (IMSIU) for supporting and funding this project.

ORCID iD

Sana Almutlaq

Ethical Considerations

This study received ethical approval from the College of Languages and Translation at Imam Mohammad Ibn Saud Islamic University (IMSIU). Informed consent was obtained from all participants through signed consent forms after they were provided with a clear and comprehensive explanation of the study’s objectives, procedures, and content. The research design was carefully developed to minimize any potential risk or harm to participants by ensuring anonymity, voluntary participation, and the unequivocal right to withdraw from the study at any time without penalty or adverse consequences. The researchers maintained a strict commitment to the confidentiality and privacy of all participants’ data and personal information, restricting its use exclusively to scientific research purposes, preventing disclosure to any parties beyond the scope of the study, and ensuring secure storage in accordance with ethical and data-protection standards. Furthermore, the anticipated benefits of the study to the academic community and to society at large were carefully evaluated and determined to outweigh any minimal potential risks, as the findings are expected to contribute to knowledge advancement, inform policy development, and support the improvement of practices related to the focus of the study.

Consent to Participate

All participants provided written informed consent before participation.

Author Contributions

Eman Alkhalifah: Conceptualization, Resources, Methodology, Writing, Reviewing, and Editing.

Sana Almutlaq: Conceptualization, Resources, Methodology, Data Analysis, Writing, Reviewing, and Editing.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU; grant number IMSIU-DDRSP2602).

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data set of this study shall be available upon request.

Supplemental Material

Supplemental material for this article is available online.

References

Almutlaq

S. A.

Alsaleh

Z. M.

(2025). Does metalinguistic explanation of teacher and ChatGPT feedback improve EFL learners’ writing quality and engagement? Findings from an intervention study. Assessment & Evaluation in Higher Education. https://doi.org/10.1080/02602938.2025.2586124

Al-Seghayer

(2022). EFL writing instruction in Saudi Arabia: Challenges and directions. English Language Teaching, 15(2), 1–14.

Bagheri Nevisi

Arab

(2023). Computer-generated vs. Direct written corrective feedback and Iranian EFL students’ syntactic accuracy and complexity. Teaching Language Skills, 42(2), 111–148. https://doi.org/10.22099/tesl.2023.46955.3177

Biber

Gray

Poonpon

(2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly, 45(1), 5–35. https://doi.org/10.5054/tq.2011.244483

Bulté

Housen

(2012). Defining and operationalising L2 complexity. In Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (Vol. 32, p. 21). John Benjamins Publishing Company. https://doi.org/10.1075/lllt.32.02bul

Deng

Lin

(2022). The benefits and challenges of ChatGPT: An overview. Frontiers in Computing and Intelligent Systems, 2(2), 81–83. https://doi.org/10.54097/fcis.v2i2.4465

Deygers

Buelens

Chan

Schildt

Van Parys

Vanbuel

(2025). The impact of automated writing evaluation on writing gains. ELT Journal, 79(3), 352–362. https://doi.org/10.1093/elt/ccaf020

Eckstein

Bell

(2021). Dynamic written corrective feedback in first-year composition: Accuracy and lexical and syntactic complexity. RELC Journal, 54(3), 630–647. https://doi.org/10.1177/00336882211061624

Escalante

Pack

Barrett

(2023). AI-generated feedback on writing: insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20(1), 57. https://doi.org/10.1186/s41239-023-00425-2

10.

Fan

(2023). Exploring the effects of automated written corrective feedback on EFL students’ writing quality: A mixed-methods study. SAGE Open, 13(2), 1–17. https://doi.org/10.1177/21582440231181296

11.

Fazilatfar

A. M.

Fallah

Hamavandi

Rostamian

(2014). The effect of unfocused written corrective feedback on syntactic and lexical complexity of L2 writing. Procedia – Social and Behavioral Sciences, 98, 482–488. https://doi.org/10.1016/j.sbspro.2014.03.443

12.

Fredrick

D. R.

Craven

(2025). Lexical diversity, syntactic complexity, and readability: A corpus-based analysis of ChatGPT and L2 student essays. Frontiers in Education, 10, Article 1616935. https://doi.org/10.3389/feduc.2025.1616935

13.

Guo

Wang

(2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0

14.

Han

(2024). Exploring ChatGPT-supported teacher feedback in the EFL context. System, 126, Article 103502. https://doi.org/10.1016/j.system.2024.103502

15.

Hao

Wang

Bin

Yang

Liu

(2024). How syntactic complexity indices predict Chinese L2 writing quality: An analysis of unified dependency syntactically-annotated corpus. Assessing Writing, 61, Article 100847. https://doi.org/10.1016/j.asw.2024.100847

16.

Hartshorn

K. J.

Evans

N. W.

(2015). The effects of dynamic written corrective feedback: A 30-week study. Journal of Response to Writing, 1(2), 6–34.

17.

Hou

(2024). The effects of automated essay scoring on the syntactic complexity of intermediate EFL learners’ writing. Foreign Languages and Cultures, 9(1), 133–143. https://doi.org/10.19967/j.cnki.flc.2024.01.013

18.

Jiang

Liu

(2019). Syntactic complexity development in the writings of EFL learners: Insights from a dependency syntactically-annotated corpus. Journal of Second Language Writing, 46, Article 100666. https://doi.org/10.1016/j.jslw.2019.100666

19.

Johnson

M. D.

(2017). Cognitive task complexity and L2 written syntactic complexity, lexical complexity, accuracy, and fluency: A research synthesis and meta-analysis. Journal of Second Language Writing, 37, 13–38. https://doi.org/10.1016/j.jslw.2017.06.00

20.

Kim

Chon

(2025). The impact of self-revision, machine translation, and ChatGPT on L2 writing: Raters’ assessments, linguistic complexity, and error correction. Assessing Writing, 65, Article 100950. https://doi.org/10.1016/j.asw.2025.100950

21.

Koltovskaia

Rahmati

Saeli

(2024). Graduate students' use of ChatGPT for academic text revision: Behavioral, cognitive, and affective engagement. Journal of Second Language Writing, 65, Article 101130. https://doi.org/10.1016/j.jslw.2024.101130

22.

Kyle

(2016). In Measuring syntactic development in L2 writing: Fine-grained indices of syntactic complexity and usage-based indices of syntactic sophistication [Doctoral dissertation]. Georgia State University.

23.

Kyle

Crossley

S. A.

(2018). Measuring syntactic complexity in L2 writing using fine-grained clausal and phrasal indices. The Modern Language Journal, 102(2), 333–349. https://doi.org/10.1111/modl.12468

24.

Liu

(2020). Syntactic complexity development in college students’ essay writing based on AWE. In Frederiksen

K.-M.

Larsen

Bradley

Thouesny

(Eds.), CALL for widening participation: Short papers from EUROCALL 2020 (pp. 190–194). Research-publishing.net. https://doi.org/10.14705/rpnet2020.48.1187

25.

Nikitina

Riget

P. N.

(2022). Development of syntactic complexity in Chinese university students’ L2 argumentative writing. Journal of English for Academic Purposes, 56, Article 101099. https://doi.org/10.1016/j.jeap.2022.101099

26.

Lingard

(2023). Writing with ChatGPT: An illustration of its capacity, limitations & implications for academic writers. Perspectives on Medical Education, 12(1), 261–270. https://doi.org/10.5334/pme.1072

27.

Long

(1991). Focus on form: A design feature in language teaching methodology. In de Bot

Ginsberg

Kramsch

(Eds.), Foreign language research in cross-cultural perspective (pp. 39–52). John Benjamins.

28.

(2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474–496. https://doi.org/10.1075/ijcl.15.4.02lu

29.

(2015). Syntactic complexity in college-level English writing: Differences among writers with diverse L1 backgrounds. Journal of Second Language Writing, 29, 16–27. https://doi.org/10.1016/j.jslw.2015.06.003

30.

Norris

J. M.

Ortega

(2009). Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied Linguistics, 30(4), 555–578. https://doi.org/10.1093/applin/amp044

31.

H. J.

Hsieh

S. F.

(2025). An analysis of ChatGPT-generated feedback on argumentative essays in Korean EFL classrooms. Modern English Education, 26, 166–181. https://doi.org/10.18095/meeso.2025.26.1.166

32.

Ortega

(2003). Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24(4), 492–518. https://doi.org/10.1093/applin/24.4.492

33.

Ortega

(2015). Syntactic complexity in L2 writing: Progress and expansion. Journal of Second Language Writing, 29, 82–94. https://doi.org/10.1016/j.jslw.2015.06.008

34.

Qian

(2022). A comparison of measurements of syntactic complexity in L2 writing: Large-grained indices and fine-grained clausal and phrasal indices. Cross-Cultural Communication (CCC), 18(4), 78–87. https://doi.org/10.3968/12889

35.

Shen

Heacock

Elias

Hentel

K. D.

Reig

Shih

Moy

(2023). ChatGPT and other large language models are double-edged swords [Editorial]. Radiology, 307(2), Article e230163. https://doi.org/10.1148/radiol.230163

36.

Song

(2023). Enhancing academic writing skills and motivation: Assessing the efficacy of ChatGPT in AI-assisted language learning for EFL students. Frontiers in Psychology, 14, Article 1260843. https://doi.org/10.3389/fpsyg.2023.1260843

37.

Thi

N. K.

Nikolov

(2023). Effects of teacher, automated, and combined feedback on syntactic complexity in EFL students' writing. Asian-Pacific Journal of Second and Foreign Language Education, 8(1), 6. https://doi.org/10.1186/s40862-022-00182-1

38.

Thi

N. K.

D. V.

Nikolov

(2023). Investigating syntactic complexity and language-related error patterns in EFL students’ writing: Corpus-based and epistemic network analyses. Language Learning in Higher Education, 13(1), 127–151. https://doi.org/10.1515/cercles-2023-2009

39.

Truscott

(2007). The effect of error correction on learners’ ability to write accurately. Journal of Second Language Writing, 16(4), 255–272. https://doi.org/10.1016/j.jslw.2007.06.003

40.

Van Beuningen

C. G.

De Jong

N. H.

Kuiken

(2012). Evidence on the effectiveness of comprehensive error correction in second language writing. Language Learning, 62(1), 1–41. https://doi.org/10.1111/j.1467-9922.2011.00674.x

41.

Wang

Zaid

Y. H.

Pan

(2024). Opportunities and challenges of using ChatGPT as a teaching assistant in English language teaching: A systematic literature review [Symposium]. In Proceedings of the 2024 International Symposium on Artificial Intelligence for Education (ISAIE 24) (pp. 375–382). Association for Computing Machinery. https://doi.org/10.1145/3700297.3700362

42.

Wolfe-Quintero

Inagaki

Kim

(1998). Second language development in writing: Measures of fluency, accuracy, and complexity. University of Hawaii, Second Language Teaching & Curriculum Center.

43.

Zhang

(2021). Understanding AWE feedback and English writing of learners with different proficiency levels in an EFL classroom: A sociocultural perspective. The Asia-Pacific Education Researcher, 31(4), 357–367. https://doi.org/10.1007/s40299-021-00577-7

44.

Yang

Weigle

S. C.

(2015). Different topics, different discourse: Relationships among writing topic, measures of syntactic complexity, and judgments of writing quality. Journal of Second Language Writing, 28, 53–67. https://doi.org/10.1016/j.jslw.2015.02.002

45.

Yeung

(2025). University students engagement with generative AI-supported automated writing evaluation (AWE) feedback. Journal of Second Language Writing, 68, Article 101203. https://doi.org/10.1016/j.jslw.2025.101203

46.

Yoon

H. J.

Polio

(2017). The linguistic development of students of English as a second language in two written genres. Tesol Quarterly, 51(2), 275–301. https://doi.org/10.1002/tesq.296

47.

Yoon

H.-J.

(2017). Linguistic complexity in L2 writing revisited: Issues of topic, proficiency, and construct multidimensionality. System, 66, 130–141. https://doi.org/10.1016/j.system.2017.03.007

48.

Zhai

(2022). ChatGPT user experience: Implications for education (SSRN Scholarly Paper No. 4312418). SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4312418

49.

Zhang

(2022). Revisiting the predictive power of traditional vs. fine-grained syntactic complexity indices for L2 writing quality: The case of application letters and argumentative essays. Journal of Second Language Writing, 60, Article 100756. https://doi.org/10.1016/j.jslw.2022.100756

50.

Zheng

Barrot

J. S.

(2024). Syntactic complexity in second language (L2) writing: Comparing students’ narrative and argumentative essays. System, 123, Article 103342. https://doi.org/10.1016/j.system.2024.103342

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.06 MB