An Empirical Study on the Artificial Intelligence Writing Evaluation System in China CET

Abstract

The Artificial Intelligence Writing Evaluation system is widely used in China College English writing. It provides for both teachers and the English learners services of automated composition evaluation on the net in order that teacher's working load can be reduced and they can learn directly about the students' English writing level and that the students' English writing will be improved. Juku automated writing evaluation (AWE) is one of the most used systems among colleges and universities in China. The empirical study was conducted on the use of Juku AWE in college English teaching. Through the experiment with 114 students from 2 classes in Xi'an University and questionnaires and interview for both 30 teachers and 200 students using Juku AWE, the author finds that: (1) Using AWE does effectively help the students with their English writing; (2) Both teachers and students have positive attitude to the use of AWE in terms of immediate and clear feedback, time-saving, and arousing interests in English writing; and (3) AWE still needs to be perfected as it cannot provide proper evaluation on the text structure, content logic, and coherence. So both teachers and students should take the score from AWE objectively.

Introduction

In college English teaching in China, English writing as an important vehicle to achieve academic and social interaction is still the most difficult obstacle to overcome. In writing process, feedback plays an important role because it can help the writer assess how effectively the writing mediates the intended messages.¹ Besides, the most common and effective type of feedback is teacher feedback. Unfortunately, this puts an enormous load on the China College English teachers as colleges in China expand their enrollments and the number of students has been sharply increasing. As a result, giving feedback to students in English writing becomes a great heavy burden for College English teachers. On the one hand, teacher feedback is not enough in terms of both quality and quantity. On the other hand, as feedback delay is feedback denied, delayed feedback has lost the best timing for students to apply it to writing improvement and generally it becomes a useless feedback.² When the feedback delays too much time, students may have little memory of what they have written. Thus, they cannot revise and perfect their essays according to the feedback given by their teachers.

Under this situation, the Artificial Intelligence Writing Evaluation systems, which are designed to produce immediate computer-based scores for submitted essays along with the diagnostic feedback, have been widely used by English teachers as an alternative assessment and feedback tool in a lot of middle schools and colleges in China. The most widely used automated writing system is Juku correcting net system known as Pigai.org. The present study takes Juku as an example to aim at finding whether it is really effective to help both the English teachers and the students in English teaching and writing, their use of automated writing evaluation (AWE), their attitude toward AWE, and so on.

Literature Review

AWE, also named as computer-assisted writing assessment, computerized essay scoring, computer essay grading, or machine scoring of essays is defined as “the ability of computer technology to evaluate and score written essays.”³

AWE systems have been under development since the mid 1960s, when a national network of U.S. universities, known as the College Board, supported the development of Project Essay Grade to help score thousands of high school student essays.⁴ Page, who was involved with the development, once demonstrated that original enthusiasm for the results was tainted by practical considerations. In the 1960s computer technology was not advanced enough or accessible enough to expand into a larger scale.

In the 1980s, a second product, the Writer's Workbench, was brought to the market when microcomputers were introduced with interest in Project Essay Grade once again renewed.⁵ The Writer's Workbench did not only score essays but also provided feedback to writers on their writing quality. Although technology at that time operated on a narrow definition of quality, as it only allowed for recognizing misspelled or misused words and for identifying long or short sentences, the Writer's Workbench pointed the field in an important direction: focus on feedback.

After 10 years of development, in 1998 Vantage Learning finally released the scoring engine IntelliMetric. It was known as the first holistic essay-scoring tool based on artificial intelligence (AI).^6,7 Using a blend of AI, natural language processing, and statistical technologies, IntelliMetric is a type of learning engine that internalizes the “pooled wisdom” of expert human raters.⁶ One of the best attributes of IntelliMetric™ is its capability of evaluating essay responses in multiple languages, including English, Spanish, Hebrew, Bahasa, Dutch, French, Portuguese, German, Italian, Arabic, and Japanese.⁶

Since then, with the significant advancement in computer technology, the potential of AWE has been largely deployed in research, which has been informed by different types of perspectives from teaching pedagogy, educational measurement, and cognitive science. In other words, what is considered to be most beneficial for students, models which reflect students' thought processes, psychometric evaluations of reliability and validity, and considerations about operational systems and their functionality—all have contributed to the development and implementation of AWE systems. WritetoLearn™, Criterion, and My Access! are the most popular three AWE systems in the market, and they were specifically developed from Intelligent Essay Assessor, E-rater, and IntelliMetirc Automated Writing Scoring engines. Many colleges and universities, high schools, and language testing organizations use AWE system to provide grade to student essays.^4,6,8–11

In China, a variety of AWE systems have also emerged in recent years, such as EFL Essay Evaluator (EEE) invented by Liang Maocheng,^12,13 Writing Roadmap developed by CTB/McGraw-Hill, Juku AWE system developed by Beijing University of Posts and Telecommunications, partly one of English Writing Intelligent Tutoring Systems (EWITS) based on corpus and cloud computing, Bingo English AWE system, E-rater (Electronic Essay Rater)^TM and ETS (Educational Testing Services) Criterion Online Writing Evaluation Service. Juku correcting net, Writing Roadmap, and Bingo English AWE system are the main ones used in China, with Juku being used most popularly among college English teachers and students.

Studies on AWE system

Studies on AWE system mainly involve the effectiveness of the AWE, the differences of using AWE and the traditional way, the comparison of two AWE systems, teachers' and students' attitude to AWE, and so on.

Elliot and Mikulas¹⁴ conducted a study among 709 students of 11th grade using one of the AWE systems MY Access! They claimed that the students who used MY Access! showed improvement to a greater extent than those students who did not use it. Warschauer and Ware¹⁵ indicated that this study was not methodologically good enough for lack of control groups in either the pilot or follow-up study or random assignments. Warschauer and Grimes¹⁶ did a study with My Access! in four secondary schools and found that both teachers and students had positive attitudes toward MY Access! in terms of increasing the students' motivation, promoting autonomous student activity, and being a time saver for teachers. Grimes and Warschauer investigated the attitudes of teachers and students toward the use of the MY Access! system at middle school settings and found that despite the negative views of many teachers and students concerning the reliability of the system, MY Access! provided benefits in terms of classroom management and student motivation.¹⁷

Shermis et al. conducted a study using random assignments and found no significant differences between the experimental group and the control group.¹⁸ Attali¹⁹ gave a detailed account of how Criterion was used by 6th–12th graders throughout the United States during the 2002–2003 school year and, in particular, what kinds of changes occurred through revision of essays. Attali's study makes no difference between students by grade level, school, or language background. However, the software easily allows these types of analysis to be carried out.

Chodorow et al.²⁰ reported two experiments that evaluated the two systems: Criterion and ESL Assistant, for identifying and correcting writing errors, including articles and prepositions. Studies on AWE attach more importance to whether AWE feedback can improve students' writing proficiency or how AWE feedback affects students' revisions and how the students react to the feedback received.

In China, with the increase of large-scale tests of different types and the increasing number of students, teachers have great burden in grading the test papers of students, and the reliability of the students' score is dubious when teachers may score subjectively. Thus the automated grading system is in great need. The early study on AWE was conducted by Liang Maocheng,²¹ who studied the application of AWE in English composition of Chinese students, while Li Yanan did the research with Chinese language test.²² Cao Yiwei and Yang Chen conducted the research on Chinese composition scoring by means of potential semantic analysis.²³

AWE systems have been widely used and discussed in language teaching and learning in foreign countries, but in China it started late and only a few studies have been conducted to explore the students' perceptions toward it, and numbered studies still need to be carried out to compare the AWE feedback with other sources of feedback.

Studies on feedback on students' English writing

Feedback plays an important part in encouraging and strengthening students' English writing. It was introduced to the language acquisition field as comments or information learners receive either from teachers or from other learners on the success of a learning task.²⁴ It has been a concern of various researchers for centuries.^25–30

Many researches and studies explore different aspects of the feedback in English writing in resources,^31–33 ways,^34–36 and focuses.^37,38 Some tried to prove the effectiveness of the feedback given by teachers,^1,39–45 some explore the peer feedback,^45–47 and some compare the effects on feedback on the writing process and on the performance.^48–55 The subjects in the studies are different from native learners to ESL and EFL learners.^56–61 Languages in the studies are L1 (native language), L2 (the second language), and FL (foreign language).

In recent years in China, with the application of AWE systems such as JUKU Pigai.org and Bingo intelligence review system, some scholars have concentrated on the study of the reliability and validity of these systems. Some studies are about students' attitudes toward the AWE system. Some researches are conducted among high school students,⁶² English majors,^63–65 and non-English majors.^66–71 Few contrastive studies are conducted on the effectiveness of AWE feedback and teacher feedback. The effects of AWE feedback and teacher feedback need to be further studied.

Research Design

Research questions

Whether AWE is an effective platform to help students with their English writing?

What are the attitudes that teachers and students hold to AWE?

Whether there is a great difference between the traditional feedback and the one given by the computer?

Subjects

One hundred fourteen students from two classes majoring in engineering in Xi'an University in Shaanxi Province were chosen as participants for this study. They are all sophomores in college with 63 from class 1 and 51 from class 2. Randomly one class is the controlled one using traditional way of teaching, and the other is the experimental one receiving Juku AWE feedback. It should make clear that the students were told nothing about the experiment to make sure about the validity of the study.

Means and instruments

The study adopts the combination of quantitative method and qualitative method. The quantitative method is the analysis of the students' scores in the tests before and after the experiment. The qualitative study includes the questionnaires and interview for both teachers and the students. SPSS is used to analyze the data from pretest and post-test, and t test is used to see whether there is an important difference between the two classes.

Steps in the research

Step 1. Pretest

At the beginning of the semester, both the experiment class and the control class (CC) were asked to write a short essay of about 150 words on the given topic within 30 minutes under the supervision of two teachers. The teachers scored all the papers and input all the scores in the computer for further analysis.

Step 2. Registration

First, students of the experimental class (EC) were informed to register on the platform of Juku AWE. Then the teacher distributed the writing task and asked them to submit their composition online according to the requirement. Students can revise their compositions and resubmit until they felt satisfied with their compositions and the scores given by the computer. After that, the teacher's feedback was added to the feedback online. Finally, the teacher summarized the commonly and easily made mistakes that the students often made in their compositions and kept them online for students to discuss. After the students submitted more than 10 compositions, the teacher could set up “My Website” from where new tasks of writing composition would be assigned to the students, well-written compositions could be recommended to the students, or files related to writing could be uploaded. Students could have the right to read and to scan these materials.

Step 3. Experiment

Both of the experiment class and the CC had the same teaching content of writing and the same writing task.

The experiment lasted for one semester. The AWE teaching was applied in the EC. The teacher assigned the students writing task on AWE, and then the students submitted their compositions online with the limit of 1 or 2 weeks. Students could revise their compositions with no limitation until they felt satisfied. After that the teacher read the students compositions online and added the teacher's feedback at proper time. Finally, the teacher gave common comment on all the compositions and summarized the common and easily-made mistakes by students and discussed them in the class.

The traditional way of teaching was applied in the controlled class. The teacher gave the instructions of writing, and the students wrote the composition independently after class. Then the teacher graded the composition and gave feedback and suggestions to students. Actually because of the energy and the time limit, the teacher gave two less compositions to the controlled class.

Step 4. Post-test

At the end of the semester, after the experiment, again both the experiment class and the controlled class were asked to write a short essay of about 150 words on the given topic under the supervision of two teachers. The teachers scored all the papers and recorded the data in computer for further analysis.

Step 5. Questionnaires and interview

The author designed questionnaires for both the teachers and the students to learn the effect of using AWE in writing, teaching, and learning, their attitude to the use of AWE, and so on. Thirty teachers and 200 students in the same university who had used Juku AWE answered the questionnaire. At the same time, the author tried to learn the problems of the system itself and the practical problems in teaching and learning practice.

Two hundred handouts of questionnaire for students were distributed, and 198 were collected, and 2 were invalid. Thirty handouts were distributed to teachers, and 30 were collected.

To know more about opinions from students and teachers, the author interviewed some teachers and the students so as to have an overview of their thinking.

Data Analysis and Discussion

The analysis of pretest scores of the students

The pretest was conducted before the experiment to learn the proficiency of these two classes. Table 1 shows that the mean score of EC is 9.7936, which is rather similar to that of CC (9.7751), and it is a little bit higher than that in CC (9.7936.55 > 9.7751) with the disparity of 0.0185, which is not so significant. Moreover, the standard deviation is 0.98734 in EC which is almost the same as CC (0.99786) too. Therefore, according to the two items of statistics, it is claimed that the average writing proficiency of these two classes is almost at the same level and the independent samples T-test (Table 2) just illustrates this question clearly.

Table 1.

Statistics of pretest in experimental class and control class

	Class	No.	Mean	SD	SE mean
Pretest score	EC	63	9.7936	0.98734	0.12527
Pretest score	CC	51	9.7751	0.99786	0.13960

CC, control class; EC, experimental class; SD, standard deviation; SE, standard error.

Table 2.

Independent samples T-test

Pretest			T-test for equality of means
	Levene's test for equality of variances		t	df	Significance (two-tailed)	Mean difference	SE difference	95% CI
	F	Significance	t	df	Significance (two-tailed)	Mean difference	SE difference	Lower	Upper
Equal variances assumed	0.097	0.755	0.260	112	0.785	0.04455	0.18668	−0.32111	0.41654
Equal variances not assumed			0.260	106.641	0.796	0.04455	0.18689	−0.32174	0.41805

df, degree of freedom; 95% CI, 95% confidence interval of the difference.

According to Table 2 the significant difference of Levene's test for equality of variances is 0.755 (>0.05), which surely indicates that the variances of scores in pretest of the two classes have no significant difference. Furthermore, the mean difference is merely 0.04455, and significance (two-tailed) is 0.785 (>0.05), which also signifies that the mean scores between EC and CC have no obvious difference. In addition, 95% confidence interval of the difference is from −0.32111 to 0.41654 and obviously it includes 0, which also signifies that the two classes have no significant difference. So, in conclusion, the students from both EC and CC nearly have the same writing level before the study, which can ensure the validity and reliability of this experiment at the beginning of the experiment.

The analysis of the post-test scores of the students

The post-test is conducted at the end of the experiment. The results are brought out in the following Tables 3 and 4 after inputting the scores into SPSS19.0, and thus, the analysis of the relevant statistics could be stated clearly.

Table 3.

Statistics of post-test in experimental class and control class

	Class	N	Mean	SD	SE Mean
Post-test score	EC	63	13.5556	1.10591	0.13344
Post-test score	CC	51	10.9804	1.02937	0.14414

Table 4.

Independent samples T-test

Post-test	Levene's test for equality of variances		T-test for equality of means
	Levene's test for equality of variances		t	df	Significance (two-tailed)	SE	SD	95% CI
	F	Sig.	t	df	Significance (two-tailed)	SE	SD	Lower	Upper
Equal variances assumed	0.329	0.567	7.995	112	0.000	1.57516	0.19702	1.18478	1.96554
Equal variances not assumed			8.019	108.287	0.000	1.57516	0.19643	1.18582	1.96451

Table 3 shows that the mean score in EC is 13.5556, which is about 2.5752 points higher compared with CC (10.9804). Obviously, there is important difference between the experiment class and the CC. Compared with the mean scores in the pretest, the mean scores of the EC advance from 9.796 to 13.5556 with a big gap of 3.7596 while the mean scores of the CC just moved forward from 9.7751 to 10.9804. This indicates that the disparity is significant. Moreover, the standard deviation is 1.10591 in EC which is higher than CC (1.02937). It means that the students' writing proficiency shows significant progress in EC. This means students in experiment class receiving Juku feedback and the teachers' feedback did better in English writing than those in CC only receiving teacher's feedback and the traditional way of teaching.

From Table 4 we can see that the significance (two-tailed) 0.000 is lower than p (0.05), which means that there are obvious differences in mean scores between EC and CC. Moreover, 95% confidence interval of the difference is from 1.18478 to 1.96554 and it obviously does not contain 0, which also can prove that EC and CC have significant difference in their writing competence after this experiment.

The analysis of the data from the questionnaires and the interview

The questionnaire and interview for students mainly include four aspects: their interest in using AWE, the helpfulness of AWE, their frequency use of AWE, and their preference of the feedback given by the computer or by the teacher or both of them. The students answered the questionnaire, and later on the author interviewed dozens of students personally to learn their real thoughts.

Table 5 shows that most of the students (81.5%) use AWE three to five times a month, with only 20% and 8.5% once or twice and more than five times, respectively. This indicates that students are interested in using AWE. They may use it once every week.

Table 5.

Students' frequency of using automated writing evaluation in a month

Frequency (times)	0	1–2	3–5	>5
No. (%)	0	20 (10)	163 (81.5)	17 (8.5)

As for the helpfulness of AWE, 65% of the students (shown in Table 6) think AWE system is very helpful with their English writing. Every time they submit their compositions, they will be given feedback immediately about the correction suggestions. Gradually they will be aware of their mistakes and try to avoid making them next time.

Table 6.

Students' perception of the helpfulness of using automated writing evaluation

Item	Very helpful	Helpful	Neutral	Not very helpful	Never
No. (%)	130 (65)	40 (20)	10	0	0

From Table 7, it can be seen that students have a higher perception of the effectiveness of Juku AWE in terms of improvement in grammar, spelling and collocation, organization, revision, and content. Especially, half of the students agree that they have improvement in grammar (20% of strongly agree and 30% of agree); only 10% of the students have no improvement in grammar. The author also learned from the investigation that the students thought Juku AWE system helped a little with the organization and content of their essay. Ninety percent students interviewed told the author that when teachers used AWE, their learning efficiency improved. This reflects that without the teacher's supervision and delayed assessment, most of the students have no awareness of autonomous learning.

Table 7.

Students' attitude toward the effectiveness of Juku automated writing evaluation

Choice item	Strongly agree, N (%)	Agree, N (%)	Neutral, N (%)	Disagree, N (%)	Strongly disagree, N (%)
Improvement in grammar	40 (20)	60 (30)	80 (40)	15 (7.5)	5 (2.5)
Improvement in spelling and collocation	24 (12)	52 (26)	84 (42)	29 (14.5)	11(5.5)
Improvement in organization	28 (14)	56 (28)	71 (35.5)	30 (15)	15 (7.5)
Improvement in revision and writing	28 (14)	55 (27.5)	75 (37.5)	25 (12.5)	17 (8.5)
Improvement in content	25 (12.5)	60 (30)	71 (35.5)	30 (15)	14 (7)

Among the three choices of feedback, 60% students (Table 8) tend to have both AWE feedback and teacher's feedback. They gave the answer in the interview that AWE's feedback was not so accurate that they still wanted to get feedback from their teachers. Thus they would know how well they did with their English writing and how much progress they made. All the students interviewed hoped that although they could get feedback from the AWE, they did need the teacher's feedback, for the teacher's feedback was emotional. This is beyond the author's expectation.

Table 8.

Students' preference of the feedback

Item	AWE feedback, N (%)	Teacher's feedback, N (%)	Both AWE and teacher's feedback, N (%)
	22 (11)	58 (29)	120 (60)

AWE, automated writing evaluation.

Table 9 shows that the students are satisfied with AWE with the average of 44% of “strongly agree” and the average of 7.4% of “agree” without counting the choice of “neutral.” Especially for the evaluation of the sentence, synthetic analysis was done, with 91.5% and 80% “strongly agree + agree” of the students' choices, respectively. This is further proved by the interview. Almost all the students interviewed told the author that they really paid attention to the evaluation of the sentence because that was the most frequent mistakes they made in their writing. Besides, the author also finds in the investigation that the students have little knowledge of the function of AWE. Some students have difficulty in using the AWE owing to no access to a computer.

Table 9.

Students' satisfaction with evaluation of automated writing evaluation

Choice item	Strongly agree, N (%)	Agree, N (%)	Neutral, N (%)	Disagree, N (%)	Strongly disagree, N (%)
Spelling and collocation	30 (15)	67 (33.5)	86 (43)	17 (8.5)	0
Sentence	85 (42.5)	98 (49)	14 (7)	3 (1.5)	0
Grammar	55 (27.5)	89 (44.5)	53 (26.5)	3 (1.5)	0
Structure and organization	20 (10)	45 (22.5)	40 (20)	50 (25)	45 (22.5)
Synthetic analysis	61 (30.5)	99 (49.5)	30 (15)	15 (7.5)	5 (2.5)

Teachers' questionnaire and interview include their interest in using AWE, the helpfulness of AWE, their frequent use of AWE, the effect of using AWE, and their preference of giving students' feedback.

In this study, Table 10 clearly shows that teachers have great interest in using AWE for no one disapproves of using it. Teachers interviewed also said that using AWE really reduced their load of evaluating students' essays and saved their time to some extent. They could use the AWE initiatively and they found that AWE met the need of their work, and it was easy for them to use it. It didn't need much technology. Their work load did not increase. Obviously the digital evaluation improves their working efficiency. But we can see that teachers have different attitudes to the use of AWE. Of the teachers 53.3% approve of AWE, while 46.7% of the teachers show dubious attitude. And still 23.3% teachers hold neutral attitude to the helpfulness of AWE.

Table 10.

Teacher's perception of automated writing evaluation

Choice items	Strongly agree, N (%)	Agree, N (%)	Neutral, N (%)
Interest in AWE	11 (36.7)	6 (20)	13 (43.3)
Helpfulness of AWE	14 (46.7)	9 (30)	7 (23.3)
Approval of using AWE	10 (33.3)	6 (20)	14 (46.7)

Similarly to the students, most of the teachers (76.7%) use AWE three to five times a month, and 10% of the teachers use it more than five times (Table 11). They combine their daily work with this digital grading system with 73.3% of scoring, 83.3% of feedback of the test paper, 56.7% of quality analysis, 33.3% of test source, 23.3% of quality test, and 83.3% of interaction between teachers and the students (Table 12). The figures show that the information technology really integrates with the English teachers' classroom teaching.

Table 11.

Frequency of teachers' use of automated writing evaluation of a month

Frequency (times)	0	1–2	3–5	>5
No. = 30 (%)	0	4 (13.3)	23 (76.7)	3 (10)

Table 12.

Teachers' use of automated writing evaluation

Choice item	N = 30 (%)
Scoring students' essay	22 (73.3)
Giving feedback of test paper	25 (83.3)
Doing quality analysis	17 (56.7)
Getting test source	10 (33.3)
Doing quality test	7 (23.3)
Interaction with students	23 (83.3)

It is interesting to find in Table 13 that 30% of the teachers prefer AWE feedback or teacher's feedback, respectively, while 40% of them prefer both of the feedbacks from the system and from teachers. This is further explained in the interview, and the answers from them confirm the previous figures that some teachers suspect the system. Although half of the teachers think that the use of AWE is helpful with the teaching quality, quite a lot of the teachers hold that the use of AWE has little help to the students, which indicates the lack of the comparative study on using AWE.

Table 13.

Teachers' preference of giving students' feedback

Item	AWE feedback, N (%)	Teacher's feedback, N (%)	Both AWE and teacher's feedback, N (%)
	9 (30)	9 (30)	12 (40)

Conclusion

The freshmen are young and curious about everything in college. When they were introduced to the Pigai.org system, they were very excited. They were eager to have the experience of submitting their compositions online. Many students enjoyed their progress every time they submitted their compositions online, and even many of them submitted their compositions again and again. The Pigai.org inspired the students' enthusiasm for writing and increased the time of students' writing exercises. After so many times of writing exercises, and with the system's assessment, the students' English writing will be greatly improved! Thus, the correcting network can effectively help students to improve their English writing. Compared with the traditional teacher marking and giving feedback, it is immediate, clear, and time-saving!

Teachers believe that the AWE system is very convenient to prevent students from plagiarizing each other and from the samples the students find from other sources on the net. Moreover, teachers can add their own feedback to the feedback in the system and they can take advantage of the function of their personal websites to upload learning material and to assign tasks to the students so as to strengthen their interaction with students after class.

Of course, there are still many aspects to be improved in the intelligent correction system of English composition. It is just a tool to assist teaching, and teachers' guidance should be combined with online learning. In addition, AWE system can only comment on the grammar errors and basic word collocations. It cannot meet the requirements of the evaluation for the composition of the text structure, content logic, and coherence. So the writing scores should be taken objectively. The intelligent correction system of composition can not only improve teachers' working efficiency, let teachers have more time to improve classroom teaching and research work, but also more importantly create opportunities for students to write as much as possible and improve their English writing ability. Finally, as the environmental conditions of using Internet and computer are changing, and the teachers scoring composition and students writing composition change with time, college English teachers should try to find good ways for marking essays and give effective feedback for students so that teachers can be liberated from heavy work of marking students' essays while the students will get effective feedback and move forward quickly with their good English composition.

Footnotes

Acknowledgments

The thesis is a product of the study of English Writing open course online supported by Shaanxi Provincial Department of Education (program no.: JSMK1723) and of the Educational Project of Shaanxi Provincial “13th Five-Year Plan”: Research and Practice on Cultivating Innovative Foreign Language Talents in Shaanxi Universities Under the Context of “One Belt and One Road” Strategy (Project No: SGH17H231).

Author Disclosure Statement

No competing financial interests exist.

Abbreviations Used

References

Arndt

Response to writing: Using feedback to inform the writing process. In: Brook

, Walters

. (Eds.): Teaching composition around the pacific rim: Politics and pedagogy. Clevedon: Multilingual Matters, 1993. pp. 90–116.

Yang

. Immediate feedback. Electr Teach Foreign Lang. 1981:14–17.

Shermis

, Burstein

. Automated essay scoring: A cross-disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates, Inc. 2003. pp. 43–54.

Page

EB.

Project essay grade: PEG. In: Shermis

, Burstein

(Eds.): Automated Essay Scoring: A Cross-Disciplinary Perspective. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 2003. pp. 43–54.

MacDonald

, Frase

, Gingrich

, Keenan

. The Writer's workbench: Computer aids for text analysis. IEEE Trans Commun. 1982; 30:105–110.

Elliot

IntelliMetric: From here to validity. In: Shermis

, Burstein

(Eds.): Automated Essay Scoring: A Cross Disciplinary Perspective. Mahwah, New Jersey: Lawrence Erlbaum Associates, Inc., 2003. pp. 71–86.

Shermis

, Barrera

Exit assessments: Evaluating writing ability through Automated Essay Scoring. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA, 2002.

Burstein

The E-rater Scoring Engine: automated essay scoring with natural language processing. In: Shermis

, Burstein

(Eds.): Automated Scoring: A Cross-Disciplinary Perspective, Hillsdale, NJ: Lawrence Erlbaum Associate, Inc., 2003. pp. 113–121.

Landauer

, Laham

, Foltz

Automated scoring and automation of essays with the intelligent essay assessor. In: Shermis

, Burstein

(Eds.) Automated Scoring: A Cross-Disciplinary Perspective. Hillsdale, NJ: Lawrence Erlbaum Associate, Inc., 2003. pp. 87–112.

10.

Larkey

, Croft

. A text categorization approach to automated essay scoring. In Shermis

, Burstein

(Eds.): Automated Scoring: A Cross-Disciplinary Perspective. Hillsdale, NJ: Lawrence Erlbaum Associate, Inc., 2003. pp. 55–70.

11.

Dikli

Automated essay scoring. Turkish Online J Dist Educ. 2006; 7:735–738.

12.

Liang

MC.

Construction of automatic score model for Chinese students' English writing. Nanjing: Doctoral Thesis of Nanking University, 2005.

13.

Liang

MC.

Development of automatic score system for English writing in mass examinations. Beijing: Higher Education Press, 2011.

14.

Elliot

, Mikulas

The impact of MY Access! use on student writing performance: A technology overview and four studies. Paper Presented at the Annual Meeting of the American Educational Research Association, San Diego, CA, 2004.

15.

Warshauer

, Ware

. Automated writing evaluation: Defining the classroom research agenda. Lang Teach Res. 2006; 10:1–24.

16.

Warshauer

, Grimes

Automated writing assessment in the classroom. Pedagogies, 2008:22–36.

17.

Grimes

, Warschauor

. Utility in a fallible tool: A multi-site case study of automated writing evaluation. J Technol Lang Assess. 2010; 8:1–43.

18.

Shermis

, Burstein

, Bliss

The impact of automated essay scoring on highstakes writing assessments. Paper Presented at the Annual Meeting of the National Council on Measurement in Education, San Diego, 2004.

19.

Attali

Exploring the feedback and revision features of criterion. Paper Presented at the Annual Meeting of the National Council on Measurement in Education, San Diego, CA, 2004.

20.

Chodorow

, Gamon

, Tetreault

. The utility of article and preposition error correction systems for English language learners: Feedback and assessment. Lang Test. 2010; 27:419–436.

21.

Cao

, Yang

. Automated Chinese essay scoring with latent semantic analysis. Exam Res. 2007; 3:63–71.

22.

Automated essay scoring for testing Chinese as a second language. Beijing: PhD Thesis of Beijing Language University, 2006.

23.

Liang

MC.

Constructing a model for the computer-assisted scoring of Chinese EFL learners argumentative essays. Nanjing: PhD Thesis of Nanjing University, 2005.

24.

Richards

et al. Longman Dictionary of language teaching and applied linguistics. Beijing: Foreign Language Teaching and Research Press, 2000.

25.

Hyland

Second language writing. 8th ed. Cambridge: Cambridge University Press, 2010.

26.

Brookhart

SM.

How to give effective feedback to your students. Alexandria, VA: ASCD. 2008.

27.

Kroll

, ed. Exploring the dynamics of second language writing. Cambridge, UK, Cambridge University Press, 2003.

28.

Ferris

, Hedgcock

. Teaching ESL composition. Purpose, process, and practice. New York: Routledge. 2004.

29.

Reid

JM.

Understanding learning style in the second language classroom. Englewood Cliffs, NJ: Prentice Hall, 1998.

30.

Leki

. The preferences of ESL students for error correction in college-level writing classes. Foreign Lang Ann. 1991:203–217.

31.

Kulhavy

, Yekovich

, Dyer

. Feedback and contest review in programmed instruction. Contemp Educ Psychol. 1979:91–98.

32.

Hendrickson

JM.

Error correction in foreign language teaching: Recent theory, research, and practice. In: Croft

(Ed.): Readings on English as a second language (2nd ed.). Cambridge, MA: Winthrop Publishers. 1980.

33.

Omaggio Hadley

Teaching language in context. Boston: Heinle & Heinle. 1993.

34.

Williams

Undergraduate second language writers in the writing center. J Basic Writ. 2002; 21:73–91.

35.

Williams

Tutoring and revision: Second language writers in the writing center. J Second Lang Writ. 2004; 13:173–201.

36.

Goldstein

, Conrad

. Student input and negotiation of meaning in ESL writing conferences. TESOL Quart. 1990; 24:443–460.

37.

Hedgcock

, Lefkowitz

. Collaborative oral/aural revision in foreign language writing. J Second Lang Writ. 1992; 1:255–276.

38.

Kepner

. An experiment in the relationship of types of written feedback to the development of second-language writing skills. Modern Lang J. 1991:303–313.

39.

Chaudron

Evaluating writing: Effects of feedback on revision. RELC J. 1984; 15:1–14.

40.

Mitan

The peer review process: Harnessing students' communicative power. In: Johnson

, Roen

(Eds.): Richness in writing: Empowering ESL students. New York: Longman, 1989. pp. 207–219.

41.

Keh

. Feedback in the writing process: A model and methods for implementation. ELT J. 1990:294–304.

42.

Allison

, Ng

Developing text revision abilities Hong Kong: Institute of language in Education. 1992. pp. 106–130.

43.

Wang

Feedback and English writing. Shandong: Shandong University Press, 2007.

44.

Berg

EC.

Preparing ESL students for peer response. TESOL J. 1999; 8:20–25.

45.

Wang

. Could students master the techniques of mutual correction?. J Foreign Lang Teach Abroad. 2004:54–56.

46.

Hyland

ESL Writers and feedback: Giving more autonomy to students. Langu Teach Res. 2000; 4:33–54.

47.

McGroarty

, Zhu

. Triangulation in classroom research: A study of peer revisions. Lang Learn, 1997; 47:1–43.

48.

Connor

, Asenavage

. Peer response groups in ESL writing classes: How much impact on revision?. J Second Lang Writ. 1994; 3:257–276.

49.

Dheram

PK.

Feedback as a two-bullock cart: A case study of teaching writing. ELT J. 1995; 49:166.

50.

Hall

Managing the complexity of revising across languages. TESOL Quart. 1990; 24:43–60.

51.

Hyland

The impact of teacher-written feedback on individual writers. J Second Lang Writ. 1998; 7:255–286.

52.

Ferris

, Roberts

. Error feedback in L2 writing classes: How explicit does it need to be?. J Second Lang Writ. 2001; 10:161–184.

53.

Yates

, Kenkel

. Responding to sentence-level errors in writing. J Second Lang Writ. 2002; 11:29–47.

54.

Chandler

The efficacy of various kinds of error correction for improvement in the accuracy and fluency of L2 student writing. J Second Lang Writ. 12, 2003, pp. 267–296.

55.

Truscott

. The Effect of Error Correction on Learner's ability to write accurately. J Second Lang Writ. 2007:255–272.

56.

Anandam

AK.

Computer-based feedback on writing. Comput Read Lang Arts. 1983; 1:30–34.

57.

Zamel

. Responding to student writing. TESOL Quart. 1985:79–102.

58.

Chen

. Assessing and scoring of college English writing practice. Foreign Lang World. 1994:43–46.

59.

Ferris

Teaching ESL composition students to become independent self-editors. TESOL J. 1995; 4:18–22.

60.

Ferris

The influence of teachers commentary on students revision. TESOL Quart. 1997; 315–339.

61.

. The effect of feedback in English writing—Research on writing of english argumentative. J Foreign Lang Teach Abroad. 2004:47–53.

62.

Wang

A comparative study of peer feedback and teacher feedback on English writing in rural middle schools in west part of China. J Nanchang Educ Coll. 2016; 122–125.

63.

Wei

An empirical study on the influence of feedback mode on college students' English writing. J Tianjin Foreign Stud Univ. 2015; 22:43–50.

64.

A comparative study of feedback in English writing based on dynamic evaluation theory]. Foreign Lang World. 2015; 168:59–67.

65.

Liu

YH.

A study on the effect of feedback from writing team mates and the teacher of English majors. Foreign Lang World. 2015; 166:48–55.

66.

Cai

. A comparative study of online peer feedback and teacher feedback for college students in China. Foreign Lang World. 2011; 143, 65–72.

67.

Zhao

. On students' views on peer feedback and teacher feedback in writing. English Teacher. 2012:17–21.

68.

Dou

A comparative study of teacher feedback and peer feedback in English writing teaching. J Jiangsu Inst Educ. 2013; 29:126–129.

69.

, Sun

, Wang

. A comparative study of teacher feedback and peer feedback in English writing teaching. J Jiamusi Educ Inst. 2014; 138:354–357.

70.

Song

, Liu

. Study on the impact of the effect of feedback on college students' writing: Rational combination of teacher feedback and peer feedback. J Shaanxi Educ. 2015; 35–37.