Abstract
This paper investigates the use of lexical bundles in Chinese students’ academic writing across different levels of studies at an English medium university. Frequency-based bundles were retrieved from a corpus of student academic texts written at four points of time between Year 1 and Year 4, and the structures and functions of the bundles were analysed to reveal the developmental patterns in bundle use. The analysis shows that, overall, students used more types of bundles when they progressed to the higher level of studies, but there were differences in the use of preferred structural forms. The analysis also suggests a developmental order of bundle use in L2 students’ academic writing: clausal and nominal bundles appeared to be acquired prior to prepositional bundles. In terms of functional distributions, discourse organizing bundles were much more prevalent than stance and referential expressions in student writing at all levels of studies, particularly in final year project dissertations. The findings indicate a linkage between levels of academic studies and the patterns of bundle use in student writing, as well as the effects of EAP instruction on the learning of lexical bundles by Chinese student writers at an English medium university in China.
Keywords
Introduction
A typical feature of the current internationalization of higher education in China is the adoption of using English as the Medium of Instruction (EMI), through which students need to grasp subject contents and develop disciplinary competence for the purposes of academic and professional communication in international contexts. Learning to write academic texts in English is thus becoming an increasingly important issue for research on second language (L2) writing as well as curriculum design of English for Academic Purposes. Approaches to writing in the disciplines, which recognize writing as a tool for learning both subject contents and disciplinary conventions (Hyland, 2013), pose new challenges to Chinese university students whose perceptions of learning English are characteristically instrumental, and are often associated with developing English language fluency. Recent research on phraseology in applied linguistics has shown that effective use of recurrent multi-word expressions or lexical bundles is central to the building of written and spoken academic discourse (e.g. Biber et al., 2004; Hyland, 2008; Wray, 2008). However, there are few studies that examine developmental patterns of such formulaic language in English academic writing by second language learners in an EMI context. This research investigates the use of lexical bundles in Chinese undergraduate academic writing at an English medium university in mainland China, with the aim of unveiling the patterns of phraseological development in English academic writing in the course of their studies of degree programmes from lower to higher levels.
Literature Review
A large body of research has been conducted on formulaic language, which is referred to in the literature as phrasal expressions (Martinez and Schimitt, 2012), formulaic sequences (Schmitt, 2004; Wray, 2008), recurrent word combinations (De Cock, 1998), or lexical bundles (Biber et al., 1999; Biber and Barbieri, 2007; Biber et al., 2004; Chen and Baker, 2010; Cortes, 2004; Hyland 2008). A defining characteristic of lexical bundles is that they are the most frequently occurring sequences of words that can be examined in the corpus of written or spoken language. Their occurrence is pervasive, and their identification is frequency-driven and can be automatically retrieved. Another characteristic of lexical bundles, which distinguishes them semantically from idiomatic expressions, is that bundles (e.g. if you look at, one of the most) are transparent in meaning from the individual words, revealing the aspect of the degree of formulaicity rather than idiomaticity in language use. A further characteristic is that despite being formally regular, lexical bundles (e.g. it should be noted, as a result of) are often structurally incomplete (Biber et al., 1999). Biber and Barbieri (2007) noted that while most lexical bundles act as linkage for two structural units, usually bridging two clauses in speech or two phrases in writing, only a small portion of bundles in both speech and writing represents complete structural units.
An important line of research on lexical bundles was to explore variations in the use across academic registers in the university context. Biber, Conrad and Cortes (2004) described the lexical bundles in classroom teaching and textbooks, following a three-category functional framework of stance expressions, discourse organizers, and referential expressions. The analysis showed distinctive patterns of lexical bundles in classroom teaching in comparison to textbooks, academic writing and conversation. This research was further extended to the investigation of the use of lexical bundles in a wider range of spoken and written university registers (Biber and Barbieri, 2007), and it was found that lexical bundles in spoken university registers are fundamentally different from those in written registers. In a study on the distributional patterns of lexical bundles in the discourse structure of university class sessions, Csomay (2013) revealed a strong relationship between the lexical bundle functions and the communicative functions of three macro-phases of the structure of classroom discourse.
Another line of research has been concerned with the investigation of disciplinary variations in the use of lexical bundles (e.g. Durrant, 2015; Hyland, 2008). Hyland (2008) explored the forms, structures and functions of four-word bundles in a corpus of research articles and Doctoral and Master’s dissertations in four disciplines, and showed the centrality of bundles in creating academic discourse as well as in differentiating written texts by discipline. A recent study by Durant (2015) described disciplinary variation in university students’ writing that emerged from the identification of bundles that are distinctive of hard (science/technology) and soft (humanities/social sciences) disciplines.
Recent research has also explored the use of lexical bundles in English academic writing by second language learners, compared with native speakers’ writing and published academic texts (Ädel and Erman, 2012; Chen and Baker, 2010, 2014; Staples et al., 2013; Wei and Li, 2011). Staples et al. (2013) examined the use of lexical bundles in written responses across three proficiency levels in the TOEFL iBT, which was a controlled writing environment. The study found that lower-level test takers used more bundles overall, but also more bundles from the test prompts. In terms of function, stance and discourse organizing bundles were used similarly across proficiency levels, but very few referential bundles were used by L2 writers regardless of proficiency levels. Chen and Baker (2010) investigated the structures and functions of lexical bundles in L1 and L2 academic writing, drawing on a corpus of published academic texts and two corpora of student academic writing (one L1, the other L2 Chinese student writing). The widest range of lexical bundles was identified in published academic writing and the smallest range in L2 Chinese student writing. In addition, certain bundles (e.g. all over the world) which were rarely used by native academics were overused by L2 Chinese student writers, whereas some expressions that are typical in academic prose were underused. In another study on Chinese L2 writing, Chen and Baker (2014: 28) found that lexical bundles in lower-level writing share more features with conversation, as they are ‘more verb-heavy’, ‘more personally involved’ and ‘rely more on colloquial quantifiers’, whereas more proficient L2 writing shows a more impersonal tone with greater use of nominal elements in lexical bundles. Wei and Li (2011) investigated the use of four–word lexical bundles in the Doctoral dissertations by advanced Chinese EFL learners, compared with published journal articles by professional writers. Ädel and Erman (2012) also explored English lexical bundles in advanced L2 learner writing in comparison with native-speaker writing. Current corpus-based research on university student writing has suggested that a good command of lexical bundles is an important attribute of L2 students’ academic writing competence, providing evidence that lexical bundles are the important building blocks of effective academic discourse (Biber et al., 2004; Hyland, 2008).
Despite the abundance of studies on lexical bundles in academic writing, there is still little research exploring L2 learner development in the use of bundles in English academic writing across different levels of studies. This line of research is of great significance to the teaching of English academic writing to L2 learners, as it has the potential to reveal how novice L2 writers learn to use formulaic language as a resource for composing their texts to demonstrate their understanding of academic conventions and disciplinary knowledge. Although research so far has suggested a developmental sequence for some aspects of formulaic language use by L2 writers (Chen and Baker, 2010, 2014; Staples et al., 2013), it is imperative to deepen our understanding of the patterns of development from apprentice to more mature writing in a common, self-regulated L2 academic writing context. Drawing upon a corpus of student written assignments, the current study aims to fill the gap by investigating the use of English lexical bundles in Chinese undergraduate academic writing across different levels of studies at a Sino-UK partnership university, where English is used as the medium of instruction and writing as the main means of assessing student academic attainments. The main focus of the study is the comparison of the features of lexical bundle use in Chinese student writing upon their entry to the English medium university and at the completion of their degree studies at the university.
The Corpus Material
Data of this study consist of a corpus of Chinese undergraduate written assignments across different years of their studies at a Sino-UK English medium university. The university implemented an intensive English for Academic Purposes course in Year 1 and a discipline-related English for Specific Purposes course in Year 2, with the aim of developing students’ academic literacy and to prepare them for the study of subject courses that were taught in English from the second year. The collected texts were written at four different points during the students’ four-years of studies. These four points were:
First English essays written for the EAP class at the beginning of Year 1
Final coursework submitted for the EAP class at the end of Year 1
Final coursework written for the ESP class at the end of Year 2
Final Year Project dissertations submitted for the completion of studies in Year 4
The EAP course in Year 1 focused on developing generic academic English writing skills while the goal of the ESP course in Year 2 was to familiarize students with the different kinds of writing genres they were likely to encounter in their disciplinary studies. There was no requirement in their writing of subject knowledge, though writing topics were broadly relevant to their studies of disciplinary areas. The Year 4 dissertations were gathered from students studying joint programmes in BA English and international business/finance, and their topics were subject-specific, mainly relating to English language and international business. A total of 777 texts were collected from students of different years at the EMI university. As the focus of the study was on the patterns of bundle use in L2 academic writing at the different university levels, we did not take into account some other potential variables such as gender and proficiency differences in constructing the corpus. Table 1 gives an overview of the data used for building the corpus of students’ academic writing.
Composition of Chinese Undergraduate Academic Writing Corpus.
Lexical Bundle Identification
A number of operational issues were considered in generating a frequency-driven list of lexical bundles from the corpus. Though high frequency is a defining feature of lexical bundles, decisions on the frequency cut-off used for bundle identification varied among researchers. For written corpora, the setting of a minimum frequency ranges from 40 times per million words (e.g. Biber and Barbieri, 2007), to 25 times (e.g. Chen and Baker, 2010); and further down to 20 times (e.g. Hyland, 2008). In addition, distribution thresholds of the occurrence of a bundle in a minimum of 10% of texts, or 3–5 texts, depending on the size of corpora (e.g. Hyland, 2008; Chen and Baker, 2010), are often required to avoid idiosyncrasies by individual writers. Chen and Baker (2014: 6) used a ‘dynamic’ frequency cut-off threshold to accommodate variations in the sizes of three sub-corpora, in order to generate an ‘optimum’ number of bundles in each sub-corpus whose size was less than 10,000 words. In the present study, the sizes of four sub-corpora were quite different, as well as the number of texts and average text length in each sub-corpus. Though these were factors influencing the building of the corpus, the nature of written assignments required at different academic levels had determined the length of texts, and resulted in the variations in the size and text number of sub-corpora. Chinese students are not asked to write a 1000-word English essay upon their entry into an EMI university right after graduating from high school. On the other hand, a final year dissertation is a substantial piece of work for completing their degree studies, so has to meet the normal requirements on text length. With these variations in mind, the study set a frequency cut-off at 30 times per million words across the sub-corpora regardless of their differences in size. An additional restriction on distribution was imposed for adjusting variations in text number as well as average text length of the sub-corpora. In their studies of bundle use in Chinese L2 writing, Chen and Baker (2010, 2014) adopted a cut-off point of the occurrence of a bundle in at least three texts in each sub-corpus to accommodate variations in the sizes of sub-corpora that were explored. For the purpose of comparability, this study set a distribution threshold of a raw occurrence at least 5% of texts, which means a minimum of 11 texts of Year 1 first essays, 12 texts of Years 1 and 2 final coursework, and three texts of Year 4 FYP dissertations. These decisions, though somewhat arbitrary, were made after repeated experiments with the data, and were necessary for generating an optimal number of bundles with sufficient representativeness across the sub-corpora.
Most of the recent studies on lexical bundles focus on four-word sequences because they, as Hyland noted, ‘are far more common than 5-word strings and offer a clearer range of structures and functions than 3-word bundles’ (2008: 8). It is indicated that research on four-word sequences is particularly revealing about academic discourse, though some pedagogy-oriented phraseology research also includes other multiword items retrieved from larger corpora, for example, in ‘Academic Formulas List’ generated by Simpson-Vlach and Ellis (2010), and ‘Phrasal Expression List’ by Martnez and Schmitt (2012). In line with the current studies on L2 students’ academic writing (e.g. Ädel and Erman, 2012; Chen and Baker, 2010; Staples et al., 2013), the present study focused on four-word lexical bundles for the purpose of comparability of lexical bundle use in Chinese undergraduate academic writing. WordSmith Tools 5 (Scott, 2007) was applied to generate a bundle list from each of the four sub-corpora, following the frequency and dispersion criteria described above. The retrieved bundles were checked manually to exclude bundles that were related to specific writing topics (e.g. aims of academic study). Overlapping bundles were also examined in the context via concordance analysis. For example, after checking the concordance lines, it was found that many instances of ‘there is no doubt’ and ‘is no doubt that’ derive from the same sequence of ‘there is no doubt that’. Such instances were merged as one bundle type and noted in token frequencies so as to guard against inflated numbers of bundle types and frequency counts.
Results
Frequencies and Structures of Lexical Bundles
Lexical bundles which met the criteria of a minimum frequency of 30 times per million words and the related distribution restrictions were identified, and then compared across the sub-corpora. Table 2 shows a general comparison of the occurrence of bundle types at the different year levels. Overall, there were more different English lexical bundles identified in Year 4 FYP dissertations than in students’ written assignments at other levels, though there was a fluctuation at in-between levels. On the other hand, the average token frequency of occurrence of the identified bundles in each sub-corpus (normalized per million words) decreases from Year 1 to Year 4. The results reveal a narrower range of lexical bundles used by lower level students compared with Year 4 students, as well as a declining pattern of average token frequency of lexical bundle types as students progressed to higher level of studies.
Occurrence of Bundle Types across Sub-corpora.
When comparing the bundle types retrieved from the sub-corpora, nine bundles were found to have occurred in all student academic writing across the years. Table 3 shows these nine shared bundles with the normalized token frequency of occurrences, which were found to be much higher in Year 1 first essays than in Year 4 FYP dissertations. However, two structurally complete units (i.e. at the same time, on the other hand) show a similar pattern of use across the different years. Previous research has shown that these two bundles are among the most frequent four-word bundles in academic writing (Ädel and Erman, 2012; Hyland, 2008), and that L2 student writers tended to overuse them compared with native and expert academic writers (Chen and Baker, 2010). The findings of this study further suggest that L2 student writers’ tendency to overuse these two bundles seems immune to the development of their academic literacy. In addition, incomplete structure units, especially clausal chunks such as ‘is one of the’, ‘it is important to’, ‘it is necessary to’, and ‘an important role in’, were used less often in Year 4 than in Year 1.
Shared Bundles across Different Years with Normalized Frequency per Million Words.
Apart from the frequency differences, there are some qualitative distinctions in the use of these shared bundles across the sub-corpora. The chunk ‘is one of the’ was used across the years, but in different ways. In Y1 first essays, the bundle often co-occurred with a superlative adjective functioning as the premodification of a noun phrase, which constitutes the main part of the propositional content of the clause, whereas in Y4 FYPs, the bundle was used to introduce a complex noun phrase followed by an embedded that/which clause as the post-modification. Year 4 students tended to use the bundle to construct more complex sentences for managing to link related ideas together.
1. Education
2. It is clear that the strong culture of each company
In Year 1 first essays, the text linking bundle ‘on the other hand’ tended to co-occur with ‘on the one hand’ as pair bundles to signal the contrast of ideas in a more explicit way, as well as to construe the adjacency and locality of the discourse. In Y4 FYPs, the bundle was used on its own more often as an independent discourse organizer, for the purpose of elaborating contrastive discourse beyond the immediate co-text.
3.
4. Thus, translation should respect intentionality, emotional expression and values of the original writer, which is basic principle of communication.
This study adopted the framework of structural classification that was initially developed by Biber et al. (1999) and widely used in recent research on lexical bundles in academic writing (e.g. Chen and Baker, 2010, 2014; Cortes, 2004). Three broad categories were identified in the sub-corpora, including ‘NP-based chunks’, ‘PP-based chunks’, and ‘VP-based chunks’. Table 4 shows the structural distribution of bundle types in the sub-corpora. As can be seen, the proportion of NP-based bundles is much larger in Year 1 first essays (32.1%) than in the writings of later years. A close scrutiny indicates that in Year 1 first essays, there are varied nominal constituents, some with of (e.g. the development of society) and others without post-modifier (e.g. the most important thing), whereas the prevalent combination of NP-based bundles in other sub-corpora falls into the sequence of the + n + of the (e.g. the purpose of the, the basis of the). This suggests that the NP-based bundles were used for discussing the relation of abstract ideas in students’ higher level academic writing, but at lower level, functioned as a resource for expressing the propositional contents often related to their general knowledge.
Structural Distribution of Bundle Types across the Sub-corpora.
An interesting difference in the use of lexical bundles is the proportions of PP-based bundles across the years. Prepositional expressions were used substantially less by Year 1 students when they started to learn academic writing, and the most common ones are fixed expressions (e.g. at the same time, on the other hand) that have complete internal structures. However, in Year 2 final coursework and Year 4 FYPs, PP-based bundles had much higher proportions of bundle types, often with the sequences of in/on/with + n + of (e.g. on the basis of, in terms of the). Such bundles, though structurally incomplete, are highly productive in framing sentences in academic writing, and their frequent occurrence is an important indicator of students’ improving academic writing ability.
With regard to VP-based bundles, it is noteworthy that there are only small variations in terms of the overall proportions across the sub-corpora, all making up more than 50% of bundle types extracted from the data. However, a scrutiny of the sub-categories of VP-based bundles shows that three types of constructions, i.e. anticipatory it + VP/Adj P (e.g. it is difficult to; it can be found), passive Verb + PP fragment (e.g. can be seen as; can be divided into), that-clause fragment (e.g. this might be that), are proportionally higher in Year 4 FYPs than Year 1 first essays. In contrast, another two types of VP-based bundles, that is, VP with active verb (e.g. pay more attention to) and pronoun/NP + be/verb (e.g. that is to say, this study is to), were more widely used in Year 1 first essays than Year 4 FYPs. Although all these chunks are built around the verb to form the syntactic core of clause, verb constructions like ‘passive verb + PP fragment’ and ‘anticipatory it + VP/Adj. P’ are extensively used in academic proses (Biber et al., 1999: 997), and typically serve as the linguistic resources for reporting and evaluating information.
The findings of the frequent structural categories in each sub-corpus revealed interesting patterns of bundle use in student academic writing. First, the teaching of EAP in Year 1 in the English medium university had immediate effects on increasing the variety of bundle types in student academic writing. Second, a developmental order of lexical bundles seems to emerge in students’ academic writing. VP-based and NP-based bundles appear to be acquired by the students prior to PP-based bundles. Chen and Baker (2014:17) showed a similar pattern of bundle types across three CEFR levels (B1, B2 and C1) of Chinese student writing. VP based bundles were proportionally the biggest group in B1 level (the lowest level); NP-based bundle types had a higher proportion in B2, compared with the other two levels; and PP-based bundles had the widest range in C1 (the highest level). Whether this developmental order of lexical bundle use can be verified requires further studies on L2 academic writing in broader contexts.
Functions of Lexical Bundles
In analysing the functions of lexical bundles, a three-category framework, including referential expressions, stance expressions, and discourse organizers, has been adopted by researchers (Biber et al., 2004, Chen and Baker, 2010; Ädel and Erman, 2012). Referential expressions (e.g. the quality of the) make reference to physical or abstract entities, or to the text itself. Stance bundles (e.g. it is necessary to) convey the writer’s attitudes or evaluation of certainty that frame other propositions. Discourse organizers (e.g. as we all know) are used to structure prior and subsequent text. Table 5 shows the results of the functional distribution of the bundles across the sub-corpora. Overall, discourse organizers have the biggest proportion in student academic writing regardless of the year differences, amounting to nearly half of the bundles in Year 1 and over 60% in Year 4. Stance expressions, on the other hand, are the smallest functional category across the sub-corpora, with small variations across the levels.
Functional Distribution of Lexical Bundles (Types).
Hyland (2008:14) observed that some functional categories have a strong connection to certain types of structural patterns. NP-based bundles, for example, commonly carry the referential functions; PP-based bundles occur typically as discourse organizers; and anticipatory it units are used largely for expressing stance. In the present study, the choices of bundles within each functional category were found to be distinct across the years. With regard to discourse organizers, students in Year 1 used varied types of VP-based bundles, particularly ‘to-clause fragments’, to introduce or elaborate a topic in their writing.
5. In my opinion, one aim of academic study is
6.
As Chen and Baker (2010) observed, Chinese student writers used considerably more ‘to-clause fragments’, showing a preference for the frame ‘in order to + Verb’. However, this study also uncovered that Year 4 students tended to rely on a smaller range of types of discourse organizers to establish or mark the relation in their writing. Different from Year 1 students’ preference for ‘to-clause fragments’, Year 4 students made dense use of PP-based bundles and passive verb constructions as discourse organizers, and a salient type is the chunk of ‘can + passive verb’ (e.g. can be seen that; can be found that; can be regarded as).
7. Corporate culture can also be considered
8.
Since PP-based bundles and passive constructions are the typical discourse features of advanced academic writing (Hyland, 2008), the changing pattern of the use of discourse organizers from Year 1 to Year 4 writing indicates the aspects of the trajectory of the academic writing development of Chinese student writers.
As for referential expressions, NP-based chunks were found to be the most common type across the years, and the two Year 1 sub-corpora had higher distributional proportions than the sub-corpora of Years 2 and 4. However, there are some qualitative differences in functions, especially between the sub-corpus of Year 1 first essays and other corpora. In Year 1 first essays, many referential expressions are nominal groups consisting of a general noun (e.g. thing, way) and a common evaluative adjective (e.g. good), which are concerned with the description of students’ subjective opinions about writing topics, and reflects the way that novice student writers constructed their experiential meaning in text at the initial phase of learning L2 academic writing.
9. At last, it is also
By contrast, referential expressions identified in the other three sub-corpora are more concerned with the procedure of reporting contents, identification and/or quantification of entities, and text reference. This referential function is largely realized in the structural unit of the ‘the + noun + of + the’ bundle, in which the occurrence of the noun covers a variety of content-specific academic-related lexis (e.g. purpose, analysis, basis, results).
10.
11. As what has been stated earlier, the model was selected as the framework for
According to Biber, Conrad and Cortes (2004), stance bundles convey two major kinds of meaning: epistemic and attitude. Whereas epistemic stance bundles concern the judgments on the status of knowledge in the following proposition, attitudinal stance bundles express a writer’s attitudes towards the actions or events. In this study, stance expressions have the lowest distributional proportion among the three functional categories, and there are small variations across the sub-corpora, ranging from 14.5% to 10.8%. Stance expressions in the sub-corpus of Year 1 first essays appear to convey a different type of meaning from those used in other sub-corpora. In Year 1 first essays, students tended to use bundles associated with ‘important’ or ‘necessary’ to express their attitudes towards the actions or events, and in particular, they tended to overuse the ‘important’ bundles to state obligations or evaluations (e.g. it is very important, is of great importance, is the most important).
12. Most of us in this university will go abroad for further study, so,
On the other hand, more epistemic stance bundles which express possibility or certainty are used in the other sub-corpora. Such epistemic bundles fall into the structural sequence of ‘it + is + Adjective + that’ (e.g. it is clear/likely/possible/ that), which manifests an explicit, objective assessment of the following proposition.
13. From the result of experiment,
Epistemic stance bundles were used as hedging devices for conveying the degree of confidence in making a statement. The control of such cautious language by the Chinese student writers, however, does not show much diversity, even in Year 4 student writing. An important feature of stance bundles identified in the corpus is that they are all impersonal bundles, regardless of the year levels, and there is no single case of stance bundles with personal pronouns.
Discussion and Conclusion
The main purpose of the study has been to compare the use of lexical bundles in Chinese student academic writing between the entry point and the final year of their degree studies at an EMI institution. The results presented in the previous sections have confirmed a developmental pattern for some aspects of formulaic language use found in other research (e.g. Staples et al., 2013) and some distinctive features of bundle use by Chinese L2 student academic writing (e.g. Chen and Baker, 2010, 2014). The study shows that students relied heavily on the repeated use of a narrow range of lexical bundles in academic writing when they were at lower level of studies, but used more types of lexical bundles when they progressed to the higher level. The findings, therefore, suggest a relationship between levels of academic studies and the range of lexical bundles that L2 students are able to use in academic writing.
An interesting finding on the structural categories is the prominent use of VP-based bundles in the entire corpus and across the sub-corpora. This finding is consistent with Chen and Baker’s (2014) analysis of Chinese student writing at three proficiency levels, suggesting that clausal chunks are the most extensively used structures in their English academic writing. As clausal fragments are used more widely in spoken language than in academic prose (Biber et al., 2004), the students’ reliance on this structural type might be a distinctive feature of lexical bundle use in novice academic writing. The use of NP-based bundles was nearly two times more in Year 1 first essays than in Year 4 FYPs, and on the other hand, as the students progressed from Year 1 to Year 4, the occurrence of PP-based bundles steadily increased. This may represent a developmental order of lexical bundles in L2 students’ academic writing: clausal bundles and nominal bundles were acquired prior to prepositional bundles by L2 student writers.
Regarding the functional distribution, the study found that discourse organizers ranked the largest category, stance expressions the smallest, and referential expressions in between. The findings corroborate Chen and Baker (2010) on Chinese student writing but are different from the distribution of discourse functions in other research on non-native writing (e.g. Ädel and Erman, 2012). In addition, this study reveals a progressive trend in the use of discourse organizing types from Year 1 to Year 4 and an opposite decreasing pattern of referential expressions. L2 writers’ tendency to use more discourse organizers as they progress in academic writing proficiency appears to be in contrast to native academic writers who predominantly use referential expressions (c.f. Ädel and Erman, 2012; Chen and Baker, 2010). A tentative explanation for such difference is that this may be influenced by the teaching of academic writing in the Year 1 EAP course, in which formulaic language used for linking ideas and organizing texts constituted an important part of the syllabus that was explicitly taught and intensively practiced in student writing.
Finally, an important finding of the study is that changes in the use of lexical bundles are much more conspicuous, both quantitatively and qualitatively, between Year 1 first essays and Year 1 final coursework, than between Year 1 final coursework and Year 4 FYPs. The students seemed to have learned a certain repertoire of recurrent word expressions from the intensive EAP course by the end of the first year of their studies. This change in bundle use, therefore, reflects the immediate effects of academic writing instruction in the Year 1 EAP course on the acquisition of formulaic language in student writing. On the other hand, the pattern of bundle use from the end of Year 1 to the final year suggests that, after an initial stage of bundle acquisition, there might be a plateau phenomenon in the development of L2 student writers’ formulaic language in their academic writing.
In closing, the findings of this study have pedagogical implications for teaching academic writing in the EMI educational contexts. As the important linguistic resources for constructing academic prose, lexical bundles generated from the corpus of expert writing will be systematically integrated into the teaching of academic English for the needs and purposes of students’ success in studying at EMI universities. Furthermore, we believe that the developmental patterns of bundle use found in student writing will also inform the design of an academic writing course for L2 students who are required to construct disciplinary knowledge for communicating in the academic discourse community.
Footnotes
Acknowledgements
I would like to thank Dr Bin Zou and Dr Wangheng Peng for sharing their collection of the text data, which comprised part of the corpus used in the present study.
Funding
This research received grants from the Jiangsu Province Philosophy and Social Sciences Funding Scheme and XJTLU Research Development Fund.
