Abstract
The present study intended to investigate whether test takers’ breadth and depth of vocabulary knowledge can contribute to their efficient use of lexical bonds while restoring damaged texts in reduced redundancy tests. Moreover, the moderating role of general language proficiency was investigated in this interaction. In so doing, Vocabulary Levels Test (VLT), Word Associates Test (WAT), and a series of C-tests with high and low lexical bonds were administered to two groups of 85 upper-intermediate and 50 lower-intermediate EFL learners. Results of multiple regression analyses indicated the following: (a) breadth and depth of vocabulary knowledge played dissimilar roles for test takers with different levels of language proficiency; (b) depth of vocabulary knowledge was a better predictor for high-bond texts; and (c) test takers with higher levels of language proficiency made more efficient use of lexical bonds as contextual cues. The findings point to the necessity of improving learners’ depth of vocabulary knowledge, especially at lower levels of language proficiency where vocabulary knowledge is mostly a matter of size rather than quality.
Keywords
Many scholars (e.g. Qian, 2002; Qian & Schedl, 2004; Read, 1993; Read, 2004; Read & Chapelle, 2001; Schmitt, 2008; Shiotsu & Weir, 2007; Zhang & Li, 2011) have emphasized the significance of vocabulary knowledge in the literature of ESL/EFL. As Schmitt (2008) points out, vocabulary knowledge is of high importance and “one thing that students, teachers, materials writers, and researchers can all agree upon is that learning vocabulary is an essential part of mastering a second language” (p. 329).
One of the main distinctions proposed for the construct of vocabulary knowledge is the dichotomy between breadth and depth of lexical knowledge (Henriksen, 1999; Read, 2000). The breadth of lexical knowledge refers to the size or number of words that language learners know at a particular time (Nation, 2001). That is, the breadth of vocabulary knowledge is an “estimate of how many words testees have in their lexicon” (Schmitt, 1999, p. 191). Depth of vocabulary knowledge, on the other hand, refers to the quality of word knowledge or how well a word is learned (Read, 1993, 2000). Schmitt (2008) argued that to know a word “a learner must also know a great deal about each item in order to use it well” (p. 333). This means that for a word to be considered “learned”, a learner should not only acquire the form-meaning link of a lexical unit (breadth of lexical knowledge), but also needs to be aware of its network (depth of lexical knowledge).
In a discussion of the relationship between breadth and depth, Henriksen (1999) puts forward the view that having “an understanding of the relations among the items is a pre-requisite for a more precise understanding of each individual item” (p. 313). In other words, a learner should not only acquire the meaning of a lexical unit, but also needs to be aware of its network. The network surrounding a lexical unit encompasses two types of relations: syntagmatic and paradigmatic (Schoonen & Verhallen, 2008). Syntagmatic relations are the linear relations a word may have with other words in the same sentence (e.g. desk-study, desk-book, and desk-pen). In contrast, paradigmatic relations are hierarchical and characterized by class inclusion (e.g. bird-animal, bird-sparrow).
Many studies have been conducted so far to probe the role of breadth and depth of word knowledge, as reader variables, in text-dependent tests such as reading comprehension (e.g. Nassaji, 2004, 2006; Qian, 2002; Shiotsu & Weir, 2007; Zhang & Anual, 2008). However, as argued by Bachman (2002), test takers’ performance can be influenced by three main sets of factors: (1) “characteristics inherent in the task itself”; (2) “attributes of test-takers”; and (3) “interactions between test-takers and task characteristics” (p. 469). The first two factors have been extensively investigated, whereas very few studies have examined the role played by the interactions between the reader and textual variables in test takers’ performance on text-dependent tests. This study made an effort to investigate the interaction between test takers’ breadth and depth of vocabulary knowledge, as reader variables, and their use of lexical bonds, as contextual cues, in their performance on the C-test, which is a context-dependent vocabulary test (Read & Chapelle, 2001). As Schmitt (2014) points out, the role of vocabulary size and depth has been largely investigated in reading comprehension, so “it is an interesting, but unexplored, question whether the two would equally predict other kinds of language use” (p. 939).
One of the textual features, which is capable of affecting test takers’ performance in text-dependent tests, is lexical bond. It is closely related to lexical cohesion, which itself has been identified as one of the two types of cohesion (Halliday & Hasan, 1976). Lexical bond, as introduced and defined by Hoey (1991), is made up of a few lexical links, which are grouped by means of semantically related words between sentences. Hoey (1991) believed that two or three lexical links can form one lexical bond between a pair of sentences.
The C-test and lexical cohesion
The C-test was developed by Raatz and Klein-Braley (Raatz & Klein-Braley, 1981) and, like the cloze test, it was meant to operationalize the reduced redundancy principle (Babaii & Jalali Moghaddam, 2006). In this type of test, the language ability of examinees is measured when the linguistic message is deformed by some noise or other kinds of interference (Babaii & Ansary, 2001). In the C-test, the texts are damaged by deleting parts of some words, which should then be restored by the test takers with the rationale that language is naturally redundant. The C-test employs the “rule-of-two”, “which involves deleting the second half of every other word beginning from the second word of the second sentence” (Jafarpur, 1999, p. 86).
It has been argued that the C-test is a valid measure of language proficiency for some purposes (Eckes & Grotjahn, 2006; Lee-Ellis, 2009). It is integrative and relies on authentic texts (Coleman, 1994). It has also been identified as an economic (Eckes & Grotjahn, 2006) and a practical (Lee-Ellis, 2009) measure. The C-test principle has been applied to many languages including Hebrew (Cohen, Segal, & Bar-Siman-To, 1984), French (Grotjahn & Stemmer, 1985), Japanese (Roos, 1994), Turkish (Daller, Treffers-Daller, Ünaldı-Ceylan, & Yıldız, C., 2002), Korean (Lee-Ellis, 2009), and Persian (Baghaei, 2014). To add to the advantages of the C-test, Sigott (2004) claimed that research on C-tests in diverse contexts and languages have always provided evidence for the good internal consistency of this measure.
C-tests can tap both micro-level (Stemmer, 1991) and macro-level processing (Babaii & Ansary, 2001; Babaii & Jalali Moghaddam, 2006; Hastings, 2002; Sigott, 2004) on the part of the test takers and the nature of this processing could be related to the textual features of the texts used in such types of tests as well as the proficiency level of the test takers. Babaii and Jalali Moghaddam (2006), for instance, found that syntactically complex and abstract texts result in more macro-level processing. Even Stemmer (1991), who contended that high-level processing plays a minor role in C-test taking, reported that higher scorers employed more macro-level processing in the difficult passages. As for the role of the proficiency level, Klein-Braley (1994) made an effort to study the strategies employed by C-test takers and found that the low-proficient subjects made minimal use of context in completing the C-test items. In a large-scale study, Sigott (2004) collected data from over 700 participants using 23 C-test passages to investigate the processing strategies used by C-test takers. He found that test takers who scored higher on the Oxford Quick Placement Test (OQPT) were more successful in solving the decontextualized items than the less proficient ones. In a more recent study, Babaii and Fatahi-Majd (2014) attempted to find the differences between the low- and high-proficient C-test takers regarding the nature of errors they made in restoring the mutilated words. Results of the verbal protocol analysis showed that while the high-proficient learners appeared to benefit more from textual and contextual clues, the low-proficient learners resorted to more local cues. To put it simply, the proficiency level of the test takers has been found to affect the type of processing in which they engage (i.e. micro or macro level).
A number of factors have proved to be influential in determining the item difficulty of C-tests. Klein-Braley (1985), for instance, reported that type–token ratio and average sentence length could predict text difficulty and hence, C-test items. In addition, Dörnyei and Katona (1992) found that sentence length and syntax were determinants of C-test takers’ performance. The results of their study showed that mutilated structure or function words were easier to reconstruct than content words. Sigott and Köberl (1996) also examined the effect of deletion patterns other than the rule-of-two, including deleting two thirds of every second word (C33), only keeping the first letter of every second word (CFL), and deleting the first half of every second word (X25). The results showed that CFL resulted in a 30%, C33 in a 20%, and X25 in a 15% increase in item difficulty. Readability indices, which are based on word and sentence length, have also been used to determine the difficulty of texts used in C-tests, but they have been criticized for not taking into account a number of linguistic and discourse features such as cohesion relations (McNamara, Graesser, & Louwerse, 2012). Babaii and Fatahi-Majd (2014, p. 273) further provided evidence for the invalidity of readability indices and pointed out the significance of “monitoring the textual properties of the texts selected for constructing the C-test”.
As Read and Chapelle (2001) mention in their categorization of vocabulary tests, the C-test is a context-dependent test that requires knowledge of the context, involving the crucial aspect of text organization and making use of contextual clues. Text organization is one example of textual features achieved through discourse elements of cohesion and coherence (Halliday & Hasan, 1976), which are devices for sticking together and making sense of a text, respectively (Morris & Hirst, 1991).
In their seminal work, Halliday and Hasan (1976) categorized cohesion into two main types: grammatical and lexical. Grammatical cohesion includes devices such as reference, substitution, ellipsis, and conjunction, while lexical cohesion is divided into reiteration and collocation. Reiteration refers to the repetition of the same word, which may be an exact repetition or a synonymy. Collocation on the other hand, refers to the co-occurrence of lexical items. The term lexical cohesion itself can be broken down into other terms such as lexical links and chains. The term “chain” was first introduced by Halliday and Hasan (1976) to denote a relation where an element is related to another element, which in turn relates to another element, and so on. Morris and Hirst (1991) argue that “lexical cohesion occurs not simply between pairs of words but over a succession of a number of nearby related words spanning a topical unit of the text” (p. 22). They use the term “lexical chains” to refer to the sequences of related words created by means of semantic relations between them. As stated by Silber and McCoy (2002), lexical chains are grouped by means of words that are semantically related such as synonyms, hypernyms/hyponyms, and superordinates/subordinates.
As previously found by Babaii and Ansary (2001) and Babaii and Jalali Moghaddam (2006), lexical chains are among the key top-down cues utilized by the test takers while taking the C-test. This evidence could justify the use of the C-test as an appropriate elicitation instrument for exploring the role of lexical cohesion in macro-level text processing. Similarity chains or lexical chains thus constitute the network of words that could be associated with a specific concept and/or context of use and “help an examinee to anticipate one specific member of a possible network when he/she comes across the other members” (Babaii & Ansary, p. 216). Lexical links have significant roles to play in text comprehension and coherence development since the meaning of a text should be ascribed to not only the total meaning of the single sentences, but also the interconnection that exists between them (Crossley & McNamara, 2009).
The relationship between lexical bonds, depth of vocabulary knowledge, and the C-test
Hoey (1991) mentions that text cohesion does not result from only the links that exist between words; semantic relationships between sentences can also form text cohesion. Hoey calls the cohesive relations between sentences “lexical bonds” and believes that a few lexical links between two sentences can form one lexical bond. A visual representation of the concept of lexical bond is provided in Figure 1. The left-hand diagram shows a lexical link between two lexical units, which is a short-distance relationship. The right-hand diagram, on the other hand, shows a long-distance relation between lexical units which conceptualizes the lexical bond (i.e. two or more lexical links).

Visual representation of lexical bond
As mentioned above, lexical links or chains are made by semantic relations between words such as synonyms, hypernyms/hyponyms, and superordinates/subordinates (Silber & McCoy, 2002). This is akin to the definition of depth of vocabulary knowledge as the knowledge of syntagmatic, paradigmatic, and analytic relationships between words (Read, 1993). Paradigmatic and syntagmatic relations are built by means of semantic relations between words such as synonymy, hyponymy, meronymy, and collocation. It has been found that using these associations contributes to lexical inferencing and strategy use (Nassaji, 2004, 2006; Qian, 2004). Consequently, this study assumes that depth of vocabulary knowledge, as a reader variable, can contribute to the test takers’ use of lexical chains and bonds as they read the text. The rationale lies behind the sources of these variables; both involve syntagmatic and paradigmatic relations. To put it briefly, the ability of learners to make lexical inferencing and use contextual clues depends on their vocabulary knowledge, since without knowing a certain number of words, contextual clues are of no use (Kaivanpanah & Alavi, 2008).
Hastings (2002) points out that “C-tests tap the ability to integrate contextual information with a range of language competencies, including those involved in semantic, syntactic, morphological, lexical and orthographic processing” (p. 53). Therefore, lexical bonds, as contextual cues, can be exploited to enhance performance in a text-dependent test such as the C-test. However, some of the test takers may not use the lexical bonds as one of the key cues to fill in the mutilated words of a C-test. In other words, they may perform differently in using the lexical bonds within the text. As found by Ozuru, Dempsey, and McNamara (2009), not all test takers can make use of text cohesion for performing better in a text-based test such as reading comprehension. Sigott (2004) also pointed out that the text-level processing in C-tests varies from one participant to another since reader and text variables interact with each other, and textual features like decontextualization, are approached dissimilarly by C-test takers from different proficiency levels. The present study thus aims at finding the moderating role of overall language proficiency, as measured through a placement test, in the interaction between test takers’ breadth and depth of vocabulary knowledge and their use of lexical bonds while taking the C-test. This is in line with the recent literature, which calls “for further research on the interactions between test takers’ characteristics, text features, and processing mechanisms in C-testing” (Babaii & Fatahi-Majd, 2014, p. 274) that are the “under-researched aspects of C-test processing” (Babaii & Fatahi-Majd, 2014, p. 274).
The significance of lexical relations and contextual cues also extends to the use of lexical inferencing on the part of the test takers. Lexical inferencing “involves making informed guesses as to the meaning of an utterance in light of all available linguistic cues in combination with the learner’s general knowledge of the world, her awareness of context and her relevant linguistic knowledge” (Haastrup, 1991, p. 40). Success in lexical inferencing depends on a number of factors among which is the degree of textual information provided for a single utterance in the surrounding context (Dubin & Olshtain, 1993). One of the sources of textual information, which contributes to the inferencing success, is hypothesized to be the lexical cohesion and bonds which connect parts of the text together. This lexical cohesion can be formed by means of syntagmatic and paradigmatic relations between lexical units of a text.
The current study, therefore, tries to examine the role of lexical bonds, as discourse features, and their interaction with aspects of vocabulary knowledge, as reader variables, in the test takers’ performance on the C-test with low and high lexical bonds. This study examines both breadth and depth of vocabulary knowledge since “to truly understand an L2 learner’s lexical proficiency, we must move past simple analyses of lexical features and begin to examine the L2 learner’s lexical knowledge of syntagmatic and paradigmatic properties, sense relations, and complex lexical association models” (Crossley & McNamara, 2009, p. 120). Considering the significance of breadth and depth of vocabulary knowledge, as reader variables, and the role of lexical bonds, as text variables, in EFL learners’ C-test performance, this study attempts to answer the following research questions:
Do breadth and depth of vocabulary knowledge contribute to the EFL learners’ total C-test performance and its two sub-tests with high and low lexical bonds?
Do scores of breadth of vocabulary knowledge at different word frequency levels contribute to the EFL learners’ total C-test performance and its two sub-tests with high and low lexical bonds?
Methodology
These questions were investigated through the use of an ex post facto design that included breadth and depth of vocabulary knowledge, as the independent variables, and the performance of EFL students on the C-test as well as its high-bond and low-bond sub-tests, as the dependent variables. The proficiency level, which was measured by the Oxford Quick Placement Test (OQPT), was hypothesized to affect the relationship between the independent and dependent variables; therefore, separate models were run for the higher and the lower proficiency groups. The overall information is presented in Table 1.
Information on the design, groups, variables, and measures of the study.
Participants
The participants were 135 Iranian undergraduate students majoring in English translation studies and English literature. The selected participants were males and females ranging from 18 to 24 years of age and were native speakers of Persian. They were classified into lower-intermediate (n = 50) and upper-intermediate (n = 85) levels based on the results obtained from the Oxford Quick Placement Test (OQPT, 2004). Furthermore, 32 EFL instructors and MA students, who had a good command of English, were recruited for piloting the C-test and its two sub-tests (i.e. with low and high lexical bonds). Table 2 shows the brief profiles of the participants.
Profile of the participants.
Note: ET = English Translation; EL = English Literature; TEFL = Teaching English as a Foreign Language.
Instruments
Oxford Quick Placement Test (OQPT, 2004)
The OQPT is claimed to be a test of overall language proficiency, and was therefore administered to determine the language proficiency level of the participants. The participants’ scores on OQPT were used for classifying them into lower-intermediate and upper-intermediate levels. This test consists of 60 items developed by Oxford University Press and University of Cambridge Local Examinations Syndicate. Validation research on this test was carried out by the developers in 20 countries with more than 6000 students from which its reliability was reported as 0.90 (Geranpayeh, 2003). The present study found that Cronbach’s alpha reliability measured 0.85 for this test.
Word Associates Test (WAT)
The WAT was developed by Read (1993) to measure the receptive aspect of depth of vocabulary knowledge. Its design takes into account the main relations of syntagmatic, paradigmatic, and analytic relationships between two lexical units. The test has been revised through additional validation research since it was first developed (e.g. Ishii & Schmitt, 2009; Laufer, Elder, Hill, & Congdon, 2004; Schmitt, Ching, & Garras, 2011), and it has been found to produce reliable scores (e.g., Nassaji, 2006; Qian, 2002).
Vocabulary Levels Test (VLT)
Nation (1983) designed the VLT to measure the test takers’ vocabulary size by examining their knowledge of content words at four word-frequency levels (2000, 3000, 5000, and 10,000) (Read, 2000). Since the first VLT, many other studies have conducted validation research, examined its reliability, or even revised it (e.g. Ishii & Schmitt, 2009; Laufer et al., 2004; Schmitt, Schmitt, & Clapham, 2001; Xing & Fulcher, 2007). Schmitt et al. (2001) developed two modified versions of the VLT and found Cronbach’s alpha reliability measured 0.92 for all the word-frequency levels of this measure. The current study used the second version of the VLT developed and revised by Schmitt et al. (2001) and obtained a Cronbach’s alpha reliability measurement of 0.89 for the test scores, which included all the four word-frequency levels of the test.
Although the Vocabulary Size Test (VST), developed by Nation and Beglar (2007) and later investigated by Beglar (2010), has been argued to be a more comprehensive measure of vocabulary size than the VLT, the present study used the latter for a couple of reasons. First, the VST uses rather long and complex options for the target words, which make the test a measure of reading skill and syntactic knowledge as well as vocabulary (Elgort, 2013; Nguyen & Nation, 2011). Second, the four-option, multiple-choice items, as used in the VST, are subject to a guessing effect (Gyllstad, Vilkaite, & Schmitt, 2015), which may lead to the overestimation of test scores over and above the VLT, which uses a six-option matching format (Stewart, 2014; Stewart & White, 2011). Furthermore, the VLT is the most widely used measure of vocabulary size (Webb & Sasao, 2013) and many studies have used it to measure the vocabulary size of their participants (e.g. Alavi & Akbarian, 2012; Baba, 2009; Qian, 2002; Shiotsu & Weir, 2007; Stenius Stæhr, 2009; Zhang & Anual, 2008).
ADELEX ANALYSER (ADA)
The ADA is an online tool for measuring textual difficulty. This tool has been developed by the members of the ADELEX (Assessing and Developing Lexis through the Internet) team at the University of Granada and enables the user to analyze the lexical difficulty of written texts. The ADA examines the lexical profile of English texts based on seven frequency levels or bands drawn from the British National Corpus, Bank of English, and Longman Corpus Network databases. This online tool makes use of multiple factors for estimating the lexical difficulty of texts such as lexical density and lexical frequency. The main measures of the ADA are frequency, density, and lexical profile (Moreno Jaén, 2006). This tool is currently available online at www.ugr.es/~inped/ada/perfil.php?ada=bj8vg4kv0phng1id058n80vag3&lng=english.
Hoey’s (1991) lexical analysis
Hoey’s approach to lexical analysis reveals how lexical patterns that mirror text organization can be determined through the study of cohesion. His approach identifies lexical ties as being the means of providing cohesive connections between sentences. These sentences are said to be linked where they involve lexical ties formed by means of lexical semantic relations between lexical units such as repetition, synonymy, superordinate, subordinate, meronymy (tree – trunk), hyponymy (tree – oak), co-hyponymy (oak – pine), co-meronymy (trunk – branch), and antonymy (awake – asleep). Sentences that “contain an above-average number of links are termed bonded sentences” (Todd, Khongput, & Darasawang, 2007, p. 13). Consequently, calculating the number of lexical bonds, which are formed by means of reiteration, underlies Hoey’s (1991) lexical analysis. Reiteration includes simple or complex lexical repetition (bear – bear; economist – economy, respectively), simple or complex paraphrase (volume – book; sickness – doctor, respectively), simple or complex synonymy (sedating – drugging; sedating – tranquillized, respectively), simple or complex antonymy (violent – calm; complex – clarify, respectively), superordinate repetition (snake – animal), and hyponymic repetition (animal – bear). As Hoey (1991) proposes, there should be at least three lexical links between a pair of sentences to consider them as bonded sentences.
The C-test
In order to construct the C-test that could serve the purpose of the study, careful screening of texts in terms of lexical cohesion and lexical bonds was required. Numerous authentic texts were examined to locate the ones with the desired characteristics. The texts were taken from novels, short stories, and other descriptive essays with the rationale that this genre should include more lexical chains. Hoey’s (1991) taxonomy was adopted for counting the lexical links and bonds within the texts and classifying them into texts with low and high lexical bonds. As for the criterion of lexical bonds, texts with five lexical bonds and below were considered as low-bond and texts with 30 lexical bonds and above were considered as high-bond. Four low-bond texts and four high-bond texts were required for the pilot phase of this study.
Two PhD candidates of applied linguistics were trained as expert raters for this purpose. They were graduate students in the Department of English Language and Literature at a large state university in Iran. The raters were first trained using a number of authentic descriptive and narrative texts. After the training stage, the raters were given 30 short texts some of which were to be used for the pilot phase of the study. They counted the number of lexical bonds based on Hoey’s (1991) taxonomy, as mentioned above (see Appendix 2 for a sample analysis of a high-bond text). The Pearson correlation was conducted to assess the inter-rater reliability of their measurement. The correlation between the two raters was r = .935 (p < .001).
According to Zhang and Anual (2008), if a text contains an above-average number of difficult or low-frequency words that are beyond the vocabulary knowledge of the test takers, comprehension will decline even if the text is highly cohesive. Consequently, the current study monitored the lexical difficulty level of the texts to select the ones that would allow the test takers to make use of lexical cohesion and bonds of the texts. This was done by means of ADELEX ANALYSER (ADA) tool which determines the lexical difficulty of the texts based on word frequency lists. Based on this computational tool, if 95% of the words used in texts are among the first 5000 words, the lexical difficulty level of the texts is average. This is rooted in Laufer’s (1997) assertion which states that knowing at least 95% of words is essential to achieve a general level of reading comprehension. The 30 short texts that had been assessed by the two expert raters were entered in the ADA tool for frequency analysis. Accordingly, eight texts (four with high lexical bonds and four with low lexical bonds), which had average lexical difficulty, were selected for the piloting phase of this study.
To construct the C-test, the number of selected texts should be more than needed for the final form of the test (Klein-Braley, 1997). Following the guidelines provided by Klein-Braley (1997), a C-test, consisting of eight passages was developed according to the rule-of-two. The clues regarding the number of missing letters were also eliminated to trigger more macro-level processing on the part of the test takers (Babaii & Jalali Moghaddam, 2006). The texts were arranged from easy to difficult in the C-test and each text had 25 mutilations. Four texts were singled out as the final texts of the C-test after the primary C-test was administered to a group of 32 EFL instructors who had a good command of English. The characteristics of each text (sub-tests), which were used in the piloting phase of the study, are presented in Table 3.
Characteristics of the C-test and its sub-tests (early version).
Note: LB = lexical bonds; KR-21 = Kuder-Richardson reliability estimates; IF = item facility indices; K = number of items; * = selected for the final C-test.
Klein-Braley (1997) warns against the overestimation of reliability by KR-21 and recommends the use of super-items. Therefore, the sub-tests with reliability indices less than 0.50 were eliminated. Since the results revealed that except for the two sub-tests, the other individual texts produced scores with high reliability values (Table 3), item characteristics indices served as the criterion for choosing the four sub-tests for the final version. As for the sub-tests with high lexical bonds, “The light and the dark of it” (KR-21 = 0.72, IF = 0.68) and “Description of our house” (KR-21 = 0.70, IF = 0.58) were singled out for having higher reliability estimates and average item facility indices. As for the sub-tests with low lexical bonds, “The mark on the wall” (KR-21 = 0.72, IF = 0.41) and “Liberation from ignorance, from sorrow” (KR-21 = 0.89, IF = 0.58) were selected (see Appendix 1 to see the four sub-tests of the final C-test).
After the final version of the C-test (k = 100) was administered for the main phase of the study with 135 participants, its Pearson product–moment correlation coefficient with the OQPT was calculated as r = 0.78. The reliability of the final version of the C-test was measured by using the four texts as individual items in the Cronbach’s alpha formula (Babaii & Jalali Moghaddam, 2006). The results indicated that the final version of the C-test was moderately reliable (α = .73).
Procedure
Several steps were taken to collect the data required for this study. First, the OQPT was administered to determine the students’ level of proficiency and form the two groups. Second, the WAT and VLT were used to measure the participants’ depth and breadth of vocabulary knowledge, respectively. Third, the VLT was given requiring the examinees to choose the right word that matched each definition. They had 30 minutes to complete this test. Fourth, the WAT was administered in a 30-minute period too. Finally, the C-test with two sub-tests (i.e. low and high lexical bonds), was administered to the participants. They were told to reconstruct the texts and write the words they found suitable for the context during a 30-minute session. Multiple regression analyses were then conducted to find the contribution of the test takers’ breadth and depth of vocabulary knowledge as well as their scores on each word frequency level of the VLT to their performance on the C-test and its two sub-tests with high and low lexical bonds. The statistical analyses were conducted using the SPSS software version 20.
Results
Overall, the results of multiple regressions indicated that breadth and depth of vocabulary knowledge played different roles for lower- and upper-intermediate test takers. While neither the vocabulary size nor the vocabulary depth could predict the less proficient students’ performance on the C-test and its two sub-tests, vocabulary depth was the predictor variable for the more proficient examinees’ scores on the C-test and its sub-tests with low and high lexical bonds. The results thus showed that depth of vocabulary knowledge contributed to the use of lexical bonds, as contextual cues, for test takers with a higher language proficiency. Different word frequency levels also predicted the two groups’ scores on the C-test and its two sub-tests.
Contribution of breadth and depth of vocabulary knowledge
The first research question of this study concerned the contribution of breadth and depth of vocabulary knowledge to EFL learners’ total C-test performance and its two sub-tests with high and low lexical bonds. The lower-intermediate group of this study consisted of 50 students. Table 4 shows the descriptive statistics for all the variables, that is, the scores on the WAT, VLT, each word frequency level of the VLT, the C-test, low-bond, and high-bond sub-tests of the C-test. To examine the contribution of breadth and depth of vocabulary knowledge to this group’s performance on the C-test and its two sub-tests with low and high lexical bonds, multiple regression analyses were conducted. Since multiple comparisons were made using the same C-test, a Bonferroni correction was applied and thus the results are reported at a .0166 level of significance to control the Type I error rate.
Descriptive statistics for all the tests and sub-tests (lower-intermediate group; n = 50).
Note: MPS = Maximum possible score; SD = standard deviation; α = Cronbach’s alpha.
Depth and breadth of vocabulary knowledge could not be considered as predictor variables for overall C-test performance, as shown in Table 5. The results of multiple regression, using the enter method, indicated that the whole model could account for 8% of the variance in the total C-test performance, which was not statistically significant (R2 = .081, F = 2.07, p > .016). Depth (ß = .046, t = .313, p > .016) and breadth of vocabulary (ß = .266, t = .1.793, p > .016) could not significantly predict the total C-test performance. As for the low-bond sub-test of the C-test, breadth and depth of vocabulary knowledge (ß = .171, t = 1.128, p > .016; ß = .067, t = .445, p > .016, respectively) could not predict the performance of the lower-intermediate students. Using the enter method, both variables could explain only 4% of the variance (R2 = .041, F = 1.016, p > .016). Similarly, multiple regression indicated that neither breadth (ß = .299, t = 2.032,p > .016) nor depth of vocabulary knowledge (ß = .011, t = .077, p > .016) could significantly explain the performance on the high-bond sub-test of the C-test. The whole model could only explain 9% of the variance (R2 = .092, F = 2.383, p > .016).
Multiple regression analysis for the breadth and depth of vocabulary in EFL C-test performance and its two sub-tests (lower-intermediate group).
Predictors: (Constant), breadth, depth.
The upper-intermediate group of this study consisted of 85 EFL students. The same statistical procedures were conducted to answer the first research question for this group of students. The descriptive profile of the participants’ achievement in every test and sub-test is provided in Table 6.
Descriptive statistics for all the tests and sub-tests (upper-intermediate group;n = 85).
Note: MPS = Maximum possible score; SD = standard deviation; α = Cronbach’s alpha.
As Table 7 displays, the first model in which only depth of vocabulary knowledge was entered using the stepwise method, could explain about 13% of the variance in the C-test performance (R2 = .128, F = 12.193, p < .01). Depth of vocabulary knowledge could significantly predict the total C-test performance with ß = .358, t = 3.492, p < .01 while breadth of vocabulary knowledge was excluded in this model. Similarly, one model tested the performance on the low-bond sub-test of the C-test which could significantly explain the variance (R2 = .088, F = 8.006, p < .01). Depth of vocabulary could significantly predict the low-bond C-test performance as the only predictor (ß = .297, t = 2.829, p < .01).
Multiple regression analysis for the breadth and depth of vocabulary in EFL C-test performance and its sub-tests (upper-intermediate group).
Significant at p = .01.
Predictors: (Constant), depth.
As for the performance on the high-bond sub-test of the C-test, the only predictor, which could explain the variation, was depth of vocabulary knowledge entered in model 1. It could explain about 11% of the variation in the dependent variable which was significant (R2 = 109, F = 10.185, p < .01). In addition, depth of vocabulary as a single variable, significantly predicted the performance on the high-bond sub-test of the C-test (ß = .331, t = 3.191, p < .01).
Word frequency levels and C-test performance
Using the results of the VLT, we explored the role of vocabulary size in the lower-intermediate students’ C-test performance. To do this, multiple regression analyses were conducted to find how the students’ scores at each word frequency level could contribute to the prediction of their performance on the C-test and its two sub-tests. Bonferroni correction was again applied, so the results are reported at the .0166 level of significance.
Using the stepwise method, a significant model emerged for the performance on the C-test as well as its high-bond sub-test (p < .01). As the results of Table 8 show, the only predictor variable that was entered in the model capable of accounting for the variances was the 5000-word frequency level. The model could explain nearly 19% of the variance in the C-test (R2 = .188, F = 11.107, p < .01). The 5000-word frequency level scores significantly predicted the total C-test performance (ß = 433, t = 3.333, p < .01). As for the low-bond sub-test, the model in which all the word frequency levels were entered, explained 11% of the variance which was not significant (R2 = .110, F = 1.396, p > .016). In particular, the 2000-word (ß = .102, t = .603, p > .016), the 3000-word (ß = −.179, t = −.815, p > .016), the 5000-word (ß = .406, t = 2.190, p > .016), and the 10,000-word frequency levels (ß = −.092, t = −.632, p > .016) could not significantly predict EFL performance on the low-bond C-test. For the high-bond C-test, the model could significantly explain about 22% of the variance with R2 = .219, F = 13.481, p < .01. The scores of the students at the 5000-word frequency level thus significantly predicted the high-bond C-test performance with ß = .468, t = .3.672, p < .01.
Multiple regression analysis for VLT levels on the total C-test performance and its two sub-tests (lower-intermediate group).
Significant at p = .01.
Predictors: (Constant), level 5000.
Predictors: (Constant), level 2000, level 3000, level 5000, level 10,000.
In order to examine the contribution of scores on each level of the VLT to upper-intermediates’ C-test performance, multiple regression analyses were conducted to determine the way in which the upper-intermediate students’ scores at each word frequency level could predict their performance on the C-test and its two sub-tests with low and high lexical bonds.
The results, as shown in Table 9, indicated that the scores of the students at the 10,000-word frequency level predicted their total C-test performance. The model in which only the 10,000-word frequency level was entered as the predictor variable, could explain about 10% of the variation (R2 = 098, F = 9.056, p < .01). The 10,000-word frequency level could significantly predict the dependent variable with ß = 314, t = 3.009, p < .01. For the sub-tests of the C-test, the only predictor variable was again the 10,000-word frequency level. The models could explain 8% of the low-bond C-test performance (R2 = .080, F = 7.176, p < .01) and 7% of the high-bond C-test performance (R2 = .076, F = 6.823, p < .016). In particular, the 10,000-word frequency level significantly predicted the performance on the low-bond (ß = .282, t = 2.679, p < .01) and the high-bond (ß = .276, t = 2.612, p < .016) sub-tests of the C-test.
Multiple regression analysis for VLT levels on the total C-test performance and its two sub-tests (upper-intermediate group).
Significant at p = .0166.
Significant at p = .01.
Predictors: (Constant), level 10,000.
Discussion
The present study investigated the predictive power of breadth and depth of vocabulary knowledge for the test takers’ performance on the C-test. It also examined how these aspects of the test takers’ knowledge interacted with textual features such as degree of cohesion as measured through lexical bonds. Assuming different C-test performance processes depending on students’ proficiency levels, the analyses were carried out for two groups of test takers at different proficiency levels.
The analysis of the lower-intermediate group’s performance indicated that neither breadth nor depth of vocabulary knowledge was capable of predicting performance on the C-test and its two sub-tests. In contrast, depth of vocabulary knowledge acted as the predictor variable in the C-test performance and its sub-tests with low and high lexical bonds for the upper-intermediate group. Above all, the prediction of depth of vocabulary knowledge was more significant for the performance on the high-bond C-test for the high-proficiency group.
The results lend support to previous studies on reader variables and processing strategies employed in C-test taking. It has been found, for instance, that contextual guessing and using the surrounding information available in the text is a strategy whose frequency of use is correlated with the language proficiency level (Hastings, 2002; Klein-Braley, 1994; Sigott, 2004). The results of Klein-Braley (1994) indicated that the low-proficiency C-test takers mostly employ strategies that are based on a minimal use of the context. Hastings (2002, cited in Babaii & Fatahi-Majd, 2014) also found that the completion of C-tests is affected by the local syntactic cues as well as the global context, with high-proficiency test takers’ tendency to attend more to the surrounding context rather than the local context. Furthermore, Sigott (2004) conducted a large empirical study to investigate the C-test taking strategies employed by students from different proficiency levels and showed that test takers who got higher scores on the OQPT, performed better in solving the decontextualized items than their less proficient counterparts. The findings of this study also showed that the upper-intermediate students were more successful than the lower-intermediate ones in using the lexical bonds, as contextual cues.
Previous research on C-test taking strategies has shown that using contextual cues, such as similarity chains, is considered as an instance of macro-level processing (Babaii & Ansary, 2001). The use of lexical bonds, as a discourse feature which extends beyond the word and sentence level and involve processing the lexical links between pairs of sentences, is also an example of high-level or macro-level processing. The present study showed that C-test taking involves this type of processing and provided counterevidence for Stemmer (1991) which argued that high-level processing is peripheral to C-test taking. The findings, however, can verify other studies which emphasized that C-test is an integrative test capable of tapping both micro-level and macro-level processing on the part of the test takers (Babaii & Ansary, 2001; Babaii & Jalali Moghaddam, 2006; Hastings, 2002; Sigott, 2004).
There is conflicting evidence in the literature as to whether students who have more proficiency can benefit more from the context than those with lower language proficiency (Frantzen, 2003). Shiotsu and Weir (2007), among others, suggested that the proficiency level of the students should be identified so that we can find where on the proficiency continuum the significance of the variables is most evident. The results of this study indicated that the upper-intermediate students succeeded in using their semantic knowledge and depth of vocabulary knowledge to make use of the contextual cues of the high-bond C-test. It was found that depth of vocabulary knowledge was a predictor variable especially in the performance of the high-bond C-test for the upper-intermediate students. This can be justified by Wolter’s (2002) contention that “we would expect learners of higher proficiency to have more highly developed semantic networks in the L2 mental lexicon” (p. 316). Thus, we can infer that the lower-intermediate students lacked the required semantic networks in the L2 to make use of the contextual cues. The results also support Haastrup (1991), who pointed out that L2 learners must have a basic proficiency level as the starting point to be able to use the contextual cues provided in the text. Laufer (1997) also mentioned that when learners have not reached this threshold, the contextual cues are of no use. Walters (2006) also found that different proficiency levels seem to gain from different approaches in lexical inferencing. She found that lower levels seem to benefit from general inferencing procedures, whereas higher levels benefited more from instruction in the recognition and interpretation of contextual cues. The findings further support Kaivanpanah and Alavi (2008), who stated that “advanced L2 learners tend to infer more since they know enough words and on the basis of the sufficient and clear context created by the known words they feel they can infer the meaning of unknown words” (p. 178). They explained that low-proficient learners are less able to use the contextual cues than their high-proficiency counterparts. In short, as learners gain proficiency, they will develop their semantic network and will be able to make use of it in using the contextual cues of the text such as the lexical cohesion and lexical bonds.
The results showed that depth of vocabulary knowledge could correlate well with the use of the lexical bonds. In other words, the semantic knowledge of words associated with depth of vocabulary could help the learners in locating the lexical bonds and using the lexical cohesion as a clue to fill in the mutilated words. As found by Nassaji (2004, 2006), participants with more depth of vocabulary knowledge could succeed more in lexical inferencing and depth “made a significant contribution to inferential success” (p. 387). It was mentioned earlier that lexical inferencing is a meaning-construction process that could be considerably affected by the nature of the learners’ semantic system. This semantic system is related to depth of vocabulary knowledge that involves the syntagmatic and paradigmatic relations between words which tap a deeper knowledge of words.
With regard to the role of each of the VLT levels in the C-test performance and its two sub-tests with low and high lexical bonds, the results showed that more proficient test takers possessed a larger vocabulary storage in comparison to the low-proficient ones. In this study, the 10,000-word frequency level acted as the predictor variable for the upper-intermediate group, whereas for the lower group, the 5000-word frequency level was entered into the model. This can be justified by the discriminatory power of VLT for determining the learners’ language proficiency level. In other words, in line with the argument set forth by Laufer and Nation (1999), the VLT is a valid measure of vocabulary development and can be used for placement purposes. Zhang and Anual (2008) found a significant level of correlation between vocabulary and reading at the 2000-word and 3000-word frequency levels. The present study found that there was a high correlation between the low word frequency levels and performance on the C-test and its sub-tests with low and high lexical bonds.
Conclusion
This study demonstrated that EFL students’ breadth and depth of vocabulary knowledge played dissimilar roles in their text-processing ability at low and high language proficiency levels. The independent variables interacted differently with the total C-test performance and its sub-tests with low and high lexical bonds for the two proficiency levels. The results have implications for teachers, materials writers, test developers, and researchers. The findings lend support to the necessity of instructed vocabulary learning especially at lower levels of language proficiency where vocabulary knowledge is mostly a matter of size rather than quality. Such instruction may be planned to develop the learners’ knowledge of word associations and lexical inferencing ability (Nassaji, 2006). For materials writers, the findings advocate the adoption of a broader view of vocabulary knowledge (Brown, 2011). The inclusion of activities tapping depth of vocabulary knowledge and lexical networks is deemed necessary. Text-dependent tests developers should consider using texts with more lexical cohesion and bonds to trigger more macro-level processing on the part of the test takers and to encourage them to look for sentential and cohesive clues (Babaii & Ansary, 2001). The findings may also be used as evidence that lexical bonds, as discourse features of a text, should be considered in determining the lexical difficulty of texts in text-dependent tests like reading comprehension and the C-test. Different types of C-tests (e.g. with low and high lexical bonds) can also be developed to examine different aspects of language proficiency (Daller & Grotjahn, 1999), such as the learners’ depth of word knowledge and lexical inferencing skill.
A number of limitations, however, may have jeopardized the generalizability of the findings. First, the data were collected from a convenience sample of 135 Iranian EFL students studying at the undergraduate level. Future research, therefore, could be done in various contexts and using a larger group of participants to confirm the findings of this study. In addition, the proficiency levels addressed in the present study were limited to lower-intermediate and upper-intermediate levels. The difference between these two levels may not be as distinct as the difference between elementary and advanced levels. Consequently, future studies may address and compare more levels to find out how breadth and depth of vocabulary knowledge of students from other proficiency levels interact with the use of lexical bonds, as contextual cues. Finally, previous studies have shown that factors such as average sentence length and type–token ratio (Klein-Braley, 1985), syntactic complexity (Dörnyei & Katona, 1992), and deletion patterns (Sigott & Köberl, 1996) can determine the difficulty level of the C-test items. The current study did not take into account the effect of these factors and instead controlled the lexical difficulty of the texts by means of an online tool, as explained. Future research may thus control the effect of the aforementioned variables or replicate the same study with other deletion patterns.
Footnotes
Appendix 1
Appendix 2
Sample analysis of a high-bond text.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
