Abstract
The contemporary era brings technology into every part of our lives. As a result of technological improvements, education process has been affected and the use of technological devices is inevitable. Teaching language has also been affected by this movement. Thus, Mobile-Assisted Language Learning has gained importance through the recent years. Within light of these, the aim of this study is to understand the effect of annotation use on vocabulary recall and retention levels of English as a foreign language students. A software called Vocastyle was developed for this study. It included annotations that helped students to learn and remember new vocabulary items. The participants of the study were 122 students of a state elementary school in which multimedia learning is benefited at a major city of Turkey. In order to understand whether the use of annotations caused any difference between students’ level of vocabulary recall and retention, the pre-, post-, and delayed posttests were applied. Quantitative data analyses were done via descriptive statistics, chi-square, multivariate analysis of variance, and Kruskal–Wallis and Mann–Whitney U tests. The results implied that learners who used multimedia annotations recalled and retained better than the learners who used paper-based annotations and who received no treatment at all.
Introduction
With the globalization and improving technology on the current era, the student profile has changed according to the necessities and needs. In addition to this changing profile, the amount of information that students are exposed to is increasing rapidly as well as getting the information in a number of discrete ways. In this vein, education systems not only should be improved to meet today’s needs but also should concentrate on both current and possible future needs of students. Traditional methods of teaching and learning should be updated to lifelong and unlimited education as they fail to meet what modern society needs. As a result of rapid changes in information and technologies, it is inevitable for education to be affected. With the emergence of lifelong learning and the need of incorporating technology into education systems and learning environments, instructional materials, methods, and techniques used in such environments have been altered to more contemporary and technology-aided versions. Use of technology in education has been included and adapted to almost all of the fields so far and language learning and teaching is one of those popular ones.
There have been many studies on the use of technology during language learning process, especially during vocabulary learning process. After that, the search for an alternative way to substitute computer-assisted language learning was begun. As a result of this research, the concept of Mobile-Assisted Language Learning (MALL) emerged; because mobile devices include features of connectivity, social interactivity, context sensitivity, portability, and individuality which personal computers may not do (Klopfer, Sheldon, Perry, & Chen, 2012), they have transformed the way we learn and expanded our horizons by making learning portable, real-time, and cooperative (Kukulska-Hulme, 2009; Wong & Looi, 2011). According to Laufer and Hulstijn (2001), Search is the attempt to find the meaning of an unknown L2 (foreign language) word or trying to find the L2 word from expressing a concept (e.g., trying to find the L2 translation of an L1-first language-word) by consulting a dictionary or another authority (e.g., a teacher). (p. 14).
There have been a number of theories on MALL. This study relates to two of these theories which are Dual Coding Theory (DCT; Paivio, 1990) and Generative Theory of Multimedia Learning (Mayer, 2001).While the former theory postulates the idea of pictorial–verbal system for knowledge construction in which a verbal system deals directly with language and a nonverbal (pictorial) system deals with nonlinguistic objects, elements, and events; the latter theory puts forward that information, both verbal and visual, is accessed consecutively in short-term memory. Then the working memory comes into play as the place where the information is processed with both verbal and visual representations to get a holistic form leading to a more complete understanding of the information.
Among many components of language learning as learning styles, vocabulary and reading also have attracted much attention by researchers recently. Vocabulary learning is an indispensable part of learning a new language (Nation, 2001). Much vocabulary might be learnt incidentally through reading (Nagy, 1997). Tassana-ngam (2004) states that vocabulary is quite influential on reading skill. It might facilitate particularly comprehension of second language learners on a written text on the condition that the learners’ vocabulary knowledge is lower or below the threshold minimum of approximately 3,000 words. Reading “large quantities of materials that is within learners’ linguistic competence” (Grabe & Stroller, 2002, p. 259) facilitates vocabulary learning by providing chances for inferring word meaning in context (Krashen, 2003).
Nevertheless, learning vocabulary is as important as retaining and retrieving. Many learners generally find it hard to remember the words they have studied before. Yet learning occurs when a learner is able to recall a previously studied vocabulary. Thornbury (2002) states that remembering what has been studied is the key point for a vocabulary to be learnt. So the main question that needs to be asked is what should be done to attain a better vocabulary recall and retention level. As a reasonable response to the existing problem, Craik and Lockhart (1972) suggested depth of processing theory which claims that the retention of a word successfully relies heavily on how deeply it is processed at sensory level. Accordingly, it can be put forward that in order to have more permanent memory relations you need to process in the deeper level. To attain better recall results for words in the long-term memory, it is necessary to use the given information in a sentence or in a context so as the definition could be noticed. Craik and Tulving (1975) claim that good retention depends on attention to the word’s meaning. Craik and Lockhart also state that storing information in the long-term memory does not rely on the time period it is kept in the short-term memory yet is linked with how deeply it is processed. Another support for recall and retention of vocabulary comes from Laufer and Hulstijn’s (2001) “task induced involvement load model.” This model depends on the depth of processing model and was applied to the second language context later. According to this model, “Involvement is perceived as a motivator-cognitive construct which can explain and predict learners’ success in the retention of hitherto unfamiliar words” (Laufer and Hulstijn, 2001, p. 14). The notion of involvement has three factors which are need, search, and evaluation.
The motivational construct of need is concerned with one’s will to achieve whether the imposition by the task is from external agents or inner sources. The external imposition may refer to times when the learner is asked by the instructor to use a vocabulary item in a sentence. Laufer and Hulstijn (2001) call this a moderate need. Nation (2001, p. 71) puts it as follows: “Need is moderate if the task requires the target vocabulary.” The other two components of the concept of involvement are search and evaluation which are both cognitive aspects.
While the theory and the abovementioned model accounts for better vocabulary recall and retention outcomes Laufer (2006) claims that “learners do not necessarily notice unfamiliar words in the input” (p. 152) so an explicit and discrete learning might be profitable for increasing word knowledge depth, expanding the size of vocabulary and facilitating enriched use of lexis. Thus, the use of annotations might be an alternative strategy to make the input more explicit. Annotations are regarded to be practical in reading in the second language and vocabulary learning process, because words or phrases that are not familiar with the learners’ actual competence may be provided through annotations (Widdowson, 1984). And the multimedia glosses with annotation have gained attention with input enhancement with computers’ becoming technologically feasible on vocabulary learning and reading comprehension as technology’s integration into teaching has accelerated and Computer-Assisted Language Learning (CALL) being implemented both in and outside formal classroom environments.
All of those improvements explained earlier have made the annotations important for vocabulary learning process. Wolfe (2002) explained that annotations have important advantages and positive effects on vocabulary learning process. The advantages of annotations are to develop understanding of source material, provide quote for later review, enable critical thinking, comprehend and comment, and record intermediate and unselfconscious reaction to text. On the other hand, the effects of annotations are to develop recall of emphasizing items, affect perception of specific arguments, and decrease tendencies to unnecessarily summarize.
Within the light of the abovementioned explanations, this study aims to investigate the relationship between high school students’ annotation preferences and their level of vocabulary recall and retention. The research question and subquestions of this study are as follows:
1. Is there a significant difference among three groups, with multimedia annotation, with paper-based annotation and with no annotation, in terms of vocabulary learning and retention? If so, what particular types of annotations do affect vocabulary achievement of target words? 1a. Is there a significant difference among the three groups (high, mid, low annotation users) in terms of immediate vocabulary recall under Mobile-Assisted Vocabulary Learning environment? 1b. Is there a significant difference among the three groups (high, mid, low annotation users) in terms of vocabulary retention under Mobile-Assisted Vocabulary Learning environment?
Methodology
Research Design
A 1 × 4 × 2 factorial design was used to be able to explore the effects of hypermedia annotation types and different learning styles on Mobile-Assisted Vocabulary Learning and retention levels of English learners through reading texts. While the first factor has one level (1. Number of annotation use), the second factor, type of annotation, has four levels (1. Text, 2. Pictures, 3. Audio, 4. Video) and the third factor, learner styles, has two levels (1. Visual and 2. Auditory).
Participants
Participants of the study are five 10th-grade classes (122 students in total) of a state elementary school in which Fatih Project is applied in Kocaeli, Turkey. Among these five classes, Classes A and B were lectured by Teacher X, Classes C and D were lectured by Teacher Y, and the last class, Class E, was lectured by Teacher Z. One class from Teachers X and Y each were assigned as treatment groups (A and C) and Classes B and D were assigned as control groups. While the former groups were exposed to which received Multimedia Vocastyle App, the latter ones (B and D) were supplied vocabulary journals as a compensation through a paper-based annotation vocabulary practice. Finally, Class E was assigned as the pure control group which received no treatment at all. The number of the students in each class was around 25. They share similar educational backgrounds.
In this high school, the 49 students that constituted the two classes where the software which was developed by the researcher for enhancing vocabulary learning process was used. The software is called as “Vocastyle.” The software aims to support vocabulary learnings of students. The Vocastyle included reading texts from the coursebook Yes You Can A.2.2., which is the book used in state schools in Turkey, as it is prepared and provided by the Ministry of Education. The reading texts on the Vocastyle included annotations for each target vocabulary item.
Materials
Reading texts
The reading texts were chosen directly from the official coursebooks of the participants to supplement their reading habits with multimedia annotation support. They were mostly about science and youth. The list of reading texts and a sample are given in Figures 2 and 3. As the application was designed to supplement the English coursebook “Yes You Can A.2.2” which is in use in the 10th grade of state schools in which multimedia learning is prevalent, it aims to support vocabulary learnings of students. In this vein, target words from the reading texts were selected totally from the related coursebook.
The layout of the Vocastyle. The list of reading text in a unit in the Vocastyle. The page for a reading text.


Data collection tools
Data were collected through two tools, which were Vocastyle Log files, and Reid’s Perceptual Learning Style Preference Questionnaire (PLSPQ). The Vocastyle provided a log file which gives the number of annotations which were used in data analysis and it kept track of each students preferred annotation types. Data collection started at the beginning of the Spring semester in 2015 to 2016 academic year and was completed at the end of the same semester. Data for this study came from the following.
(a) Vocastyle app and log files:
For the requirement of this study, an Android Application (VocaStyle) was developed by the researcher. The software was designed to supplement the English coursebook “Yes You Can” A.2.2 which is in use in the 10th grade of the state schools in which multimedia learning is prevalent. It aims to support vocabulary learnings of students. In this vein, target words from the reading texts were selected totally from the related coursebook with a joint decision with English teachers at school. These target words had key roles in forming the meaning in reading texts. Accordingly, these words were provided with annotations with the help of Vocastyle app. While learners come across to an unfamiliar vocabulary in a hypermedia reading text in Vocastyle, a number of different annotations (text, audio, graphic, and video) were provided. These annotation types were checked two times by the teachers at school and an expert from multimedia learning field in order to avoid misunderstandings. Final versions of annotations were shown to participants after getting a joint decision from teachers and the experts mentioned. Once participants clicked on the target word, options appeared regarding which type of annotations they selected. Then they were able to see their preferred annotation types as many times as they wish. The software kept track of each student’s preferred annotation types. The application provided a log file which gives the number of annotations which were used in data analysis.
The Vocastyle was prepared as a plain and easy-to-use program. As illustrated in Figure 1, the interface of the program is not complicated that made it user friendly. Once students opened the Vocastyle, they came up with this interface where they can choose the unit they wanted to study.
When students clicked on the unit, the next page included the reading texts in that unit. Students could easily choose the reading text they wanted to study. The page including the list of reading texts can be seen in Figure 2.
When students clicked on the name of the reading text they wanted to study, they came up with the full reading text with the target vocabulary written in blue and underlined, as shown in Figure 3.
The reading text in Figure 3 was “Anita’s Letter of Complaint.” The target vocabulary items for this reading text were “claim,” “in addition,” and “advisor.” When students clicked on the vocabulary item they wanted to study, the software provided the student with the annotations which can be seen in Figure 4.
The interface of annotations for the target vocabulary item.
After students saw this page, they chose the annotation they wanted to study. For text annotation, the Vocastyle app provided word class of the vocabulary item, its English equivalent, example sentences, and synonyms. The page of text annotation for the word “advisor” can be seen in Figure 5, as an example.
The page of text annotation for “advisor.”
The audio annotation for each word included pronunciation of the word. When students clicked on audio annotation, they came up with a black page and listened to the audio. The page can be seen in Figure 6.
The page of audio annotation for “advisor.”
The graphic annotation for each word included pictures related to the word. When students clicked on graphic annotation, they came up with the pictures. The page of graphic annotation for the word “advisor” can be seen in Figure 7.
The page of graphic annotation for “advisor.”
The video annotation for each word included a video related to the word. When students clicked on this annotation, they could watch the video as many times as they wanted. The page of video annotation for the word “advisor” can be seen in Figure 8.
The page of video annotation for “advisor.”
An example of the log file mentioned earlier which tracked the preferences of each student during 1-month treatment process can be seen in Figure 9.
(b) Reid’s Perceptual Learning Style Preference Questionnaire Log file.

Reid’s PLSPQ originally includes four subdimensions which were auditory (learning by listening to audios, tapes, and people), visual (learning by reading and studying charts, graphics, and diagrams), kinesthetic (learning by physical participation), tactile (hands-on, learning by, e.g., doing lab experiments, building models), group (learning by studying with other learners in a group), and individual learning (studying in isolation). As this study specifically focuses on vocabulary learning in a multimedia learning environment with the use of tablets, and using a tablet only requires seeing and listening to the target words, their related annotations as well as touching the screen to pick their preferred learning styles, only two types of perceptual learning styles were used in the application. Therefore, the findings obtained from visual and aural parts of the questionnaire were used in the data analysis. After two subdimensions were eliminated, the scale included 30 items. It is a 5-point Likert scale where 5 = strongly agree, to 1 = strongly disagree. As mentioned earlier, the study concentrates on vocabulary learning and the questionnaire was adapted with a specific focus on vocabulary learning, such as the items like “I learn better when I see things” was adapted to “I learn vocabulary better when I see it on the board.”
Reid’s Perceptual Learning Style Questionnaire was piloted among students of another state school. As mentioned earlier, it was adapted to vocabulary component. Students were asked what they understand when they read the items. Based on their comments, slight modifications were made such as word choices. The scale was translated to Turkish by an expert and then backtranslated to English by another expert, and a final version was prepared by a native speaker of English by checking whether there is loss of meaning during translation processes. Findings coming from this questionnaire were thought to give us detailed insight about one’s preferred annotation type and its effect on short-term and long-term retention of words.
(c) Vocabulary tests
Vocabulary tests are useful instruments in measuring one’s knowledge of vocabulary. Vocabulary achievement of target words has been regarded as the sum of intercorrelated subknowledges like knowledge of spoken and written form, knowledge of morphology, semantics, collocations, connotations, and associations, and the use of social and other factors (Nation, 1990, 2001; Richards, 1976; Ringbom, 1987).
There have been diverse vocabulary tests and views of researchers regarding their preference for a specific subknowledge and their interest in vocabulary size or depth. In this study, we stand for the vocabulary size tests, although we acknowledge the vitality of vocabulary depth, as size tests have been given importance in predicting achievement in literacy (reading and writing) and general language proficiency besides academic success (Laufer, 1997; Saville-Troike, 1984). Size tests can be used practically as an instrument that gives the researchers vocabulary size of participants before the treatment and displaying growth after the treatment.
In this study, the researcher developed a vocabulary test as a measurement instrument to use in the analysis especially on short-term and long-term vocabulary recall and retention. First, target words were determined in coordination with English teachers at the school. Initially, the number of target words was 49. Then a vocabulary knowledge scale (Paribakht & Wesche, 1997) was adapted and applied to participants and the familiar words were eliminated even if only one student knows the word. Finally, there were 38 words left forming the final version of our vocabulary test. The items and alternatives were tested or several times to eliminate ineffective distractors. An expert from the field was consulted in forming the final version. In light of his recommendations, two alternatives were substituted with their equivalents. As the test was conducted in pretest, posttest, and delayed posttest conditions, the alternatives in the items were shuffled to prevent familiarity. All groups took the tests at the same time under the same conditions.
(d) Vocabulary journals
As stated earlier, vocabulary journals were supplied to Classes B and D as a compensation strategy through a paper-based annotation vocabulary practice. The students in these classes were given the target words traditionally without being exposed to the Vocastyle app. They only received vocabulary journals in which they are asked to write the definitions of the target words in their own words, synonyms/antonyms, and a picture or a memory aid. They were supposed to do it after they were introduced the target words till the following class. An example of a vocabulary journal made by students can be seen in Figure 10.
An example vocabulary journal.
Results
The data gathered were analyzed with SPSS 18.0. The descriptive statistics, multivariate analysis of variance (MANOVA), Kolmogorov–Smirnov normality tests, Kruskal–Wallis, Mann–Whitney U tests, and post hoc tests were used for analysis.
Descriptive Statistics of Pretest, Posttest, and Delayed Posttest Vocabulary Achievement Mean Scores of Each Group (N = 135).
Note. SD = standard deviation.
As it is clear in Table 1, posttest scores of multimedia annotation group is higher than pretest scores, posttest scores of paper-based annotation group is higher than pretest scores, and posttest scores of no-annotation group is higher than pretest scores. Although posttest scores are higher in each group compared with pretest scores, this progress seems more in multimedia annotation group compared with paper-based annotation and no-annotation groups. Also, the posttest mean score of multimedia annotation group is
MANOVA Test Results of Vocabulary Recall and Retention Levels in Terms of Groups.
Note. df = degrees of freedom; MANOVA = multivariate analysis of variance.
Tukey HSD Test Results for Vocabulary Recall Level of Students in Multimedia Annotation Group and Paper-Based Annotation Group.
Note. HSD = Honest Significant Difference; SD = standard deviation.
Tukey HSD Test Results for Vocabulary Recall Level of Students in Multimedia Annotation Group and No-Annotation Group.
Note. HSD = Honest Significant Difference; SD = standard deviation.
Tukey HSD Test Results for Vocabulary Retention Level of Students in Multimedia Annotation Group and Paper-Based Annotation Group.
Note. HSD = Honest Significant Difference; SD = standard deviation.
Tukey HSD Test Results for Vocabulary Retention Level of Students in Multimedia Annotation Group and No-Annotation Group.
Note. HSD = Honest Significant Difference; SD = standard deviation.
According to the results presented in Tables 5 and 6, significant differences were found between multimedia annotation group (
These results suggest that the learners who used annotations more performed better while learners who used annotations less performed worse in vocabulary achievement delayed posttest. In other words, learners who used annotations have higher recall and retention levels compared with those who used fewer annotations.
As this study focuses mainly on hypermedia annotations and their effect on vocabulary achievement scores of learners, the findings here were discussed with a specific reference to multimedia annotation group, namely, their annotation preferences and post and delayed post vocabulary achievement test scores. Therefore, there are two more subresearch questions. While Research Question 1a seeks to figure out whether there is an effect of annotation use on vocabulary recall, Research Question 1b examines whether there is an effect of annotation use on vocabulary recall.
Results of Normality Tests of Posttest and Delayed Posttest of Multimedia Annotation Group.
Table 7 shows that the multimedia annotation group posttest scores (p = 0.014 < 0.05) and delayed posttest scores (p = 0.049 < 0.05) were not normally distributed. Thus, nonparametric analyses were used. For Research Questions 1a and 1b, Kruskal–Wallis test was used to examine whether there is a significant difference among three groups in terms of their vocabulary achievement posttest scores and delayed posttest scores according to their annotation use. Then, Mann–Whitney U test was used to test possible significant differences among groups.
Descriptive Statistics for Annotation Use and Immediate Vocabulary Recall Levels of Multimedia Annotation Group Students (N = 49).
Note. SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results of the Effect of Total Multimedia Annotation Use on Vocabulary Recall Levels of Multimedia Annotation Group Students.
Note. df = degrees of freedom; SD = standard deviation.
According to the Kruskal–Wallis test results, total annotation use caused significant difference on students’ vocabulary learning (χ2 = 19.962, p = 0.000 < 0.05). Mann–Whitney U test was applied to the variable of annotation use total number groups. According to the results of Mann–Whitney U tests, there is a significant difference between vocabulary recall levels of students with low-level (
Descriptive Statistics and Kruskal–Wallis Test Results About the Effect of Use of Annotation “Text” on Vocabulary Learning.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results About the Effect of Use of Annotation “Audio” on Vocabulary Learning.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results About the Effect of Use of Annotation “Graphic” on Vocabulary Learning.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results About the Effect of Use of Annotation “Video” on Vocabulary Learning
Note. df = degrees of freedom; SD = standard deviation.
According to the Kruskal–Wallis test results, use of “video” annotation caused significant difference on students’ vocabulary learning. Mann–Whitney U test was conducted to see which groups differed significantly. The results show that there is a significant difference between vocabulary recall levels of students with low-level (
With these Mann–Whitney U-test results, the analysis for this research question was completed. Before continuing with Research Question 3, delayed posttest scores of multimedia annotation group were checked for normality. Because of the fact that n > 30, Kolmogorov–Smirnov test was applied. It was found out that delayed posttest scores (p = 0.049 < 0.05) were not normally distributed. Thus, nonparametric analysis was used for the third research question. Kruskal–Wallis test was used to examine whether there is a significant difference among three groups in terms of their vocabulary achievement posttest scores and delayed posttest scores according to their annotation use. Then, Mann–Whitney U test was used to test possible significant differences among groups.
Descriptive Statistics for Annotation Use and Vocabulary Retention Levels of Multimedia Annotation Group Students (N = 49).
Note. SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results of Effects of Total Multimedia Annotations Use on Vocabulary Retention Levels of Multimedia Annotation Group Students.
Note. df = degrees of freedom; SD = standard deviation.
According to the Kruskal–Wallis test results, total multimedia annotation use caused significant difference on students’ vocabulary retention levels (χ2 = 19.936, p = 0.000 < 0.05). These results suggest that the learners who used multimedia annotations more performed better than the learners who used multimedia annotations less in vocabulary achievement delayed posttest. Separate Mann–Whitney U tests were applied to see the effect of annotation use on vocabulary retention among groups. According to the Mann–Whitney U tests’ results, there is a significant difference between vocabulary retention levels of students with low-level (
Descriptive Statistics and Kruskal–Wallis Test Results of Effects of Text Annotation Use on Vocabulary Retention Levels of Multimedia Annotation Group Students.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results of Effects of Audio Annotation Use on Vocabulary Retention Levels of Multimedia Annotation Group Students.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results of Effects of Graphic Annotation Use on Vocabulary Retention Levels of Multimedia annotation group Students.
Note. df = degrees of freedom; SD = standard deviation.
Descriptive Statistics and Kruskal–Wallis Test Results of Effects of Video Annotation Use on Vocabulary Retention Levels of Multimedia Annotation Group Students.
Note. df = degrees of freedom; SD = standard deviation.
According to the Kruskal–Wallis test results, video annotation use caused significant difference on students’ vocabulary retention levels (χ2 = 26.315, p = 0.000 < 0.05). These results suggest that the learners who used video annotations more performed better than the learners who used annotations less in vocabulary achievement delayed posttest. Mann–Whitney U tests were applied to see the effect of video annotation use on vocabulary retention among groups. The results showed that there is a significant difference between vocabulary retention levels of students with low-level (
Discussion and Conclusion
In this study, it was aimed to examine whether there is a significant difference among three groups (with multimedia annotation, with paper-based annotation, and with no annotation) in terms of vocabulary learning and retention, and if so, what particular types of annotations affect vocabulary achievement of target words.
Findings obtained from vocabulary achievement tests indicated that, in pretest condition, the mean score for multimedia annotation group were the lowest of all groups. The mean score for no-annotation group was a slightly higher than paper-based annotation group. In posttest condition, multimedia annotation group mean score increased drastically, paper-based annotation group mean scores increased to some extent, and mean score of no-annotation group displayed a slight increase compared with pretest condition. Finally, in delayed posttest condition, there were decrease in varying degrees in all groups but still the highest mean score belongs to multimedia annotation group. Findings revealed that there are significant differences among mean difference scores of students in each group, revealing the effect of Vocastyle in both vocabulary recall and retention. These findings might be interpreted as multimedia annotations and, therefore, Vocastyle software affected recall levels and retention levels of multimedia annotation group students. Findings also indicate significant differences between multimedia annotation group and paper-based annotation group, and between multimedia annotation group and no-annotation group in terms of both vocabulary recall levels of students and vocabulary retention levels. These findings suggested that learners who used multimedia annotations recalled more and retained more when compared with learners who received paper-based annotations and learners who received no annotation. The effect of paper-based annotations should not be ignored, as there were mean differences both in posttest and delayed posttest condition between paper-based annotation group and no-annotation group on the part of paper-based annotation group. Although the differences in both conditions were not statistically significant, the progress of those students should not be underestimated. Still, it should also be questioned that in all three conditions, during the treatment, all three groups were exposed to traditional learning environment as well as purposeful treatment to multimedia annotation and no-annotation groups. It should also be noted that the progress, regardless of groups and treatments, to some extent, might be explained by traditional learning environment as well such as coursebooks, workbooks, and additional worksheets delivered by class teachers. Another possible interpretation might be the idea that students in multimedia annotation group and no-annotation group might have given additional effort to their studies that they already knew that they had been observed, so not all, but some of the learners might have affected the results slightly. Therefore, some of the progress of learners in multimedia annotation and no-annotation group might be explained with Hawthorne effect. Yet, the findings seem still clear about the effect of Vocastyle software, as mean difference scores and significant differences between multimedia annotation and paper-based annotation group, and multimedia annotation and no-annotation group is pretty obvious enough to claim that.
The abovementioned findings are in line with DCT (Paivio, 1990) and Generative Theory of Multimedia Learning Theory (Mayer, 2001). While DCT (Paivio, 1990) postulates the idea of pictorial–verbal system for knowledge construction in which a verbal system deals directly with language and a nonverbal (pictorial) system deals with nonlinguistic objects, elements, and events, Mayer’s (2001) Generative Theory of Multimedia Learning puts forward that information, both verbal and visual, is accessed consecutively in short-term memory. Then the working memory comes into play as the place where the information is processed with both verbal and visual representations to get a holistic form leading to a more complete understanding of the information. Generally, when language learners are offered both verbal and visual input via multimedia, they choose and arrange helpful information accordingly into different models. Thus, the relationships can be established to construct a kind of structure which is mental and meaningful. In fact, linguistic elements, specifically words, in verbal models offer discrete and linear information, but a holistic and nonlinear type of information is offered by pictures in other models. Therefore, learners can have better comprehension when they incorporate knowledge structures into the related models (Ariew, 2006). The abovementioned findings are also in congruence with the findings of the following studies. Chen and Chang (2011) explored the moderating effect of L2 English proficiency upon presentation mode and found that there was no moderating effect as students having dual mode scored performed better than the students who had access only to audio across proficiency levels. In another study, Xu (2010) examined the effect of L1, L2, and L1 + L2 annotations on L2 vocabulary learning and found that L1 annotations were more effective in enhancing L2 vocabulary learning than L2 and L1 + L2 annotations. In the same line, the study by Hulstijn, Hollander, and Greidanus (1996) lent support to the effectiveness of L1 annotations on enhancing L2 vocabulary learning.
On the other hand, the findings of this study contradicts with Cognitive Load Theory which argues that that cognitive capacity in working memory is restricted, so that if a learning task requires too much capacity, learning will be obstructed on the condition that a learning necessitates too much capacity. Similarly, Biçer and Akdemir (2015) examined the influence of multiple content forms use in web-based environments on English vocabulary learning. The findings contradicted the Cognitive Theory of Multimedia Learning (Mayer, 2001) which stresses that providing more than one channel (dual mode) at the same time without any rise in cognitive load and the findings were in line with the Cognitive Load Theory which claims that increasing cognitive load may lead to worse performance in learning especially in cognitively less able students.
After the abovementioned discussion including three groups in terms of annotation use and its effect on vocabulary recall and retention levels of learners, the following section involves with two subresearch questions of first research question investigating (a) the effect of annotation use on immediate vocabulary recall under Mobile-Assisted Vocabulary Learning environment and (b) the effect of annotation use on long-term vocabulary retention under Mobile-Assisted Vocabulary Learning environment. The discussion here will include only multimedia annotation group learners.
The first subresearch question was addressed to investigate the effect of annotation use on immediate vocabulary recall under Mobile-Assisted Vocabulary Learning environment. The data regarding annotation use were divided in three groups as high annotation users, mid annotation users, and low annotation users. The findings indicated that high-level annotation users scored higher in vocabulary achievement posttest than mid-level users and low-level users and there were significant differences between students that use annotations at low level and mid level, students that use annotations at mid level and high level, and students that use annotations at low level and high level. It would be reasonable to infer that the mean scores increase along with the higher use of annotations. In other words, learners with high-level annotation use learned better compared with those who used annotations fewer and this difference is observable among different annotation use levels.
Findings regarding isolated effects of multimedia annotation on vocabulary learning revealed that text annotation did not seem to cause significant difference on students’ vocabulary learning and this was not surprising when remarks of students given earlier are taken into consideration.
On the other hand, the other types of annotations—audio, graphic, and video—were found significant in terms of vocabulary recall. Findings revealed that there are significant differences between students that use audio annotations high and low. In other words, the learners who used audio annotations high have higher achievement scores compared with those who are mid and low audio annotation users.
Another finding indicated that there were significant differences between the students that use graphic annotations at low level and mid level, the students that use graphic annotations at low level and high level, and the students that use graphic annotations at mid level and high level. Put it differently, high graphic annotation users have higher achievement scores compared with those who are mid and low graphic annotation users. Achievement scores decrease along with less use of audio annotations.
The last finding of this part displayed that there were significant differences between the students that use video annotations at low level and mid level, and low level and high level. It can be inferred that high video annotation users have higher achievement scores compared with those who are mid and low video annotation users and achievement scores regresses with lesser use of video annotations.
To wrap up, the abovementioned findings revealed that multimedia annotation use has significant effect on vocabulary recall levels of learners. Among annotations, this effect seems significant in video, graphic, and audio annotations. Although text annotations were reported to be helpful, their effect did not seem significant.
The abovementioned findings correlate with a number of studies and view in the literature. Nation (2001) remarked that the use annotations has a number of advantages. First, difficult and presumably authentic texts are presented with no simplification or adaptation. Second, there is no interruption from the reading process and it is more time-saving than dictionary use. Third, learners are supplied accurate meanings preventing them from guessing incorrectly. Last, learning might be encouraged with more focusing on annotated words. Another supporting view is that different types of media might be employed by annotations which are not available in traditional ones. To clarify, while hypermedia annotations offer a various kinds of media such as text, audio, video, animations, or images to present visual, aural, or verbal information, traditional annotations can employ only pictorial and textual aids to help the reader’s understanding (Chun & Plass, 1996). Traditional annotations might be provided either within the text in the form of marginal annotations or as a list at the end as glossaries; on the contrary, hypermedia annotations are provided within the text in different forms of multiple media. Therefore, learners can read passages faster with the aid of both print and hypermedia annotations. In terms of studies, Wu (2015) designed a Basic4Android smartphone application (Word Learning-CET6) and explored its impact as a tool in facilitating English as a foreign language (EFL) students learning vocabulary. The findings of the study revealed that the participants using the app significantly outscored their counterparts in the control group in terms of new vocabulary gain scores. Moreover, Lomicka (1998) carried out a pilot study to explore the influence of multimedia annotations on reading comprehension. The study ensures empirical evidence to promote the practicality of multimedia annotation. Finally, the findings of Yeh and Wang’s (2003) research also showed that the significance of hypertext annotation use in EFL and vocabulary learning has been influential.
The second subresearch question was addressed to investigate the effect of multimedia annotation use on vocabulary retention under Mobile-Assisted Vocabulary Learning environment. The data regarding multimedia annotation use were divided in three groups as high annotation users, mid annotation users, and low annotation users. The findings indicated that high-level multimedia annotation users scored higher in vocabulary achievement delayed posttest than mid-level multimedia annotation users and low-level users. It would be reasonable to infer that the means scores increase along the more multimedia annotation use. Put it differently, vocabulary retention level of learners who use multimedia annotations in high level is higher than the learners who use multimedia annotations in mid level and low level.
Findings regarding isolated effects of multimedia annotation on vocabulary retention demonstrated that text annotation did not seem to cause significant difference on students’ vocabulary retention levels and this was not surprising when remarks of students given earlier are taken into consideration as it was the same for vocabulary learning in the discussion earlier.
On the other hand, the other types of multimedia annotations—audio, graphic, and video—were found significant in terms of vocabulary recall. Findings indicated that there was a significant difference between the students who use audio annotations at low level and high level. Put it differently, learners who used audio annotations high have higher retention levels compared with those who used less audio annotations.
Another finding displayed that that there were significant differences between the students that use graphic annotations at low level and mid level, at low level and high level, at mid level and high level. In other words, learners who used graphic annotations high have higher retention levels compared with those who used less graphic annotations.
The final findings related to vocabulary retention levels of EFL learners indicated that there were significant differences between the students that use video annotations at low level and mid level, and at low level and high level. Put it differently, learners who used video annotations high have higher retention levels compared with those who used less video annotations.
To wrap up, the abovementioned findings revealed that annotation use has an effect on vocabulary retention levels of learners but this effect seems significant in video, graphic, and audio annotations. Although text annotations were reported to be helpful, their effect did not seem significant.
The aforementioned findings correlate with a number of studies and view in the literature. Martínez-Lage (1997) claims that computer-aided annotations are more effective than traditional ones in terms of having a better overall comprehension of the text as diverse multimedia annotations such as sounds, images, cultural, and geographical references might be used. Texts with hypermedia annotations help learners to make a more global approach to the text. Enabling learners to access the text immediately with no interruption is one of the advantages of hypermedia annotations. They also present information in multiple formats which are more understandable and faster to manage for language learners. In terms of studies, Ko (2012) investigated the effect of L1, L2, and no annotations on vocabulary learning. Ninety university students in Korea were randomly assigned to three groups and were asked to read texts for a reading comprehension test. Then, they took an unexpected multiple-choice vocabulary test, which was repeated again 4 weeks later. Data analysis revealed that on the immediate vocabulary test, the multimedia annotation groups outperformed the no-annotation group; however, there was no significant difference between L1 and L2 annotation groups. Similar results were obtained in the delayed posttest. The participants showed keen interest in having access to annotations. Interestingly, they favored L2 over L1 annotations. Yoshii (2006) examined the effect of L1 and L2 annotations on L2 vocabulary learning in a multimedia context. Yoshii’s study revealed no significant differences between the L1 and L2 annotations, suggesting that both L1 and L2 annotations could be equally effective for L2 vocabulary learning. Taylor (2006) conducted a meta-analytic research of experiments carried out on the effects of L1 annotations on second language reading comprehension. He concluded that learners provided with L1 annotations through computer comprehended significantly more than learners who were provided with paper-based L1 annotations aids.
Based on the abovementioned conclusion and discussion, this study has a number of limitations. First, the number of participants who took part in the study is limited. Thus, generalizability of the statistical findings here is questionable. Especially in post hoc grouping of the students caused low cell sizes indicating questionable results. Second, treatment lasted only 4 weeks due. Longer period of time allotted to treatment could have given different findings. Third, delayed posttest was conducted 1 month after the treatment. A delayed posttest which might have been conducted at a later stage could have given different findings. Fourth, this study concentrated on only two perceptual learning styles, visual and auditory. Therefore, the findings cannot be generalized to all learning styles. In addition, the treatment was limited only with 10th-grade EFL learners. Treatment to different grades could have given different findings. Moreover, multimedia annotations were prepared only for two units of a selected coursebook used in state schools. A wider scope with more units covered and with more coursebooks included could have supplied a deeper insight to the study. Another limitation is that only click time (frequency) or in other words how many times an annotation is accessed was taken into consideration. Time spent on as well as access time could have given a wider perspective to the study.
Footnotes
Declaration of Conflicting Interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
