Abstract
By studying the use of a virtual learning environment, many have focused on automatically logged web data in order to detect factors that enhance students’ use of the virtual learning environment and that may impact their productive and efficient learning via this means. Following their footsteps, the aim of this research is to examine data (activity logs) obtained by students’ while they are logged into the virtual learning environment in order to detect frequencies and priorities of students’ choice of activities in a virtual learning environment. The activity logs are used to measure students’ effectiveness of learning to determine whether students’ activity logs, within courses supported by a virtual learning environment as part of a blended learning approach, correlate with their final marks and the students’ perceptions of using the virtual learning environment. Observed activities involved course view, assignment view, resource view, forum view, assignment upload and project upload when seen against their final mark. Data log features of a virtual learning environment and an instrument used to gather data on the students’ perceptions of using the virtual learning environment were used. Results show that there are positive correlations between students’ logs of particular activities and their final mark.
Understanding how students use a virtual learning environment (VLE)
In an online-learning environment students can review lecture notes, exercise notes, assignments, quizzes, questionnaires and other learning materials and prepare themselves for revision tests and examinations or develop projects on a certain subject. Most educators today use a blended learning approach, that is, one that combines in-class activity with the support of a learning management system (LMS) as a supplement to the traditional classroom-based approach. The benefits of using VLE can be seen in positive practices from various studies. Formation of smaller groups who engage with each other via a VLE can make the individual feel less embarrassed and encourage them to share ideas. Tutors can provide guidance in initiating online discussion and supply initial arguments to motivate students involved actively in the VLE (Ashby and Broughan, 2002). Integrating information from the online discussion into module delivery, dividing students into teams and asking them to supply group opinion to the forum, and providing positive and constructive feedback through lecturers can enhance students’ use of online discussion forums (Dale and Lane, 2004). It is possible to determine skills that are fundamental in efficient online dialogue or online communication: expressing oneself fluently, receiving and understanding the real meaning of the given message, bounding the messages, concentrated continuing on others’ thoughts and improving on others’ thoughts, making enquiries for better understanding and positive wondering (Toth, 2010). VLEs are widely used by teachers, who upload learning materials to the VLE, which in turn allows for both the sharing of materials for learning and for discussion among learners. VLEs also allow for the storing of learning resources online, connecting the resources and creating tests. They also allow the teacher to set up online discussions and monitor their students’ participation or otherwise in these. VLEs also provide the facility to create blogs, review teaching material, use email and other such activities (Black et al., 2008), and allow for learning outside the classroom, thus offering flexibility.
One of the advantages of a VLE is that it allows us, educators, to see, literally, the activities that students carry out to help in their learning, which are not normally visible to us in the ‘traditional’ classroom. That this activity is online and visible to the teacher outside the classroom makes a VLE useful in that it allows the teacher to track student activity and to interpret those activities with a view to better understanding their students’ learning and to better supporting them in that task. Students’ approaches and activities that they undertake online can be observed from various points of view. The teacher can log in and see, for example, course reviewing, assignments view, resource view, forum view, discussion view and also see what assignments or projects of other work their students have uploaded and when they did so, or what their students have not uploaded by a certain time or date. We, as educators, play a part in the use (or otherwise) of a VLE, which thus impacts both the activities undertaken and also our students’ engagement and learning. Student activity in a VLE is influenced by teacher activity; that is, if the teacher engages fully and appropriately with their learners via the VLE, the rewards for the learners will be greater. However, rather than spending most of their time and effort engaging with their learners via a VLE, that is, focusing on the pedagogic, it could be argued that teachers devote as much if not more of their time and effort dealing with the administrative ‘backroom’ aspects of a VLE, or so it seems (Preidys and Sakalauskas, 2010).
Our own engagement with a VLE and a better understanding of students’ behaviour while using a VLE could be useful when designing our teaching activities and materials, as understanding their behaviour allows us to adapt what we do or provide in order to take account of our learners’ preferences. If we better understand their online behaviour, we could design more appropriate activities and materials either prior to the start of the course or while the course is running, that is, respond to feedback on a need basis. It can be argued that a VLE allows educators to design individualized learning materials which, if effectively done, may better assist learners in their learning and, subsequently, help them in improving their performance on assessment tasks. Given the widespread use of VLEs in higher education and elsewhere, research in this area is becoming ever more important, since how well we design and use a VLE impacts the nature and quality of our students’ learning (Daukilas et al., 2008). There are many different approaches for assessing the perceived quality of a VLE. Many look at the quality of the design of the learning material as well as the quality of communication that learners, participants, undertake in the process of e-learning, although that participation also extends to teachers and to the educational institution more widely (Weaver et al., 2008).
Learners, themselves, have their own perceptions of using their VLE. An online survey instrument is a common method used to gather students’ perceptions, and the results from this informs educationalists of the efficiency and effectiveness of tactics and strategies they are using in their VLE (Clayton, 2007). That is, how effective a VLE is varies according to whose views are sought and from their perspective. According to the students themselves, a VLE supports their learning, whereas teachers report that although a VLE might be useful to their students’ learning, they do not believe that it supports their teaching activities (McGill and Hobbs, 2008). In terms of supporting learners and their learning, maximizing accessibility, adaptability and clarity of communication in a VLE should support e-learners in different categories, which can be grouped into three, namely, cognitive, affective and information management (Seok et al., 2006). The cognitive category involves the accessibility dimension, which includes indicators that emphasize attributes of online instruction that enhance access to courses and to features within online instruction. The affective category encompasses providing an environment that gives support to students through social strategies and the teaching aspect. Information management concerns keeping students updated and cognizant regarding instruction, and being generally student friendly in the process (Seok et al., 2006).
In terms of how engagement with, and use of, a VLE impacts their subsequent performance, students who have a lower rate of online engagement have lower final marks, whether or not that access/use is on campus or at home. Although students are more likely to use a VLE while on campus rather than at home, the place where they use the VLE (home or university campus) does not impact their performance (Chanchary and Haque, 2007). Relationships between students’ VLE access behaviour, study habits and overall performances were observed, so that in a web-based learning environment, students were required to view lecture notes and additional documents, practice sheet, sample quiz and so on, and to prepare themselves for tests, daily/weekly quizzes or to develop projects on given topics. Success factors include appropriate motivation, student opportunities to collaborate, a variety of delivery methods, user-oriented technology and teachers actively participating in the online environment. Incorporating time-dependent media and self-check tests as well as maximizing the syllabus units in 14–15 screen pages has been shown to provide more productive and efficient learning in a VLE (Izsó and Toth, 2008). Components of VLE offer various opportunities for enhancement of student-centred learning and these impact their final academic results, that is, their marks (Pislaru and Mishra, 2009). How well or in what ways students use a VLE is their learning style(s) also impacts their academic results. These shape their use of a VLE because students plan and use the VLE in a way which is consistent with their preferred learning style (Heaton-Shrestha et al., 2007). A detailed analysis has shown that a VLE could, if suitably designed, accommodate a variety of learning styles and approaches, including active and reflective styles, and approaches to learning and studying (Entwistle, 2003). Students who learn online have a more independent style, whereas the on-campus student is more dependent (Heaton et al., 2007). Among those students who spend two or more hours per week on pre- and post-processing of the lectures, there is evidence that ‘heavy’ VLE-users perform better than non-users in the final examination and that the ‘heavy’ users’ performance in the VLE is the best predictor of the marks in the final examination (Stricker et al., 2011).
As for tracking and monitoring student activity in a VLE, log files in the VLE allow educators to collect and subsequently review statistical data such as how students approach and use different course materials, their approaches to the forum and usage, how long they view various elements and at what times, and so on (Zorilla et al., 2005). When using the VLE, records of the user remain in a system that is accessible at any time. VLEs span from systems for managing training or educational records to software for distributing courses over the Internet or offering items for online collaboration. Having the data stored and available in a VLE provides an opportunity for the use of methods of data mining in order to examine them (Preidys and Sakalauskas, 2010). Within the e-learning field, data mining can be used to explore, visualize and analyse the data with the aim of identifying useful patterns or evaluating web activity in order to obtain students’ learning behaviour or feedback that teachers can use when designing instruction and delivery. Data mining includes tasks and methods concerning statistics, visualization, clustering, classification, association rule mining, sequential pattern mining, text mining and so on. The application of data mining in e-learning is similar to any other data mining application area. The use of data mining has been employed to analyse students’ behaviour in an online-learning environment (Monk, 2005). By using Bayesian networks in the analysis of such systems, which model different aspects of a user’s behaviour while the user works with this system, it is possible to detect the learning styles of students from the activities that they perform in a VLE (Preidys and Sakalauskas, 2010). A Bayesian network, belief network or directed acyclic graphical model is a probabilistic graphical model, which represents a set of random variables and their conditional dependencies via a directed acyclic graph. In detail, the nodes in the Bayesian network represent the different student behaviours that determine a given learning style, and the arcs represent the relationships between the learning style and the factors determining it. By analysing students’ log files, the data used to create the Bayesian model are obtained (Preidys and Sakalauskas, 2010).
Although there is hardly a university that does not use a VLE to support the learning of its students, and the use of a VLE is accepted as ‘a good thing’, there is still much to learn about what activity is taking place, and how much it has impact on students’ performance measured by marks. This study examines the data (activities logs) obtained from students’ log-in to a VLE in order to detect frequencies and priorities of students’ choice of activities in a VLE, as there is much that we still do not know about the variables, such as how students do or do not use features like the discussion board, and how they look for resources and related content, that is, their actual activity when using a VLE. This study, therefore, looks more closely at students’ choice of activities in a VLE and their active participation (or not) within a VLE in order to determine which activities they perform because it is argued that if we know more about their activity, we can better understand which activities are more closely related to subsequent performance. The aim of this study is to detect the relationship between the observed variables of course view, resource view, forum view discussion, forum view forum, assignment view, project upload, assignment upload and the variable students’ final mark. The objectives are to find out the possible impact of a particular student activity in a VLE on their final mark in the course.
Methodology
Participants
The subjects were undergraduate students in the first and the second year of study, all of whom had just started to use the VLE. All students who filled in the online questionnaire were female undergraduates of the Faculty of Teacher Education (see Questionnaire in Appendix A for details requested from the participants). In total, 85% of them were first year students aged 18 years and the rest were second year students, aged 19 years. The questionnaire was filled in by 111 students out of 224. All students had a computer at home and Internet access. Internet access is always accessible for 58% of students and mostly accessible for 42% students, which means that there is sufficient homogeneity between subjects. The majority of students, 75%, use a computer for researching on the Internet on a daily basis and others use it several times a week or month. Using the Internet for learning purposes on a daily basis is the case for only 16% of students. In total, 45% of students do so several times a week, 25% do so monthly and 12% rarely. It can be concluded that the majority of students use a computer and the Internet for learning on at least a weekly basis.
Downloading learning content from the Internet daily is the case for 22% of students, and 38% do so several times a week. In total, 29% do so monthly, and only 12% of students do so rarely. Almost two-thirds of students rarely or never use the computer for playing games, but almost 78% of students regularly participate in online forums and chats outside of the VLE, that is, for social, non-course-related activities. Only 50% of students had had experience of using, or were aware of, a VLE in their previous education, while 46% of students had rarely or never used a VLE. On average, students started to use computers for the first time when they were 10 years old.
Apparatus
Student feedback and system log analyses were used to examine and measure VLE usage, while for data analysis, Statistica 8 and Weka software were used. Weka is an open source software that provides a collection of machine learning and data mining algorithms for data preprocessing, class- ification, regression, clustering, association rules and visualization. Moodle was used as a VLE in the observed courses. Moodle is a free web application that educators can use to create effective online-learning sites.
Procedure
Data were collected from access system logs and an evaluation questionnaire, while observations were made during two semesters. The start of collection were done at the beginning of the first term and lasted until the end of the second term. Students and administration were familiar with the research and tracking logs. Research respected confidentiality and the data were collected so as to ensure anonymity. The evaluation questionnaire asked for students’ perception of the VLE and their readiness to use it. The questionnaire consisted of a set of questions concerning computer usage in relation with e-learning, such as the technology availability, Internet accessibility, familiarity with different learning software and their usage, frequency of using such software and students’ preferred ways of learning when offered choice. The evaluative questionnaire was designed, administered and provided online through content of LMS Moodle in June 2010. Results from the questionnaire were downloaded into an Excel spreadsheet and subsequently imported into Statistica 8 and Weka for analysis.
Using data mining in a VLE consists of four steps (Romero et al., 2008). The four steps in the general data mining process as they apply to data mining in e-learning (Izsó and Toth, 2008) are as follows and are used in this study:
Collecting data
Students’ usage data and interaction information are stored in the database of the Moodle system.
Preprocessing the data
Data were cleaned and transformed into an appropriate format for mining. In this case, transformation was made in comma-separated values (CSV) format.
Using data mining
Data mining algorithms are applied to build and execute the model that discovers and summarizes the knowledge of interest for the user (teacher, student, administrator, etc.). For this purpose we have used a free data mining tool, namely, Weka.
Interpreting, evaluating and making use of the results
Results or the model are interpreted and used by the teacher for further actions. Retrieved information can be used to make decisions about students’ activities and the Moodle activities of the course in order to improve the students’ learning.
The data measured the total number of accesses to the system, the number of accesses to the learning resources and the use of particular features such as discussions and forums. Observed variables can be categorized into three major groups as follows: (1) information about students (number, first name and last name), (2) students’ marks and (3) activities during their learning period. Activities are grouped in content review (course view, assignment view, resource view, forum view and discussion within forum view) and uploading content on Moodle (activities of assignment and project upload).
Research was conducted in the Faculty of Teacher Education in Osijek, Croatia, by examining system logs of four computer courses during one academic year and by gathering the data from the 224 students who completed the evaluation questionnaire. There were 39,432 entries in the web log file. When the log data were downloaded they were sorted by students’ final grades in their courses using Microsoft Excel 2007 and imported into Statistica 8 and Weka. Frequency analysis was applied to data logs providing a cumulative number of system logs for each student. Spearman correlations were conducted in order to see the relationship between the students’ final grades in the course and the observed system logs of activities of the same course.
Results
Analysing the relationship between variables course view and mark, at the level of significance 0.05, the probability is 0.000005. A level of significance of 0.05 is statistically significant. Taking into account that the Spearman’s correlation coefficient is positive (R = 0.299), it can be concluded that variables course view and mark are positively correlated. Observing the variables resource view and mark, it is noted that the level of significance 0.05 correlation between variables resource view and mark is statistically significant. Because the Spearman correlation coefficient is positive (R = 0.14), we can conclude that variables resource view and mark are positively correlated.
Observing the variables forum view discussion and mark given the level of significance of 0.05, available data do not support a decision to reject a correlation between these two variables. Analysing the variables forum view and mark on the level of significance 0.05, it is noted that correlation between these two variables is statistically significant. Since Spearman’s correlation coefficient is positive (R = 0.2253), we can conclude that variables forum view and mark are positively correlated. Correlation between variables was made but not the linear model given that the basic assumptions of the linear model were not satisfied.
In general, variables of system logs of certain activities with the highest correlation with the variable mark are assignment view, course view, forum view and resource view.
The results presented in Table 2 indicate that the greatest average number of course view, assignment view, resource view and forum view refers to students with a very good and/or excellent final mark. These marks equate to A and B grades. There are a larger percentage of students achieving A or B grades than is normally the case, and this can be explained by the fact that this is typical on elementary-level computer courses that students choose to take. A detailed correlation between observed variables can be seen in Table 1. The average values of student activities can be seen in Table 2.
Correlation between observed variables
Table of students’ activities
Considering the cluster analyses, the aim was to group students into different clusters depending on activities done in Moodle and their final marks. Results of the cluster method are subdivisions of objects in separate groups of similar objects which represent users, events, pages, activities and other features in a VLE. By applying the Weka tool, the data were grouped into two clusters by means of simple k-means cluster algorithm. The simple k-means algorithm is an algorithm for clustering objects based on attributes in k partitions. After creating clusters, the software Weka classifies training instances into clusters by representation of clusters and counts the percentage of instances that fall in each cluster. In this case, clustering made by simple k-means shows 49% (110 cases) in cluster 0 and 51% (114 cases) in cluster 1. The number of iterations made is 5 and within the cluster, the sum of squared errors is 47.5457. Cluster 0 contains information about students with a higher average mark (mean = 4.9182), and thus, the observed activities are higher for each activity item, even though there is no statistically significant difference between the clusters.
Discussion and conclusion
In the research described, we have focused on detecting the impact of access to certain activities within a VLE. This was tracked through activity logs of the VLE and on the students’ final mark on the courses used in this study. By detecting a list of priorities of student activity within the VLE based on system logs, the behaviours of students in a VLE are visible to us, educators, and therefore, it is argued that methods of teaching can be adapted and content improved by adapting those teaching materials. This could result in better helping students in their learning and thus improving their subsequent performance.
Results of the simple k-mean cluster algorithm application
The data from system logs regarding student activity were gathered from the open source VLE, Moodle, in order to examine frequencies of using the activity logs and to determine their relation to student success, with ‘success’ being defined as their final mark by using methods of classifying the data obtained from these logs and by applying correlation tests. The results shed light on student success and the efficacy of their learning, given the relationship between students’ final marks and their activities in a VLE. The secondary goal was to determine whether student activity logs align with the students’ perception of using a VLE. By applying cluster methods, results can be extracted and presented using two groups of learners’ activities, which have not previously been discussed, and a description of learners’ activity in a VLE is given along with the relation of this to student success. As can be seen, student activity falls into each cluster and is distributed according to the final mark.
Students’ activity logs in a VLE are correlated with their final mark. Correlations are for the most part positive and suggest that the higher the number of activity logs in a VLE the higher the final mark will be. Of the activities used, the greatest influences on the students’ final marks are assignment view, course view, forum view and resource view. That is, the highest correlation exists between these variables and the other variable, the final mark. We can therefore conclude that the activities carried out in a VLE influence students’ learning outcomes and their learning effectiveness. It is also noteworthy that the lowest number of activity logs refers to the activities concerning forum view and discussion view. This suggests that we need to encourage learners to participate and engage more fully in a VLE in order to maximize communication within a VLE. It is perhaps surprising that students do not engage as fully as we might expect in a VLE given that the majority of students in this study, and likely elsewhere, given the technology-savvy students in higher education today, participate daily in online communication for other purposes, outside a VLE. If we are to better support our learners via a VLE, then we need to find ways to encourage and motivate them to engage and communicate more often, as results from this study demonstrate that doing so impacts their learning performance.
It is acknowledged that the limitations of this research are the number of students and the number of courses, that those courses observed are limited to one particular subject matter, namely, computer courses, and the age and gender of the participants (the majority of students in the Faculty of Teacher Education are women). A further limitation is that there was no control group, that is, a group of students which did not use a VLE. Nonetheless, system logs are an important source of information in understanding the patterns of students’ online usage for learning purposes, and so there is much more that we can learn from investigating, filtering and sorting activity logs via the methods described in this research. Possible future research in this area could comprise a broader range of factors (describing student success), a larger sample of students, more number and types of courses, by studying students from different disciplines and ages and by carrying out further studies with regard to gender of participants. Some pointers that are worth thinking about could be focused on detecting other success factors like student motivation and collaboration, or how actively (or not) the teacher participates in a VLE, as it might be the case that the greater the engagement and participation of the teacher, the greater the engagement and participation of students within a VLE. Observing the system logs of the VLE may help us, educators, to identify the most important activities that students choose and thus to adapt and, we hope, to improve our teaching style accordingly. Furthermore, it can encourage educators to pay more attention to what is and is not happening in our virtual classrooms and to encourage suitable participation and engagement of our learners within these. By observing the activities and behaviour of our learners, further studies in this area could provide us with insights into our own teaching practices and how and in what ways we may need to adapt these in order to better support our learners via a VLE.
Footnotes
Appendix A
Acknowledgements
We thank both the anonymous referees and the Editor for the comments, support and constructive feedback.
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
