Abstract
Blended learning is the latest and inevitable trend in the development of education. Although blended learning research is on the rise, fewer studies examine the learning behaviour of college students in blended learning environments. This study aimed to investigate the learning behaviours of students in the field of computer science and examine these behaviours using data mining algorithms, taking the teaching practice of the Digital Signal Processing course as a case study. A total of 18 behavioural indicators were extracted and divided into three categories: basic learning behaviours, self-regulated learning behaviours, and extended learning behaviours. Data analysis of the behavioural indicators yielded the following conclusions: (1) Students did not have the habit of watching course playback and were less receptive to multiple online learning platforms; (2) Students’ midterm performance and duration of livestream watching directly affected their basic learning behaviours, with all indicators of self-regulated and extended learning behaviours showing significant correlations; (3) The clustering of learning behaviours yielded four different learner patterns, which calls for personalised teaching strategies; (4) The random forest algorithm had an accuracy of 95.4% in predicting performance of the four types of learners.
Introduction
The development of information and network technologies has brought a fundamental change in the lives and mindsets of students. Teaching informatization, a key concept of modern education, is guiding the teaching reforms. The blended learning mode, which combines real classrooms and online teaching, has become a research hotspot in the field of education at home and abroad [1]. Especially against the background of the COVID-19 pandemic in 2020, numerous online teaching platforms offering a large number of courses on a wide range of topics have been introduced in universities, greatly facilitating the application of blended teaching [2]. Blended learning has become the most popular mode of instruction adopted by educational institutions due to its perceived effectiveness in providing flexible, timely, and continuous learning [3].
The home-based online learning mode lacks a good classroom learning atmosphere and effective teacher supervision, and puts a higher demand on student autonomy. Consequently, how students use their time widens the gap in learning [4]. Compared with traditional teaching, blended learning is student-centred and pays more attention to individual differences [5]. In addition to the basic learning ability of students, blended teaching reflects more self-regulated learning and innovation ability, with students’ performance being an important indicator to measure their learning behaviour in the blended teaching mode [6].
Analysis of student learning behaviour under blended teaching mode
Learning behaviour analysis has been a concern in blended learning [7]. The blended teaching mode is divided into three types based on the learning style: offline-oriented, online-oriented, and online/offline-coordinated [8], and promotes assessment of students’ learning status primarily through the analysis of classroom learning behaviours. Traditional teaching follows the classroom observation method to record students’ active and passive verbal and motor behaviours, such as answering questions, looking at the board, and making eye contact [9]. Machine learning methods were used to examine and evaluate offline classroom teaching behaviour to indicate overall institution performance. A deep learning-student attention recognition framework (DL-SARF) for offline classroom assessment was developed to analyse professional classroom behaviour [10]. Thus, the traditional offline-based teaching model mainly focuses on students’ apparent behaviours in the classroom and their classroom performance for a limited period, reflecting the basic learning ability of students to acquire textbook knowledge, and does not analyse independent and innovative learning behaviour mechanisms. Moreover, the implementation of offline instruction is vulnerable to epidemic conditions, making concrete implementation difficult.
The online learning mode is not restricted by space and has played an important role in the continuation of teaching when offline classes had been suspended during the COVID-19 pandemic. Previous studies have examined online learning behaviours of students using data from online teaching platforms in universities to analyse differences in learning processes [11]. Online learning platforms are sources of rich log data, allowing observation and analysis of students’ self-regulated learning and learning behaviours outside the classroom, thus compensating for the lack of classroom learning behaviour analysis. Sohee clustered students’ online learning behaviours and examined differences in attendance, assignment completion, discussion participation, and perceived learning outcome, and analysed the behavioural, emotional, and cognitive aspects of student engagement [12]. Song examined the design, content management, task construction, and assessment features of an online course under the Blackboard system, including the creation and grading of interactive rubrics, association of rubrics with content, and incorporation of digital learning objects [13]. Huang compared the distribution of scores and learning achievement differences in resource learning behaviours, problem-solving behaviours, and social interaction behaviours during online learning [14]. Wang used image emotion recognition to monitor students’ online learning behaviour to improve teaching effectiveness [15].
The online-offline mode takes into account online and classroom learning behaviours [16]. Xu et al. analysed the impact of general online learning behaviour on student performance by implementing a student-centred teaching method based on the flipped classroom and the small private online course (SPOC) [17]. Lin analysed the effectiveness of group awareness and peer assistance as external scaffolds in training self-regulated learning behaviour, enhancing opportunities for self-reflection, and stimulating and encouraging learners [18]. Liu explored students’ inclination towards the course and their preference for teachers to enhance students’ knowledge and integrative skills during the blended teaching process in three stages: before, during, and after class [19]. Additionally, a closed-loop feedback framework of teaching and learning in the online-offline mode has been established to give full play to the role of students in teaching [20]. Thus, the online-offline teaching mode broadens the scope of student learning behaviours and can measure the behavioural performance of students from multiple aspects and perspectives.
Studies on blended teaching primarily extract students’ learning behaviour indicators based on the needs of classroom teaching or the length of the course, followed by analysis and mining of data and providing feedback and recommendations to improve teaching quality. In recent years, research and teaching have been given equal importance in academia, with a notable trend of promoting teaching through research [21]. Enhancing college students’ ability to conduct scientific research can effectively promote student learning and improve the quality of talent training. Course practice has revealed that many students already have a sense of scientific innovation and they actively participate in scientific research. The thinking quality of students trained in the scientific research process directly affects the effectiveness of classroom teaching and the quality of talent output. Therefore, in addition to basic classroom learning behaviours, research learning behaviours are also important indicators of blended learning. This study further divides research learning behaviours into self-regulated and extended learning behaviours. Thus, this study explores students’ learning behaviour performance in terms of basic learning behaviours, self-regulated learning behaviours, and extended learning behaviours, followed by a prediction analysis of learning performance based on the learning behaviours. Overall, the findings of this study contribute to improving learning behaviours and enhancing teaching effectiveness.
Research design
Research environment
The data for analysis included the blending teaching practice data of the Digital Signal Processing course at X College in Hebei Province. Data for three spring semesters from 2020 to 2022 were collected: spring semester of 2020 (1/3 online
The data related to students’ basic, self-regulated, and extended learning behaviours were analysed and a multi-level, multi-characteristic behavioural performance data chain was developed. In addition, statistical analysis of the learning behaviour data was conducted to investigate the effect of the clustering of learning behaviours and examine the prediction accuracy, analysis benefit, and performance improvement of different machine learning classification models.
Data acquisition and pre-processing
Data related to basic, self-regulated, and extended learning behaviours were collected from 173 students, including 145 male and 28 female students, in the electronic information engineering program.
Basic learning behaviours included students’ classroom learning behaviours during face-to-face classroom instruction. The data were mainly obtained from the course resources and the online learning platforms (XueXiTong and DingTalk) hosting the course, and included seven indicators such as class attendance, number of answers, group reports, and platform access. The data for self-regulated and extended learning behaviours were collected using 5-point Likert-type questionnaires with responses ranging from “very often”, “more often”, “moderately”, “less”, and “never”. The questionnaires were distributed during the last week of the course. Self-regulated learning behaviour refers to the learning behaviour of the learner when completing a learning task independently in a specific environment, and includes items related to “willingness, skill, and self-direction” in the Learning and Study Strategies Inventory (LASSI) [22]. The five indicators for self-regulated learning included finding information on one’s own, searching for information, reading books, and attending lectures. Extended learning behaviour refers to the active efforts and learning activities of students outside of classroom learning, and included six indicators such as applying for major innovation and entrepreneurship projects, participating in disciplinary competitions, and publishing academic papers. Table 1 lists the learning behaviour indicators, their codes, and specific descriptions.
Student learning behaviours and specific descriptions
Student learning behaviours and specific descriptions
Since the data were collected from the “Chaoxing Fanya Online Course Platform (XueXiTong)” and “DingTalk” logs, there were problems such as inconsistent standards, missing data, and data imbalance. To resolve these issues, the “hours, minutes, and seconds” format of students’ livestream and playback time was unified as a numerical variable (in minutes) for subsequent data analysis. In addition, a large number of null values in the playback data were set to 0. To solve the problem of data imbalance, e.g. female students were less than 20% of the total students [23], the number of male and female students was normalised to improve the comparability of data when investigating the effect of gender. Furthermore, because the magnitudes of indicators vary greatly, the numerical attributes were standardised (Z-Score) to avoid any interference from the range of values taken by various types of indicators on the classification prediction.
Students’ basic learning behaviours, self-regulated learning behaviours, extended learning behaviours, and underlying information data were used as input to observe variable characteristics and clean the data. Subsequently, the behavioural features were visualised. Statistical tools and machine learning algorithms were used for correlation analysis, variable filtering, feature clustering, and classification prediction to guide the teaching process and inform instructional improvement. The analysis process is shown in Fig. 1.
Analysis process of learning behavioural data.
Statistical analysis of student attendance
Student attendance was analysed using the XueXiTong sign-in rate and DingTalk livestream watch duration (double sign-in). We analysed shows the time spent by the students watching the livestream and the playback. For livestream watching, the upper quartile, median, lower quartile, and maximum values are 3526, 4142, 4481, and 4739, respectively. For playback watching, the values for the above indicators are 0, 4, 63, and 535, respectively. Clearly, the number of students watching the playback after class is small.
Students were divided further into four categories to analyse their attendance on the DingTalk platform: students who watched both livestream and playback (A), students who watched only livestream but not playback (B), students who watched only playback but not livestream (C), and students who watched neither livestream nor playback (D). The visualization for each category is shown in Fig. 2. Most of the students (68.35%) “only watched livestream but not playback”. There were not any students in category C, indicating that students hardly watched the playback. It may be due to poor teacher supervision during playback or students do not have the habit of consolidating learning after class. Therefore, teachers should pay more attention to live online teaching and make full use of live class time to improve students’ learning efficiency.
The viewing behaviour of four types of students on the DingTalk platform.
The single-platform (DingTalk) sign-in and dual-platform sign-in were compared to analyse students’ receptivity of the learning platforms and simultaneous sign-in activities (Fig. 3). The perfect attendance rate of students on dual platforms was 77.22%. These were the students who were less likely to be late, leave early, miss classes, or forget to sign in during online courses. The single-platform (DingTalk) sign-in rate of students was higher than the dual-platform sign-in rate by 7.59%, indicating that students are more accustomed to signing in to a single platform. Therefore, a limited number of platforms should be employed in the blended teaching mode, and course construction should be concentrated on one platform.
The data on basic learning behaviours were obtained from classroom teaching records and backstage-exported data. The data on self-regulated and extended learning behaviours were obtained from the questionnaires. The collected data were then subjected to Spearman correlation analysis.
The Spearman correlation coefficient indicates the direction of correlation between two input variables X and Y [24]. If Y tends to increase when X increases, the Spearman correlation coefficient is positive; otherwise, it is negative. The Spearman correlation coefficient increases in absolute value as X and Y get closer to a complete monotonic correlation. In addition, the absolute value is 1 when a complete monotonic correlation occurs.
Basic learning behaviour analysis
The correlation coefficient matrix for the seven indicators of basic learning behaviours is shown in Table 2.
Correlation coefficient matrix of basic learning behaviours
Correlation coefficient matrix of basic learning behaviours
Note: **indicates a significant correlation at the 0.01 level and * indicates a significant correlation at the 0.05 level.
Attendance ratio of DingTalk/dual platforms.
The matrix analysis yielded the following findings:
The correlation coefficients of classroom performance and watching livestream with classroom sign-in rate were 0.28 and 0.25, respectively, which were significantly correlated at the 0.01 level. The correlation coefficient between lab report completion and classroom performance was 0.52, which was significantly correlated at the 0.01 level; the correlation coefficients of midterm performance and duration of watching livestream with classroom performance were 0.16 and 0.14, respectively, which were significantly correlated at the 0.05 level. This shows that students with good classroom performance also perform better in lab report completion and midterm assessment. The correlation coefficient between midterm performance and lab report completion was 0.29, which was significantly correlated at the 0.01 level. The duration of watching the livestream and playback was significantly correlated with lab performance at the 0.05 level. This indicates that lab report completion has a significant correlation with midterm assessment and online class attendance among students. The correlation coefficients of midterm performance and duration of watching livestream with final performance were 0.28 and 0.21, respectively, which were significantly correlated at the 0.01 level. Classroom performance, lab report completion, and duration of watching playback were significantly correlated with final performance at the 0.05 level. This shows that there is a significant correlation between final performance and classroom performance, lab report submission, midterm assessment, and online class attendance, which is consistent with the general learning pattern.
The overall performance of the two learning behaviours
There are a total of 11 indicators of self-regulated and extended learning behaviours. The scoring rate of each indicator was calculated separately and ranked from highest to lowest (Table 3). Overall, self-regulated learning behaviours were rated significantly better than extended learning behaviours. The top three rankings were for self-regulated learning behaviours, and the bottom three were for extended learning behaviours. This shows that students at the undergraduate level have a strong sense of self-regulated learning, but their extended learning outside the classroom needs to be strengthened.
Self-regulated and extended learning behaviour performance (ranked from highest to lowest)
Self-regulated and extended learning behaviour performance (ranked from highest to lowest)
Correlation coefficient matrix of self-regulated and extended learning behaviours
Note: *** indicates a significant correlation at the 0.01 level, ** at the 0.05 level, and * at the 0.1 level.
In terms of high-scoring behaviour (scoring rate
Regarding medium-scoring behaviour (scoring rate: 0.1–0.5), students were more active in acquiring cutting-edge knowledge by attending lectures and reading. Thus, undergraduate students should be involved in projects and should participate in competitions during their school years to enhance learning and improve their engineering practice skills.
As for low-scoring behaviour (scoring rate
Correlation analysis of self-regulated and extended learning behaviours
Correlation analysis was conducted on the 11 indicators of self-regulated and extended learning behaviours. The correlations among these indicators were significantly higher than those for self-regulated learning behaviour. As shown in Table 4, the correlation coefficients between all five indicators of self-regulated learning behaviour were above 0.5, and each indicator was significantly correlated at the 0.01 level. Based on further analysis of Table 3, when students exhibited a self-regulated learning behaviour, it was often accompanied by similar habits of self-regulated learning behaviour.
As for extended learning behaviours, there was a significant correlation at the 0.01 level between four indicators: applying for innovative and entrepreneurial projects, participating in disciplinary competitions, doing research with classroom teachers, and applying for software copyrights. Despite the low scores for these four indicators, there was a significant correlation between each indicator. Table 4 shows that the 11 indicators are significantly correlated at the 0.05 level. This shows that students who perform well in self-regulated learning behaviours also perform better in extended learning behaviours.
The data were analysed using the K-means clustering method, wherein learners with similar learning behaviours were put in a cluster to identify potential learning patterns [25]. The different measure gauges used for the various types of learning behaviour variables resulted in large differences in values between variables. Therefore, all variables were normalised and converted to Z-scores before clustering.
Clustering was performed using the Davies-Bouldin Index (DBI), followed by the evaluation of different K values (number of clusters) and clustering results. Smaller values of DBI imply smaller intra-class distances, while larger inter-class distances indicate more reasonable clustering results. The K values ranged from 3 to 9 in this study. The values of DBI corresponding to different K values are shown in Table 5.
Different K values and their corresponding DBI
Different K values and their corresponding DBI
Classification prediction results of different classifiers
Broken line graph of the three learning behaviour clustering centres.
The lowest DBI value was obtained for
Active extracurricular learners are learners with average or low performance in classroom learning behaviours but very good performance in extracurricular learning behaviours. This group of learners has poor performance in basic learning behaviours, but their performance in self-regulated and extended learning behaviours is much higher than that of other students (scientific research students). Comprehensive developmental learners are students who perform well (close to the mean level) in both in-class and extracurricular learning behaviours. This group of learners has basic learning behaviours that are close to the mean and self-regulated and extended learning behaviours that exceed the mean. In general, these students excel in all areas and are the elite students in their classes, majors, and colleges. Active in-class learners are students who have good in-class learning behaviours but poor extracurricular learning behaviours. The basic learning behaviours of these learners are good, but their performance of self-regulated and extended learning behaviours is lower than the mean level. In addition, learners in this category focus mainly on textbook knowledge and have weak practical skills. Therefore, the cultivation of independent and creative abilities should be strengthened. Negative learners are students who perform poorly in both in-class and extracurricular learning behaviours (values of B1–I6 are very low). This group of students needs more focused attention, including active guidance in the classroom, interest point tapping, and early intervention.
The analysis of various types of learners can help teachers adopt learner-specific intervention strategies to achieve personalised teaching and learning.
As mentioned above, basic, self-regulated, and extended learning behaviours were clustered into four categories of learners: active extracurricular learners, comprehensive developmental learners, active in-class learners, and negative learners. Six different machine learning algorithms, including logistic regression, K-nearest neighbour, decision tree, support vector machine, random forest, and XGBoost, were used to predict the learner categories. In addition, 70% of the sample data were randomly selected as the training set and 30% of the data were used as the test set. The classification results are shown in Table 6. The results were evaluated using four types of metrics commonly used in classification evaluation: overall accuracy, precision, F1 mean, and average recall [26].
The random forest algorithm achieved the best classification results with an overall accuracy of 0.954, precision value of 0.872, mean F1 value of 0.887, and recall value of 0.917. Thus, it can effectively predict student learning behaviour and provides a reference for implementing personalised teaching.
Conclusion
This study collected data related to students’ blended learning behaviours in the dual-platform Digital Signal Processing course during the COVID-19 pandemic from 2020–2022 and analysed students’ performance in basic learning, self-regulated learning, and extended learning based on 18 indicators of learning behaviours. The findings of this study are as follows: (1) At the beginning of the pandemic, teachers tended to use a multi-platform approach due to the imperfect online platforms. However, with the gradual normalization of online teaching, students’ acceptance of multiple platforms declined. In addition, poor platform interaction made students reluctant to watch the course playback; (2) The basic learning behaviours affecting students were midterm performance and viewing live courses. In addition, all indicators of self-regulated and extended learning behaviours had significant correlations; (3) The four different types of learners obtained by clustering of learning behaviours had distinctive characteristics; therefore, attention should be paid to the needs of students to balance the different types of learning behaviours; (4) The performance prediction using six types of machine learning algorithms for the four types of learners had an accuracy of 95.4%, which can guide teaching strategy.
This study makes the following recommendations to improve the effectiveness of blended teaching: (1) Teaching should be concentrated on a single learning platform. In addition, teachers should improve course resources, optimise live teaching content, and enhance interactivity to increase students’ interest and participation; (2) Students should engage in extracurricular learning activities to improve their self-regulated learning ability, which will facilitate their basic learning behaviours. In addition, maintaining an optimum balance between the different types of learning behaviours can improve students’ learning efficiency; (3) Personalised teaching strategies can be adopted to provide individualised learning recommendations for different learners based on their behavioural characteristics and performance predictions.
Footnotes
Acknowledgments
The authors acknowledge the support of Innovation and Entrepreneurship Training Program for College Students (Grant no.: 202210092005), Education and Teaching Reform Research Project of Hebei North University (Grant no.: JG202212), the second batch of school-level high-quality online open construction course “Fundamentals of Big Data Analysis” in 2020 of Hebei North University, China Information Association Education Branch “13th Five-Year Plan” (Grant no.: ZXXJ2020013), and Higher Education Science Research Planning Project of Hebei Higher Education Society (Grant no.: GJXH2019-005).
