Abstract
Estimating the flow state of students in a course allows evaluating their sentimental state and the challenges they are facing. In e-learning platforms, the evaluation of flow state is a complex task because it depends on the ability to extract the parameters that better reflect the activity and effort of students. In this scope, the current study proposes a method based on flow theory aiming to provide information about the students' flow state in a course that is taught in an e-learning environment. First, the interaction of students with an e-learning platform that comprises classical e-learning pages and a timeline tool is analyzed, using activity heatmaps and deep neural networks. Then, by taking also in account their grades, the flow state of students is calculated. The resulted data are validated with a statistical analysis that also utilizes student surveys. In order to guarantee that this method is applicable to various profiles, students from different faculties participated in this study. In a period where education is rapidly adapting to online lectures and e-learning platforms, the estimation of student's flow state in e-learning environments can provide useful feedback and data to students and educators.
Nowadays e-learning systems provide an educational environment for students and those interested in improving in their areas and be informed about the current developments in particular fields. These systems are used by a large number of people simultaneously, without any time, place or past limitations. An emerging challenge arising from providing education to different student profiles is the design of a system that can interpret the activity patterns of its users in order to adapt its content according to user behavior and needs. Ideally, an e-learning system shall keep track of the patterns that its user generates when interacting with it and offer suggestions to the student and/or the instructor to achieve the learning outcomes in an automated manner. Hence, student modelling is especially important in order to help improve the educational impact of e-learning systems.
According to Self (1990) a student model should be able to analyze the performance of students, identify their prior and acquired knowledge, and describe their personality characteristics. Therefore, understanding the learning state of a student in e-learning systems is critical in student modelling. Moreover, this state should be in accordance with the student's behavior. The concept of flow which is introduced by Csikszentmihalyi (1975) is one of the major theories that contributed to this field. Flow theory suggests that an individual is in a learning state if there is a balance between the difficulty of an activity and the personal skill level.
A review on the usage of flow theory of Csikszentmihalyi in computerized education systems has been published by dos Santos et al. (2018). According to their findings, almost all of the studies that use flow theory analyze an educational game or present game-based techniques rather than designing a computer-based learning activity. In order to measure the flow state, these studies use surveys (Hou, 2015; Katuk & Ryu, 2011; Liu et al., 2009; Van Schaik et al., 2012), user data logs (Kiili, 2005; San Pedro et al., 2013), interviews (Faiola et al., 2013; Kiili, 2005) and face video recordings (D’Mello & Graesser, 2012). Their main goal is to investigate the correlation between the subjective flow measures and the components of these measurements or to utilize flow-based rules to alter the content or the difficulty of the proposed activities. A recent study attempts to detect students’ flow state through Electroencephalogram (EEG) (Wu et al., 2020). Although they present promising results, the application of the approach requires the use of an EEG equipment for each student. In another study using flow theory, Georgiou and Demiris (2017), proposed an adaptive user modelling approach for a car-race simulator. They utilized machine learning algorithms to extract features from user’s physiological data and game-related actions and calculated a single value which they called “Exploration”. The performance of the subjects in the game was defined as their “Experience”. Finally, they based the validation of their variables on the reports obtained by the users. However, the defined behavioral parameters cannot be transferred to another domain since they are defined specifically for the case of a racing car simulator.
In summary, although flow theory is able to explain the learning state of students, in the literature there is a lack of objective and transferable measurements. The previous studies focus on educational games and game-based activities and measurements that cannot be transferred or depend on subjective reports. To the best of our knowledge, an objective measurement for calculating the flow state as it is described in flow theory for e-learning systems does not exist. One of the major difficulties is to define an objective and valid method that would be able to express the way students decide to face their challenges. In other words, define a method being able to capture the way they interact with an e-learning platform and extract a meaning from it.
Overall, the research questions of this study are the following:
Is it possible to capture the interaction of students in an e-learning environment in an automated way and use it for evaluating their flow state in a course? To what extent can this method be generalized and applied to different student profiles and e-learning environments?
To tackle these questions, this paper proposes a system that is based on deep learning techniques and activity heatmaps. It comprises an e-learning platform that is accompanied by an interactive tool, a timeline. The system works as follows: during an e-learning course, a mechanism observes student activity and collects the interaction patterns of students. After the course, quizzes based on the content of lectures serve to collect student grades. These patterns and the corresponding grades are analyzed in order to extract the flow state for each student. In this system, the proposed deep learning model serves to reduce the dimensions of the activity patterns and has by far the best performance when compared with other known dimension reduction methods.
Figure 1 presents a general overview and some details of this study. As described above, a class of students will follow a course with the help of an e-learning platform that has two parts: a timeline tool and classical e-learning pages. Then the behavior of each student will be analyzed using activity heatmaps that capture their interaction with the platform and deep neural networks in order to generate an Activity value, which represents the activity of a student. After the course, a quiz will extract a Performance value which denotes the acquired skills of a student. Finally, the Activity and Performance value will generate the flow state with the help of a flow state diagram. Within this way, students can get information about their flow state and an instructor, by observing the flow states of all students, can estimate if the majority of the students are in flow.

General overview of this study.
Theoretical Background
This section presents the concept of flow theory, activity heatmaps, and deep learning techniques used in this work.
Flow Theory
Csikszentmihalyi (1975) introduced the concept of flow; a feeling experienced during an activity in which an individual is completely engaged and has high levels of focus, enjoyment, and fulfilment. The activities that produce this feeling are identified as self-motivating and there is a balance between the difficulty of the challenge presented and the personal skill of the individual. The flow theory suggests that the individual would experience anxiety if the personal skill is lower than the presented challenge and boredom sets in if the personal skill is higher than the difficulty of the challenge. Figure 2 presents the flow state diagram as proposed by Csikszentmihalyi. The flow state diagram defines the anxiety, flow, and boredom regions according to the difficulty of a challenge and the skills of an individual.

The flow state diagram proposed by Csikszentmihalyi.
Heatmaps
A heatmap is a graphical representation of data where the values of a matrix are represented by colors. In heatmaps, when data are inserted in a cell, the neighboring data are modified by propagation mechanisms which affect their values. The colors of a heatmap depend on the values in each cell in the matrix. The most common color schemes are the rainbow color map and the grayscale color map. In the rainbow color map, large values are represented with a red color, middle values with a green color, and small values with a blue color. All other intermediate values are the intermediate colors of the above three colors. A white color indicates that there is no data on this cell. An example heatmap which uses a rainbow color map is presented in Figure 3.

A heatmap using rainbow color map.
Deep Learning Techniques
In the last years, neural networks became more performant and managed to give solutions to problems that could not be solved before, thus introducing a new era in Artificial Intelligence. The term “Deep Learning” englobes a new generation of neural networks, even if the fundamental ideas that lie behind them are much older.
Autoencoders
Before the deep learning era, the training of neural networks with many layers was problematic and restricted by factors like the number of resources, the size of data needed to achieve a satisfactory result, and the vanishing gradient problem. A solution to this issue was given by autoencoders which were originally introduced by Rumelhart et al. (1985). Essentially, autoencoders are a dimension reduction method in which the abstract representations of a set of data are obtained by the reconstruction of data through a set of fully connected networks. The goal is to reproduce an output that will be exactly the same as the input. A network consists of three layers: the input layer, the output layer, and the hidden layer. In autoencoders, each network is trained individually in an unsupervised manner. At the end of each training, the hidden layer represents the input data which usually has fewer attributes. As a result, this process reduces the dimensions of the input data. Figure 4 presents the architecture for an autoencoder with a single fully connected network.

An autoencoder architecture.
Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are one of the most successful neural networks of recent years. Their success lies in their ability of finding local patterns. Nowadays, in object recognition, when compared to the human perception they even report a better accuracy. LeCun et al. (1988) proposed CNNs that use convolutional operations, which are known to be capable of extracting patterns from data. They defined three layers: a convolutional layer, a pooling layer, and a classification layer. In the convolutional layer, the input data are convolved with a series of filters. The shape and the number of filters are set empirically based on the data and the expected output. The output of the convolutional layer is transferred to the pooling layer, where data are resampled by using a maximum or average function. After each convolutional and pooling layer, the output becomes smaller. In the end, after a certain number of layers, an abstract representation of the input data are created. The classification layer is used only if the goal is to do a classification task. This layer can be a fully connected neural network. Figure 5 presents a CNN architecture with a fully connected classification layer at the end.

A Convolutional Neural Network architecture.
Convolutional Autoencoders
CNN’s are a supervised training method and for this reason, they need a labelled set. Consequently, they cannot be used with unlabeled data. Masci et al. (2011) introduced the convolutional autoencoders (CAEs), a method to do a dimension reduction without a supervised training procedure. The goal is to generate again the input image as an output image through a set of convolutional and pooling layers (encoder layers) and up-sampling layers (decoder layers). The encoder layers decrease the size of data and the decoder layers increase the data size and attempt to reproduce the original input image. Figure 6 presents a convolutional autoencoder architecture. The ‘Features’ layer between the encoder and decoder part is the abstract representation of the input image.

A convolutional autoencoder architecture.
Research Method
In order to define a flow state model from behavioral data, two concepts are introduced and adapted from the original flow state model: Activity and Performance. The research method of this work is shown in Figure 7 and the current section follows this description. The participants are students from different faculties, in order to ensure diversity in student profiles. Two different courses are taught in a controlled environment via an e-learning platform that also comprises a timeline tool. A course finishes with a quiz. Then, for each student, a single Activity value (student’s interaction) is calculated with a complex method including deep neural networks. Then, a single Performance value (student’s acquired skills) is extracted from the quiz results. In order to verify the validity of the above values, students fill surveys that provide information about their sentimental state. Finally, by using the ‘Activity’ and ‘Performance’ values, each student is represented in a flow state diagram. When all students are displayed in the diagram, the flow state of the course can be observed and the course can be evaluated based on students' flow.

A diagrammatic representation of the research method.
Activity and Performance
In this study, Activity is defined as the measurement of a student’s interaction in a learning environment. In case of a difficult lecture, a thorough search of critical information is required and a student needs to intensively interact with the course content in order to acquire the necessary knowledge. On the other hand, in a lecture with easy content, the intended knowledge can be accessed with less effort. Hence, the students do not have to present a high activity. If a student needs to be highly active in interacting with the course content, this student would perceive the course as difficult. Based on these assumptions, this study proposes to measure the Activity of students through their interaction in order to reflect the level of challenge or difficulty.
Performance is defined as the evaluation of a student’s knowledge. The examination is the formal test of an individual’s knowledge or skill and the grade obtained in an exam indicates the individual’s performance. In general, a highly skilled student has the ability to acquire most of the required information from the content of a course and as a result to obtain high grades in exams. If the student is not skilled enough, the student has difficulty obtaining the necessary knowledge and it is reflected in the exams with a poor performance. Hence, this study proposes to measure the skills of students through their exam results thus corresponding to their Performance.
If a balance between the Activity and the Performance exists, students have the sense of accomplishment which induces the feeling of Flow. In case of imbalance, students experience a feeling of Anxiety or Boredom. According to Csikszentmihalyi, Anxiety occurs when the difficulty of a challenge is high and the individual is less skilled. If students obtain low grades despite an intensive activity, the disappointment created by this outcome induces anxiety. The flow theory suggests that Boredom sets in when the challenge is easy and the student is highly skilled. In the case of this study, the students would lose interest and start to get bored if they obtain high grades with a low activity. Figure 8 illustrates the relations between the above concepts.

Graphical representation of the relations between the concepts of this study and Flow Theory.
Participants
The participants were students from two different departments of different faculties, in order to guarantee that the results would not be biased by a specific student profile. In each department, the students followed one e-learning course.
In the first course, the participants were students from the Department of Computer Engineering. There were 52 students, ranging from 19 to 21 years old and 11 of them were female students. The participants of the second course were students from the Department of Political Science and International Relations. There were 35 students, ranging from 19 to 21 years old and 27 of them were female students. All of the students who attended the lectures were in their bachelor’s degree studies and their participation was voluntary.
E-Learning Platform
The e-learning platform comprises a web page with a timeline tool to accompany classical e-learning web pages that contain materials such as electronic documents and videos.
The timeline tool consists of two parts: a timeline and a map. The timeline allows students to travel in time through the use of a timeline structure and better understand chronological events and the map helps students to see the evolution of events in a particular place where the events will appear gradually, according to the evolution of time. The map and the timeline work together: the map is updated according to changes made in the timeline in order to visualize the items of the timeline in spatial domain. Information about the items is given with the help of pop-ups: all the items in both map and timeline structure can activate corresponding pop-ups that contain detailed information about the clicked item.
The timeline tool is very important for extracting the interactions of a student with the e-learning platform. As it is rich in content and structure, by manipulating the timeline bar, by opening pop-ups, by exploring the map, students generate behavioral patterns that are based on the mouse activity and the locations they visit.
The e-learning web pages consist of four types of lecture materials: documents, videos, links, and quizzes. The instructors can upload electronic documents about their lectures, provide video links from online video sharing platforms, prepare online quizzes, and create links to the timeline tool. Figure 9 demonstrates the timeline tool and the e-learning web pages interfaces prepared for the first course.

Presentation of a course in both interfaces of the e-learning platform: (A) Timeline Tool and (B) E-learning Web Pages.
Courses
In this study, two courses were taught: one in the Department of Computer Engineering and another in the Department of Political Science and International Relations. For every course, only one lecture was conducted in a controlled environment and was followed by a quiz and a survey. Every lecture had a total duration of about two hours. The content of the two courses was of average difficulty. Two hours were enough in order to extract sufficient information about the students’ activity as the data collected are mouse clicks and locations visited by mouse movements. Concerning mouse activity, for the course held in the Department of Computer Engineering, an average of 149.52 (SD = 89.78) mouse clicks and an average of 2082.29 (SD = 850.24) visited locations was measured. For the course held in the Department of Political Science and International Relations, an average of 83.97 (SD = 45.42) mouse clicks and an average of 2278.59 (SD = 1117.92) visited locations was also reported.
Material Organization
In order to ensure that students will explore both interfaces, the course content was divided between the e-learning web pages and the timeline tool as follows: three small chapters in the form of electronic documents and a timeline which includes information for a number of related prominent people, correspondingly. These important people were placed in the timeline tool map according to their lifelong activity. For each of them, information about their life and work was available with the help of pop-up windows, as displayed in Figure 9.
Courses Details
The first one, entitled “History of People in Computation” was taught to the students of the Department of Computer Engineering and its content included a timeline entitled “History of People in Computation” and three small chapters entitled “History of Computing”, “History of Artificial Intelligence” and “History of Deep Learning”.
The second one entitled “Regional Studies” was followed by the students of the Department of Political Science and International Relations. Figure 10 shows a picture of the lecture that was given for this course. Its content also included a timeline and three small chapters: “Iran-Iraq Wars”, “Arab-Israeli Wars”, and “American-Arab Wars”.

The lecture of the “Regional Studies” course.
Data Collection
The data are collected with the following methods: interaction data taken from the e-learning platform, quiz results after the lectures, and the student surveys.
E-Learning Platform Interaction Data
The e-learning platform is equipped with a web analytics tool that provides raw statistics for web-based applications such as mouse click coordinates, the locations visited with the mouse, access times, total page view duration, active page view duration, etc. These measurements are obtained from the timeline tool and the e-learning web pages separately. The measurements calculated from the raw statistics are presented in Table 1.
The Definitions of the Statistical Measurements.
Quizzes
At the end of the lecture, the students were asked to answer ten multiple-choice questions. The ten-question quiz was a closed-book exam. Five questions were asked from the timeline tool content and five questions from the e-learning web pages content. The quizzes were prepared by the instructors in average difficulty and the questions reflect only the information that can be acquired from the content of the online courses.
Surveys
After the end of the course, students filled a survey in order to obtain information about their emotional states during the lectures. The survey is adapted from the Flow State Scale proposed by Jackson and Marsh (1996), which is one of the most widely used flow state identification techniques.
Data Process
In this section, we present the methods for calculating the Activity, the Performance, and also the validation of Activity and Performance values using the surveys’ results.
Calculation of Activity
Activity is based on the interaction of the user with the system such as the number of clicks, mouse moves, and the time spent for exploring the system. Based on this interaction, a number of features can be chosen directly or calculated indirectly using different methods, as it will be discussed in this section.
Let,
An activity vector of a student in
Based on the work of Georgiou and Demiris (2017), for each course, we defined an Expert student. The Expert model represents a student who is fully engaged and reflects the best performance in terms of grades and engagement. It is defined as follows:
the minimum total amount of time spent in the platform after logging in, while having the maximum total amount of time performing actions in the platform and performing necessary operations with minimum mouse actions.
As time is one of the constraints for the expert, it is very rare that more than one student will have the same time values. Nevertheless, if this event occurs, one of these students is chosen randomly. The algorithm for finding the Expert student in a class is based on the most effective time and is presented in Figure 11. It should also be noted that an Expert student is chosen according to the selection algorithm separately for each course.

The algorithm to find the Expert.
The Expert student expresses the optimal attention and performance in a particular course given to a particular class of students. As a result, the comparison of students with the Expert gives a metric about the attention of the students and by consequence the quality of their activity. In order to estimate this resemblance, the similarity between the vector of a student and the vector of the expert is calculated using the dot product of the above vectors. Finally, Activity is calculated from Equation 1:
Choosing the features of the activity vector is a challenging task because we have to select features that are capable of best-representing student’s interaction with the e-learning platform. There are many alternatives, like selecting directly the interaction data only (time and mouse interactions), or further processing the interaction data with statistical or deep learning methods. All of them are presented in the following section.
Calculation of Student’s Activity Vector
The definition of a feature vector Interaction data only, as described in “Calculation of Activity” section Activity heatmaps generated from the interaction data and processed with convolutional autoencoders (deep neural networks) Other dimension reduction methods which use the above activity heatmaps
For each course we generated a vector with interaction data only, a vector obtained from activity heatmaps using convolutional autoencoders, and 22 activity vectors from eight different dimension reduction methods using 22 configurations in total. The vector generation procedures for each method are applied for each course individually. These methods are presented analytically in the following sections.
Interaction Data Only
In the measurement calculated from the interaction data only, a vector of a student
These features represent normalized values of interaction data as described in Table 1 the time spent actively (ts), the number of mouse clicks (mc) and mouse movement paths (mm). They are normalized using the total spent time (tt).
Activity Heatmaps and Convolutional Autoencoders
The location information of mouse clicks and moves is not taken into account in the previous approach and it is a factor worthy of being examined. To conduct this investigation, for each student two heatmaps are generated using the mouse movement and mouse click locations, one from the timeline tool and another from the e-learning web pages. In order to generate a heatmap, we create an image corresponding to the size of the interface and we collect all the locations visited or clicked with the mouse when a student is working inside the aforementioned pages. When the student is not working on these pages, data are not collected for the heatmap generation process. Then, we associate a particular color to each pixel according to the number of visits. In order to obtain the final activity heatmap, a propagation algorithm distributes pixel intensity values to their neighbor pixels. This procedure ensures that the values in the heatmap will have a smooth transition among pixels. Figure 3 presents an example of a heatmap generated from the interactions of a student.
After generating an activity heatmap, a feature extraction method is needed to find the critical behavioral patterns and extract the interaction information from these images. For this task, we proposed to use convolutional autoencoders. In recent years, convolutional autoencoders showed significant progress concerning the feature extraction and dimension reduction in finding local patterns of multi-dimensional data. Since the heatmaps have mouse movement patterns in color image format, it is worthy to take advantage of the power of convolutional autoencoders and examine if student interaction patterns can be successfully extracted.
The proposed model consists of an encoder and a decoder. The encoder has three convolutional layers that have filters with a size of 3
The features of size 3

Proposed convolutional autoencoder model architecture.
In summary, two activity heatmaps are generated, one from the total mouse activities of the timeline tool and another from the total mouse activities of the e-learning web pages. Each heatmap is entered into a separate convolutional autoencoder and each neural network generates 48 features. Finally, all features are concatenated, producing a single feature vector that corresponds to the student’s activity vector

Generation of a student’s activity vector
Other Dimension Reduction Methods
To assess whether there is another method that can extract features that represent a student’s interaction with the e-learning platform, we examined some of the most important dimension reduction methods: Principal Component Analysis (PCA) (Pearson, 1901), Factor Analysis (Garrett, 1953), Independent Component Analysis (Comon, 1994), Isometric Feature Mapping (Tenenbaum et al., 2000), Multidimensional Scaling (Kruskal, 1964), Spectral Embedding (Shi & Malik, 2000), t-distributed Stochastic Neighbor Embedding (t-SNE) (Maaten & Hinton, 2008), and Gaussian Random Projection (Bingham & Mannila, 2001). Table 2 shows the number of selected components for these methods. These parameters are chosen empirically based on the amount of data for each lecture and the method requirements.
Dimension Reduction Methods Used for Extracting Student Interaction Features.
In dimension reduction methods 2 or 3 configurations were tested with different numbers of components. In PCA, in the first configuration 20 components are chosen, because according to the cumulative explained variance method, they can express more than 95% of the study data. A second configuration was tested with 32 components where the cumulative explained variance method expressed around 97% of the data. For this reason, a third configuration was not required. In t-SNE, only two configurations were tested with 2 and 3 components correspondingly since this method aims to reduce the dimensions for data visualization (2 or 3 dimensions are needed only). For all other methods three configurations were tested with a minimum, near to average, and maximum number of components in order to investigate different dimension ranges.
Calculation of Performance
Performance is a straightforward measure and it is calculated directly from the quiz results as a singular value. It shows if a student managed to explore and understand the content of the course. The singular Performance value for each student is calculated by collecting the number of correct answers the student gave and normalizing this value with the number of questions in the quiz.
Validation of Activity and Performance
In order to verify the reliability of the Activity and Performance values a statistical analysis that takes into account the results of the student surveys should be conducted. The analysis method depends on the results of the normality test for each variable. This analysis is performed on SPSS (IBM Corp., Released 2013) and it is based on the Shapiro-Wilk test for normality (Shapiro & Wilk, 1965), Pearson correlation (Pearson, 1895) for normally distributed pairs, and Spearman’s rank correlation (Spearman, 1904) for pairs that do not have normal distributions. This analysis has three goals:
Examine if there is a significant correlation between the survey results and the Performance, as extracted from the grades of the students from the quizzes. Examine if there is a significant correlation between the survey results and students Activity values. Verify that there is not a significant correlation between Activity and Performance since high activity should not infer high performance and vice versa.
If these three goals are achieved that would mean that the Activity and Performance values are not linearly related and are coherent with the interaction and performance of the students in the course.
Results
In this section, we present the results of this study. As it will be shown, for both courses the Activity value calculated from the activity heatmaps and convolutional autoencoder was the most successful method for extracting representative interaction features of student activity. Survey results and Performance also present a significant correlation. Moreover, Activity and Performance are independent from each other. Finally, for each course the survey results and student flow state diagrams are presented.
Pairwise Correlation Analysis of Course 1: History of People in Computation
A pairwise correlation analysis is performed for the variables of the first course. The results are presented in Table 3. The analysis used data from 50 students instead of 52 since one of the students was chosen to be the expert and another was omitted from the analysis because the survey was not filled. In quiz results, the minimum quiz grade is 4 and the maximum is 10 (μ = 6.92, SD = 1,39).
Results of Pairwise Correlation Analysis in Lecture 1: History of People in Computation.
aThese variables present normal distributions according to the Shapiro-Wilk normality test.
bThese variables significantly deviate from normal distribution.
cThis field includes the correlation analysis results of all the dimension reduction methods that are not named in the table. There are 21 different variables where p > .102
*Pearson correlation analysis.
**Spearman’s rank correlation analysis.
According to Table 3, we observe that:
There is a significant correlation between the survey results and Performance. Consequently, the grades can be used for extracting the Performance value. The Activity value calculated with heatmaps and convolutional autoencoders has a significant correlation with surveys results. As a result, this method can acquire features representing the interaction of a student. The Activity value doesn't correlate with Performance, thus showing that a student can present high activity without a corresponding high performance and vice versa.
The Activity calculated from t-distributed Stochastic Neighbor Embedding results also showed a significant correlation with survey results but, their r values were less than the proposed method. The other methods did not show a significant correlation with the survey results.
Pairwise Correlation Analysis of Course 2: Regional Studies
For the second course, a similar pairwise correlation analysis is performed, as for the first course. The results are presented in Table 4. The analysis was performed on 29 students instead of 35 since one of the students was chosen to be the expert and five of the students were omitted from the analysis because they didn't fill the surveys. In quiz results, the minimum quiz grade is 4 and the maximum is 10 (μ = 7.03, SD = 1,62).
Results of Pairwise Correlation Analysis in Lecture 2: Regional Studies.
aThese variables present normal distributions according to the Shapiro-Wilk normality test.
b hese variables significantly deviate from normal distribution.
cThis field includes the correlation analysis results of all the dimension reduction methods that are not named in the table. There are 19 different variables where p > .106
*Pearson correlation analysis.
**Spearman’s rank correlation analysis.
From Table 4 we can observe that the results are similar to the first course:
Survey results correlate with Performance. The Activity value calculated with heatmaps and convolutional autoencoders correlates with survey results. The Activity value doesn't correlate with Performance.
The Activity values calculated from factor analysis, isometric feature mapping, and the Gaussian random projection methods with two components also showed a significant correlation with survey results but, their r values were less than the method using activity heatmaps and convolutional autoencoders. The other methods or the other component numbers did not show a significant correlation with survey results.
Flow State Diagrams of the Two Courses
After calculating the Activity and Performance for each student, each course can be represented with a flow state diagram. Figure 14 presents the flow diagrams for the first and second courses. In this study the exact positions of the flow region borders that separate Anxiety, Flow, and Boredom are based on the study of Lynch and Ghergulescu (2017), in which the location regions were defined after an analysis including 7614 students and 25945 lessons. The flow diagrams show that in both courses the majority of the students are in flow. This result presumes that the content of the course was generally good for most students. We can suppose that making the course content harder would move some of the students to the Anxiety region while making it much easier would make students be uninterested and displace them towards the Boredom region.

Flow state diagrams for (A) Course 1: History of People in Computation and (B) Course 2: Regional Studies.
Student Surveys
The results of student surveys showed that in both courses most of them in a positive sentimental state. The main factor that contributed to this result is related to the average difficulty of the course. Based on their answers, we can expect that most of them should be placed on the flow channel. Table 5 presents the student survey results.
Students Survey Results From the Courses in Likert Scale.
Discussion
In this section, we discuss the research questions, the findings, and the limitations of this study.
Research Questions
Is It Possible to Capture the Interaction of Students in an E-Learning Environment in an Automated Way and Use it for Evaluating Their Flow State in a Course?
One of the challenges of this work was to extract the correct parameters related to the interaction of students during a course. The correlation analysis results verified that the proposed method which uses convolutional autoencoders with activity heatmaps present statistically significant correlations with survey results and higher r-values than all other methods. Hence, the concept of Activity which depends on the interaction patterns, together with the concept of Performance, according to the correlation analysis can extract the flow state of students.
To What Extent Can This Method Be Generalized and Applied to Different Student Profiles and E-Learning Environments?
The current method was applied in two different student profiles and presented similar results despite the fact that the content of the courses was different. During this study, we could not detect a specific reason that a student from a particular discipline would not be suitable for this method. Concerning the generalization of the method, we estimate that this method can be generalized and applied to other e-learning environments, provided that their interface ensures a satisfying mouse activity, thus allowing the extraction of behavioral patterns. In order to ensure that the method is applicable to other e-learning environments, we excluded the use of any external equipment other than a mouse for extracting behavioral data.
Summary of Findings
Deep Learning Methods Are Able to Extract Student Behavioral Patterns Based on Mouse Interactions Whereas Other Methods Performed Poorly in Terms of p-Values and r-Values
The results showed that the proposed method using activity heatmaps and convolutional autoencoders presented statistically significant p-values and higher r-values than all other methods in both courses. Acquiring the location information of mouse interactions in the activity heatmaps might be the main factor of the success of this method. The ability of convolutional autoencoders to extract topological patterns from heatmaps is a determinant factor since student interactions rely highly on the content and the location of the items presented in the e-learning platform. Usually, deep learning techniques require a much bigger amount of training data compared with the data used in this study. Here, the convolutional autoencoders are used as a feature extraction method and the extracted features are acceptable, even if the amount of data collected from the experiment lectures is relatively small.
In the Context of an E-Learning Course, Activity and Performance Can Be Defined as Uncorrelated Concepts
The results of the pairwise comparison between the Activity and the Performance values showed that there is no correlation between them. This finding is important in order to define a flow state diagram where Activity and Performance are uncorrelated concepts. In other words, a student can present high activity without a corresponding high performance and vice versa.
The Flow State Diagram Provides Feedback That Corresponds to Survey Results
In the flow diagrams of the two courses, we have observed that most of the students were in flow. This result is coherent with the results of student surveys and was verified by statistical analysis. We can also empirically verify it by simple observation of student surveys.
Limitations
The current e-learning environment apart from the classic e-learning web pages comprises a timeline combined with a map which helps to ensure a mouse activity. This method would not be suitable for e-learning platforms where the mouse activity is low, because behavioral patterns might not be correctly extracted. But to the best of our knowledge, most e-learning environments require sufficient mouse-based interactions in order to explore their content. It should also be noted that mouse movements give no information about other cognitive activities like the eye position or the brain activities.
Conclusion
This study proposed a new method for evaluating the flow state of students in a course taught in an e-learning environment. This became possible because it managed to successfully measure the interaction of students in an e-learning platform using activity heatmaps and deep neural networks. This method, which depends on time, mouse interactions, and grades is not specific to a particular course or field, can be generalized for other e-learning courses and MOOC platforms, and can serve as a basis for an intelligent tutoring system. Further experiments need to be done in order to verify the applicability of this system in various types of e-learning platforms. The proposed evaluation method could also be used to generate personalized feedback for students. In a future study, we could investigate the effects of personalized feedback on learning by providing consecutive feedback based on objective measurements after each lecture. In an era where education is adapting rapidly to online lectures and e-learning environments, providing automated and valid feedback to students and educators becomes a significant element for education, and the current study aims to contribute towards this direction.
Footnotes
Acknowledgments
We would like to thank, Muzaffer Özgüleş, Micaela Antonucci, Aneta Poniszewska-Marańda, and all the team members who participated in the development of the Timeline Travel platform.
Ethical Approval
Participation was voluntary and the data was collected anonymously. Appropriate permissions and ethical approval for the participation requested and approved. This study was also approved by the Yeditepe University Ethical Committee.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is partially supported by the Erasmus+ Programme of the European Union [2017-1- TR01-KA203- 046818] and awarded with the “Good Practice” prize.
