An IRT-based approach to assess the learning gain of a virtual reality lab students’ experience

Abstract

The current trends in education for distance and personalized learning, calls for further investigation on the educational benefits of a Virtual Reality (VR) software, regarding laboratory science courses. In this study, we investigated the teaching effectiveness of a VR-oriented innovative method vs a more traditional pedagogical approach, by measuring the Pre-Post change, commonly called as Learning Gain (LG). An Item Response Theory model, The Rasch Model (RM), was used to assess the LG as the difference between the students’ ability before and after the educational treatment. The participants, ( $N=$ 54) graduate students enrolled in the Department of Primary Education, at the University of Patras, in Greece, were divided into two groups and followed two different scenarios to be educated on the topic of microscopy. Our findings provided evidence in favor of using simulations as a supplementary tool to the learning procedure. According to the LG analysis, the students that interacted with the VR software showed a higher change in ability, compared to their fellow students, who followed a more classic teaching methodology and obtained no LG between the Pre and Posttesting situation.

Keywords

Learning gain graduate education/research laboratory biology courses virtual reality rasch model onlabs

1. Introduction

Virtual Reality has drawn much attention in the last decades. Many studies agree on that it can successfully enhance and support conventional and open, undergraduate and postgraduate education [1]. VR-based educational tools may offer a highly interactive, self-paced, cost effective and safe learning experience that overcomes limitations of laboratory facilities, insufficient support of technicians, size of classes and low funds for traditional educational experiments [2, 3]. Makransky and Lilleholt [4], mentioned that many business analyses and reports predicted that VR would be the biggest future computing platform of all time, as it could revolutionize the entertainment, gaming and education industries [5, 6]. When referring to Biology, new Information and Communication Technology (ICT) applications such virtual labs, contribute to the teaching methods so that educators can overcome the educational problems that arise from the complexity of these courses. Data, gathered through the participation of students in educational scenarios are an invaluable asset to researchers as they can utilize them in order to generate conclusions and identify hidden patterns and trends by using analytics techniques [7, 8, 9, 10, 11].

Over the past decades several comparative research studies have attempted to find out whether the use of physical manipulatives is more effective to students’ learning than the use of virtual manipulatives [12, 13, 14]. According to Paxinou et al. [15] and Makransky et al. [16], VR technology has been proved to be a promising supplementary tool to the traditional teaching methods in laboratory biology courses. The authors mentioned that students who were trained on their lab exercises via a VR application, obtained statistically significant higher scores in tests, were more certain of the gained knowledge and exhibited a greater ability in conducting the microscopy experiment in the physical lab, than their fellow students who only attended more traditional teaching methods [17]. Many supporters of the VR technology believe that this alternative educational approach facilitates learning due to the ability of human brain to perceive better and assimilate easier a three dimension (3D) computer-graphics representation than a simple text [18]. Review papers mention the research methodologies used in the area of adaptive systems like 3D virtual learning environments [19]. Many studies show that simulations can be a very promising and affordable tool for learning and instruction [20, 21, 22], especially for users who are not aware of information technologies [23]. Virtual laboratories have overall positive effects on students’ cognitive load, skills development and motivation [4]. Virtual learning simulations provide students and trainees with cost-effective teaching methods that enhance both cognitive and non-cognitive outcomes [24]. According to Trundle and Bell [25], VR laboratories are useful educational tools as they highlight significant information and remove unnecessary details making this way the educational process more effective. In addition, like all modern ICT educational applications, virtual applications have general features that can support constructive learning [26], while they are very effective in dynamically engaging learners in the learning process [27]. Simulation based learning environments have great potentials for improving students’ knowledge on scientific subjects and students’ experimental skills [28, 29, 30, 31, 32, 33, 34, 35].

The new generation of students, growing up in a wide-spread technological environment, has improved the learning capacity through visual and tactile modalities [36] and embraces such technological innovations. The digital age students process information, fundamentally differently from their predecessors [37] and hold expectations that the educational institutes will embed some innovations in the science curriculum. The question of whether the use of an educational application of cutting-edge technology, like the VR technology, contributes to the better conceptual understanding of science, has preoccupied many researchers [38, 39, 40].

A way to evaluate a new educational intervention is through the evaluation of the changes in students’ performance using a Pre and Post-Test design. Two are the main concerns when designing and creating a Pre-Post assessment test: (a) to come up with the appropriate and targeted questions (items) and (b) to apply the most suitable grading model so as to obtain accurate and useful scores [41]. Accurate scores are those scores that can be trusted. Therefore, it is expected that if the same test is given to the specific student more than once, the student will demonstrate every time the same performance. Useful are those scores that can lead the teachers to take some strategic decisions regarding the teaching procedure. Based on useful scores teachers could include some extra teaching hours in the curriculum, for those students who had a low ability in understanding a newly introduced scientific subject, and therefore received a low score in the assessment test.

After designing a targeted test, the issue of choosing an appropriate grading model to assess the students’ aptitude test performance comes in order and the following very important questions, must be taken into consideration: Do all students who obtain the same score in a test, have the same ability? Does the student’s ability depend each time upon the difficulty of the given test? Are all the items in the test equally difficult, and if not, is there an objective way to decide upon the level of their difficulty?

In many studies changes in students’ performance, commonly called Learning Gain (LG) is assessed based on the raw Pre and Post-Test scores [42, 43, 44]. But, as raw scores are measures of students’ ability defined by the number of items students get correct, and are independent of item difficulty, basing the measurements on raw scores is most of the time problematic. Having in mind the above limitation, in this study we use an alternative evaluating technique to estimate the magnitude of the Pre-Post change. This technique is not based simply on the number of items correctly answered, but uses a probabilistic Item Response Theory (IRT) model, the Rasch Model (RM), developed by the Danish statistician Georg Rasch [45]

Although there is a vast literature on students’ Pre and Post-implementation performance, the assessment of the learning gain has not been the subject of biology education research studies. Furthermore, and as far as we know, there is no published study on the measurement of the students’ learning gain after following a teaching scenario, in the context of a laboratory biology course. In this research the sample comprised an entire class of 54 4 ${}^{\text{th}}$ year, undergraduate students of the Department of Primary Education at the University of Patras in Greece who were enrolled in the Computers and Education course. The two basic aims of this study were: (a) to use the probabilistic RM and assess the students’ learning gain after being educated on the subject of microscopy, and (b) to decide on whether a teaching scenario that includes interactive engagement methods, like the use of a desktop-based VR biology lab, helps students gain higher ability in their microscopy-oriented written tests, than more traditional didactic methods.

2. The assessment framework

2.1 Measuring the learning gain with an IRT model

In the context of an educational intervention, an obvious way to calculate the LG for an individual student ( $\textit{LG}_{s}$ ), is to give the same test before (Pre-Test) and after (Post-Test) this intervention, and then simply calculate the differences of the raw scores ( $S$ ) obtained at his/her Pre and Post-Test ( $S_{\textit{Pre-Test}}$ and $S_{\textit{Post-Test}}$ , respectively), (Eq. 1).

$\displaystyle\textit{LG}_{S}=S_{\textit{Post-Test}}-S_{\textit{Pre-Test}}$ (1)

The main disadvantage of calculating the LG through the above equation is that the observed raw scores are independent of the item difficulty. For example, a student may do well on a test either because he/she is a well prepared student or because the items in the test are easy, or finally because he/she is a well prepared student and the items are easy. So, if we only use the classical test theory for grading the tests, then we may not measure fairly students’ abilities. Additionally, the assessment of the student’s ability on the observed raw scores, has the following prominent extra limitations: (a) The characterization of an item as easy or as difficult, depends upon the examiner’s subjective criteria and (b) The definition of reliability is established through the concept of parallel tests which is difficult to achieve in practice, as individuals can never act exactly in the same way on a second trial due to factors such as development of new skills or changes in motivation or even stress [46].

An alternative to gain calculations based on differences in raw scores would be to use a probabilistic IRT model to estimate students’ performance. A number of papers have described the advantages of the IRT models: continuous, interval-level scoring, item-level parameters that facilitate the development of valid measures, precise scoring and reliability estimates, and valid comparisons of respondents who took more, fewer or different items [47, 48, 49]. IRT attempts to model a student’s ability and the probability of answering a test item correctly [50, 51]. With an IRT model, students’ abilities can be estimated independent of the specific items they take. Furthermore its parameters are sample-independent [52] and as a result the item parameters can be estimated independent of the population of examinees [53].

According to the item response theory, each item has three parameters: (a) the difficulty, (b) the discrimination power and (c) the guessing parameter. The simplest IRT model is the RM, one-parameter (1PL) logistic model.

2.2 The rasch model

The RM, unlike other IRT models takes into account only the difficulty parameter of an item in order to assess the student’s ability [54]. Bond and Fox [55] claim that tests and questionnaires should produce data that fit the RM as this model sets out the criteria for successful measurements. RM is a dichotomous model since, when applied, every incorrect response in a test is scored with 0 and every correct one is scored with 1. The RM uses the following probability function to estimate the probability of a student to get the item $j$ correct:

$\displaystyle P\left({X_{j}|\theta,\beta_{j}}\right)=\frac{1}{1+e^{-\left({% \theta-\beta_{j}}\right)}}$ (2)

The parameter $\beta_{\iota}$ is the difficulty of the item $j$ and $\theta$ is the student’s ability to give a correct response to the item of $\beta_{\iota}$ difficulty [56]. The difficulty parameter $\beta_{\iota}$ and the proficiency variable $\theta$ are estimated along a common logit (log-odds) scale. Values fall between $-\infty$ and $+\infty$ , with lower values indicating easier items or lower abilities. The main advantage of using the RM in the measurement of the LG is this common scale as, whatever change in student’s ability may occur between the Pre and the Post-Test administration, it can be directly connected to items correctly answered in the Post-Test but not in the Pre-Test [44]. From the Eq. (2) it is obvious that if the student’s ability $\theta$ equals the difficulty $\beta_{\iota}$ of an item, there is a 0.5 probability of a correct response to the $j$ item.

For each item in a test an Item Characteristic Curve (ICC) can be made. The ICC shows the probability of a correct response as a function of the ability of a student. Figure 1 shows three ICCs for item A, item B and item C included in a test.

Figure 1.

Item characteristic curves for three items in a test.

The student’s ability is shown on the horizontal axis, while the corresponding probability to give a correct response to this item, is shown on the vertical axis. According to Fig. 1 the probability of a student responding correctly to an item with difficulty lower than that person’s ability, is greater than 0.5, while the probability of responding correctly to an item with difficulty greater than the student’s ability is less than 0.5. Under the RM, the theoretical item characteristic curves for a set of items in a test, are all parallel and that they all have the same shape except for a location shift. This property is known as equal discrimination. That is, each item provides the same discriminating power in separating individuals by their levels on latent trait (person’s ability). The Item A curve, seems to correspond to the easiest item whereas the Item C curve to the most difficult one, out of the three items in the test.

In this study the LG is calculated, as presented by Wallace and Bailey [53], by using the estimates of the student’s ability according to the RM:

$\displaystyle\textit{LG}_{s}=\theta_{\textit{Post-Test}}-\theta_{\textit{Pre-% Test}}$ (3)

where $\textit{LG}_{s}$ is the LG of each individual student, $\theta_{\textit{Pre-Test}}$ is the measured student’s ability in the Pre-Test and $\theta_{\textit{Post-Test}}$ is his/her ability in the Post-Test.

Figure 2.

Screenshots from the Instructional Mode of Onlabs.

3. The deployment methodology

3.1 The participants

Our sample comprised an entire class of 54, 4 ${}^{\text{th}}$ year, undergraduate students of the Department of Primary Education at the University of Patras in Greece, who were enrolled in the Computers and Education course. In general, students who attend this course are primarily practicing in the computer use, are informed about the developed educational software for primary and secondary education, and are educated on technology-assisted teaching and learning. According to the Greek educational system students who graduate from the aforementioned department are usually employed as teachers in the Primary Education and therefore face the challenge of training their young students to perform basic experiments in the school labs, such as focusing on different specimens with an optical microscope [17]. From our point of view, this future prospect was the main motivation for these 54 novice science learners for agreeing to participate in this project. Additionally, as biology is a fast developing field which increasingly networks with other disciplines (bioinformatics, bioeconomy, biomedical engineering, etc.), we expect that raising the potential for the primary schoolteachers to gain basic scientific background with a cell biology lab through ICTs, will serve as a future expertise science tool to be used towards their students.

The process of this empirical study didn’t harm or put the participants in a position of discomfort. The participants were volunteers, taking part in the process of their own free will. They were given information on the purpose of the research, the methods being used and the possible outcomes. All the participants filled in the tests and the worksheets by using as id, a number, randomly selected, protecting this way their anonymity. Furthermore, every student had the right to withdraw at any stage in the research process.

3.2 The VR biology lab

In this empirical study the VR educational software, Onlabs (https://sites.google.com/site/onlabseap/) was used to educate the students on the operation of the optical microscope, a central instrument in a biology lab. Onlabs is a desktop-based VR software developed by an interdisciplinary scientific team at the Hellenic Open University, an institution that is mastering the distance learning education. Onlabs allows the user to simply move the mouse and interact with the virtual lab equipment. The graphics of Onlabs make it an ideal educational tool, appropriate for the queries and the evaluations of our study. As claimed in Mikropoulos et al. [57], the context of the educational virtual environment has to be closely related to its content, the didactic goals and the learning activities, in order that learning outcomes can be realized constructively. Onlabs satisfies the learning outcomes of the experimental exercise “Operating an Optical Microscope” by offering the user three modes: the Instruction Mode, the Evaluation Mode and the Experimentation Mode. In this study the Instruction and the Experimentation Mode are used.

3.2.1 The Instruction and the Experimentation Mode of Onlabs

Via the Instruction Mode, the trainee performs the experiment under instructions. To decrease the intrinsic cognitive load of the experiment, the latter is divided into numerous steps. For each step a written instruction appears at the top of the screen, whereas a narrator also reads the specific instruction (Fig. 2a). There is globe-button on the left up corner of the screen to click on, for help (Fig. 2b). When a step is performed successfully, the narrator congratulates the user and instruction for the next step appears on the screen.

The Experimentation Mode allows the user to explore the microscope without any instructions. Through this interaction the user has a first-person viewpoint this instrument. He/she learns by trial-and error without any worries about causing damages and accidents by misusing an expensive and sensitive microscope.

3.3 The teaching scenarios

The 54 students were separated in two groups: (a) the T-Group who attended a traditional teaching scenario for microscopy and (b) the VR-Group who was educated on microscopy through Onlabs. Figure 3 gives the experimental outline of this study which lasted two hours in total, and was implemented in a research day.

Figure 3.

The scenario of the project.

According to Fig. 3, at the 1 ${}^{\text{st}}$ Phase (Introduction in microscopy) all students in the class attended a necessary face-to-face tutorial, as they all brought a zero to minimum prior knowledge on the topic of microscopy. At the end of that tutorial, a Pre-Test was given to set a baseline in the students’ ability regarding the microscopy topic. The test consisted of 20 multiple choice items. The Pre-Test was assessed right after the 1 ${}^{\text{st}}$ Phase, by giving 0.5 points to every correctly answered item so as to make an effort to divide the students into two cognitive equal groups.

The average score and the standard deviation for the first group (named as T-Group) was 5.63 $\pm$ 1.22 and 5.74 $\pm$ 1.32 for the second group (named as VR-Group). At this point of our research, the T and the VR-Group had exactly the same amount of participants but, since some of the students withdrew during the 2 ${}^{\text{nd}}$ Phase, the presented data in this study correspond only to those students who participated in the research from beginning to end, (that is 30 for the T-Group and 24 for the VR-Group).

In the 2 ${}^{\text{nd}}$ Phase (Trained on microscopy), the T-Group followed the traditional educational method and passively attended a live demonstration of the complete microscopy procedure. During this tutorial a set of detailed PowerPoint slides was used in order to present the basics on microscopy. An optical microscope was also apparent and parts of the equipment were demonstrated to the classroom.

On the other hand, the VR-Group entered the Computers and Educational Technology Lab to be trained in the microscopy via Onlabs. Upon training with Onlabs, the tutor used the Experimentation Mode and through a projector screen, she performed a virtual microscopy experiment. After this demonstration, each student used a PC and through the Instruction Mode of Onlabs, performed virtually, the microscopy experiment without any further assistance from the tutor.

For each of the two groups, the 2 ${}^{\text{nd}}$ Phase lasted one hour. Immediately following the microscopy instruction (Traditional for the T-Group and VR oriented for the VR-Group), all 54 students filled in the Post-Test.

The Pre and the Post-Test contained exactly the same 20 multiple choice items. To fit our data in the dichotomous RM we represented each item as a binary variable, so that a value of 0 corresponds to an incorrect response and a value of 1 to a correct one. For a group of $y$ students, to who we administered a test of $z$ items, a vector of $z$ binary variables was created to represent the responses given by each student. A CSV file was created containing y vectors, each of size $z$ .

For the data analysis the open source statistical analysis language R was used, and more specifically, the TAM (Test Analysis Modules) package of R [58]. This package functionality covers the RM as it contains the tam function [59] which tries to fit the RM with the dichotomous data by using the Maximum Likelihood Estimate (MLE) method [60, 61]. TAM can be found at the R CRAN site which is the official R repository for packages (http://cran.r-project.org/web/packages/TAM/index.html).

The matched data sets (Pre and Post-Test for each individual) were used to calculate the difficulty of the items, as well as the ability of each student. All Rasch analyses were done using the Statistic Language R.

4. Results and discussion

As already mentioned, in this study the LG was calculated based on the student’s abilities and according to the Eq. (3). Item difficulty data in Pre or Post-Test, were used to estimate these students’ abilities, but when comparing students’ ability between their pre and post assessment, this analysis was anchored by using only the Pre-Test item difficulties. In this way, the change of students’ ability was not enmeshed with any changes in the values of the item difficulty. Values for the item difficulty in Pre-Test according the T-Group and the VR-Group responses are presented in the following Table 1.

Table 1
Pre-test item difficulties for T and VR-group

(a)
Item N ${}^{\circ}$	Item difficulty (Logit Units) T-Group (Pre-Test)
2	$-$ 2.64 (the easiest item)
7	$-$ 2.20
1	$-$ 1.39
14	$-$ 1.19
20, 9	$-$ 0.69
19	$-$ 0.55
8	$-$ 0.41
4	$-$ 0.13
10, 17	0.00
3, 6, 13	0.13
11, 12, 16	0.55
5, 18	0.85
15	1.01 (the hardest item)

(b)
Item N ${}^{\circ}$	Item difficulty (Logit Units) VR-Group (Pre-Test)
14	$-$ 3.23 (the easiest item)
5,7	$-$ 2.02
2	$-$ 1.67
19	$-$ 1.39
1, 4	$-$ 0.72
3, 9, 13	$-$ 0.35
6, 8, 11	$-$ 0.17
18, 20	0.53
10, 17	0.72
15	0.93
16	1.15
12	1.67 (the hardest item)

According to Table 1, the first observation is that these two groups consider different items as the easiest and the most difficult ones. For example, the RM analysis of the VR-Group data, presented the item N ${}^{\circ}$ 14 as the most difficult one in the Pre-Test (it has the biggest logit value). On the other hand, the analysis of the T-Group Pre-Test data indicated that there were three items (N ${}^{\circ}$ 2, 7 and 1) easier than N ${}^{\circ}$ 14. However, the average item difficulty for the Pre-Test was essentially the same; 0.35 $\pm$ 1.23 and 0.35 $\pm$ 1.00, according to the T and VR-Group responses, respectively. This can be useful information showing that before any training on the microscopy experiment, the Pre-Test was a test of the same difficulty level for both groups.

Figure 4a and b present the item characteristic curves of the 20 items in the Pre-Test. As these figures show, no two ICCs cross over each other. Items that give such curves are ideal for separating students, based on their ability to give correct answers [53].

Figure 4.

Item characteristic curves for the (a) first ten and (b) last ten items in the Pre-Test.

Figure 5a and b present pictorially, through a Wright Map, the direct comparison, on a logits scale, between the distribution of the students’ ability and the items difficulty. The distribution of the students’ ability in the Pre and Post-Test is aligned with the distribution of the difficulty of the items in the Pre-Test. The left side of the map shows the distribution of the measured ability of the students, from most able at the top, to least able at the bottom. The items on the right side of the map are distributed from the most difficult at the top to the least difficult at the bottom.

Figure 5.

Distribution of students’ ability compared to items difficulty for (a) T-Group and (b) VR-Group, in the Pre and Post-testing situation.

Table 2

The Pre-Test and Post-Î¤est average students’ ability, the SD and the Rasch LG, for T and VR-Group in logit units

	Average student ability $<\theta_{\textit{Pre-Test}}>$	Average student ability $<\theta_{\textit{Post-Test}}>$	Rasch learning gain $<\textit{LG}=<\theta_{\textit{Post-Test}}>-<\theta_{\textit{Pre-Test}}>$
T-Group	0.009 $\pm$ 0.614	$-$ 0.013 $\pm$ 0.716	No Learning Gain
VR-Group	$-$ 0.005 $\pm$ 0.716	0.003 $\pm$ 0.793	0.008

Table 3

The Pre-Test and Post-Test averages in items difficulty for T and VR-Group, in logit units

	Average item difficulty (Pre-Test)	Average item difficulty (Post-Test)	Average difference in items difficulty between the educational treatment
T-Group	$-$ 0.35 $\pm$ 1.23	$-$ 2.20 $\pm$ 3.49	1.85
VR-Group	$-$ 0.35 $\pm$ 1.00	$-$ 0.89 $\pm$ 1.07	0.51

Looking at the student distributions in Fig. 5, it can be seen that there is a shift of the students’ ability to the higher values after the educational intervention for the VR-Group, but not for the T-Group. To reinforce the above observation, the results of the Rasch analysis for the LG are displayed in Table 2. In this table the Pre-Test and Post-test average students’ ability $<\theta>$ , the SD and the Rasch Learning Gain $<\textit{LG}>$ , (where $<>$ indicates average values), are also demonstrated. As in this study we decided to use the LG as the basis for the evaluation of the effect of the two educational methods, the results in Table 2 indicate that students who used the VR software to be trained on microscopy, showed a change in their ability and obtained a LG. On the contrary, those students who passively attended a tutorial-oriented teaching method, not only did not have a learning gain, but their ability in the Post-Test was also decreased, showing this way that these students may followed the strategy of the lucky guess while answering the multiple choice questions in the Pre, and most importantly, in the Post-Test.

Table 3 presents the Pre-Test and Post-Test average difficulty of the items. The numerical results from the Rasch analysis indicated that the VR educational application helped the VR-Group to gain knowledge on the subject of microscopy, as the VR students considered the test items easier (with smaller, more negative, logit values) after the applied teaching scenario. The T-Group also considered the test items easier, when filling in the Post-Test, but this reduction of the items difficulty, was smaller for this group.

Based on the two tables above, it is indicative that the VR technology helped the biology instructor to communicate knowledge in a more effective way offering to students more confident knowledge and bigger learning gain.

In this study, the main goal was to compare two teaching interventions based on the learning gain measured via the Rasch model. The fact that the students in the VR-Group used the Onlabs to interact with the VR lab environment only once, might not be the best way to evaluate the effectiveness of the VR technology for learning and training. A more appropriate way of using the VR-software could be to persuade the students to download the application at the beginning of a term and urge them to use it at home at their discretion. Therefore, a fairer way to assess learning gain could be to conduct a study across a longer period of time, during a semester or an academic year. It is also understandable that this study was carried out with a small sample of participants. Parameters as time, cost but mostly the students’ availability and willingness, determined this size. The procedure of this research was an extracurricular activity, and therefore, it was quite difficult to persuade more participants to be involved.

Those results were indicative, and therefore not catholically accepted. During this study, a scientifically correct strategy was followed and applying this strategy on a larger scale in order to be validated, will be the subject of future communications.

5. Conclusions

A global shift towards the independent learning experience of students revealed a need to assess the contribution of a virtual reality technology intervention to the delivery of science courses. A Rasch-based analysis of the students’ ability facilitates a fair measurement of the Learning Gain which is estimated as the change of the students’ ability from a Pre-versus Post-testing situation. This analysis was performed in a group level and identified a larger learning gain for those students who attended a VR-oriented teaching procedure than their fellow students who participated to a more traditional teaching method and did not exhibit any learning gain. Our study provided an indicative proof that virtual laboratory simulations are very promising tools in laboratory education in terms of obtaining higher ability and bigger learning gain. With the certain assumption that physical labs play a critical role in lab learning, there is a need for educational institutions to become attuned to this new trend to communicate knowledge in science through technological tools, and design scenarios that consist, at least partially, of activities that involve simulations and other similar innovations.

References

Paxinou

Zafeiropoulos

Sypsas

Kiourt

Kalles

. Assessing the Impact of Virtualizing Physical Labs. In the Proceedings of the 27th EDEN Annual Conference, 2018, pp. 17-20.

Pantelidis

. Virtual reality in the classroom. Educational Technology. 1993; 33(4): 23-27.

Karaseitanidis

Amditis

Patel

Sharples

Bekiaris

Bullinger

Tromp

. Evaluation of virtual reality products and applications from individual, organizational and societal perspectives – The “VIEW” case study. Int J Human-Computer Studies. 2006; 64: 251-266.

Makransky

Lilleholt

. A structural equation modeling investigation of the emotional value of immersive virtual reality in education. Education Technology Research and Development. 2018; 66(5): 1141-1164.

Belini

Chen

Sugiyama

Shin

Alam

Takayama

. Virtual & augmented reality: Understanding the race for the next computing platform. Retrieved from: https://www.goldmansachs.com/insights/pages/technology-driving-innovation-folder/virtual-and-augmented-reality/report.pdf, 2016.

Greenlight & RoadToVR. (2016). 2016 virtual reality industry report. Retrieved from https//greenlightinsights.com/reports/virtual-reality-stats-2016/.

Lotsari

Verykios

Panagiotakopoulos

Kalles

. A Learning Analytics Methodology for Student Profiling. In: Likas

Blekas

Kalles

. (eds) Artificial Intelligence: Methods and Applications. SETN 2014. Lecture Notes in Computer Science. 2014; 8445. Springer, Cham. doi: 10.1007/978-3-319-07064-3_24.

Kagklis

Karatrantou

Tantoula

Panagiotakopoulos

Verykios

. A learning analytics methodology for detecting sentiment in student fora: a case study in distance education. European Journal of Open, Distance and e-Learning. 2016; 18. doi: 10.1515/eurodl-2015-0014.

Gkontzis

Kontsiantis

Kalles

Panagiotakopoulos

Verykios

. Polarity, emotions and online activity of students and tutors as features in predicting grades. Intelligent Decision Technologies. 2020; 14(3): 409-436.

10.

Tsoni

Sakkopoulos

Panagiotakopoulos

Verykios

. On the equivalence between bimodal and unimodal students’ collaboration networks in Distance Learning. Journal of Intelligent Decision Technologies, (to appear), 2021.

11.

Gkontzis

Kotsiantis

Tsoni

Verykios

. An effective LA approach to predict student achievement. In the Proceedings of the 22nd Pan-Hellenic Conference on Informatics, 76-81. doi: 10.1145/3291533.3291551.

12.

Riess

Mischo

. Promoting systems thinking through biology lessons. International Journal of Science Education. 2010; 32(6): 705-725.

13.

Jimoyiannis

Komis

. Computer simulations in physics teaching and learning: a case study students’ understanding of trajectory motion. Computers & Education. 2001; 36(2): 183-204.

14.

Zacharia

Constantinou

. Comparing the influence of physical and virtual manipulatives in the context of the Physics by Inquiry curriculum: The case of undergraduate students’ conceptual understanding of heat and temperature. American Journal of Physics. 2008; 76(4): 425-430.

15.

Paxinou

Panagiotakopoulos

Karatrantou

Kalles

, & Sgourou

. Implementation and evaluation of a three-dimensional virtual reality biology lab versus conventional didactic practices in lab experimenting with the photonic microscope. Bioch Mol Biol Educ 2020a; 48(1): 21-27.

16.

Makransky

Bonde

Wulff

JSG

Wandall

Hood

Creed

Bache

Silahtaroglu

Norremolle

. Simulation based virtual learning environment in medical genetics counseling: an example of bridging the gap between theory and practice in medical education. BMC Medical Education. 2016; 16(98). Retreived from doi: 10.1186/s12909-016-0620-6.

17.

Paxinou

. Methods of Assessing the Students’ Performance upon Utilization of a Virtual Reality Educational Tool for Laboratory Biology Courses, (PhD Thesis). Hellenic Open University. Retrieved from https//apothesis.eap.gr/bitstream/repo/49901/1/PhD_Dissertation_Evgenia-Paxinou-2020.pdf, 2020.

18.

Eslinger

. The Encyclopedia of Virtual Environments-Education. Retrieved from http://www.hitl.washington.edu/projects/knowledge_base/virtual-worlds/EVE/, 1993.

19.

Scott

Soria

Campo

. Adaptive 3D Virtual Learning Environments-A Review of the Literature. IEEE Transactions on Learning Technologies. 2017; 10(3): 262-267.

20.

Kirriemuir

McFarlane

. Literature review in games and learning. Bristol: Futurelab, 2004.

21.

Vogel

Cannon-Bowers

Bowers

Muse

Wright

. Computer gaming and interactive simulations for learning: A meta-analysis. Journal of Educational Computing Research. 2006; 34(3): 229-243.

22.

Sypsas

Kalles

. Virtual Laboratories in Biology, Biotechnology and Chemistry education: A Literature Review. In the Proceedings of the 22nd Pan-Hellenic Conference on Informatics, 2018. doi: 10.1145/3291533.3291560.

23.

Garcia-Bonete

Jensen

Katona

. A practical guide to developing virtual and augmented reality exercises for teaching structural biology. Biochem Mol Biol Educ. 2019; 47(1): 16-24.

24.

Bonde

Makransky

Wandall

Larsen

Morsing

Jarmer

. Improving biotech education through gamified laboratory simulations. Nature Biotechnology. 2014; 32(7): 694-697.

25.

Trundle

Bell

. The use of a computer simulation to promote conceptual change: A quasi-experimental study. Computers and Education. 2010; 54(4): 1078-1088.

26.

Mayer

. Multimedia learning, (2nd edition). New York: Cambridge University Press, 2009.

27.

Makransky

Thisgaard

Gadegaard

. Virtual Simulations as Preparation for Lab Exercises: Assessing Learning of Key Laboratory Skills in Microbiology and Improvement of Essential Non-Cognitive Skills. PLOS ONE. Retrieved from: http//journals.plos.org/plosone/article?id=10.1371/journal.pone.0155895, 2016a.

28.

Smetana

Bell

. Computer simulations to support science instruction and learning: A critical review of the literature. International Journal of Science Education. 2012; 34(9): 1337-1370.

29.

Rutten

van Joolingen

van der Veen

. The learning effects of computer simulation in science education. Computers and Education. 2012; 58(1): 136-153.

30.

Brinson

. Learning outcome achievement in non-traditional (virtual and remote) versus traditional (hands-on) laboratories: A review of the empirical research. Computers & Education. 2015; 87: 218-237.

31.

Paxinou

Georgiou

Kakkos

Kalles

Galani

. Achieving educational goals in microscopy education by adopting Virtual Reality labs on top of face-to-face tutorials. Research in Science & Technological Education. 2020b. doi: 10.1080/02635143.2020.1790513.

32.

Paxinou

Kiourt

Sypsas

Zafeiropoulos

Sgourou

Kalles

. Cross Reality Technologies in Archaeometry Bridge Humanities with “Hard Science”. Applying Innovative Technologies in Heritage Science. Hershey, PA: IGI Global. 2020c. doi: 10.4018/978-1-7998-2871-6.ch008.

33.

Paxinou

Karatrantou

Kalles

Panagiotakopoulos

Sgourou

. 3D Virtual Reality Laboratory as a Supplementary Educational Preparation Tool for a Biology Course. European Journal of Open, Distance and E-learning. 2018; 21(2). Retrieved from http//www.eurodl.org/?p=current&sp=brief&article=777.

34.

Sypsas

Kiourt

Paxinou

Zafeiropoulos

Kalles

. The Educational Application of Virtual Laboratories in Archaeometry. International Journal of Computational Methods in Heritage Science. 2019; 3(1): 1-19. doi: 10.4018/IJCMHS..

35.

Tsoni

Samaras

Paxinou

Panagiotakopoulos

Verykios

. From Analytics to Cognition: Expanding the Reach of Data in Learning. In the Proceedings of the 11th International Conference on Computer Supported Education, 2019, pp. 458-465.

36.

Tapscott

. Growing up Digital: The Rise of the Net Generation. New York: McGraw-Hill, 1998.

37.

Prensky

. Digital Natives, Digital Immigrants, Part 1. On the Horizon. 2001; 9(5): 1-6.

38.

Garzon

GCV

Magrini

Galembeck

. Using augmented reality to teach and learn biochemistry. Biochem Mol Biol Educ. 2017; 45(5): 417-420.

39.

Gamo

. Assessing a Virtual Laboratory in Optics As a Complement to On-Site Teaching. IEEE Transactions on Education. 2018; 99: 1-8.

40.

Allen

Miao

Yao

Sha

Chen

. Exploration of an interactive “Virtual and Actual Combined” teaching mode in medical developmental biology. Biochem Mol Biol Educ. 2018; 46(6): 585-591.

41.

Tam

Jen

T-H

. Educational Measurement for Applied Researchers: Theory into Practice. Singapore: Spinger, 2016.

42.

Tornqvist

Vartia

. How Should Relative Change Be Measured? Am Statistician. 1985; 39(1): 43-46.

43.

Hake

. Interactive-engagement versus traditional methods: a six-thousand-student survey of mechanics test data for introductory physics courses. Am J Phys. 1998; 66(1): 64-74.

44.

Pentecost

Barbera

. Measuring Learning Gains in Chemical Education: A Comparison of two Methods. J. Chem.Educ. 2013; 90: 839-845.

45.

Rasch

. Probabilistic models for some intelligence and attainment test. Copenhagen: Danmarks Paedagogiske Institut, 1960.

46.

Hambleton

Swaminathan

Rogers

. Fundermentals of Item Response Theory. Newbury Park, CA: Sage Publications, 1991.

47.

Hambleton

Jones

. Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement. 1993; 12(3): 38-47.

48.

Embretson

Reise

. Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates, 2000.

49.

van der Linden

Hambleton

. Handbook of modern item response theory. New York: Springer, 2013.

50.

Richardson

. The relationship between the difficulty and the differential validity of a test. Psychometrica. 1936; 1(2): 33-49.

51.

Tucker

. Maximum validity of a test with equivalent items. Phycometrica. 1946; 11: 1-3.

52.

Whitely

Dawis

. The Nature of Objectivity with the Rasch Model. J Educ Meas. 1974; 11: 163-78.

53.

Wallace

Bailey

. Do Concept Inventories Actually Measure Anything? Astron Educ Rev. 2010, p. 9.

54.

. A Simple Guide to the Item Response Theory (IRT) and Rasch Modeling. Retrieved from file:///C:/Users/user/Desktop/Print/IRT-SOS.pdf, 2017.

55.

Bond

Fox

. Applying the Rasch Model: Fundamental Measurement in the Human Sciences, (3rd edition). London and New York: Routledge, 2001.

56.

Almond

Mislevy

Steinberg

Yan

Williamson

. Bayesian Networks in Educational Assessment. New York: Springer, 2015.

57.

Mikropoulos

Katsikis

Nikolou

Tsakalis

. Virtual environments in biology teaching. Journal of Biological Education. 2003; 37(4): 176-181.

58.

Kabacoff

. R in Action. Data analysis and graphics with R. Shelter Island N.Y.: Manning Publications CO, 2011.

59.

Robitzsch

Kiefer

. Test Analysis Modules-Package “TAM”. Computer Software. Retrieved from https://cran.r-project.org/web/packages/TAM/TAM.pdf, 2013.

60.

Fisher

. On an absolute criterion for fitting frequency curves. Messenger of Mathematics. 1912; 41: 155-160.

61.

Fisher

. Two New Properties of Mathematical Likelihood. Proceedings of the Royal Society of London-Series A. 1934; 44: pp. 285-307. Retrieved from https//errorstatistics.files.wordpress.com/2019/01/fisher-1934-likelihood-searchable.pdf.

An IRT-based approach to assess the learning gain of a virtual reality lab students’ experience

Abstract

Keywords

1. Introduction

2. The assessment framework

2.1 Measuring the learning gain with an IRT model

3.1 The participants

3.2 The VR biology lab

3.2.1 The Instruction and the Experimentation Mode of Onlabs

3.3 The teaching scenarios

Table 1 Pre-test item difficulties for T and VR-group

References

Table 1
Pre-test item difficulties for T and VR-group