Abstract
Testing is a significant part of the teaching and learning process. An assessment test has to include test items that are tailored to the individual learning needs of the students in order to be more accurate and support learning in a more effective way. In this paper, a fuzzy-based mechanism is presented for automatic personalized assessment in an e-learning system for computer programming. Particularly, the selection of the most appropriate test items for each individual student is based on a variety of criteria: (i) the student’s knowledge level, (ii) the student’s prior knowledge of computer programming, (iii) the type of programming errors that the student is prone to make, and (iv) the difficulty level of the test items. Linguistic values are used to determine these criteria. Additionally, 45 fuzzy rules are used over these criteria, which imitate the way of thinking of human tutors with regard to deciding about the most appropriate test items that have to be included in an adaptive test. The presented mechanism was used under real conditions and evaluated by experts and students of the Department of Informatics of the University of Piraeus, Greece with very encouraging results. Specifically, both the participating students and experts found that the presented mechanism creates non-repetitive balanced tests that meet learners’ knowledge level and needs.
Keywords
Introduction
Nowadays, new and advanced technologies make available a variety of services in everyday lives of humans. Education is a field in which technology has caused both changes and new challenges concerning the teaching and learning processes [1, 2, 3]. E-learning, educational software, mobile learning, and Intelligent Tutoring Systems are some of the technological advances employed in education. Specifically, some of the important benefits from the development and use of educational software and tutoring systems include allowing a learner to have access to a course from wherever and whenever s/he is available to participate. Moreover, educational software and tutoring systems are capable of providing learning material and processes that are tailored to an individual learner’s needs, imitating the one-to-one teaching process. Adaptivity of the learning process is very crucial since it makes the learning process significantly more effective [4, 5, 6]. Adaptivity is achieved by incorporating Artificial Intelligent (AI) Techniques in educational software and tutoring systems [7, 8, 9, 10, 11]. AI enables the system to recognize the learners’ individual characteristics and needs and provide a personalized teaching process [12, 13, 14, 15], which includes personalized lesson paths, personalized advice and feedback, and adaptive assessment tests.
In the relevant literature, adaptive assessment tests, which are created automatically by a tutoring system or an educational application, are called Computer Adaptive Tests (CAT) [16]. The aim of a CAT is to include assessment items (exercises, questions, suggested activities, quizzes) that are tailored to the learners’ knowledge level and abilities in order to evaluate more accurately and more effectively the learners’ obtained knowledge [17, 18, 19]. CAT are, usually, shorter and more targeted tests [20]. As a result, they allow the students’ assessment to be performed with accuracy and in a shorter time. Moreover, they eliminate boredom and feelings of disappointment that are caused by typical extensive assessments tests which can include assessment items that the students may perceive as too easy or too difficult. CAT require the availability of a large pool of assessment items, including questions, exercises, and activities that concern all the possible different characteristics, needs and abilities of the students.
The process of creating CAT has to incorporate “intelligence” so that the most appropriate assessment test is selected for each individual learner. Specifically, an AI technique has to be integrated in the process of selecting assessment items for CAT that imitates a human tutor’s way of thinking. Fuzzy logic is such a technique, since it uses linguistic values to describe data that are characterized by human subjectivity and employs sets of IF-ELSE rules that resemble the human process of decision making [21, 22, 23, 24]. Furthermore, the criteria for choosing the most appropriate assessment items to be involved in a CAT are multiple. They include the individual student’s knowledge level, background, learning misconceptions and needs. The determination of these characteristics is not a straightforward task. It is based on things that are subjective and contain uncertainty. Therefore, fuzzy logic is an ideal approach to deal with this data, as it constitutes a methodology for computing with words and can deal with uncertain and vague data [25, 26, 27, 28].
In this paper, a fuzzy-based mechanism is presented for automatic personalized assessment in an e-learning system for computer programming. In more detail, fuzzy sets are used to describe a variety of learners’ characteristics such as their knowledge level, how well they master the learning material that corresponds to their knowledge level, their prior knowledge of computer programming, how prone they are to make syntactical and/or logical errors. In addition, fuzzy sets are used to describe activity characteristics such as their difficulty level and their relation to syntactical and/or logical issues. In the end, each time a student interacts with the system and is called to complete an assessment test, the system triggers 45 fuzzy rules which calculate the degree in which an activity meets the learners’ needs and decides about the activities that have to be included in the created assessment test. The presented fuzzy-based activity selection mechanism was incorporated into a web-based tutoring system for computer programming learning. The system was used under real conditions by undergraduate students of the Department of Informatics of the University of Piraeus, Greece. The adaptive tests that were automatically created were evaluated by both learners and experts, in a teaching computer programming setting. The evaluation results are very positive and very promising. Specifically, both the participating students and experts found that the presented mechanism creates tests that include activities which are balanced concerning their difficulty and complexity and tailored to the learners’ knowledge level and needs. Furthermore, the frequency of re-occurrence of a test and/or an activity is quite low. Consequently, the generated tests receive great acceptance by both experts and students and significantly contribute to the enhancement of the learners’ knowledge.
Related work
Computerized Adaptive Tests (CAT) allow tutoring systems and educational applications to evaluate automatically the students’ knowledge acquisition and performance in a more accurate and targeted way. They support students during the learning process, contribute to their motivation and engagement and enhance the learning outcomes [29, 30]. Additionally, CAT can minimize the duration of an exam and reduce the subjectivity of the assessment [31]. In the relevant literature, there is an increased scientific interest in using CAT for learners’ assessment [16, 32, 33, 34, 35]. CAT is particularly significant for the assessment of students of computer science, since these students usually come from different backgrounds as far as computer programming is concerned and demonstrate a variety of skills [36, 37, 38, 39].
A sample of the pool of activities
A sample of the pool of activities
There are tutoring systems that incorporate fuzzy logic in the process of selecting the most suitable assessment items (questions, exercises, activities etc.) that meet the knowledge level and the learning needs of a particular learner. Badaracco and Martínez [40] introduced a new item selection algorithm, based on a multi-criteria decision model. Specifically, instead of using statistical calibration, the algorithm uses fuzzy linguistic information to model and integrate the teachers’ knowledge. The aim of this method is to enhance the adaptation of testing to the students’ competence level. On the other hand, in [31], a fuzzy rule-based system is used to estimate the ability of students and this information is subsequently used to select the next item question. The system input consists of the level of difficulty and discrimination of the questions, the probability of students being able to answer correctly, and the students’ answer, and returns as output the estimated students’ ability. Lendyuk et al. [41] used fuzzy rules to dynamically adapt the complexity level of the questions in an adaptive test. Particularly, when the testing begins, the student receives a question block with specific complexity. According to the student’s answer, the time s/he spent to answer and the complexity level of the block, the system applies fuzzy rules to decide about the complexity level of the next block. In the study of Chrysafiadi and Virvou [37], a fuzzy-based algorithm is presented that on the fly creates adaptive tests for teaching computer programing via a web-based educational environment. In particular, their algorithm takes into consideration the knowledge dependencies that exist among the various domain concepts of the learning material and decides about the test items that have to be included in the created test.
After a thorough investigation in the related scientific literature, it is clear that the use of fuzzy logic in the process of selecting the most suitable assessment items in a pool is applied on data that concern the students’ knowledge and competence level, the students’ ability to answer or solve correctly particular questions or activity, and the difficulty level and complexity of the assessment items. None of the existing approaches takes into consideration data that concern types of students’ errors and misconceptions (i.e. syntactical or logical) and previous knowledge of the domain concept. Therefore, the presented fuzzy-based mechanism for automated creation of adaptive tests considers not only the knowledge level of the student and the difficulty level of the activities, but also the learners’ deficiencies, error proneness and background. In this way, the system models and recognizes the students’ learning needs in a more accurate way and selects assessment items to be included in the created test that are better tailored to the characteristics and needs of each individual learner.
Students’ possible knowledge levels
The pool of activities
The tutoring system communicates with a database, which constitutes the pool of the available activities for learners’ assessment. For each activity, the following information is stored.
The description of the activity. The correct answer. The knowledge level to which it is attached, i.e. an integer value between 1 and 7, according to Table 2. The level of difficulty. This is defined by the instructor who imports the activity in the system. It can take one of three linguistic values, namely “easy”, “medium”, “hard”. The relation of the activity to syntactical issues, such as symbolism of operators, variable naming, command structures etc. This is also defined by the instructor who imports the activity in the system. It can take one of four linguistic values, namely “none”, “little”, “medium”, “high”. The relation of the activity to logical issues, such as semantics of commands and execution flow of a program. This is also defined by the instructor who imports the activity in the system. It can take one of four linguistic values, namely “none”, “little”, “medium”, “high”.
Here, we mention that an activity can assess either only syntactical or logical issues or both of them. Table 1 presents a sample of the pool of activities.
The criteria for choosing the most appropriate activities in the pool, to be included in the test concern either learners’ characteristics or activity characteristics. Particularly, they are:
The current knowledge level (KL) of the learner: This is a crucial characteristic of a learner that helps the system to identify the learners’ learning needs. It represents those concepts of the teaching domain knowledge that the learners knows. It is a dynamic characteristic the value which can change during the learners’ interaction with the system. The knowledge level of a learner assumes values from 1 (novice) to 8 (expert), as in Table 2. The knowledge adoption level (KAL) of the learner: This is a dynamic learner characteristic which represents how well the learner knows the learning material that corresponds to her/his current knowledge level. Its value is determined by the learner’s performance in the tests. It is described by the following fuzzy sets:
“Poor”, when the learner failed to correctly solve most of the test activities and her/his performance in the test is below 45/100 to 100/100. “Moderate”, when the learner correctly solved a moderate number of the test activities and her/his performance in the test ranges from 30/100 to 70/100. “High” when the learner correctly solved most of the test activities and her/his performance in the test ranges from 60/100 to 90/100. “Very high”, when the learner correctly solved almost all the test activities and her/his performance in the test exceeds 85/100. If the KAL of a learner is characterized as ‘Very high’, then s/he is considered to have reached the target knowledge and no more assessing to the particular level is realized. Thresholds of the above fuzzy sets were defined by eight experts in computer programming, who had at least five years of experience in teaching computer programming and are, henceforth, referred to as “experts”. In Fig. 1 the trapezoidal membership functions for the fuzzy sets of KAL are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Poor” is described by (0, 0, 30, 45), “Moderate” is described by (30, 45, 60, 70), “High” is described by (60, 70, 80, 90), and “Very high” is described by (85, 90, 100, 100).
Fuzzy sets of KAL and their membership functions. The learner’s prior knowledge of computer programming (PPK): This concerns other programming languages that the learner may already know. During the first interaction of the learner with the system, s/he is asked to enter a number from 0 (at all) to 100 (absolutely), that declares the degree to which s/he knows another programming language. For the description of PPK, the following three fuzzy sets are defined:
“Poor”, when the learner has zero or little knowledge of computer programming. Her/his degree of previous knowledge on computer programing is less than 50. “Moderate”, when the learner has a moderate knowledge of computer programming. S/he knows the basic concepts of computer programming, like variable declaration, assignment statement, operators, input/output commands, and basic concepts concerning if and iteration structures. Her/his degree of previous knowledge of computer programing ranges from 40 to 70. “High”, when the learner very well knows at least another computer programming language. Her/his degree of previous knowledge on computer programing is more than 60.
Fuzzy sets of PPK and their membership functions. Thresholds of the previous fuzzy sets were defined by the eight experts. In Fig. 2, the trapezoidal membership functions for the fuzzy sets of PPK are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Poor” is described by (0, 0, 40, 50), “Moderate” is described by (40, 50, 60, 70), and “High” is described by (60, 80, 100, 100).
Fuzzy sets of TE and their membership functions. The type of errors to which the learner is prone (TE): There are two categories of errors, namely syntactical and logical. Syntactical errors include anagrammatism of command names, omission of the definition of data, invalid command names, incorrect symbolism of operators etc. Logical errors are usually errors of design and occur in case of misconceptions of the program and of the semantics and operation of the commands. A learner makes a syntactical error, when s/he has not carefully studied the learning material. However, s/he makes a logical error when s/he has a difficulty in understanding a command or a programming structure. The type of errors that a learner makes more often is derived from the results of the test that the learner has completed. The system calculates the percentage of errors that the learner has made and counts how many of them were syntactical and how many were logical errors. For the description of the learner’s tendency to make errors of each category, the following three fuzzy sets are used:
“Low”, when the learner makes less than 45% errors of the category. “Medium”, when the learner makes from 35% to 70% errors of the category. “High”, when the learner makes more than 65% errors of the category.
Fuzzy sets of AD and their membership functions.
Fuzzy sets of ARS and ARS and their membership functions. Thresholds of the above fuzzy sets were defined by the eight experts. In Fig. 3, the trapezoidal membership functions for the fuzzy sets of TE are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Low” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100). We have to mention that the percentage of syntactical and the percentage of logical errors are complimentary. In other words, if the system recognizes that from the learner’s errors in a test, 42% concerns syntax errors, then this means that 58% concerns logical errors. Therefore, the “syntactical errors” category belongs to two adjust fuzzy sets: “Low” with membership degree 0.12 and “Medium” with membership degree 0.47. The corresponding fuzzy set for the category “logical errors” is “Medium” with membership degree 1. Therefore, the variables “Frequency of Syntactical Errors” (FSE) and “Frequency of Logical Errors” (FLE) are used to describe the tendency of the learner to make syntactical and logical errors, correspondingly. They are described by the three fuzzy set (Low, Medium, High) that were defined previously. The activity difficulty level (AD): This indicates how difficult an activity is to solve. The activity difficulty is defined by the instructor who imports the activity into the system. For the description of the activity difficulty level, the following three fuzzy sets are used: “Easy”, “Normal”, “Hard”. The membership functions of these fuzzy sets are trapezoidal, as in Fig. 4. Therefore, the fuzzy set is described by a set of four numbers. Particularly, “Easy” is described by (0, 0, 40, 50), “Normal” is described by (40, 50, 70, 80), and “High” is described by (70, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.
Fuzzy sets of activity’s relevance level and their membership functions. The activity relation to syntactical issues (ARS): As it is referred in Subsection 3.1, this is defined by the instructor. For the description of ARS, we use four fuzzy sets, namely “None”, “Little”, “Medium”, and “High”. The membership functions of these fuzzy sets are trapezoidal, except of “None”, which is a point (Fig. 5). Particularly, “None” is described by (0, 0, 0, 0), “Little” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts. Criteria of calculating the suitability of each activity






The activity relation to logical issues (ALS): As it is referred in Subsection 3.1, this is defined by the instructor. For the description of ALS, we use four fuzzy sets, namely “None”, “Little”, “Medium”, and “High”. The membership functions of these fuzzy sets are trapezoidal, except of “None”, which is a point (Fig. 5). Particularly, “None” is described by (0, 0, 0, 0), “Little” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.
Table 3 summarizes up the criteria for calculating the suitability of each activity and their description.
We notice that criteria KL, KAL, PPK, FLE and FSE concern the learner’s characteristics, while criteria AD, ARS and ALS concern activity characteristics. The system takes all of these criteria into consideration and, for each activity in the pool, decides its relevance to each individual learner’s needs. The relevance of an activity is described by four fuzzy sets, namely “Low”, “Medium”, “High”, “Very high”. The membership functions of these fuzzy sets are trapezoidal, as in Fig. 6. Particularly, “Low” is described by (0, 0, 30, 40), “Medium” is described by (30, 50, 60, 70), “High” is described by (65, 70, 80, 85), and “Very High” is described by (80, 90, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.
In this section, the rules are presented that apply to the previously-defined criteria for the definition of the relevance of an activity to the learner’s needs. In more details, the relevance of an activity to the learner’s needs is based on the following:
If the activity concerns the current knowledge level of the learner. If the activity difficulty level is suitable for the knowledge adoption level and the prior knowledge level of the learner. If the activity relation to syntactical and logical errors is appropriate for helping the learner to understand her/his misconceptions as discovered from her/his tendency to make a corresponding type of errors.
Considering the previous, 45 fuzzy rules are derived. Particularly, 8 experts in computer programming and the programming language C were asked to define the relevance of an activity to each different combination of values of the learner and activity characteristics. All 8 experts had at least 5 years of experience in teaching the programming language C. From the experts’ answers, the following fuzzy rules were derived. The experts’ long experience ensures the validation of the rules. Specifically, the rules are presented in Tables 4, 5 and 6.
Fuzzy rules concerning KAL, PPK and ADL
