A fuzzy-based mechanism for automatic personalized assessment in an e-learning system for computer programming

Abstract

Testing is a significant part of the teaching and learning process. An assessment test has to include test items that are tailored to the individual learning needs of the students in order to be more accurate and support learning in a more effective way. In this paper, a fuzzy-based mechanism is presented for automatic personalized assessment in an e-learning system for computer programming. Particularly, the selection of the most appropriate test items for each individual student is based on a variety of criteria: (i) the student’s knowledge level, (ii) the student’s prior knowledge of computer programming, (iii) the type of programming errors that the student is prone to make, and (iv) the difficulty level of the test items. Linguistic values are used to determine these criteria. Additionally, 45 fuzzy rules are used over these criteria, which imitate the way of thinking of human tutors with regard to deciding about the most appropriate test items that have to be included in an adaptive test. The presented mechanism was used under real conditions and evaluated by experts and students of the Department of Informatics of the University of Piraeus, Greece with very encouraging results. Specifically, both the participating students and experts found that the presented mechanism creates non-repetitive balanced tests that meet learners’ knowledge level and needs.

Keywords

Personalization online assessment e-learning fuzzy rule-based system fuzzy logic computer programming

1. Introduction

Nowadays, new and advanced technologies make available a variety of services in everyday lives of humans. Education is a field in which technology has caused both changes and new challenges concerning the teaching and learning processes [1, 2, 3]. E-learning, educational software, mobile learning, and Intelligent Tutoring Systems are some of the technological advances employed in education. Specifically, some of the important benefits from the development and use of educational software and tutoring systems include allowing a learner to have access to a course from wherever and whenever s/he is available to participate. Moreover, educational software and tutoring systems are capable of providing learning material and processes that are tailored to an individual learner’s needs, imitating the one-to-one teaching process. Adaptivity of the learning process is very crucial since it makes the learning process significantly more effective [4, 5, 6]. Adaptivity is achieved by incorporating Artificial Intelligent (AI) Techniques in educational software and tutoring systems [7, 8, 9, 10, 11]. AI enables the system to recognize the learners’ individual characteristics and needs and provide a personalized teaching process [12, 13, 14, 15], which includes personalized lesson paths, personalized advice and feedback, and adaptive assessment tests.

In the relevant literature, adaptive assessment tests, which are created automatically by a tutoring system or an educational application, are called Computer Adaptive Tests (CAT) [16]. The aim of a CAT is to include assessment items (exercises, questions, suggested activities, quizzes) that are tailored to the learners’ knowledge level and abilities in order to evaluate more accurately and more effectively the learners’ obtained knowledge [17, 18, 19]. CAT are, usually, shorter and more targeted tests [20]. As a result, they allow the students’ assessment to be performed with accuracy and in a shorter time. Moreover, they eliminate boredom and feelings of disappointment that are caused by typical extensive assessments tests which can include assessment items that the students may perceive as too easy or too difficult. CAT require the availability of a large pool of assessment items, including questions, exercises, and activities that concern all the possible different characteristics, needs and abilities of the students.

The process of creating CAT has to incorporate “intelligence” so that the most appropriate assessment test is selected for each individual learner. Specifically, an AI technique has to be integrated in the process of selecting assessment items for CAT that imitates a human tutor’s way of thinking. Fuzzy logic is such a technique, since it uses linguistic values to describe data that are characterized by human subjectivity and employs sets of IF-ELSE rules that resemble the human process of decision making [21, 22, 23, 24]. Furthermore, the criteria for choosing the most appropriate assessment items to be involved in a CAT are multiple. They include the individual student’s knowledge level, background, learning misconceptions and needs. The determination of these characteristics is not a straightforward task. It is based on things that are subjective and contain uncertainty. Therefore, fuzzy logic is an ideal approach to deal with this data, as it constitutes a methodology for computing with words and can deal with uncertain and vague data [25, 26, 27, 28].

In this paper, a fuzzy-based mechanism is presented for automatic personalized assessment in an e-learning system for computer programming. In more detail, fuzzy sets are used to describe a variety of learners’ characteristics such as their knowledge level, how well they master the learning material that corresponds to their knowledge level, their prior knowledge of computer programming, how prone they are to make syntactical and/or logical errors. In addition, fuzzy sets are used to describe activity characteristics such as their difficulty level and their relation to syntactical and/or logical issues. In the end, each time a student interacts with the system and is called to complete an assessment test, the system triggers 45 fuzzy rules which calculate the degree in which an activity meets the learners’ needs and decides about the activities that have to be included in the created assessment test. The presented fuzzy-based activity selection mechanism was incorporated into a web-based tutoring system for computer programming learning. The system was used under real conditions by undergraduate students of the Department of Informatics of the University of Piraeus, Greece. The adaptive tests that were automatically created were evaluated by both learners and experts, in a teaching computer programming setting. The evaluation results are very positive and very promising. Specifically, both the participating students and experts found that the presented mechanism creates tests that include activities which are balanced concerning their difficulty and complexity and tailored to the learners’ knowledge level and needs. Furthermore, the frequency of re-occurrence of a test and/or an activity is quite low. Consequently, the generated tests receive great acceptance by both experts and students and significantly contribute to the enhancement of the learners’ knowledge.

2. Related work

Computerized Adaptive Tests (CAT) allow tutoring systems and educational applications to evaluate automatically the students’ knowledge acquisition and performance in a more accurate and targeted way. They support students during the learning process, contribute to their motivation and engagement and enhance the learning outcomes [29, 30]. Additionally, CAT can minimize the duration of an exam and reduce the subjectivity of the assessment [31]. In the relevant literature, there is an increased scientific interest in using CAT for learners’ assessment [16, 32, 33, 34, 35]. CAT is particularly significant for the assessment of students of computer science, since these students usually come from different backgrounds as far as computer programming is concerned and demonstrate a variety of skills [36, 37, 38, 39].

Table 1
A sample of the pool of activities

Description	Answer	Knowledge level	Level of difficulty	Relation to syntactical issues	Relation to logical issues
The assignment command A $=$ B changes the value of variable B. Right or wrong?	Wrong	1	Easy	None	High
How the following representation would be written in a C program? $\frac{5X-3Y}{A-B^{2}}$	(5X $-$ 3Y)/(A $-$ pow(B, 2))	1	Normal	High	None
The number of a colony of microorganisms doubles every 2 hours. Write a program that calculates the population of the colony after 2 days, under the assumption that the initial population is 1000.	#include <stdio.h> #include <stdlib.h> int main(int argc, char argv[]) { int n $=$ 1000, i; for(i $=$ 2; i $<=$ 48; i $=$ i $+$ 2){ n $=$ n2; } printf("the final population is %d", n); return 0; }	3	Hard	High	High
The table A $=$ [43, 25, 28, 14, 32] is given. What will be the order of the data after the second access of the bubble sorting?	[14, 25, 43, 28, 32]	5	Hard	None	High
Fill the gaps to compute and display the sum of each line of a two-dimension array A [4, 10].	main() { ___(i $=$ __;____;___) { ______________ ___ (__ $=$ __; j __ ; __) { ___ $=$ ___ $+$ ___ ; } } printf("The sum is __",__); return 0; }	6	Normal	Medium	High

There are tutoring systems that incorporate fuzzy logic in the process of selecting the most suitable assessment items (questions, exercises, activities etc.) that meet the knowledge level and the learning needs of a particular learner. Badaracco and Martínez [40] introduced a new item selection algorithm, based on a multi-criteria decision model. Specifically, instead of using statistical calibration, the algorithm uses fuzzy linguistic information to model and integrate the teachers’ knowledge. The aim of this method is to enhance the adaptation of testing to the students’ competence level. On the other hand, in [31], a fuzzy rule-based system is used to estimate the ability of students and this information is subsequently used to select the next item question. The system input consists of the level of difficulty and discrimination of the questions, the probability of students being able to answer correctly, and the students’ answer, and returns as output the estimated students’ ability. Lendyuk et al. [41] used fuzzy rules to dynamically adapt the complexity level of the questions in an adaptive test. Particularly, when the testing begins, the student receives a question block with specific complexity. According to the student’s answer, the time s/he spent to answer and the complexity level of the block, the system applies fuzzy rules to decide about the complexity level of the next block. In the study of Chrysafiadi and Virvou [37], a fuzzy-based algorithm is presented that on the fly creates adaptive tests for teaching computer programing via a web-based educational environment. In particular, their algorithm takes into consideration the knowledge dependencies that exist among the various domain concepts of the learning material and decides about the test items that have to be included in the created test.

After a thorough investigation in the related scientific literature, it is clear that the use of fuzzy logic in the process of selecting the most suitable assessment items in a pool is applied on data that concern the students’ knowledge and competence level, the students’ ability to answer or solve correctly particular questions or activity, and the difficulty level and complexity of the assessment items. None of the existing approaches takes into consideration data that concern types of students’ errors and misconceptions (i.e. syntactical or logical) and previous knowledge of the domain concept. Therefore, the presented fuzzy-based mechanism for automated creation of adaptive tests considers not only the knowledge level of the student and the difficulty level of the activities, but also the learners’ deficiencies, error proneness and background. In this way, the system models and recognizes the students’ learning needs in a more accurate way and selects assessment items to be included in the created test that are better tailored to the characteristics and needs of each individual learner.

Table 2

Students’ possible knowledge levels

Current knowledge level		Corresponding learning material
Description	Value
Level 1: Novice	1	Constants, variables, assignment statement, operators, sequence structure
Level 2	2	Conditional Structure
Level 3	3	Iteration structure with known number of loops (for)
Level 4	4	Iteration structures with unknown number of loops (while and do … while)
Level 5	5	One-dimensional arrays
Level 6	6	Two-dimensional arrays
Level 7	7	Sub-programming
Level 8: Expert	8	No learning material

3. The fuzzy-based activity selection mechanism

3.1 The pool of activities

The tutoring system communicates with a database, which constitutes the pool of the available activities for learners’ assessment. For each activity, the following information is stored.

•
The description of the activity.
•
The correct answer.
•
The knowledge level to which it is attached, i.e. an integer value between 1 and 7, according to Table 2.
•
The level of difficulty. This is defined by the instructor who imports the activity in the system. It can take one of three linguistic values, namely “easy”, “medium”, “hard”.
•
The relation of the activity to syntactical issues, such as symbolism of operators, variable naming, command structures etc. This is also defined by the instructor who imports the activity in the system. It can take one of four linguistic values, namely “none”, “little”, “medium”, “high”.
•
The relation of the activity to logical issues, such as semantics of commands and execution flow of a program. This is also defined by the instructor who imports the activity in the system. It can take one of four linguistic values, namely “none”, “little”, “medium”, “high”.

Here, we mention that an activity can assess either only syntactical or logical issues or both of them. Table 1 presents a sample of the pool of activities.
3.2 Criteria and their fuzzy description

The criteria for choosing the most appropriate activities in the pool, to be included in the test concern either learners’ characteristics or activity characteristics. Particularly, they are:

1.
The current knowledge level (KL) of the learner: This is a crucial characteristic of a learner that helps the system to identify the learners’ learning needs. It represents those concepts of the teaching domain knowledge that the learners knows. It is a dynamic characteristic the value which can change during the learners’ interaction with the system. The knowledge level of a learner assumes values from 1 (novice) to 8 (expert), as in Table 2.
2.
The knowledge adoption level (KAL) of the learner: This is a dynamic learner characteristic which represents how well the learner knows the learning material that corresponds to her/his current knowledge level. Its value is determined by the learner’s performance in the tests. It is described by the following fuzzy sets:

a.
“Poor”, when the learner failed to correctly solve most of the test activities and her/his performance in the test is below 45/100 to 100/100.
b.
“Moderate”, when the learner correctly solved a moderate number of the test activities and her/his performance in the test ranges from 30/100 to 70/100.
c.
“High” when the learner correctly solved most of the test activities and her/his performance in the test ranges from 60/100 to 90/100.
d.
“Very high”, when the learner correctly solved almost all the test activities and her/his performance in the test exceeds 85/100.

If the KAL of a learner is characterized as ‘Very high’, then s/he is considered to have reached the target knowledge and no more assessing to the particular level is realized. Thresholds of the above fuzzy sets were defined by eight experts in computer programming, who had at least five years of experience in teaching computer programming and are, henceforth, referred to as “experts”. In Fig. 1 the trapezoidal membership functions for the fuzzy sets of KAL are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Poor” is described by (0, 0, 30, 45), “Moderate” is described by (30, 45, 60, 70), “High” is described by (60, 70, 80, 90), and “Very high” is described by (85, 90, 100, 100).

Figure 1.
Fuzzy sets of KAL and their membership functions.

3.
The learner’s prior knowledge of computer programming (PPK): This concerns other programming languages that the learner may already know. During the first interaction of the learner with the system, s/he is asked to enter a number from 0 (at all) to 100 (absolutely), that declares the degree to which s/he knows another programming language. For the description of PPK, the following three fuzzy sets are defined:

a.
“Poor”, when the learner has zero or little knowledge of computer programming. Her/his degree of previous knowledge on computer programing is less than 50.
b.
“Moderate”, when the learner has a moderate knowledge of computer programming. S/he knows the basic concepts of computer programming, like variable declaration, assignment statement, operators, input/output commands, and basic concepts concerning if and iteration structures. Her/his degree of previous knowledge of computer programing ranges from 40 to 70.
c.
“High”, when the learner very well knows at least another computer programming language. Her/his degree of previous knowledge on computer programing is more than 60.

Figure 2.
Fuzzy sets of PPK and their membership functions.

Thresholds of the previous fuzzy sets were defined by the eight experts. In Fig. 2, the trapezoidal membership functions for the fuzzy sets of PPK are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Poor” is described by (0, 0, 40, 50), “Moderate” is described by (40, 50, 60, 70), and “High” is described by (60, 80, 100, 100).

Figure 3.
Fuzzy sets of TE and their membership functions.

4.
The type of errors to which the learner is prone (TE): There are two categories of errors, namely syntactical and logical. Syntactical errors include anagrammatism of command names, omission of the definition of data, invalid command names, incorrect symbolism of operators etc. Logical errors are usually errors of design and occur in case of misconceptions of the program and of the semantics and operation of the commands. A learner makes a syntactical error, when s/he has not carefully studied the learning material. However, s/he makes a logical error when s/he has a difficulty in understanding a command or a programming structure. The type of errors that a learner makes more often is derived from the results of the test that the learner has completed. The system calculates the percentage of errors that the learner has made and counts how many of them were syntactical and how many were logical errors. For the description of the learner’s tendency to make errors of each category, the following three fuzzy sets are used:

a.
“Low”, when the learner makes less than 45% errors of the category.
b.
“Medium”, when the learner makes from 35% to 70% errors of the category.
c.
“High”, when the learner makes more than 65% errors of the category.

Figure 4.
Fuzzy sets of AD and their membership functions.

Figure 5.
Fuzzy sets of ARS and ARS and their membership functions.

Thresholds of the above fuzzy sets were defined by the eight experts. In Fig. 3, the trapezoidal membership functions for the fuzzy sets of TE are depicted. Therefore, the partition of each fuzzy set is described by a set of four numbers. Particularly, “Low” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100).

We have to mention that the percentage of syntactical and the percentage of logical errors are complimentary. In other words, if the system recognizes that from the learner’s errors in a test, 42% concerns syntax errors, then this means that 58% concerns logical errors. Therefore, the “syntactical errors” category belongs to two adjust fuzzy sets: “Low” with membership degree 0.12 and “Medium” with membership degree 0.47. The corresponding fuzzy set for the category “logical errors” is “Medium” with membership degree 1. Therefore, the variables “Frequency of Syntactical Errors” (FSE) and “Frequency of Logical Errors” (FLE) are used to describe the tendency of the learner to make syntactical and logical errors, correspondingly. They are described by the three fuzzy set (Low, Medium, High) that were defined previously.
5.
The activity difficulty level (AD): This indicates how difficult an activity is to solve. The activity difficulty is defined by the instructor who imports the activity into the system. For the description of the activity difficulty level, the following three fuzzy sets are used: “Easy”, “Normal”, “Hard”. The membership functions of these fuzzy sets are trapezoidal, as in Fig. 4. Therefore, the fuzzy set is described by a set of four numbers. Particularly, “Easy” is described by (0, 0, 40, 50), “Normal” is described by (40, 50, 70, 80), and “High” is described by (70, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.

Figure 6.
Fuzzy sets of activity’s relevance level and their membership functions.

6.
The activity relation to syntactical issues (ARS): As it is referred in Subsection 3.1, this is defined by the instructor. For the description of ARS, we use four fuzzy sets, namely “None”, “Little”, “Medium”, and “High”. The membership functions of these fuzzy sets are trapezoidal, except of “None”, which is a point (Fig. 5). Particularly, “None” is described by (0, 0, 0, 0), “Little” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.

Table 3
Criteria of calculating the suitability of each activity

Criterion Description Fuzzy sets

Learner’s characteristics KL The current knowledge level (KL) of the learner. It takes an integer value from 1 to 8. No fuzzy description

KAL The knowledge adoption level of the learner. It represents how well the learner knows the learning material that corresponds to her/his current knowledge level. Poor: (0, 0, 30, 45) Moderate: (30, 45, 60, 70) High: (60, 70, 80, 90) Very High: (85, 90, 100, 100)

PPK The learner’s prior knowledge on computer programming. Poor: (0, 0, 40, 50)Moderate: (40, 50, 60, 70) High: (60, 80, 100, 100)

FSE The frequency that the learner makes syntax errors. Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)

FLE The frequency that the learner makes logical errors. Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)

Activity’s characteristics AKL The knowledge level that the activity concerns. It takes an integer value from 1 to 7. No fuzzy description

AD The activity difficulty level Easy: (0, 0, 40, 50) Normal: (40, 0, 70, 80) Hard: (70, 80, 100, 100)

ARS The activity relation to syntactical issues None: (0, 0, 0, 0) Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)

ALS The activity relation to logical issues None: (0, 0, 0, 0) Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)

7.
The activity relation to logical issues (ALS): As it is referred in Subsection 3.1, this is defined by the instructor. For the description of ALS, we use four fuzzy sets, namely “None”, “Little”, “Medium”, and “High”. The membership functions of these fuzzy sets are trapezoidal, except of “None”, which is a point (Fig. 5). Particularly, “None” is described by (0, 0, 0, 0), “Little” is described by (0, 0, 20, 45), “Medium” is described by (35, 50, 60, 70), and “High” is described by (65, 80, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.

Table 3 summarizes up the criteria for calculating the suitability of each activity and their description.

We notice that criteria KL, KAL, PPK, FLE and FSE concern the learner’s characteristics, while criteria AD, ARS and ALS concern activity characteristics. The system takes all of these criteria into consideration and, for each activity in the pool, decides its relevance to each individual learner’s needs. The relevance of an activity is described by four fuzzy sets, namely “Low”, “Medium”, “High”, “Very high”. The membership functions of these fuzzy sets are trapezoidal, as in Fig. 6. Particularly, “Low” is described by (0, 0, 30, 40), “Medium” is described by (30, 50, 60, 70), “High” is described by (65, 70, 80, 85), and “Very High” is described by (80, 90, 100, 100). Thresholds of the above fuzzy sets were defined by the eight experts.
3.3 The fuzzy rules

	Criterion	Description	Fuzzy sets
Learner’s characteristics	KL	The current knowledge level (KL) of the learner. It takes an integer value from 1 to 8.	No fuzzy description
	KAL	The knowledge adoption level of the learner. It represents how well the learner knows the learning material that corresponds to her/his current knowledge level.	Poor: (0, 0, 30, 45) Moderate: (30, 45, 60, 70) High: (60, 70, 80, 90) Very High: (85, 90, 100, 100)
	PPK	The learner’s prior knowledge on computer programming.	Poor: (0, 0, 40, 50)Moderate: (40, 50, 60, 70) High: (60, 80, 100, 100)
	FSE	The frequency that the learner makes syntax errors.	Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)
	FLE	The frequency that the learner makes logical errors.	Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)
Activity’s characteristics	AKL	The knowledge level that the activity concerns. It takes an integer value from 1 to 7.	No fuzzy description
	AD	The activity difficulty level	Easy: (0, 0, 40, 50) Normal: (40, 0, 70, 80) Hard: (70, 80, 100, 100)
	ARS	The activity relation to syntactical issues	None: (0, 0, 0, 0) Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)
	ALS	The activity relation to logical issues	None: (0, 0, 0, 0) Little: (0, 0, 20, 45) Medium: (35, 50, 60, 70) High: (65, 80, 100, 100)

In this section, the rules are presented that apply to the previously-defined criteria for the definition of the relevance of an activity to the learner’s needs. In more details, the relevance of an activity to the learner’s needs is based on the following:

•
If the activity concerns the current knowledge level of the learner.
•
If the activity difficulty level is suitable for the knowledge adoption level and the prior knowledge level of the learner.
•
If the activity relation to syntactical and logical errors is appropriate for helping the learner to understand her/his misconceptions as discovered from her/his tendency to make a corresponding type of errors.

Considering the previous, 45 fuzzy rules are derived. Particularly, 8 experts in computer programming and the programming language C were asked to define the relevance of an activity to each different combination of values of the learner and activity characteristics. All 8 experts had at least 5 years of experience in teaching the programming language C. From the experts’ answers, the following fuzzy rules were derived. The experts’ long experience ensures the validation of the rules. Specifically, the rules are presented in Tables 4, 5 and 6.

Table 4
Fuzzy rules concerning KAL, PPK and ADL

No. of rule

Knowledge

adoption

level (KAL)

Prior

knowledge

(PPK)

Activity’s

difficulty

level (ADL)

Activity’s

relevance

(AR)

1 Poor Poor Easy Very high

2 Poor Poor Normal Medium

3 Poor Poor Hard Low

4 Poor Moderate Easy Very high

5 Poor Moderate Normal High

6 Poor Moderate Hard Low

7 Poor High Easy Medium

8 Poor High Normal High

9 Poor High Hard Low

10 Moderate Poor Easy High

11 Moderate Poor Normal Very High

12 Moderate Poor Hard Low

13 Moderate Moderate Easy Medium

14 Moderate Moderate Normal Very high

15 Moderate Moderate Hard Low

16 Moderate High Easy Low

17 Moderate High Normal Very high

18 Moderate High Hard Medium

19 High Poor Easy Low

20 High Poor Normal High

21 High Poor Hard High

22 High Moderate Easy Low

23 High Moderate Normal Medium

24 High Moderate Hard High

25 High High Easy Low

26 High High Normal Medium

27 High High Hard Very high

Each time the system asks a learner to complete a test, the fuzzy-based mechanism (presented in more detail in the following subsection) is triggered and dynamically creates the test. The system checks the learner and activity characteristics, calculates the relevance of each activity to the learner needs and decides about the activities that will be included in the test. In cases in which more than one conditions in the rules are connected with an AND operator, the MIN fuzzy operator is used [42]. As a result, a test is created that is tailored to the learner’s needs.

Table 5
Fuzzy rules concerning FSE and ARS

No. of rule

Frequency of

syntax errors

(FSE)

Activity’s relation

to syntax errors

(ARS)

Activity’s

relevance

(AR)

28 Low Little or none Very high

29 Low Medium Medium

30 Low High Low

31 Medium Little or none Low

32 Medium Medium Very high

33 Medium High High

34 High Little or none Low

35 High Medium Medium

36 High High Very high

Table 6
Fuzzy rules concerning FLE and ARL

No. of rule

Frequency of

logical errors

(FLE)

Activity’s relation

to logical errors

(ARL)

Activity’s

relevance

(AR)

37 Low Little or none Very high

38 Low Medium Medium

39 Low High Low

40 Medium Little or none Low

41 Medium Medium Very high

42 Medium High High

43 High Little or none Low

44 High Medium Medium

45 High High Very high

3.4 The algorithm for adaptive test creation

In this section, we present the algorithm that is executed for the selection of the appropriate activities in the pool and the creation of the test that is adapted to the learner’s needs.

1.
Check the learner’s characteristics.
2.
If KAL $=$ “Very High” then

2.1
The learner is considered that has reached the target knowledge and no more assessing needs.
2.2
If KL $=$ 8, then the learner has reached the target knowledge and the e-learning program has been completed and the algorithm ends. else transfer the learner to the next knowledge level (KL $=$ KL $+$ 1).

3.
Choose, from the pool, the activities for which the equation KL $=$ AKL stands. Let us name A the subset of these activities.
4.
For each activity in the subset A:

4.1
Check which fuzzy rules can be applied.
4.2
Apply the rules to calculate the activity relevance.
4.3
If more than one rules are applied, aggregate all the fuzzy rule results.
4.4
Use the centroid method [42] to defuzzify the linguistic value, thus representing the activity relevance.

5.
Sort the activities according to their relevance.
6.
Choose the first $n$ activities to include in the created test, where the number $n$ is defined by the instructor.

4. Example of use

Table 7
Criteria values for Mary, Gloria and Adam

Learner’s

characteristics

Values for

Mary

Values for

Gloria

Values for

Adam

KAL

(0.6, 0.4, 0, 0)

(0, 0, 1, 0)

(0, 0.8, 0.2, 1)

PPK

(1, 0, 0)

(0, 1, 0)

(0, 0.5, 0.5)

FSE

(0, 0, 0.47)

(1, 0, 0)

(0.08, 0.53, 0)

FLE

(0.68, 0, 0)

(0, 0, 1)

(0, 1, 0)

Table 8

Criteria values for activities of the 4th knowledge level

ID	AKL	AD	Relation to syntax issues	Relation to logical issues
1	4	Easy	None	High
2	4	Normal	High	None
3	4	Hard	High	High
4	4	Hard	None	High
5	4	Normal	Medium	High
6	4	Easy	Little	Medium
7	4	Easy	Little	High
8	4	Normal	Medium	Little
9	4	Hard	High	None
10	4	Normal	High	Medium
11	4	Normal	Medium	Medium
12	4	Easy	Medium	Little

In this section, examples are depicted of use of the previously-described fuzzy-based activity selection mechanism. Let us consider three learners, namely Mary, Gloria and Adam, who belong to the same knowledge level, but have different characteristics and needs. Considering Mary, her performance in the last test was 42/100 and 72% of her errors were syntactical (the other 28% concerning logical errors). Moreover, she declared that she had no prior knowledge of computer programming. Considering Gloria, her performance in the last test was 78/100 and 6% of her errors were syntactical (the other 94% concerning logical errors). Also, she declared that she had prior knowledge of computer programming at 50%. Finally, considering Adam, his performance in the last test was 62/100 and 43% of his errors were syntactical (the other 57% concerning logical errors). Also, he declared his prior knowledge of computer programming at a 75%. Taking these values into consideration, the fuzzy values of the criteria concerning the learner’s characteristics are calculated, as in Table 7.

Table 9

Estimated relevance level of each activity for Mary

Activity ID	Relevance	Rules	Defuzzification result
1	(0, 0, 0, 0.6)	1	44.87
	(0, 0, 0.4, 0)	10
	(0.47, 0, 0, 0)	34
	(0.68, 0, 0, 0)	39
2	(0, 0.6, 0, 0)	2	66.54
	(0, 0, 0, 0.4)	11
	(0, 0, 0, 0.47)	36
	(0, 0, 0, 0.68)	37
3	(0.6, 0, 0, 0)	3	36.55
	(0.4, 0, 0, 0)	12
	(0, 0, 0, 0.47)	36
	(0.68, 0, 0, 0)	39
4	(0.6, 0, 0, 0)	3	18.36
	(0.4, 0, 0, 0)	12
	(0.47, 0, 0, 0)	34
	(0.68, 0, 0, 0)	39
5	(0, 0.6, 0, 0)	2	40.96
	(0, 0, 0, 0.4)	11
	(0, 0.47, 0 ,0)	35
	(0.68, 0, 0, 0)	39
6	(0, 0, 0, 0.6)	1	51.24
	(0, 0, 0.4, 0)	10
	(0.47, 0, 0, 0)	34
	(0, 0.68, 0, 0)	38
7	(0, 0, 0, 0.6)	1	44.87
	(0, 0, 0.4, 0)	10
	(0.47, 0, 0, 0)	34
	(0.68, 0, 0, 0)	39
8	(0, 0.6, 0, 0)	2	66.54
	(0, 0, 0, 0.4)	11
	(0, 0.47, 0 ,0)	35
	(0, 0, 0, 0.68)	37
9	(0.6, 0, 0, 0)	3	43.16
	(0.4, 0, 0, 0)	12
	(0, 0, 0, 0.47)	36
	(0, 0, 0, 0.68)	37
10	(0, 0.6, 0, 0)	2	66.54
	(0, 0, 0, 0.4)	11
	(0, 0, 0, 0.47)	36
	(0, 0, 0, 0.68)	37
11	(0, 0.6, 0, 0)	2	66.54
	(0, 0, 0, 0.4)	11
	(0, 0.47, 0 ,0)	35
	(0, 0, 0, 0.68)	37
12	(0, 0, 0, 0.6)	1	69.33
	(0, 0, 0.4, 0)	10
	(0, 0.47, 0 ,0)	35
	(0, 0, 0, 0.68)	37

Therefore, considering Mary, KAL belongs to the “poor” fuzzy set with a 0.6 degree of membership and to the “moderate” fuzzy set with a 0.4 degree of membership, PPK is absolutely “poor”, FSE is “high” with a 0.47 membership degree and FLE is “low” with a 0.68 membership degree. Considering Gloria, KAL is absolutely “high”, PPK is absolutely “moderate”, FSE is absolutely “low” and FLE is absolutely “high”. Finally, considering Adam, KAL belongs to “moderate” fuzzy set with a 0.8 membership degree and to a “high” fuzzy set with a 0.4 membership degree, PPK belongs to “moderate” fuzzy set with a 0.5 membership degree and to “high” fuzzy set with a 0.5 membership degree, FSE belongs to “low” fuzzy set with a 0.08 membership degree and to “medium” fuzzy set with a 0.53 membership degree and FLE is absolutely “medium”.

From the pool, only the activities for which KAL equals to 4 are chosen for all three learners. Let us assume that only 12 activities concern the 4 ${}^{\text{th}}$ knowledge level. In Table 8, the characteristics of these 12 activities are presented.

Taking into consideration the characteristics of learners and the twelve activities, the system selects the fuzzy rules that have to be triggered in order to estimate the relevance level of each activity to each individual learner’s needs (Tables 9, 10 and 11).

From Table 9, the sequence of activities that the system selects to include in the created test for Mary is: 12, 2, 8, 10, 11, 6, 1, 7, 9, 5, 3, 4.

Table 10

Estimated relevance level of each activity for Gloria

Activity ID	Relevance	Rules	Defuzzification result
1	(1, 0, 0, 0)	22	40
	(0, 0, 0, 1)	28
	(0, 0, 0, 1)	45
2	(0, 1, 0, 0)	23	31.84
	(1, 0, 0, 0)	30
	(1, 0, 0, 0)	43
3	(0, 0, 1, 0)	24	47.63
	(1, 0, 0, 0)	30
	(0, 0, 0, 1)	45
4	(0, 0, 1, 0)	24	83.64
	(0, 0, 0, 1)	28
	(0, 0, 0, 1)	45
5	(0, 1, 0, 0)	23	67.08
	(0, 1, 0, 0)	29
	(0, 0, 0, 1)	45
6	(1, 0, 0, 0)	22	44.19
	(0, 0, 0, 1)	28
	(0, 1, 0, 0)	44
7	(1, 0, 0, 0)	22	40
	(0, 0, 0, 1)	28
	(0, 0, 0, 1)	45
8	(0, 1, 0, 0)	23	31.84
	(0, 1, 0, 0)	29
	(1, 0, 0, 0)	43
9	(0, 0, 1, 0)	24	34.83
	(1, 0, 0, 0)	30
	(1, 0, 0, 0)	43
10	(0, 1, 0, 0)	23	31.84
	(1, 0, 0, 0)	30
	(0, 1, 0, 0)	44
11	(0, 1, 0, 0)	23	52
	(0, 1, 0, 0)	29
	(0, 1, 0, 0)	44
12	(1, 0, 0, 0)	22	31.84
	(0, 1, 0, 0)	29
	(1, 0, 0, 0)	43

Table 11

Estimated relevance level of each activity for Adam

Activity ID	Relevance	Rules	Defuzzification result
1	(0, 0.5, 0, 0)	13	46.89
	(0.5, 0, 0, 0)	16
	(0.4, 0, 0, 0)	22
	(0.4, 0, 0, 0)	25
	(0, 0, 0, 0.08)	28
	(0.53, 0, 0, 0)	31
	(0, 0, 1, 0)	42
2	(0, 0, 0, 0.5)	14	41.49
	(0, 0, 0, 0.5)	17
	(0, 0.4, 0, 0)	23
	(0, 0.4, 0, 0)	26
	(0.08, 0, 0, 0)	30
	(0, 0, 0.53, 0)	33
	(1, 0, 0, 0)	40
3	(0.5, 0, 0, 0)	15	51.59
	(0, 0,5 0, 0)	18
	(0, 0, 0.4, 0)	24
	(0 0, 0, 0.4)	27
	(0.08, 0, 0, 0)	30
	(0, 0, 0.53, 0)	33
	(0, 0, 1, 0)	42
4	(0.5, 0, 0, 0)	15	50.94
	(0, 0.5 0, 0)	18
	(0, 0, 0.4, 0)	24
	(0, 0, 0, 0.4)	27
	(0, 0, 0, 0.08)	28
	(0.53, 0, 0, 0)	31
	(0, 0, 1, 0)	42
5	(0, 0, 0, 0.5)	14	70.02
	(0, 0, 0, 0.5)	17
	(0, 0.4, 0, 0)	23
	(0, 0.4, 0, 0)	26
	(0, 0.08, 0, 0)	29
	(0, 0, 0, 0.53)	32
	(0, 0, 1, 0)	42
6	(0, 0.5, 0, 0)	13	51.53
	(0.5, 0, 0, 0)	16
	(0.4, 0, 0, 0)	22
	(0.4, 0, 0, 0)	25
	(0, 0, 0, 0.08)	28
	(0.53, 0, 0, 0)	31
	(0, 0, 0, 1)	41
7	(0, 0.5, 0, 0)	13	46.89
	(0.5, 0, 0, 0)	16
	(0.4, 0, 0, 0)	22
	(0.4, 0, 0, 0)	25
	(0, 0, 0, 0.08)	28
	(0.53, 0, 0, 0)	31
	(0, 0, 1, 0)	42
8	(0, 0, 0, 0.5)	14	37.44
	(0, 0, 0, 0.5)	17
	(0, 0.4, 0, 0)	23
	(0, 0.4, 0, 0)	26
	(0, 0.08, 0, 0)	29
	(0, 0, 0, 0.53)	32
	(1, 0, 0, 0)	40
9	(0.5, 0, 0, 0)	15	40.56
	(0, 0,5 0, 0)	18
	(0, 0, 0.4, 0)	24

Table 11, continued
Activity ID	Relevance	Rules	Defuzzification result
	(0 0, 0, 0.4)	27
	(0.08, 0, 0, 0)	30
	(0, 0, 0.53, 0)	33
	(1, 0, 0, 0)	40
10	(0, 0, 0, 0.5)	14	72.17
	(0, 0, 0, 0.5)	17
	(0, 0.4, 0, 0)	23
	(0, 0.4, 0, 0)	26
	(0.08, 0, 0, 0)	30
	(0, 0, 0.53, 0)	33
	(0, 0, 0, 1)	41
11	(0, 0, 0, 0.5)	14	72.59
	(0, 0, 0, 0.5)	17
	(0, 0.4, 0, 0)	23
	(0, 0.4, 0, 0)	26
	(0, 0.08, 0, 0)	29
	(0, 0, 0, 0.53)	32
	(0, 0, 0, 1)	41
12	(0, 0.5, 0, 0)	13	37.9
	(0.5, 0, 0, 0)	16
	(0.4, 0, 0, 0)	22
	(0.4, 0, 0, 0)	25
	(0, 0.08, 0, 0)	29
	(0, 0, 0, 0.53)	32
	(1, 0, 0, 0)	40

From Table 10, the sequence of activities that the system selects to include in the created test for Gloria is: 4, 5, 11, 3, 6, 1, 7, 9, 2, 8, 10, 12.

From Table 11, the sequence of activities that the system selects to include in the created test for Adam is: 11, 10, 5, 3, 6, 4, 1, 7, 2, 9, 12, 8.

5. Evaluation

5.1 Method and participants

The presented fuzzy-based mechanism for automatic creation of personalized tests was incorporated in an adaptive e-learning system for teaching computer programming. The system offers complete learning in computer programming starting from variables and operators and ending with sub-programming. The system models the characteristics and learning needs of each individual learner and adapts the delivery of the learning material to them [43]. The system was evaluated from both experts in computer programming and students of computer programming. The evaluation was conducted through questionnaires that included questions concerning, mainly, the assessment tests.

The e-learning system was used by 65 undergraduate students of the Department of Informatics of the University of Piraeus, Greece. The students were taught computer programming and the language C in a class lasting 10 weeks. Then, they were asked to use the e-learning system for a period of 3 weeks to assess their knowledge on computer programming and complete their education on it. From the 65 undergraduate students, 27 were female and 38 were male. Furthermore, 42 were from 18 to 20 years old, 16 were 20 to 22 years old, and 7 were over 22 years old. All students had experience in using computers and navigating in software. In addition, before using the e-learning system, all students attended a detailed demonstration of it and were given comprehensive user manuals as a guide for its use. Furthermore, there was continuous online assistance during the period of the 3 weeks that the students used the system. After this period of system usage, the questionnaire of Table 12 was presented to the students. The questionnaire included eleven close-end questions based on the Likert scale [44] with five possible responses ranging from “far below expectations” (1) to “far above expectations” (5).

Table 12
Questionnaire for students

	Question	Responses (1 – Far below expectations, 2 – Below expectations, 3 – Meets expectations, 4 – Above expectations, 5 – Far above expectations)
1	How much did you like the educational software?	1	2	3	4	5
2	How interesting was the educational software?	1	2	3	4	5
3	How clear was the aim of the educational software?	1	2	3	4	5
4	Does the educational software help you to enhance your knowledge on computer programming?	1	2	3	4	5
5	How tailored were the activities in tests to your knowledge level?	1	2	3	4	5
6	How tailored were the activities in tests to your learning needs?	1	2	3	4	5
7	How difficult were the activities in tests?	1	2	3	4	5
8	How easy were the activities in tests?	1	2	3	4	5
9	Did the activities in tests help you to recognize your misconceptions?	1	2	3	4	5
10	Did you meet the same activity several times in the tests?	1	2	3	4	5
11	Evaluate the effectiveness of the educational software concerning that knowledge gaining.	1	2	3	4	5

Furthermore, during the period of the e-learning system usage, eight experts in computer programming observed the students’ interactions with the system. The eight experts were members of the teaching staff or external instructors of the Department of Informatics of the University of Piraeus. All of them had at least five years of experience in teaching computer programming and the C programming language. Also, five of them had been engaged in educational software and e-learning research for at least ten years. In each one of the seven experts, a group of eight students was assigned. A group of nine students was assigned to the 8th expert. The experts were informed about the knowledge level and learning needs of the students in their groups and observed them during their use of the e-learning system. Therefore, the experts obtained good knowledge of their students’ progress and of the system reactions concerning the students’ assessment. After the period of system usage, the questionnaire of Table 13 was given to the experts. The questionnaire included twelve close-end questions based on the Likert scale [44] with five possible responses ranging from “far below expectations” (1) to “far above expectations” (5).

Table 13

Questionnaire for experts

	Question	Responses (1 – Far below expectations, 2 – Below expectations, 3 – Meets expectations, 4 – Above expectations, 5 – Far above expectations)
1	How much did you like the educational software?	1	2	3	4	5
2	How interesting was the educational software?	1	2	3	4	5
3	How clear was the aim of the educational software?	1	2	3	4	5
4	Evaluate the effectiveness of the educational software concerning that knowledge gaining.	1	2	3	4	5
5	How tailored were the activities in tests to the learners’ knowledge level?	1	2	3	4	5
6	How tailored were the activities in tests to the learners’ learning needs?	1	2	3	4	5
7	Does the activities in tests help the learners to recognize their misconceptions?	1	2	3	4	5
8	Does the generation of activities in tests is random?	1	2	3	4	5
9	Evaluate the appropriateness of activities for each learner.	1	2	3	4	5
10	How often is generated the same test?	1	2	3	4	5
11	How often a particular activity is met by a particular learner?	1	2	3	4	5
12	Evaluate the generated tests in general.	1	2	3	4	5

5.2 Results and discussion

In this section the evaluation results are presented and discussed. Concerning the system acceptance by the students, 44.63% liked the system “above expectations” and 33.84% of them liked it “far above expectations”. Furthermore, 38.48% of the participants considered that the e-learning system is interesting “far above expectations” and the 40% of them considered that the system is interesting “above expectations”. Consequently, the e-learning system for computer programming learning achieved great acceptance by the students. In addition, 40% and 46.15% of the learners found that the aim of the educational software was clear “far above” and “above” expectations respectively. Also, 26.15% and 53.85% of the students noted that the e-learning system helped them enhance their knowledge of computer programming “far above” and “above” expectations, respectively. On the other hand, 13.85% of the learners considered that the system met their expectations in helping them to enhance their knowledge of computer programming. As a result, the overall student satisfaction from the e-learning system was very high.

Table 14
The evaluation results concerning the students’ reactions and opinions

		Far below expectations	Below expectations	Meets expectations	Above expectations	Far above expectations	Result
Acceptance	Likeability	–	3.07%	18.46%	44.63%	33.84%	Very high
	Interest	1.53%	1.53%	18.46%	40%	38.48%	Very high
Satisfaction	Clearance of the aim	–	–	13.84%	40%	46.15%	Very high
	Knowledge enhancement	–	6.15%	13.85%	53.85%	26.15%	High
Adaptivity	Tailored activities to the knowledge level	–	3.07%	9.23%	33.85%	53.85%	Very high
	Tailored activities to the learning needs	–	4.61%	10.77%	52.31%	32.31%	Very high
	Misconception recognition	1.53%	4.61%	23.08%	47.69%	23.08%	High
Content & complexity	Difficulty	4.62%	26.15%	40%	20%	9.23%	Medium
	Easiness	3.07%	16.92%	61.54%	15.39%	3.07%	Medium
	Activity repeatability	6.15%	36.92%	44.62%	9.23%	3.07%	Low

Table 15

The evaluation results concerning the experts’ reactions and opinions

		Far below expectations	Below expectations	Meets expectations	Above expectations	Far above expectations	Result
Acceptance	Likeability	–	–	12.5%	62.5%	20%	Very high
	Interest	–	–	12.5%	50%	37.5%	Very high
Satisfaction	Clearance of the aim	–	–	–	37.5%	62.5%	Very high
	Knowledge gaining	–	–	–	52.31%	27.69%	High
Adaptivity	Tailored activities to the knowledge level	–	–	12.5%	50%	37.5%	Very high
	Tailored activities to the learning needs	–	–	12.5%	62.5%	25%	High
	Misconception recognition	–	–	12.5%	87.5%	–	High
	Appropriateness	–	–	–	50%	50%	Very high
Content	Randomness	–	–	–	37.5%	62.5%	Very high
	Frequency of a test occurrence	62.5%	37.5%	–	–	–	Very low
	Activity repeatability	12.5%	50%	37.5%	–	–	Low

Concerning the adaptivity of the generated tests, the opinion of the students was also very positive. Particularly, 53.85%, 33.85% and 9.23% respectively considered that the generated tests were tailored to their knowledge level “far above expectations”, “far expectations” and “as expected”. The corresponding percentages for how much tailored to the students’ learning needs were the generated test are 32.31%, 52.31% and 10.77%. In addition, 23.08% of the students believed that the tests helped them to recognize their misconceptions “far above expectations”. The percentage of the students, who believed that the tests helped them to recognize their misconceptions “above expectations” is 47.69%. Finally, 23.08% of the students considered that the ability of the system test to recognize their misconceptions met their expectations. As a consequence, the students’ opinion and reactions provide solid indication that the system generated tests are adaptive to the students’ needs and allow them to experience a personalized assessment process.

Concerning the test content and difficulty, 40% of the students found the test activities as difficult as they expected, while 61.54% of the students found the test activities as easy as they expected. For 26.15% of the students, the difficulty of the activities was below their expectations, while for 20% it was above their expectations. Furthermore, for 16.92% of the students the activity easiness was below their expectations, while for 15.39% it was above their expectations. Therefore, the generated test difficulty can be characterized as balanced.

On the other hand, 44.62% of the students stated that they met the same activity in tests as many times as they expected, 36.92% met the same activity in tests fewer times than they expected, 6.15% met the same activity in tests much fewer times than they expected, 9.23% met the same activity in tests more times than they expected and 3.08% met the same activity in tests much more times than they expected. Consequently, the generated tests were not repetitive, something that can cause boredom to students and decrease in the assessment effectiveness. The evaluation results concerning the students’ reactions and opinions are summarized in Table 14.

Concerning the system evaluation by the experts, the results are also very positive. In more details, 2 of them liked the system “far above expectations”, 5 of them liked the system “above expectations”, and only 1 of them liked the system as s/he expected. Furthermore, 3 of them found the system as interesting “far above expectations”, 4 of them found the system as interesting “above expectations”, and only 1 of them found the system as interesting as s/he expected. Therefore, the system received great acceptance by the experts.

Also, 62.5% of the experts considered that the aim of the educational software was clear “far above expectations”. The other 37.5% considered that the aim of the educational software was clear “above expectations”. Concerning the system effectiveness in knowledge gaining, 62.5% of the experts evaluated it as “above expectations”, 25% evaluated it as “far above expectations” and only 12.56% evaluated it as “meets expectations”. Concerning the generated tests of the system, 62.5% of the experts evaluated them as “above expectations” and the rest 37.5% evaluated them as “far above expectations”. Consequently, experts were very satisfied with the e-learning system, its contribution to knowledge gaining and the system generated tests.

Concerning the adaptivity of the generated tests, the experts’ opinion was also very positive. Particularly, 62.5% of the experts considered that the tests were “far above expectations” tailored to the learners’ knowledge level, and 37.5% considered that the tests were “far above expectations” tailored to the learners’ learning needs. On the other hand, 37.5% of the experts considered that the tests were “above expectations” tailored to the learners’ knowledge level, and 62.5% considered that the tests were “far above expectations” tailored to the learners’ learning needs. Furthermore, 7 out of the 8 experts (87.5%) were convinced that the generated test helped the learners to recognize their misconceptions “above expectations”. Therefore, they recognized that the tests were generated taking into account the students’ knowledge level and different learning needs.

Moreover, half of the experts evaluated the appropriateness of test activities for each student as “far above expectations” and the other half of the experts evaluated it as “above expectations”. Thus, the experts strongly believed that the fuzzy-based mechanism creates personalized assessment tests that contribute positively to the students’ learning process.

Concerning the tests’ content and frequency of occurrence, the experts’ evaluation results are very positive. 62.5% of the experts believed that the generation of activities in tests is random “far above expectations”. The same percentage of experts considered that the frequency of occurrence of a test was “far below expectations”. Also, 37.5% of the experts believed that the generation of activities in tests is “above expectations” random. The same percentage (37.5%) of experts considered that the frequency of occurrence of a test was “below expectations”. Furthermore, the frequency that a particular activity is included in tests for a particular learner was “below expectations” for 50% of the experts, “far below expectations” for 12.5% of the experts and met the expectations for the 37.5% of the experts. As a consequence, the generated tests are different and adapted to the students’ learning profiles and needs. This makes the students’ assessment more effective. The overall evaluation results concerning the experts’ reactions and opinions are summarized in Table 15.

6. Conclusion

In this paper, a fuzzy-based mechanism was presented for automatic personalized assessment in an e-learning system for computer programming. The mechanism selects for each individual learner the most appropriate assessment item to be included in the created assessment test. The selection is based on the learner’s knowledge level, background knowledge, learning deficiencies and error proneness and on activity characteristics like their difficulty level and their relation to syntactical and/or logical issues. These data are described with linguistic terms through fuzzy sets. 45 fuzzy rules are developed which imitate the human tutor way of thinking in selecting assessment activities. These rules are applied over the defined fuzzy sets to calculate the degree of relevance to a particular learner’s needs of each assessment activity in the pool. In this way, the system succeeds to automatically create adaptive tests that are tailored to each individual learner characteristics and needs.

The presented mechanism was used under real conditions by 65 under-graduate students of the Department of Informatics of the University of Piraeus, Greece. They evaluated the created adaptive tests through questionnaires. An additional eight experts in computer programming participated in the evaluation process. The evaluation results showed that the presented system receives great acceptance by both students and experts. Furthermore, both students and the experts were very satisfied by the created assessment tests. They evaluated the generated test difficulty as balanced. Also, according to their opinion, the generated tests do not often repeat the same activity. In addition, they consider that the activities in the assessment tests meet their knowledge level and learning needs. As a consequence, the automatically created assessment tests contribute to the knowledge gaining and support the learning process in a very effective way.

It the future, we will perform a more thorough evaluation of the fuzzy-based mechanism, including participants of a variety of ages and from a variety of educational program backgrounds. In addition, we will apply the presented fuzzy-based mechanism for creating adaptive tests in systems, which concern other educational fields beyond from computer programming. Furthermore, we plan to include pedagogical and psychological theories in the selection process of the most appropriate activities in the pool to be included in the created adaptive test.

References

Cho

Kim

. Production of Mobile English Language Teaching Application Based on Text Interface Using Deep Learning. Electronics. 2021; 10(15): 1809.

Sáiz-Manzanares

Marticorena-Sánchez

Ochoa-Orihuel

. Using Advanced Learning Technologies with University Students: An Analysis with Machine Learning Techniques. Electronics. 2021; 10(21): 2620.

Alonso-Secades

López-Rivero

A-J

Martín-Merino-Acera

Ruiz-García

M-J

Arranz-García

. Designing an Intelligent Virtual Educational System to Improve the Efficiency of Primary Education in Developing Countries. Electronics. 2022; 11(9): 1487.

O’Donnell

Lawless

Sharp

Wade

. A review of personalised e-learning: Towards supporting learner diversity. International Journal of Distance Education Technologies. 2015; 13(1): 22-47.

Förster

Weiser

Maur

. How feedback provided by voluntary electronic quizzes affects learning outcomes of university students in large classes. Computers & Education. 2018; 121: 100-114.

Hssina

Erritali

. A personalized pedagogical objectives based on a genetic algorithm in an adaptive learning system. Procedia Computer Science. 2019; 151: 1152-1157.

Hwang

Sung

Chang

Huang

. A fuzzy expert system-based adaptive learning approach to improving students’ learning performances by considering affective and cognitive factors. Computers and Education: Artificial Intelligence. 2020; 1: 100003.

Virvou

Alepis

Tsihrintzis

Jain

. Machine learning paradigms. In: Machine Learning Paradigms. Springer, Cham; 2020. pp. 1-5.

AlShaikh

Hewahi

. Ai and machine learning techniques in the development of Intelligent Tutoring System: A review. In: 2021 International Conference on innovation and Intelligence for informatics, computing, and technologies (3ICT). IEEE; 2021. pp. 403-410.

10.

Tsihrintzis

Virvou

Hatzilygeroudis

. Special Collection of Extended Selected Papers on “Novel Research Results Presented in The 12th International Conference on Information, Intelligence, Systems and Applications (IISA2021), 12–14 July 2021, Chania, Crete, Greece”. Intelligent Decision Technologies. 2021. Available from: https://easyconferences.eu/iisa2021/.

11.

Chrysafiadi

Papadimitriou

Virvou

. Cognitive-based adaptive scenarios in educational games using fuzzy reasoning. Knowledge-Based Systems. 2022; 250: 109111.

12.

Chrysafiadi

Virvou

. Student modeling approaches: A literature review for the last decade. Expert Systems with Applications. 2013; 40(11): 4715-4729.

13.

Paladines

Ramírez

. A systematic literature review of intelligent tutoring systems with dialogue in natural language. IEEE Access. 2020; 8: 164246-164267.

14.

Mousavinasab

Zarifsanaiey

Niakan Kalhori

Rakhshan

Keikha

Ghazi Saeedi

. Intelligent tutoring systems: a systematic review of characteristics, applications, and evaluation methods. Interactive Learning Environments. 2021; 29(1): 142-163.

15.

Ouyang

Jiao

. Artificial intelligence in education: The three paradigms. Computers and Education: Artificial Intelligence. 2021; 2: 100020.

16.

Vie

Popineau

Tort

Marteau

Denos

. A heuristic method for large-scale cognitive-diagnostic computerized adaptive testing. In: Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale. 2017. pp. 323-326.

17.

Melesko

Novickij

. Computer adaptive testing using upper-confidence bound algorithm for formative assessment. Applied Sciences. 2019; 9(20).

18.

Yijun

Yong

Maorong

. Advances in Computerized Adaptive Testing. In: 2020 International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI). IEEE; 2020. pp. 202-205.

19.

Keskin

Gunay

. A Survey On Computerized Adaptive Testing. In: 2021 Innovations in Intelligent Systems and Applications Conference (ASYU). IEEE; 2021. pp. 1-6

20.

Wang

Zhang

. Design of an adaptive examination system based on artificial intelligence recognition model. Mechanical Systems and Signal Processing. 2020; 142: 106656.

21.

Chrysafiadi

Virvou

. Fuzzy logic for adaptive instruction in an e-learning environment for computer programming. IEEE Transactions on Fuzzy Systems. 2015; 23(1): 164-177.

22.

Sweta

Lal

. Optimized Fuzzy Rule-Based System to Measure Uncertainty in Human Decision Making System. In: Soft Computing: Theories and Applications. Simgapore: Springer; 2020. pp. 799-811.

23.

Iqbal

Zhao

Cheok

. Estimation of Machining Sustainability Using Fuzzy Rule-Based System. Materials. 2021; 14(19): 5473.

24.

Yang

Liu

Wang

. An improved fuzzy rule-based system using evidential reasoning and subtractive clustering for environmental investment prediction. Fuzzy sets and systems. 2021; 421: 44-61.

25.

Zadeh

. Fuzzy logic = Computing with words. IEEE Transactions on Fuzzy Systems. 1996; 4(2): 103-111.

26.

Eryılmaz

Adabashi

. Development of an intelligent tutoring system using bayesian networks and fuzzy logic for a higher student academic performance. Applied Sciences. 2020; 10(19): 6638.

27.

Makram

Mourad

Adnane

Karim

. Adaptive tutoring system based on fuzzy logic. International Journal of Advanced Intelligence Paradigms. 2020; 16(2): 132-144.

28.

Bhardwaj

Sharma

. An advanced uncertainty measure using fuzzy soft sets: Application to decision-making problems. Big Data Mining and Analytics. 2021; 4(2): 94-103.

29.

Shute

Rahimi

. Review of computer-based assessment for learning in elementary and secondary education. Journal of Computer Assisted Learning. 2017; 33(1): 1-19.

30.

Ross

Chase

Robbie

Oates

Absalom

. Adaptive quizzes to increase motivation, engagement and learning outcomes in a first year accounting unit. International Journal of Educational Technology in Higher Education. 2018; 15(1): 1-14.

31.

Ridwan

Wiranto

Dako

RDR

. Ability estimation in computerized adaptive test using Mamdani Fuzzy Inference System. In: IOP Conference Series: Materials Science and Engineering. IOP Publishing; 2020. 850(1): p. 012004.

32.

Bernardi

Innamorati

Padovani

Romanelli

Saggino

Tommasi

Vittorini

. On the design and development of an assessment system with adaptive capabilities. In: International Conference in Methodologies and intelligent Systems for Techhnology Enhanced Learning. 2018. pp. 190-199.

33.

Comas-Lopez

Hincz

Gámez

Yáñez-Mo

Sacha

. Adaptive tests as a supporting tool for self-evaluation in theoretical and practical contents in Biochemistry. In: Proceedings of the Sixth International Conference on Technological Ecosystems for Enhancing Multiculturality. 2018. pp. 180-184.

34.

Jatobá

VMF

Farias

Freire

Ruela

Delgado

. ALICAT: a customized approach to item selection process in computerized adaptive testing. Journal of the Brazilian Computer Society. 2020; 26(1): 1-13.

35.

Kozmina

Lukyantsev

Musorina

. Computer adaptive testing as an automated control of students’ level of preparadness taking into account their individual characteristics. In: 2020 V International Conference on Information Technologies in Engineering Education (Inforino). 2020. pp. 1-4.

36.

Čisar

Pinter

. Evaluation of knowledge in Object Oriented Programming course with computer adaptive tests. Computers & Education. 2016; 92: 142-160.

37.

Chrysafiadi

Virvou

. Create dynamically adaptive test on the fly using fuzzy logic. In: the 9th International Conference on Information, Intelligence, Systems and Applications. 2018. pp. 1-8.

38.

Soltanpoor

Thevathayan

D’Souza

. Adaptive remediation for novice programmers through personalized prescriptive quizzes. In: Proceedings of the 23rd Annual ACM Conference on Innovation and Technology in Computer Science Education. 2018. pp. 51-56.

39.

Iancu

. The usage of adaptive assessments in the computer science learning process: A case study for Java. In: Proc. IE Int. Conf. 2020, pp. 8-13.

40.

Badaracco

Martínez

. A fuzzy linguistic algorithm for adaptive test in Intelligent Tutoring System based on competences. Expert Systems with Applications. 2013; 40(8): 3073-3086.

41.

Lendyuk

Sachenko

Rippa

Sapojnyk

. Fuzzy rules for tests complexity changing for individual learning path construction. In: IEEE 8th International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). 2015. pp. 945-948.

42.

Klir

Yuan

. Fuzzy sets and fuzzy logic. Vol. 4. New Jersey: Prentice hall; 1995.

43.

Chrysafiadi

Virvou

. Dynamically personalized e-training in computer programming and the language C. IEEE Transactions on Education. 2013; 56(4): 385-392.

44.

Schrum

Johnson

Ghuy

Gombolay

. Four years in review: Statistical practices of likert scales in human-robot interaction studies. In: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. 2020. pp. 43-52.

A fuzzy-based mechanism for automatic personalized assessment in an e-learning system for computer programming

Abstract

Keywords

1. Introduction

2. Related work

Table 1 A sample of the pool of activities

3.1 The pool of activities

Table 7 Criteria values for Mary, Gloria and Adam

5.1 Method and participants

Table 12 Questionnaire for students

Table 14 The evaluation results concerning the students’ reactions and opinions

References

Table 1
A sample of the pool of activities

Table 7
Criteria values for Mary, Gloria and Adam

Table 12
Questionnaire for students

Table 14
The evaluation results concerning the students’ reactions and opinions