Fuzzy linguistic descriptions for execution trace comprehension and their application in an introductory course in artificial intelligence

Abstract

Execution traces comprehension is an important topic in computer science since it allows software engineers to get a better understanding of the system behavior. However, traces are usually very large and hence they are difficult to interpret. Parallel, execution traces comprehension is a very important topic into the algorithms learning courses since it allows students to get a better understanding of the algorithm behavior. Therefore, there is a need to investigate ways to help students (and teachers) find and understand important information conveyed in a trace despite the trace being massive.

In this paper, we propose a new approximation for execution traces comprehension based on fuzzy linguistic descriptions. A new methodology and a data-driven architecture based on linguistic modelling of complex phenomenon are presented and explained. In particular, they are applied to automatically generate linguistic reports from execution traces generated during the execution of algorithm implemented by the students of an introductory course of artificial intelligence. To the best of our knowledge, it is the first time that linguistic modelling of complex phenomenon is applied to execution traces comprehension.

Throughout the article, it is shown how this kind of technology can be employed as a useful computer-assisted assessment tool that provides students and teachers with technical, immediate and personalised feedback about the algorithms that are being studied and implemented. At the same time, they provide us with two useful applications: they are an indispensable pedagogical resource for improving comprehension of execution traces, and they play an important role in the process of measuring and evaluating the “believability” of the agents implemented.

To show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of the authors, and it has been incorporated into the process of assessment of an introductory artificial intelligence course. Finally, an empirical evaluation to confirm our hypothesis was performed and a survey directed to the students was carried out to measure the quality of the learning-teaching process by using this methodology enriched with fuzzy linguistic descriptions.

Keywords

Computational intelligence linguistic descriptions of data linguistic modelling of complex phenomena computer game bots turing test computer-assisted assessment

1 Introduction

1.1 Computer games for motivating students

Keeping students motivated is an important goal for undergraduate students [1]. In particular, a challenge in teaching introductory artificial intelligence (AI) is to provide students with significant experiences and frequent feedback [2]. In this sense, computer games-based learning methodologies are being employed to provide students with motivating experiences.

Computer games have a long history of motivating learning computer science. In fact, integrating games into the computer science curriculum has been gaining acceptance in recent years, particularly when they are used to improve student engagement in introductory courses. It is important to note that computer games are not used for teaching in the classic way (serious games), but students design and implement their own computer games and artificial intelligence agents.

Feedback and motivation are also indispensable components of an effective teaching and learning environment in education [3 –6]. Additionally, personalisation offers possibilities to deliver feedback that is the most appropriate for the user’s expertise, cognitive abilities and that addresses their current moods and attentiveness [7]. However, providing students with personalised, immediate and motivating feedback is a complex task, and it is usually a standardised process (every student receives the same feedback, e.g., knowledge of the correct response) due to the large number of students [7]. In writing skill, for example, truly immediate feedback is impractical [8].

Therefore, our idea is to provide students with two benefits: a learning methodology for artificial intelligence based on computer games and personalised feedback.

1.2 Challenges for incorporating games into an introductory AI subject

In 2011, a project-based learning methodology based on computer games was designed and implemented into the introductory artificial intelligence subject at the University of Bío-Bío.

The AI projects consist of developing a 2D computer game (see Fig. 1) based on the classic Pac Man computer game with some important differences. The player navigates through a maze containing four rewards and three opponents. The goal of the game is to have energy as long as possible by collecting the rewards and escaping from opponents. The three opponents roam the scene and chase the player. The player begins with 20 energy points. If the player comes into contact with an opponent, he loses 5 energy points. If the players earn a reward, he gets 10 extra energy points. The game ends when all the energy has been collected. The player loses 1 point of energy every five seconds.

Fig. 1

Example of a computer game developed by a student at the University of Bio-Bio. In the figure is shown a complete scenario (down) and a 2D scenario with the entities explained in Section 3 (top).

The player and opponents are computer game bots designed and implemented by the students using the heuristic algorithms studied during the course. The student must design and implement adequate heuristics metrics to design believable agents. An important project requirement is that the bots must be implemented by using heuristic algorithms, while opponent bots must be programmed by using the breadth-first search algorithm.

The traditional assessment of this kind of project has been performed by visually checking if the bots developed by the students have been correctly designed (revising the code developed for each student) and implemented (revising the codes functionality). This process has important flaws.

It is a time-consuming task, mainly due to the excessive time required by the teacher to check the project’s functionality. This becomes a serious problem when the number of students is high, and there is only one teacher. It produces difficulty for using individual project-based learning.

It is a complex task, mainly due to the difficulty of evaluating many important details about the implementation, which are usually missed in an execution trace: quantity (memory occupied, iterations performed, data structure used, etc.) and quality (whether the artificial intelligence agent is good at capturing coins in the virtual world: is it fast, brave, intelligent?). In fact, to perform a complete analysis for both of them -quantity and quality- is a difficult task due to the nature of algorithms which are executed in a short period of time and generate considerable amount of data at the same time. Additionally, movements performed by the bots and human players can be very fast, and the spectator (in this case, the teacher) could miss many important details.

A possible solution is to analyse execution traces or employ code debugging so that the programs can be analysed, but as it has been mentioned and explained in several works [9], even this alternative could result in a very complex task due to the large amount of generated data. For example, for an actual play session, each line of a log file is formed by almost 21 values which are generated each second, so the total size for this file is approximately 800 lines (see Fig. 2 to understand the difficulty of interpreting this kind of file).

Fig. 2

An example of an execution trace. For each line is shown the values for each variables. Description of the variables is also provided in the figure. Comprehending an execution trace can be a very difficult task.

1.3 How teachers and students can be helped by fuzzy linguistic descriptions in this process

In this scenario, fuzzy linguistic descriptions technology could play a key role in the computer game-based methodology since it would allow us to:

Interpret large amounts of numerical data quickly and accurately.

Provide students with immediate, personalised and accountable feedback

Obtain a better understanding of the content. This implies a greater motivation for addressing the subject and a greater cohesion and effectiveness regarding the information that is conveyed to the students in the early stages of their learning.

Obtain a better understanding of the execution traces. This implies having a useful complement for the classic debuggers, which sometimes require expert knowledge.

In the literature, some works have been proposed to provide learners and/or teachers with resources based on fuzzy linguistic descriptions from data generated in the teaching-learning process. In the non-fuzzy area, works can be grouped into five main categories: statistical, natural language processing (NLP), information extraction (IE), clustering and integrated-approaches [10]. Several examples of successful applications can be found in the literature: automatic creation of summary assessments for intelligent tutoring systems [11], automatic generation of formative feedback in the university classroom for specific concept maps scaffold students’ writing [12], a framework to provide students with feedback on algebra homework in middle-school classrooms [13], automatic test-based assessment of programming [14], automatic assessment of free text answers using a modified BLEU algorithm [10], and feedback for serious computer games to provide learners with useful and immediate information about the player’s performance [15].

In the fuzzy logic area [16], the following works should be mentioned: linguistic summaries of graph datasets using ontologies [17], automatic textual reporting in learning analytics dashboards [18], feedback reports for students based on several performance factors [19], and reports describing the learner’s rating in a specific learning activity [20]. In [21] linguistic descriptions were used for improving the player experience in a computer game called YADY (your actions define you). There are remarkable differences with respect to the present work. In [21], the feedback aimed to improve the player experience; the current work focuses on providing users (students) with written, immediate and interpretable feedback that aims to support the teaching-learning process.

In this paper, a methodology and a data-driven architecture software are proposed to automatically generate personalised and technical feedback from the data generated during the algorithm execution implemented by the students. A combination of three computational techniques is proposed: bot’s behaviour analysis, computational perception networks and natural language generation based on templates. The idea is that each student receives immediate, technical and personalised feedback about their mistakes made during the development of the project, and they learn about how heuristic algorithms can be employed for programming computer game bots. Additionally, another important challenge for us is to assess the ”believability” of the AI agents implemented by the students, hence a similarity measure between linguistic descriptions for evaluating and comparing the behaviour between agents and human expert players is also proposed (see Restrictive Equivalent Functions in Section 2).

Additionally, this approach is very beneficial to the teachers since it allows them to:

Save time for evaluating other aspects of the projects, which implies a better understanding of the projects.

Enhance the traditional process of assessment providing students with personalised and technical feedback.

Support individual project-based learning to obtain a more closed tracing of the projects and the opportunity to focus on the weak skills of the students and strengthen those skills.

It is important to note that our approach aims to support the traditional process of project evaluation and not to replace it. Additionally, our approach can be seen as a useful tool for improving the decison-making process into the classroom. As is mentioned in [22] “decision-making performs a vital role in our daily life” and teachers/students are countinuosly confronted with challenging situations that demand decision making [23]. As a consequence fuzzy data-driven decision making approaches should be taken into account [24 –26]

To show and explore the possibilities of this new technology, a web platform has been designed and implemented by one of the authors following the phases and steps established in the methodology detailed in Section 1. Finally, our framework is evaluated by using a survey directed to students and comparing the results of the application of the proposed approach with the human expert assessment.

The structure of the paper is as follows. Section 2 introduces several general concepts regarding project-based learning in artificial intelligence and provides a very brief review of the state-of-the-art of the different involved disciplines. Then, in Section 3, a methodology for incorporating linguistic descriptions of data into the AI projects is proposed. Section 4 details the software architecture for providing teachers and students with personalised and technical feedback. Afterwards, Section 6 explains the experimentation and evaluation carried out on the projects of the student by employing an adaptation of the Turing test. Finally, Section 7 provides future work and some concluding remarks.

2 Preliminary concepts

2.1 Automatic generation of reports in natural language and fuzzy linguistic descriptions technology

The automatic generation of reports in natural language is a sub-field of AI, which allows us to produce natural language (and/or graphs) as output on the basis of data input. There are two main methods (which are compatible with each other) for generating reports in natural language from datasets: natural language generation (NLG) and linguistic description discipline (LD).

The NLG field is focused on converting any kind of data into informative texts. NLG models and techniques have been applied for textual reporting in various domains, such as meteorological data [27, 28], care data [29], project management [30], and air quality [31]. An important and extensive survey of the state-of-the-art in this discipline can be found at [32]

The LD discipline [33] combines several sub-areas and as is mentioned in [34]: “it is a young field, to achieve a general approach capable of building different types of linguistic descriptions for any kind of application domain is still an open challenge, although some steps have been made in this direction”.

In this sense, we can mention here the linguistic description of the data (LLD) approach, which has been applied in several practical cases where data is the main input 1 . LDD is a research area with similar objectives as NLG. However, as it is mentioned in [35], ”in the fuzzy logic and soft computing field, this task is performed by employing fuzzy logic machinery (mainly linguistic variables [36], inference fuzzy rules and fuzzy quantifiers)”. More formally, following [35], LDD is defined as the task of extracting knowledge in natural language sentences and combining it with useful and explainable graphs from some input data by producing an abstraction composed of linguistic terms and inference fuzzy rules [35]. The LDD models and techniques have been applied for textual reporting in various domains such as air quality index textual forecasts [37], and weather forecasts [38].

Additionally, the linguistic description of complex phenomena (LDCP) [39] paradigm aims to extract and represent knowledge by using natural language sentences as if they were produced by a human expert, describing the most relevant aspects of a phenomenon for certain users in specific contexts. The LDCP technique has been used in domains such as deforestation analysis [40], big data [41], advice for saving energy at home [42], self-tracking physical activity [43], cosmology [44, 45], and driving simulation environments [46]. The construction of sentences in LDCP is a process influenced by the computational perception concept. The algorithms employed in LDCP approaches generate all possible sentences combinations to create candidate descriptions from data, which have been previously transformed in variable and linguistic terms. Then, linguistic terms are summarised, and template descriptions are defined to automatically generate linguistic descriptions which are put together with explainable graphs to provide users with more complete textual and visual information.

LDCP is based on the Computational Theory of Perceptions [47] and it grounds on the fact that human cognition is based on the role of perceptions, and their remarkable capability to granulate information in order to perform physical and mental tasks without any traditional measurements and computation. LDCP is based on the concept of Computational Perception (CP). A CP is a pair (A,W) described as follows:

A = (u₁, …, u_n) is a vector of linguistic expressions (words or sentences in Natural Language) that represents the whole linguistic domain of CP.

W = (w₁, …, w_n) is a vector of the validity degrees w_i ∈ [0, 1] of each u_i. w_i represents the suitability of a_i to describe the current perception of a specific aspect of the monitored phenomenon.

For example, suppose the following values:

u₁= “The current situation is dangerous”, w₁ = 0.8

u₂= “The current situation is risky”, w₂ = 0.2

u₃= “The current situation is easy”, w₃ = 0.0

u₄= “The current situation is safe”, w₄ = 0.0

We use Perception Mappings (PM) to aggregate CPs. We distinguish two kind of PMs, namely, First Order PMs (1PMs) and Second Order PMs (2PMs). We define a 1PM as a triple (Z,y,g); where Z is a special type of CP with a numerical value z, y is a 1CP, g is a function W = g (z), the function g (z) can be implemented by using membership functions w_i = μ_i (Z) associated with each component a_i of A and therefore: W=(μ₁ (Z) , μ₂ (Z) , …, μ_n (Z)) where n es the elements of elements in A. A 2PM is a tuple (U,y,g); where U is a vector of input CPs, y is the output CP and g is an aggregation function implemented by using a set of fuzzy rules (see several examples of PM later on).

2.2 Similarity measures and restricted equivalence functions

Measuring and evaluating the believability of the agents acting in a computer game is an important challenge in AI [48 –50]. A method based on the similarity between fuzzy linguistic descriptions is proposed here. The main idea consists in establishing a similarity measure between linguistic reports by computing a similarity degree between its components.

An essential component of the fuzzy linguistic descriptions are the well-known linguistic vectors which employs linguistic terms for its implementation. At this level, similarity measures can be employed in order to measure the similarity between the corresponding components. We have selected the called restricted equivalence functions (REF) because we proved in [51] that they work very well in practice. Of course, others similarity measures could be used to compare fuzzy linguistic descriptions (for example [52]).

Finally, bots and human players can be compared from their automatically generated respective reports, which finally provides us with a measure of believability. It will be studied in detail in the Section 1.

A REF [53] is a function that establishes similarity between the elements of a domain. A REF can be formally defined as follows:

Definition 2.1. A REF, f, is a mapping [0, 1] ² ⟶ [0, 1] that satisfies the following conditions:

f (x, y) = f (y, x) for all x, y ∈ [0, 1]

f (x, y) =1 if and only if x = y

f (x, y) =0 if and only if x = 1 and y = 0 or x = 0 and y = 1

f (x, y) = f (c (x) , c (y)) for all x, y ∈ [0, 1], c being a strong negation.

For all x,y,z ∈[0, 1], if x ≤ y ≤ z, then f (x, y) ≥ f (x, z) and f (y, z) ≥ f (x, z)

For example, g (x, y) =1 - |x - y| satisfies conditions (1)-(5) with c (x) =1 - x for all x ∈ [0, 1].

3 Methodology for incorporating linguistic descriptions of data in AI projects

As mentioned in the introduction, the assessment of computer game-based learning projects is a very complex task. Here, a methodology for supporting this process by using fuzzy linguistic descriptions is proposed in detail.

This methodology is based on providing teachers with a complete tutorial reconstruction and a guide for designing and implementing a data-to-text system capable of automatically generating linguistic reports from execution traces obtained by capturing highlighted variables involved in the computer game. This system can be easily incorporated into the learning-teaching process which employs computer game-based learning for support. As a result, a computer-assisted assessment tool based on fuzzy linguistic descriptions technology is obtained. The proposed methodology is formed by three phases: bot behaviour analysis, linguistic descriptions of data and evaluation.

3.1 Phase 1. Bots behaviour analysis

In this phase, a set of actions performed by the agents is analysed to establish a set of behaviour patterns. The first step is the selection of entities, attributes and their interactions to determine which variables must be considered for generating some kind of behaviour. In our case, four entities have been identified: agent, opponents, rewards and obstacles. An agent is defined by using four attributes: position X and Y in the scenario, energy E and time T employed to capture each reward. The rest of the entities (opponents, rewards and obstacles) are defined by using two attributes (position x and y).

The second step is the definition of the metrics from the entities, attributes and interactions selected in the previous step. The metrics provide the designer with useful information on the nature of the problem and the kind of behaviour that could be inferred. Two kinds of metrics are considered here: quantitative and qualitative. Quantitative provides us with information about the performance of the algorithms (memory occupied and iterations performed), and qualitative provides us with information about the behaviour of the heuristic algorithms in terms of “believability” (protection, distance, energy, time, reward). In particular, a set of metrics for analysing the behaviour of the heuristic algorithms are defined as follows:

Protection. Number of obstacles between the agent and the opponent_i, a rectangular area is created from the position of the agent and the opponent_i, respectively.

Distance. Distance between two entities E₁ and E₂.

Energy. Energy of the player at an instant in time during the play session.

Time. Time registered from the start to the end of the play session.

Reward. True or false if a reward was captured at this instant in time.

Iterations. Number of iterations performed for the execution of the heuristic algorithm (it is executed in each move).

Memory. Amount of memory required for the execution of the heuristic algorithm (it is executed in each move).

The third step is the definition of a computational procedure to capture numeric data, in our case, traces of execution have been employed as a computational procedure for capturing and storing data. Trace recording or tracing is a commonly used technique useful in debugging and performance analysis. Specifically, trace recording implies detection and storage of relevant events during run-time, for later off-line analysis. We use a trace recording that stores the metrics defined in the previous item. The result is stored in a text file containing values for each metric defined in Fig. 2.

The fourth step is the determination of behaviour patterns where a set of basic behaviour patterns can be established on the input data captured by using knowledge representation techniques. Patterns are determined by an expert. A behaviour pattern is associated with a set of actions; that is, when a set of actions tupleA₁, A₂, …, A_n is produced then a set of effects tupleE₁, E₂, …, E_m is performed.

For example, movements performed by the player could be conditioned by the movements performed by the opponents, e.g., if an opponent is close to the player, then the player will go far away from the opponent, so the player and opponent are related, and it provides us with an interesting behaviour pattern. Here, the action is “to move close to the opponent” and the effect would be “to move far away from the opponent”. Note that several actions and effects could be given at the same time in a particular instant of time, for example an action could be given by “to move close to the reward” and the effect could be “to move closer to the reward”. The idea of behavior pattern will be formalized by using computational perceptions.

Note finally that patterns are related to the metrics defined in the previous step and the metrics should be define from them. For example, attitude provide us with information about how the agent acts with respect to the reward, hence the distance between the opponent and the reward must be evaluated. On the hand, situation provide us with information about how the agent acts with respect to the opponent, in this case the energy and the protection must be evaluated. Additionally, Kind of move provide us with information about which one is the result of a movement, here distance between the agent and the reward and the opponents must be studied. Finally, Performance is about which is the performance of each movement, time and memory must be measured, here memory and iterations must be taken into account.

The result of this phase is a set of behaviour patterns derived from actions performed by entities in the virtual world. These behaviour patterns are designed and implemented by using a computational perception network (see Fig. 4).

3.2 Phase 2. Linguistic descriptions of data in a 2D virtual world for automatically generating behaviour profiles

The aim of this phase is to establish a cognitive computational model from previously identified behaviour patterns. Knowledge representation techniques can be employed here to generate linguistic descriptions using the following steps:

Selection of behaviour patterns to be studied. The behaviour patterns defined in the previous module are analysed and selected according to our particular interest in them. The idea is to determine which behaviour patterns are truly important to create the “behaviour profile” from the data selected in the previous steps. For example, a particular sequence of movements is not relevant for us, but the reason for which these movements were performed is truly important because it provides us with interesting cause-effect information, which allows us to employ if-then rules for modelling.

Modelling the selected behaviour patterns. The behaviour patterns selected previously can now be represented by using a computational cognitive model. Taxonomies, ontologies, linguistic terms, if-then rules or a combination of these can be used in this step. All these techniques are very useful because no details about the implementation must be given. For example, each rule could have the form A ← B₁, …, B_n where A is the consequent and B₁, …, B_n the antecedent. The situation of the player with respect to the opponent depends on the protection of the player, that is, the number of walls between the player and the opponent and the distance between both and the energy of the player. The energy can have a negative effect in a particular situation, e.g., risky and dangerous situations could be given when the energy is low. $Situation \leftarrow Protection, Distance, Energy$

Implementation of the computational cognitive model. The aim of this task is to implement the computational cognitive model solution for representing the behaviour model of our problem. Details about the implementation should be given. For example, two alternatives can be employed. On the one hand, an “ad hoc” implementation could be employed by using a classic programming language. Alternatively, a special package for automatically generating linguistic descriptions could be used, for example, employing the programming language R [54]. The choice will depend on the features of the application. In our case, a web platform was developed by using the PHP programming language, and hence, the first option was a good alternative for us.

Linguistic Descriptions Generator. A linguistic description generator based on LDCP is designed and implemented providing us with textual messages from the execution of the algorithms. In this paradigm, linguistic descriptions are obtained from the source data by employing a combination of several fuzzy techniques, namely, linguistic variables, if-then rules and fuzzy quantifiers. In the Section 4, a data-driven software architecture based on LDCP is explained in detail.

This phase provides us with a complete specification of the behaviour profile, which is a graphical and textual report describing the most relevant information about the behaviour of the computer game bots acting in the computer game.

3.3 Evaluation: A Turing test for computer game bots based on LLD and REFs

There is an important difference between the algorithms employed for programming bots and classic algorithms, for example, to get the optimal way to go from one place to another one. While classic algorithms aim to simulate near-optimal intelligent behaviour, the gaming bot algorithms aim to provide us with interesting and fun opponents for human players, not optimal opponents [55].

A variation of the Turing test was proposed and designed in [56] to test the abilities of the computer game bots to impersonate human players. The idea is as follows: “Suppose we are playing an interactive video game with some entity. Could you tell, solely from the conduct of the game, whether the other entity was a human player or a bot? If not, then the bot is deemed to have passed the test”.

This kind of Turing test is redefined and adapted for our methodology by using LDCP and REFs. Our idea is to establish a formal and effective method of measuring the believability of the agents acting in the computer game. A comparison between the automatically generated profiles (bots and players) is performed by establishing a similarity measure between them.

For us, a heuristic algorithm is near-behavioural (for us it is an “optimal” algorithm) when bots profiles are similar to human profiles (defined by the teacher as an evaluation pattern, i.e., requirements that must met by the bots). This novel process of assessment for heuristic algorithms in computer games will be detailed in the Section 5.

4 A data-driven software architecture based on linguistic modelling of complex phenomena

A data-driven software architecture based on LDCP is proposed here (see Fig. 3). This architecture aims to implement the phases and steps described in the previously detailed methodology. It is formed by four modules: tracing, computational perception network, behaviour profile report generation and evaluation. They are explained in detail in the following.

Fig. 3

A data-driven software architecture for trace comprehension based on LDCP.

4.1 Tracing module

The tracing module aims to implement the functionality explained in phase 1, and it is incorporated into an event handler layer of the computer game architecture. This starts when a movement is performed by the player or the opponents, then an event is launched, and a function is called, which creates a new execution trace row. Each row is formed by the values of the metrics described in Section 1, namely: player position; opponent position (for each opponent); player energy; protection with respect to the opponent (for each opponent); distance between the player and the closest reward; distance between the opponent and the closest reward (for each opponent), time elapsed, if the reward was captured in this movement (true or false), number of iterations, memory used and which entity performed the movement (character ’J’ for the player, character ’A’ for the opponent).

At the end of each play session, these values are written in the output trace file. Note that, the output of this module is a file containing the needed data captured during the execution of the algorithms implemented in the project (see Fig. 2).

4.2 Computational perception network module

A computational perception network is used to implement the functionality explained in phase 2. An extension of the computational perception network presented in [21] is performed. In this case, additional variables must be considered, and the computational perceptions network must be enhanced. Additionally, rules and templates must also be updated for these new requirements.

Please note that the design of computational perceptions depends on the designer criterion. Here, after intensive experimentation we decided to make the linguistic model simple and functional but to keep it easy tounderstand. A computational network is formed by two kind: first-order computational perceptions (called 1CP) are the metrics perceived by the agents (distance,protection, energy, time, iterations, memory) and second-order computational perceptions (called 2CP) are the mental objects built by using metrics in the world [39].

4.2.1 First-order computational perceptions

The process of construction of a computational perception network starts with the selection of the metrics and the creation of a set of linguistic variables from them. An important requirement here is that the problem domain must be known by the designer. We use 1CP to process the input data. Here the inputs are numerical data obtained from varaibles and metrics defined in previous sections.

CP Distance. The CP_Distance is the perception by which an agent perceives if him/her is close or far away from something. The distance depends on the number of cells among the different actors (see Fig. 4). Given two points A (x, y) and B (x′, y′), the distance between both points is calculated by means of the following equation: $\sqrt{(x - x^{'})^{2} + (y - y^{'})^{2}}$ . We define three different CPs of Distance: Distance between the player and the opponent (Dpa*); Distance between the player and the closest reward with respect to him (Dpr*) and Distance between the opponent and the closest reward with respect to the player (Da* r *). Since these parameters are real numbers, we define this 1CP as follows:

Fig. 4

Computational perception network: CP means Instant CP; ∑CP means play session CP.

Z = [0, N], being N the maximum distance in the game world.

A = (close, normal, far)

g: this function is built using three linguistic labels that are represented with trapezoidal membership functions close (0, 0, 4, 7) , normal (6, 9, 11, 14), far (13, 16, N, N)

CP Protection. The protection degree is the number of walls between the player and the opponent (see Fig. 4). This is also a metric whose values are represented by a real number. Given two points A (x, y) and B (x′, y′), being the positions of the player and the opponent, respectively. The number of obstacles between both points can be calculated by using the following steps: i) calculating the rectangle formed by both points; ii) counting the number of obstacles in such a rectangle. Since this parameter is a real number, we define this 1CP as follows:

Z = [0, N], being N the total number of walls in the game world.

A = (low, intermediate, high)

g: this function is built using three linguistic labels that are represented with trapezoidal membership functions (0, 0, 0, 2), (1, 3, 3, 5) and (4, 6, 380, 380)

CP Energy. The energy indicates the current energy of the player. This is a metric whose value is represented by a real number.

Z = [0, N], being N the total of energy of the player (100 in our case)

A = (low, intermediate, high)

g: this function is built using three linguistic labels that are represented with trapezoidal membership functions (0, 0, 3, 6), (4, 7, 9, 12) and (10, 13, 100, 100)

CP Iterations. This metric indicates the current number of iterations performed by the algorithm used for implementing the artificial intelligence of the agents. This is a metric whose value is represented by a real number.

Z = [0, N], being N the total number of iterations of the agent

A = (little, normal, large)

g: this function is built using three linguistic labels that are represented with trapezoidal membership functions (0, 0, 18, 30), (18, 30, 42, 54) and (42, 54, 104857600, 104857600)

CP Memory. The memory indicates the current memory required by the algorithm employed to implement the artificial intelligence of the agent. This is a metric whose value is represented by a real number.

Z = [0, N], being N the total of memory for the agent (104857600 in our case)

A = (low, normal, high)

g: this function is built using three linguistic labels that are represented with trapezoidal membership functions (0, 0, 768, 1280), (768, 1280, 1792, 2304) and (42, 54, 104857600, 104857600)

CP of Time. This CP measures the spent time to capture the rewards distributed in the game world (see Fig. 4). Given two points A (x, y) and B (x′, y′), the time is calculated by measuring the needed time to go from A to B. For example, suppose that the player is at the position (0, 0) and the rewards are at the positions (3, 4) and (8, 8). CP-Time measures the time that the user has needed to go from (0, 0) to (3, 4) and then from (3, 4) to (8, 8). Since the parameter is a real number, we define this 1CP as follows:

Z = [0, N], being N a maximum fixed in seconds for capturing a reward.

A = (short, large)

g: this function is built using two linguistic labels that are represented with triangular membership functions short (0, 0, 7) , large (5, N, N)

Finally, a set of if-then rules is defined by aggregating these linguistic terms, which have an associated fuzzy set (see Example 1).

4.2.2 Second-order computational perceptions

CP situation. The CP_Situation is the perception by which a player perceives if he is safe with respect to the opponent. In this context, a player is safe when he is far away from opponents, or when he/she is protected by the walls. Additionally, easy, dangerous or risky situations are also considered, which will depend on three factors: its protection (low, normal, high) with respect to the opponent ( ${CP}_{Protection}^{player, opponent}$ ), the distance (close, normal, far) to the opponent ( ${CP}_{Distance}^{player, opponent}$ ) and the energy (low, normal, high) that the bot has at this moment ( ${CP}_{Energy}^{player}$ ). The corresponding values for this CP are computed by using the following rules: { Risky ← Intermediate, Close, Normal; Dangerous ← Low, Close, Normal; Safe ← Intermediate, Normal, Normal; Easy ← Low, Normal, Normal; Dangerous ← Low, Normal, Low; Dangerous ← Normal, Close, Low; Dangerous ← Normal, Normal, Low; }

CP attitude. The CP_Attitude refers to the perception that a player has about the actions. Four attitudes can be defined for a computer game bot: Wise, Brave, Cautious, Passive. This depends on two factors: the distance between the bot and the closest reward (R*) ( ${CP}_{Distance}^{Opponent, R *}$ ), and the distance between the opponent and the closest reward ( ${CP}_{Distance}^{player, R *}$ ). The corresponding values for this CP are computed by using the following rules: { Wise ← Close, Normal; Brave ← Close, Close; Cautious ← Normal, Close; Passive ← Normal, Normal; }

CP movement. The CP_Movement is the perception by which a player perceives the kind of movement performed in a particular moment in the play session. Four types of movements can be identified: Good, Bad, Scare, Kamikaze. These movements depend on three factors: the distance between the player and the closest reward ( ${CP}_{Distance}^{player, R *}$ ), the distance between the bot and the opponent ( ${CP}_{Distance}^{player, opponent}$ ), and the energy of the player ( ${CP}_{Enery}^{player}$ ). The corresponding values for this CP are computed by using the following rules: { Good ← Close, Normal, Normal; Good ← Close, Close, Low; Scare ← Normal, Normal, Normal; Kamikaze ← Close, Close, Normal; Bad ← Normal, Close, Normal; }

CP resources. The computational perception for the resources (CP_Resources) is the time and space required for correct execution of the heuristic algorithm involved. The use of resources can be: very efficient, efficient, inefficient and very inefficient. This depends on the time required (little, normal, large) and the use of memory (low, normal, high). The corresponding values for this CP are computed by using the following rules: { very_efficient ← little, low; efficient ← normal, normal; inefficient ← normal, high; very_inefficient ← large, high; }

CP ability. The computational perception of ability (CP_Ability) measures the degree of ability of the bots. It depends on the bots attitude, kind of movement performed and the time in capturing the rewards. The corresponding values for this CP are computed by using the following rules: { expert ← wise, good, small; intermediate ← brave, good, normal; basic ← passive, bad, much; dummy ← passive, scare, much; }

CP skill. The skill of a computer game bot depends on its attitude, kind of movement performed and situations detected. The corresponding values for this CP are computed by using the following rules: { very_skilled ← wise, good, easy; skilled ← cautious, good, safe; improvable ← brave, bad, dangerous; improvable ← passive, bad, risky; }

4.3 Behaviour profile report generation module

A special kind of CP is employed here which is called CP play session, and it is denoted by ΣCP. Each ΣCP can be formally defined by using a vector of linguistic expressions ((a₁, w₁) , …, (a_n, w_n)). Each ΣCPs represents the whole linguistic domain. These kinds of CPs allow us to obtain the total number of times in which a value (a₁, …, a_n) occurred during the execution.

These kinds of CPs provide us with a set of variables, their associated value and a degree α, which indicates the average for a particular value. For example, a value for CP situation could be “Safe” with 0.8 at an instant i and “Safe” with 0.7 at instant i + 1, and so on. Therefore, at the end of the execution, we have that a_i (in the example “Safe”) has been given N times with N different degrees β₁, …, β_n (of course, some of these degrees could be equal). Thus, the final degree is calculated as follows: α_i = ((β₁ + … + β_n)/N). For example, the following summaries can be obtained from different ΣCP (see Fig. 4).

The generation of the report is performed by using a set of ΣCP. For each CP a linguistic description is created in the function of the pair (a_i, w_i) ∈ ΣCP. Percentages are calculated for each ΣCP. The percentage p_i is then transformed into a linguistic term of quantity as follows: few is when p_i ∈ [0, 1/3]; several is when p_i ∈ [1/3, 2/3] or many is when p_i ∈ [2/3, 1]. Next, we consider the following four cases:

There exists a pair (a_i, p_i) ∈ ΣCP whose p_i is greater than 66 percent.

There exists a pair (a_i, p_i) ∈ ΣCP whose p_i is greater than 33 percent.

There are two pairs (a₁, p₁) , (a₂, p₂) ∈ ΣCP whose p_i is greater than 33 percent.

There exists no pair (a_i, p_i) ∈ ΣCP whose p_i is greater than 33 percent.

The system selects, the most suitable linguistic expressions from among the available possibilities to describe the input data. Linguistic descriptions for each CP are stored in a list 1, 2, 3, 4 as follows:

Descriptions for CP situation={① “Definitely, [degree] situations were [value];”, ② “[degree] situations were [value]”; ③ “[degree] situations were [value₁], although the [degree] situations were also [value₂]”; ④ “Diverse situations were detected during most of the play session” }

Descriptions for CP attitude={① “During most of the play session, the bot showed [degree] attitudes [value]”; ② “The bot showed [degree] attitudes [value]”; ③ “The bot showed [degree] attitudes [value₁] but also showed [degree] attitudes [value₂]”; ④ “The bot does not show a particular attitude during the play session” }

Descriptions for CP movement={① “Certainly, the [degree] of the movements performed by the bot was [value]”; ② “The bot proved to be capable of performing [degree] movements [value]”; ③ “The bot proved to be capable of performing [degree] movements [value₁] but also performed [degree] movements [value₂]”; ④ “The bot performed several movements indistinctly during the play session ” }

Descriptions for CP ability={① “Clearly, the bot displayed a/an [value] player [degree] times”; ② “The bot displayed a/an [value] player [degree] times”; ③ “The displayed a/an [value] player [degree₁] times, however [degree₂] times it acted as a/an [value₂]”; ④ “No kind of player has been identified ” }

Descriptions for CP Skill={① “Certainly, the bot proved to be [value] [degree] times”; ② “The agent proved to be [value] [degree] times”; ③ “The agent proved to be [value₁] [degree₁] times, nevertheless degree₂ times proved to be [value₂]”; ④ “No kind of skill can be proved during the current play session” }

Several cases may happen simultaneously, in this case (④, ③, ②,①), it would be the priority.

A complete example of the generation of behaviour profiles from an execution trace file is detailed in Example 1.

Example 1. Assume the unprocessed execution trace is described in the Fig. 7. A random real line of an execution trace file is as follows: 1, 13, 4, 12, 2, 12, 3.60, 3.16, 1.41, 17, 5.0, 2.0, 1.0, 15.26, 17.08, 13.0, 13.89, 15995,false. 42, 924, J First, the data are processed as follows: the values (1,13) correspond to the player position 2 and it is stored in the variable P; (3, 16), (4, 12) and (2, 12) are the positions of the opponent 1, 2, 3, respectively and they are stored in the variables (O₁, O₂, O₃). The values 3.60, 3.16, 1.41 are the distance between the player and the opponents and they are stored in the variables (Dist (P, O₁),Dist (P, O₂),Dist (P, O₃)). The value 17 is the energy of the player at this instant of time and it is stored in the variable E. The same process is followed for computing the protection between the player and the opponents (Protect (P, O₁) =5.0, Protect (P, O₂) =2.0, Protect (P, O₃) =1.0), and distances between player and opponents with respect to to the closest reward (Dist (P, R *) =15.26; Dist (O₁, R *) =17.08; Dist (O₂, R *) =13.0; Dist (O₃, R *) =13.89). Finally, Time = 15995; Iterations = 42 and Memory = 924. Second, linguistic terms are created for each variable as it was explained in Section 1. Therefore, we have: Dist (P, O₁) =3.60 is a close distance with 1.0; D (P, O₂) =3.16 is close distance with 1.0; Dist (P, O₃) =1.41 which is a close distance with 1.0; Protect (P, O₁) is high protection with 0.5; Protect (P, O2) is a normal protection with 0.5; Protect (P, O₃) is a low protection with 0.5; Dist (P, R *) =15.26 is a far distance with 0.75; Dist (O₁, R *) =17.08 is a far distance with 1.0; Dist (O₂, R *) is a normal distance with 0.33; Dist (O₃, R *) =13.89 is a far distance with 0.29. Time = 15995 is a small time with 1.0; Iterations = 42 is a normal number of iterations with 1 and Memory = 924 needed is a low required memory with 0.69. Then, each CP is instantiated with the values of the linguistic terms and if-then fuzzy rules are computed by using the average as the aggregation operator for computing the computational perceptions as follows (the closest opponent is represented by O*):

Attitude=(Cautious,0.54) ← Dist(O*,R*)=(High,0.29), Dist(P,R*)=(Normal, 0.33), Energy=(High,1) Situation=(Dangerous,0.83) ← Protection=(Low,0.5),Distance(P,O*)=(Close,1),Energy = (High,1) Movement=(Bad,0.77) ← Distance(P,R*)=(Normal,0.33),Distance(P,O*)=(Close,1), Energy=(High,1) Ability=(Dummy,0.76) ← Attitude=(Cautious,0.54), Situation=(Dangerous,0.83), Movement=(Bad,0.91) Skill=(Improvable,0.81) ← Attitude= (Cautious,0.54), Movement=(Bad,0.91), Time=(Small,1) Resources=(Efficient,0.84) ← Memory=(Low,0.69), Iteration=(Normal,1)

Subsequently, the ΣCP are computed as it was explained in Section 1:

ΣCP_Attitude = {(wise,17.53), (brave,101.55), (cautious,14.05), (passive,10.78) }

ΣCP_Situation {(risky,24.44), (dangerous,651.39), (safe,32.26), (easy,0) }

ΣCP_Movement {(good,24.21), (scared,0), (kamikaze,94.82), (bad,48.01) }

ΣCP_Ability {(skillful,7.56), (little skilled,0.72), (improvable,122.2), (very improvable,31) }

ΣCP_Skill {(expert,38.48), (intermediate,0), (basic,31.93), (dummy,94.88) }

ΣCP_Resources {(very efficient,41.42), (efficient,121.86), (inefficient,0), (very inefficient,15.33) }

Finally, the instantiation, for each template shown in the Section 4.3, is performed by using the generated ΣCP. An example of instantiation is shown in the Fig. 5.

Fig. 5

Instantiation template for the execution trace of the Example 1 and the similarity between behaviour profile reports: human player versus bots.

5 Experimentation and evaluation

A web application has been developed for providing students with a computer-assisted assessment tool. Computer-assisted assessment is a longstanding problem that has attracted interest from the research community since the sixties and has not yet been fully resolved [11]. The main aim is to study how the computer can help in the evaluation of students’ learning processes [57]. The literature has presented several advantages:

It provides educators with didactic advantages [6]; that is, it is very useful for conveying instruction and information as well as pleasure and entertainment.

It provides students with immediate information in a timely manner, and it is particularly useful when the number of students is high, and resources are scarce [7].

It is a quick way of providing feedback, and it reduces the teacher’s workload [8].

It can be personalised, which allows the process of assessment to be enhanced from both teacher’s and students’ points of view.

The web application has been incorporated into the process of assessment of an introductory AI course. Additionally, this portal has been incorporated into the teaching-learning process, which allows each student to consult the feedback any time he/she desires and compare different kinds of algorithms for programming computer game bots; the student can also establish his/her own plan for learning.

This section aims to explain in detail the use of the web platform for automatically generating human player and bot behaviour profiles from execution traces. A quick test of the application can be performed by downloading examples of traces at the following URL: http://youractionsdefineyou.com/assess/web/examples_traces

First, the user must access the URL: http://www.youractionsdefineyou.com/assess

The main window shows two options: log in and register. Registering a user consists of entering the email address, user name, full name, RUT (national identification in Chile) and a password. A confirmation via email is sent to the user if the registration was correct. The user log in consists of entering the user name and the password. Second, a behaviour profile report can be obtained by selecting and loading an execution trace file, and then, the behaviour profile report is automatically generated. Additionally, the report can be exported to PDF.

As we mentioned, one of the most important objectives in AI is to create an agent that simulates human abilities. Here, the bots behaviour profile is compared with the human expert player profile (see Fig. 7) by using a similarity measure based on REFs. The generation of a human player profile is performed in two steps: i) human player plays several play sessions; ii) teacher/evaluator defines a profile report from them. Note that, an “expert human player” profile candidate is selected and associated with a requirement of the project. Of course, others human expert profiles could be considered, it will depend on the objetives and features of the project.

Attitude is mainly brave most of the time.

Situation is mainly safe most of the time.

Movements were mainly good most of the time.

The player is an expert.

The player is skilled.

The use of computational resources is efficient in time and space

The final grade (from 1 to 7) is computed by using the similarity between the human behaviour profile and the bot profile. The equation for calculating the final grade is as follows:

$\begin{matrix} FG = G_{Min} & + S_{Attitude} + S_{Situation} + S_{Movement} \\ + S_{Ability} + S_{Skill} + S_{Efficiency} \end{matrix}$ (1)

G_Min: 1 point (all the students have 1 point as a minimum score - it is mandatory at the University of Bío-Bío)

$S_{Attitude} = S_{REF} (Σ {CP}_{Attitude}^{Human}, Σ {CP}_{Attitude}^{Bot})$ is the similarity between the human player and bot attitude.

$S_{Situation} = S_{REF} (Σ {CP}_{Situation}^{Human}, Σ {CP}_{Situation}^{Bot})$ : is the similarity between the human player and bot situation.

$S_{Movement} = S_{REF} (Σ {CP}_{Movement}^{Human}, Σ {CP}_{Movement}^{Bot})$ : is the similarity between the human player and bot movements.

$S_{Ability} = S_{REF} (Σ {CP}_{Ability}^{Human}, Σ {CP}_{Ability}^{Bot})$ : is the similarity between the human player and bot ability.

$S_{Skill} = S_{REF} (Σ {CP}_{Skill}^{Human}, Σ {CP}_{Skill}^{Bot})$ : is the similarity between the human player and bot skill.

$S_{Efficiency} = S_{REF} (Σ {CP}_{Efficiency}^{Human}, Σ {CP}_{Efficiency}^{Bot})$ : is the similarity between the human player and bot efficiency.

where S_REF is a similarity measure between computational perceptions. The following definition formalises this measure.

Definition 5.1. Given two ΣCP_i, ΣCP_j whose percentage linguistic vectors are {(a₁, p₁) … , (a_n, p_n)} and {(b₁, q₁) … , (b_n, q_n)}, respectively. A similarity measure between ΣCP_i and ΣCP_j is defined as: $S_{REF} (Σ {CP}_{i}, Σ {CP}_{i}) = \sum_{i = 0}^{n} (REF (p_{i}, q_{i})) / n$ being REF (p_i, q_i) =1 - |p_i - q_i|

Example 2. Let ${CP}_{Attitude}^{Human}, {CP}_{Attitude}^{Bot}$ be two summation computational perceptions for the human player and the computer game bot, respectively:

$Σ {CP}_{Attitude}^{Human} = {$ (wise,122.35), (brave,289), (cautious,87.59), (passive, 8.75) }

$Σ {CP}_{Attitude}^{Bot} = {$ (wise,17.53), (brave,101.55), (cautious,14.05), (passive, 10.78) }

Then, the linguistic vector percentages are calculated for each ΣCP by using their totals

{Total}_{Σ {CP}_{Attitude}^{Human}} (507.69)

and

{Total}_{Σ {CP}_{Attitude}^{Bot}} (143.61)

$Σ {CP}_{Attitude}^{Human} = {$ (wise,0.240), (brave,0.569), (cautious,0.172), (passive,0.017) }

$Σ {CP}_{Attitude}^{Bot} = {$ (wise,0.122), (brave,0.709), (cautious,0.097), (passive,0.075) }

Now, the similarity $S_{REF} (Σ {CP}_{Attitude}^{Human}$ , $Σ {CP}_{Attitude}^{Bot})$ can be calculated:

REF (0.240, 0.122) =1 - |0.240 - 0.122|=0.882

REF (0.569, 0.172) =1 - |0.569 - 0.172|=0.882

REF (0.172, 0.097) =1 - |0.172 - 0.097|=0.925

REF (0.017, 0.075) =1 - |0.017 - 0.075|=0.942

Hence, $S_{REF} (Σ {CP}_{Attitude}^{Human}, Σ {CP}_{Attitude}^{Bot}) = \frac{3.402}{4} =$ 0.838. The rest of the similarities are similarly computed. The final grade together with the linguistic reports generated for the human player and the bot designed by an anonymous student is shown in Fig. 7.

5.1 Supporting classic human expert assessment

The aim of this section is to show how fuzzy linguistic descriptions can be used to get more information about the algorithms implemented by the students. The classic human expert assessment for the computer game-based project at the University of Bío-Bío is based on the following guidelines. There are minimal requirements to pass the subject: 2D scenario, 4 rewards, 3 opponents, 1 player, opponents implemented with breadth-first algorithms all functionality; crash with stop (+1); crash without stop (-1); images (+3); life (+3); lose life (+3); Show life (+3); Stop play session correctly (+3); a star algorithm implementation (+3); additional tasks about the project performed in class (+2 per task); additional tasks about the project not performed in class (-2 per task); exceptional IA (+8)

In order to perform an empirical comparation, we are going to use the scores obtained in a classical assessment evaluation by the teacher and those automatically generated by the software tool (see Fig. 6). A comparative table was created in order to show both resulting scores. The empirical study is simple, illustrative. We must take into account the time employed in a classical assessment of the algorithm functionality.

Supposing that teacher assesses each algorithm in five minutes, the teacher would need 50 minutes for evaluating 10 projects.

Supposing that IA assesses user each algorithm in one second, the teacher would need 50 seconds for evaluating 10 projects.

Fig. 6

Scores obtained by students in the project and the scores obtained automatically.

Fig. 7

Bar Graphs obtained from the answers for the survey directed to the students.

The above examples support our initial hyphotesis. It is hard for a teacher to assess all the features involved in an implementation. However, although the teacher could do it, this is a task which needs too much time to be completed. The use of linguistic descriptions allows us to automatise this step, making the task easier for the expert and reducing time. The teacher could spend more time reviewing other features of the project that he might not be able to do because of lack of time.

Additionally, there are considerable differences between both scores. While in the first one, the difference between the respective scores is large, the scores obtained by using our methodology are more equidistant. In a second and unofficial round of revision, the teacher could check that the scores obtained are more adjusted to reality. However, a more deep analysis should be performed in the future work, for example by implementing intelligent artificial tutors based on fuzzy linguistic descriptions. It will be a great challenge for us which starts with the present work.

5.2 Results of the implementation of the experiment or research impact

To measure the impact of teaching innovation, a survey directed to the students of the artificial intelligence class was carried out. Very interesting opinions from the students have been observed. A number of opinions were positive, which appears to indicate that a project-based methodology is very useful in allowing the students to achieve the competencies pertaining to the profile of a graduate. Other opinions were negative but constructive, which indicates that certain aspects of the class, namely, those related to the organisation and presentation of materials and assessments, should be improved. Others were negative but constructive with regards to the context of the class and pre-existing gaps in knowledge. Finally, a smaller number of negative, non-constructive opinions were registered, which criticised the teaching abilities of the instructor and the class contents. Measures have been taken in light of these findings, and a final methodology has been designed as a result. The results show that 38% of students very strongly agreed, 27% strongly agreed, and 10% agreed with the statement that “The development of the artificial intelligence project was motivating and exciting” whereas 13% slightly agreed, and 12% disagreed.

Positive results were also obtained regarding the statement. The project-based method greatly helped in understanding the theoretical concepts of the classž with 81% of students agreeing, strongly agreeing or very strongly agreeing, while 10% remained neutral, and 9% disagreed. Additionally, 75% of the students agreed, strongly agreed or very strongly agreed; 21% remained neutral, and 4% disagreed with the statement. The project-based method greatly helped in understanding how to implement the research techniques reviewed in class. For the next question, 79% of students agreed, strongly agreed or very strongly agreed; 15% remained neutral, and 6% disagreed with the statement “Carrying out this project autonomously helped me improve my professional skills and further prepared me for future employment”. The results of this survey indicate that the methodology made it possible for the project-workshop to be motivating and exciting. However, it failed to establish a connection between theoretical and practical content. It has been observed that autonomous work allows students to acquire working and programming skills. The majority of the students see this as a challenge. In general, the students stated that this type of methodology helped them achieve the desired abilities to learn a better understanding of the studied algorithms.

6 Conclusions and future work

In this paper, a new approach for execution traces comprehension based on the fuzzy linguistic description paradigm has been presented. For this purpose, a methology and a data-driven architecure have been established and explained in detail. A computational perception network and a behaviour profile report generation module have been defined and implemented on a web platform. This tool allows us to automatically generate interpretable and explainable reports in natural language about the execution traces of the algorithms designed and implemented by the students in their projects. An important and remarkable feature of our proposal is that it is capable of interpreting large amounts of numerical data quickly and accurately, providing students -and teachers- with immediate, personalised and accountable feedback, obtaining a better understanding of the execution traces.

The preliminary results show that our method can be used as a pedagogical resource providing teachers with a useful tool for identifying information about the quality of the heuristic algorithm designed by the students, which improves the teaching and learning process. The project created by the students can be evaluated at any time from two points of view: quantity (performance of the algorithm -space and time-), and quality (kind of situations, movements, attitudes, abilities, skills). A survey directed to the students of the artificial intelligence class has been carried out in order to measure the impact of teaching innovation.

As future work we would like to address several challenges: i) employment of our technology for improving the transparency, interpretability, and comprehension of algorithms; ii) design and implementation of intelligent tutors based on fuzzy linguistic descriptions; iii) incorporation of our technology in others educational disciplines to obtain personalised feedback; iv) quality assessment of automatically generated datasets.

Footnotes

Acknowledgments

This work has been performed in collaboration with the research group SOMOS (SOftware-MOdelling-Science) funded by the University of Bío-Bío. This document is especially dedicated to our partner and friend, Prof. Pedro Rodríguez (RIP).

Some authors refer to the linguistic descriptions as LDD, understanding linguistic descriptions as a tool to describe human perceptions.

The notation (x,y) is indicated the coordinate (x,y) into the 2D scenario.

References

Faessler

, Hinterberger

, Dahinden

and Wyss

, Evaluating student motivation in constructivistic, problembased introductory computer science courses, in E-Learn: World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, Association for the Advancement of Computing in Education (AACE), 2006, pp. 1178–1185.

Fink

L.D.

, Creating significant learning experiences: An integrated approach to designing college courses. John Wiley & Sons, 2013.

Hattie

and Timperley

, The power of feedback, Review of Educational Research 77(1) (2007), 81–112.

Nicol

D.J.

and MacFarlane-Dick

, Formative assessment and self-regulated learning: A model and seven principles of good feedback practice, Studies in Higher Education 31(2) (2006), 199–218.

Sarsar

H.S.

, and F., Student and instructor responses to emotional motivational feedback messages in an online instructional environment, Turkish Online Journal of Educational Technology - TOJET 16(1) (2017), 115–127.

van der Kleij

F.M.

, Eggen

T.J.

, Timmers

C.F.

and Veldkamp

B.P.

, Effects of feedback in a computer-based assessment for learning, Computers & Education 58(1) (2012), 263–272.

Hamilton

I.R.

, Automating formative and summative feedback for individualised assignments, Campus-Wide Information Systems 26 (2009), 355–364.

Lavolette

, The accuracy of computer-assisted feedback and students responses to it, 19 (2015).

Pirzadeh

, Hamou-Lhadj

and Shah

, Exploiting text mining techniques in the analysis of execution traces, in Software Maintenance (ICSM), 2011 27th IEEE International Conference on, IEEE, 2011, pp. 223–232.

10.

Noorbehbahani

and Kardan

A.A.

, The automatic assessment of free text answers using a modified BLEU algorithm, Computers & Education 56(2) (2011), 337–345.

11.

Pérez

, Gliozzo

A.M.

, Strapparava

, Alfonseca

, Rodríguez

and Magnini

, Automatic assessment of students’ free-text answers underpinned by the combination of a bleu-inspired algorithm and latent semantic analysis, in Proceedings of the Eighteenth International Florida Artificial Intelligence Research Society Conference, Clearwater Beach, Florida, USA, 2005, pp. 358–363.

12.

Lachner

, Burkhart

and Nückles

, Formative computer-based feedback in the university classroom: Specific concept maps scaffold students’ writing, Computers in Human Behavior 72 (2017), 459–469.

13.

Fyfe

E.R.

, Providing feedback on computer-based algebra homework in middle-school classrooms, Computers in Human Behavior 63 (2016), 568–574.

14.

Douce

, Livingstone

and Orwell

, Automatic test-based assessment of programming: A review, Journal on Educational Resources in Computing (JERIC) 5(3) (2005), 4.

15.

Burgos

, Van Nimwegen

, Van Oostendorp

and Koper

, Game-based learning and the role of feedback. A case study, 2007.

16.

Zadeh

L.A.

, Fuzzy sets, Information and Control 8(3) (1965), 338–353.

17.

Strobin

and Niewiadomski

, Linguistic summaries of graph datasets using ontologies: An application to semantic web, Journal of Intelligent & Fuzzy Systems 32(2) (2017), 1193–1202.

18.

Ramos-Soto

, Vázquez-Barreiros

, Bugarín

, Gewerc

and Barro

, Evaluation of a data-to-text system for verbalizing a learning analytics dashboard, Int J Intell Syst 32(2) (2017), 177–193.

19.

Gkatzia

, Hastie

H.F.

, Janarthanam

and Lemon

, Generating student feedback from time-series data using reinforcement learning, in ENLG 2013 - Proceedings of the 14th EuropeanWorkshop on Natural Language Generation, Sofia, Bulgaria, 2013, pp. 115–124.

20.

Sánchez-Torrubia

M.G.

, Torres-Blanc

and Triviño

, An approach to automatic learning assessment based on the computational theory of perceptions, Expert Syst Appl 39(15) (2012), 12177–12191.

21.

Rubio-Manzano

and Triviño

, Improving player experience in computer games by using players’ behavior analysis and linguistic descriptions, Int J Hum-Comput Stud 95 (2016), 27–38.

22.

Riaz

, Çağman

, Zareef

and Aslam

, N-soft topology and its applications to multi-criteria group decision making, Journal of Intelligent & Fuzzy Systems 36(6) (2019), 6521–6536.

23.

Aho

, Haverinen

H.-L.

, Juuso

, Laukka

S.J.

and Sutinen

, Teachers principles of decision-making and classroom management; a case study and a new observation method, Procedia-Social and Behavioral Sciences 9 (2010), 395–402.

24.

Riaz

, Davvaz

, Firdous

and Fakhar

, Novel concepts of soft rough set topology with applications, Journal of Intelligent & Fuzzy Systems 36(4) (2019), 3579–3590.

25.

Riaz

and Tehrim

S.T.

, Cubic bipolar fuzzy ordered weighted geometric aggregation operators and their application using internal and external cubic bipolar fuzzy data, Computational and Applied Mathematics 38(2) (2019), 87.

26.

Riaz

and Hashmi

M. R.

, Fuzzy parameterized fuzzy soft compact spaces with decision-making, Punjab Univ j Math 50 (2018), 131–145.

27.

Goldberg

, Driedger

and Kittredge

R.I.

, Using natural-language processing to produce weather forecasts, IEEE Expert 9(2) (1994), 45–53.

28.

Coch

, System demonstration interactive generation and knowledge administration in multimeteo, Natural Language Generation (1998).

29.

Portet

, Reiter

, Gatt

, Hunter

, Sripada

, Freer

and Sykes

, Automatic generation of textual summaries from neonatal intensive care data, Artif Intell 173(7-8) (2009), 789–816.

30.

White

and Caldwell

, EXEMPLARS: A practical, extensible framework for dynamic text generation, in Proceedings of the Ninth International Workshop on Natural Language Generation, INLG 1998, Niagara-on-the-Lake, Ontario, Canada, 1998, 1998.

31.

Busemann

and Horacek

, Generating air quality reports from environmental data, in Proceedings of the DFKIWorkshop on Natural Language Generation, 1997, pp. 15–21.

32.

Gatt

and Krahmer

, Survey of the state of the art in natural language generation: Core tasks, applications and evaluation, J Artif Intell Res 61 (2018), 65–170.

33.

Yager

R.R.

, Fuzzy summaries in database mining, in Artificial Intelligence for Applications, 1995 Proceedings, 11th Conference on, IEEE, 1995, pp. 265–269.

34.

Marín

and Sánchez

, Fuzzy sets and systems+ natural language generation: A step forward in the linguistic description of time series, Fuzzy Sets and Systems 285 (2016), 1–5.

35.

Ramos-Soto

, Bugarín

and Barro

, On the role of linguistic descriptions of data in the building of natural language generation systems, Fuzzy Sets and Systems 285 (2016), 31–51.

36.

Zadeh

L.A.

, The concept of a linguistic variable and its application to approximate reasoningi, Information Sciences 8(3) (1975), 199–249.

37.

Ramos-Soto

, Bugarín

, Barro

, Gallego

, Rodríguez

, Fraga

and Saunders

, Automatic generation of air quality index textual forecasts using a data-to-text approach, in Advances in Artificial Intelligence - 16th Conference of the Spanish Association for Artificial Intelligence, CAEPIA 2015, Albacete, Spain, Proceedings, 2015, pp. 164–174.

38.

Ramos-Soto

, Diz

A.J.B.

, Barro

and Taboada

, Linguistic descriptions for automatic generation of textual short-term weather forecasts on real prediction data, IEEE Trans Fuzzy Systems 23(1) (2015), 44–57.

39.

Trivino

and Sugeno

, Towards linguistic descriptions of phenomena, International Journal of Approximate Reasoning 54(1) (2013), 22–34.

40.

Conde-Clemente

, Alonso

J.M.

, Nunes

É.O.

, Sanchez

and Trivino

, New types of computational perceptions: Linguistic descriptions in deforestation analysis, Expert Systems with Applications 85 (2017), 46–60.

41.

Conde-Clemente

, Trivino

and Alonso

J.M.

, Generating automatic linguistic descriptions with big data, Information Sciences 380 (2017), 12–30.

42.

Conde-Clemente

, Alonso

J.M.

and Trivino

, Toward automatic generation of linguistic advice for saving energy at home, Soft Computing 22(2) (2018), 345–359.

43.

Sanchez-Valdes

and Triviño

, Linguistic and emotional feedback for self-tracking physical activity, Expert Syst Appl 42(24) (2015), 9574–9586.

44.

Sanchez-Valdes

, Alvarez-Alvarez

and Triviño

, Linguistic description about circular structures of the mars’ surface, Appl Soft Comput 13(12) (2013), 4738–4749.

45.

Arguelles

and Trivino

, I-struve: Automatic linguistic descriptions of visual double stars, Engineering Applications of Artificial Intelligence 26(9) (2013), 2083–2092.

46.

Eciolaza

, Pereira-FariñA

and Trivino

, Automatic linguistic reporting in driving simulation environments, Applied Soft Computing 13(9) (2013), 3956–3967.

47.

Zadeh

L.A.

, From computing with numbers to computing with words from manipulation of measurements to manipulation of perceptions, IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications 46(1) (1999), 105–119.

48.

Bernardi

M.L.

, Cimitile

, Martinelli

and Mercaldo

, An ensemble fuzzy logic approach to game bot detection through behavioural features, in 2018 IEEE International Conference on Fuzzy Systems, FUZZ-IEEE 2018, Rio de Janeiro, Brazil, 2018, pp. 1–9.

49.

Pacheco

, Tokarchuk

and Pérez-Liébana

, Studying believability assessment in racing games, in Proceedings of the 13th International Conference on the Foundations of Digital Games, FDG 2018, Malmö, Sweden, 2018, pp. 20:1–20:10.

50.

Wang

J.-Y.

, Classification of humans and bots in two typical two-player computer games, in 2018 3rd International Conference on Computer and Communication Systems (ICCCS), IEEE 2018, pp. 502–505.

51.

Rubio-Manzano

, Similarity Measure Between Linguistic Terms by Using Restricted Equivalence Functions and Its Application to Expert Systems, Springer International Publishing, Cham, 2019, pp. 97–102.

52.

Batyrshin

, Cross

, Kreinovich

and Rifqi

, Special issue on similarity, correlation and association measures dedicated to the memory of lotfi zadeh, Journal of Intelligent & Fuzzy Systems Preprint, 1–2.

53.

Bustince

, Barrenechea

and Pagola

, Restricted equivalence functions, Fuzzy Sets and Systems 157(17) (2006), 2333–2346.

54.

Alonso

J.M.

, Conde-Clemente

and Triviño

, Linguistic description of complex phenomena with the rldcp R package, in Proceedings of the 10th International Conference on Natural Language Generation, INLG 2017, Santiago de Compostela, Spain, 2017, pp. 243–244.

55.

Soni

and Hingston

, Bots trained to play like a human are more fun, in Proceedings of the International Joint Conference on Neural Networks, IJCNN 2008, part of the IEEE World Congress on Computational Intelligence, WCCI 2008, Hong Kong, China, 2008, pp. 363–369.

56.

Hingston

, A turing test for computer game bots, IEEE Trans Comput Intellig and AI in Games 1(3) (2009), 169–186.

57.

Man

D.P.

, Automatic evaluation of users short essays by using statistical and shallow natural language processing techniques advanced studies diploma work, 2004.