Abstract
In this research, we implement an intelligent quantitative model to assess a specific qualitative intelligence scale in children between 5 and 8 years old, based on augmented reality and the well known WISC-IV test. The output of the model is a cognitive factor associated with the analogical reasoning level of the child, and the ulterior analysis of the evaluation measure is intended to serve as an aid for the teacher to discover problems related to the child’s ability to solve visual analogies. A quantitative approach to assess analogical reasoning is suitable to avoid ambiguous evaluations of qualitative results. Also, given that the assessment employs a visual WISC subtest, it constitutes a non-verbal evaluation. Finally, the fact that the model is based on an intelligent approach guarantees that the assessment process is impartial, based on the quantitative scores obtained, instead of an interpretation of the results. The purpose of this work is to give evidence that a computer-aided adaption, employing augmented reality and a Fuzzy Petri Net, for the WISC test, will improve the teaching-learning process in children ranges from 5 to 8 years old. A case study is analyzed, where both the paper-based and the augmented reality versions are applied to five children with Spanish as their native tongue. We show the feasibility and potentiality of implementing the test in a multimedia version to provide teachers with a more reliable resource for the diagnosis and treatment of possible learning deficiencies in the child regarding disambiguation, non-verbality, and impartiality.
Keywords
Introduction
The reasoning is a fundamental element in education since it involves the use of logic in order to correlate previously acquired knowledge to infer new results. Children in their young ages (between 5 and 8 years old) are expected to have already acquired a certain level of basic reasoning abilities, in terms of verbal comprehension, mathematical reasoning, visual-spatial processing, analogical reasoning, among others. Children that excel in demonstrating such capabilities are considered to be gifted and are usually motivated to acquire more abilities, given their fast learning pace. However, children that show evidence of specific learning disabilities and/or disorders must be treated so that they can have an adequate education. In both cases, the sooner the reasoning levels are measured, better results in terms of children’s education efficiency and, most of all, sense of well-being. Since, timely finding their strengths and weaknesses will play a fundamental role in determining the necessary supporting measures to improve their quality of life as students.
Analogies are used in the explanation and understanding of new concepts [7]. They are useful means for people to learn about new situations, based on previous world understanding [8]. In the same sense, analogical reasoning is ubiquitous to human cognition. It is a fundamental qualification for learning and teaching mathematics, whereby children discover the main features of an unknown object by searching similarities to a familiar one.
Applying knowledge from one context to a different one is a difficult challenge for children (particularly, between 5 and 8 years old); sometimes, children may need guidance in early ages, when they attempt to make cross-domain comparisons to draw conclusions using analogies [9]. Even if the kid’s cognitive and understanding capabilities are well developed, in some scenarios, using and understanding analogies tends to be a hard and demanding task. It even has been shown that analogical reasoning is fundamental for language development, the key to scientific learning, and an indicator of creative and critical thinking [30]. It has also been proved that supporting the development of analogical reasoning helps children become more innovative and adaptive [31].
It is a fact that education in the 21st century is quite different from what it used to be in the previous ages. As it is mentioned in [1], lack of interest and motivation in students is a significant reason why they fail at concluding their academic preparation. In the same sense, it is relevant to notice that motivational issues are usually related to early age situations that the students face. On the one hand, both highly proficient children and children with deficient learning processes, since they are perceived as different from their peers, are at risk of being bullied [33]. On the other hand, in general, children are frequently faced with situations where their average performance is considered as underachievement; Especially if they are perceived as gifted (independently from their capabilities). So, the pressure is put on performing better (in fact, in situations where this pressure becomes pathogenic in the parents, it has been characterized as the Achievement by Proxy Distortion (ABPD) behavior [34], which is really dangerous for the adequate development of children). Summarizing, the consequence in the first case is that children suffer from pressure from their peers, while, in the second one, social and/or parental pressure is more likely to happen.
It is evident that the traditional teaching technique based on an instructor lecturing students in front of a group, while these students struggle to maintain attention, has proven to be ineffective and obsolete with the new generations. In this sense, educational technologies are not supposed to replace teachers in a classroom; on the contrary, they are intended to be seen as useful tools to improve learning and to facilitate the evaluation of the quality of the teaching-learning process. The challenge in the use of educational technologies is to have a clear quantitative measure of progress in the learning process. Despite the existence of a number of efforts to take advantage of educational technologies, it is a fact that the existing gap between teaching processes and technology is widening more and more [2].
Concerning the students, the younger generations have been in contact with technology since their early childhood, either for entertainment or necessity. Nevertheless, the potential the technology has of producing a positive impact deserves to be observed in detail in both, the way students are educated and the way they are motivated to learn; especially since it can influence how students are motivated to learn actively and more effectively [3]. According to [4], technology will raise innovative ways of learning and teaching. It is expected, however, that these new tools will be complementing the traditional ones [5], converging to attain more effective learning processes.
There are several traditional approaches intended to serve as a measure for the reasoning aptitudes in children, and one of the most used ones is the Wechsler Intelligence Scale for Children (WISC). In this sense, there is a need for determining, in a quantitative form, the qualitative aspects the professional observes and obtains when a child answers the Matrix reasoning sub-test of the Wechsler Intelligence Scale in its 4th version (WISC-IV). Since measuring the ability to solve analogies determines if new information is generated in the process or if the child has an understanding of the problem faced. The work hereby presented is intended to solve this issue.
As we previously mentioned, it is important to notice that using modern technologies or classic approaches for learning, should not be seen as opposite alternatives, when they should be, in fact, complementary. One of the main goals this paper pursues is to show that a technological adaption of well tested and standardized evaluations, such as the WISC, is of benefit for the young students. One such new technology, with a number of potential applications in education, is Augmented Reality. The fact that this technology is aimed to augment the perception of objects in the real-world, using multimedia sensory stimuli (images, sounds, smells, etcetera), makes it suitable to be applied in educational environments. [6] suggest a classification of Augmented Reality applications into five groups: Discovery-based learning Object modeling Augmented Reality books Ability training Augmented Reality games
This work is devised to consider the movements and the interaction in a mobile augmented reality application. The movements of virtual objects, such as translation, rotation, and scaling, will represent the input states of a Fuzzy Petri Net (FPN), related to child behavior. The output states represent a cognitive factor (visual-perceptual discrimination, visual-perceptual organization, visual processing, and spatial skills) associated with movements that are linked to the reasoning ability to solve visual analogies.
The main objective of this paper is to show that it is possible to implement an intelligent quantitative evaluation of a specific qualitative intelligence scale, applied to children between 5 and 8 years old; all of this, using an FPN and an augmented reality application.
We present a case study showing the viability of the multimedia application; where disambiguation, non-verbality, and impartiality improved the assessment process.
This work is structured as follows. In section 2, we explain the reasons for the need of an intelligent model. Section 3 justifies the choice of the WISC-IV test. In section 4, the related work is discussed. Section 5 is devoted to present the proposed assessment model. In section 6, we show the results in terms of basic elements, the evaluation of analogical reasoning, and the analysis of a case study. Finally, section 7 presents the conclusions we drew from this research and future directions.
Why do we need an intelligent model for analogical reasoning assessment?
IQ tests are mainly designed to provide the person with a score which is supposed to reflect his/her intelligence level. However, other than providing this score, these tests have been controversial and questioned for their actual usefulness to improve performance and quality of life for the person to whom the test is applied. Even more, a score, useful only to rank the person, might lead to a bias that, at the same time, might lead to segregation or additional pressure for the person, depending on this score.
IQ tests have the potential of being useful for evaluating what aptitudes require being exercised and which ones are within “normal” parameters or even exceptional. Some of the most popular tests are: The WISC test for children among 6 years, 0 months and 16 years, 11 months has been widely used around the world [4, 42]. This test consists of ten subtests, with five additional subtests for cognitive and intellectual assessment; it was published in 2003 and adapted to a Spanish version in 2005. The SB5 (Stanford-Binet-Fifth edition) test [43, 44] is another option. This version was published in 2003 and is applicable to persons as young as 2 years old, evaluating verbal and nonverbal aptitudes, using five factor indexes. The DAS-II (Differential Ability Scales-Second Edition) [39, 45], developed for children between 2 years, 6 months and 17 years, 11 months. This test was published in 2007 and consists of 20 cognitive subtests, divided into two categories (early years battery and school-age battery). The KABC-II (Kaufman Assessment Battery for Children-Second Edition) [39, 46], published in 2004, was designed to be applied to children between 2 years, 6 months and 12 years, 6 months. It consists of 18 subtests and measures cognitive and processing capabilities. KTEA-3 (Kaufman Test of Educational Achievement-Third Edition) [39, 47], published in 2014, is an academic skills test, designed for people between 4 years, 0 months and 25 years, 11 months. This test consists of 19 subtests, applicable according to the educational level of the subject. The PISA (Programme for International Student Assessment) test 2018 [48], published its most recent results in 2019 [52]. This is aimed to assess the capabilities in reading, mathematics and science of 15 year old students. The program itself was created to strengthen the abilities of these students, so this test is supportive for this purpose. The TIMSS (Trends in International Mathematics and Science Study) [49, 50]0is designed to assess the capabilities in mathematics and science for students. It was last applied in 2015 and included students in grades 4, 8 and 12. This assessment is similar to the PISA test.
For more information on the different IQ tests available, including the WISC, the authors of [35] make an interesting and thorough revision.
Nowadays, computer-based systems applied to educational areas consider different learning-teaching processes, which implement intelligent models. Just a few systems focus on aspects like reasoning, but they are generating relevant data for decision making [9, 23]. This fact has originated in search of new formal models such as knowledge representation, alternative solutions, cognitive processes, skills, and so on.
Educational games have been proposed to improve kid’s knowledge and, at the same time, to evaluate how kids play and interact with the game while unconsciously developing cognitive aptitudes.
In this work, we focus on determining how a kid solves analogical reasoning problems using an augmented reality version of the Matrix reasoning test. Following our approach, and based on the neuropsychologist’s expertise, it is possible to transform qualitative aspects that he/she notes in order to obtain quantitative data that allows determining how well the kid solves the test. This process allows detecting several issues, helping both parents and pedagogues, obtaining data to guide educative strategies. In this sense, this work only approaches analogical reasoning; nonetheless, in future works we intend to completely automate the WISC test, to show a more plausible interaction, also reducing application time.
Therefore, in this research, we implement an intelligent quantitative model to assess a specific qualitative intelligence scale in children ranges from 5 to 8 years old, based on augmented reality and the well known WISC-IV test.
The WISC-IV test
In the following, we will list the main pros and cons mentioned in the literature for the WISC-IV test. The jobs presented in [36–40] analyze different versions of this test. We summarize these facts, which will support the reason to choose this test over the other options, as we will explain hereafter.
Among the most relevant advantages the WISC test has are: The test provides seven different scores and has a high conceptual background, aligned with recent studies in intelligence. The test offers clear guidelines for the analysis of aptitudes with respect to intraindividual terms. The manual contains base rate data, helpful to separate and interpret results, and useful to discriminate between statistically significant and clinically significant results. The evaluation is useful as a tool for identifying strengths and weaknesses in a subject, globally, or with respect to certain aptitudes.
However, although widely used and considered to be a very reliable evaluation, the WISC test is not exempt from criticism. The most relevant weaknesses in the WISC-IV test are: It is hard to quantify the clinical sense of the test results. The test contains non-psychometric features. This test cannot be considered as a neuropsychological test by itself, since it does not assess several capabilities. Test results are still subject to interpretation, which might lead to misdiagnosis. The main criticism is on how, or even if, the detected patterns will be useful for instructional planning. This version of the test is less sensitive to learning difficulties, since information and arithmetic subtests were relegated to be optionally applied. The version also lacks the picture arrangement and the mazes subtests, which allowed for the evaluation of visual-spatial intelligence, and were also helpful in the observation of perseverance and reaction to frustration. Working memory is considered incompletely evaluated since the WISC-IV test only includes auditory tests, lacking non-verbal, and spatial memory tests. The test requires a long administration time, which is exacerbated by the fact that many critics mention it is a redundant test, with too much emphasis in general intelligence, being some subtests too quantitative. Cultural and linguistic differences in the assessed population play the most relevant role in the results. Evaluating bias is a must for any similar test. In this sense, verbal intelligence quotient (VIQ) and performance intelligence quotient (PIQ) information allowed to observe non-verbal abilities in bilingual groups, which is not possible now, given the lack of this information in this version. The comprehension subtest, as well as the story subtest, were also helpful in the evaluation of social competence; given the changes in this version of the test, social skills are only measurable in verbal tests. In general, not only the WISC test, but most IQ tests, are criticized because of their dependence on verbal, quantitative, and performance scales, which leads to the need of non-verbal tests.
The WISC test was originally developed to be a traditional paper-based test. This is, in fact, a complete test set, applicable to subjects in different age ranges [32]: 1) Wechsler Preschool and Primary Scale of Intelligence, for children between 2 years, 6 months and 7 years, 7 months; 2) Wechsler Intelligence Scale for Children (WISC-IV), for children between 6 years, 0 months and 16 years, 11 months; and 3) Wechsler Adult Intelligence Scale, for subjects between 16 years, 0 months and 90 years, 11 months. In recent versions, complementary software, such as the Scoring Assistant software and the Report Writer, were developed. The former was developed as a tool that automatically converts raw scores to scaled scores, providing as well with certain analyses; the latter was intended to be a report and analysis tool. Recently, with its fifth version (WISC-V) the Q-interactive platform was incorporated, which allows for the application of subtests and more customized batteries, along with other functionalities. Nonetheless, these tools were designed to facilitate the application and evaluation of the test, mostly in its original form, but they do not correct the most relevant problems and weaknesses detected in the test, such as the difficulty to quantify results, the need for non-verbal tests or the presence of subjective interpretations.
For this work, the WISC-IV test was selected because it is an exhaustive and widely used method for the evaluation of the capabilities and cognitive abilities of the kids, showing and measuring interactions in a controlled environment. In addition, we consider that the presented intelligent augmented reality application will be of help to solve the issues related to ambiguous, subjective evaluations, given the qualitative aspects considered, along with the human interpretations of the traditional WISC-IV test. Also, the fact that we consider a non-verbal subtest, i.e., the Matrix reasoning subtest, helps to avoid the aforementioned issues in verbal tests.
We have mentioned that in this research, we perform the WISC-IV test on children from 5 to 8 years old (for these age ranges, the expert confirmed there is no problem in applying the same test to all the children). Nonetheless, one question that might arise is: what do we need to do if we intend to apply this intelligent assessment model to different age ranges. To solve this question, we can apply the model to other age ranges but they must be adjusted since variables such as the time and movements, for instance, are calibrated according to the expected maturity of their analogical reasoning according to the expert. Hence, we can argue that the model could be the same but the ranges must be calibrated.
As a general roadmap to modify our model to be applied to other age groups, we can consider the following steps: Establish a relationship with an expert in the field of psychology or neuropsychology. Select the correct version of the WISC-IV test according to age and maturity to evaluate (as we mentioned in this section, the WISC test is applicable to subjects of different age ranges, having different versions depending on the age group). The expert needs to give feedback on the test and to help focus on the parameters selected. The expert selects a set of children with a complete psychological record to check their progress from a baseline. This is a difficult task because of the availability, confidentiality, and the facility to carry out several tests. Besides, it implies a commitment from parents and children involved in the project. Carry out the process for tuning the ranges of the membership functions used by the fuzzy model. This task is carried out with the help of the expert and contributes to validating that the model’s output is helpful in the evaluation of the analog reasoning.
With this research, we show an opportunity to reduce the solving time, while obtaining quantitative data, without any direct intervention when observations are made with respect to how many or how interactions are carried out by the kid with the test application.
Related work
Research related to the development and evaluation of analytical reasoning has ranged from the area of Psychology to the area of Computer Science. Research in the field of psychology [9–11] focuses on developing assessments that show improvements in analogical reasoning, through the use of various training programs and technological tools. Researches that used technological tools, such as digital games [12], touchscreens [14, 16], ocular follow-up [15], applied them mainly to demonstrate that, after their usage, the user improved their analogical reasoning abilities. An emerging technology that is gaining track is Augmented Reality, thanks to the decreasing costs in equipment and the strong use of mobile devices; this technology improves the perception, knowledge, and interaction with the real world [16]. The application of Augmented Reality in several types of research has been focused on the training and/or rehabilitation of cognitive abilities [17, 25]. The results they obtained served as evidence of the underlying representations in memory and the processes implied [21], as well as the use of intuitive spatial abilities, motor and gestural actions [19]. On the other hand, the area of computing has focused on the development of analogical reasoning models using various techniques, such as Fuzzy Logic [22] and Neural Networks [23], based on the underlying processes identified in this kind of reasoning, to model a variety of psychological phenomena, acquiring information on human cognition [24]. Another technique applied to the representation of knowledge, fuzzy reasoning, and learning, is the theory of Petri nets [25]. The combination of Petri nets with Fuzzy Logic becomes an effective tool for the representation of uncertain knowledge about the state or behavior of a system [26–29]. Models of different kinds of Petri nets have been used to represent knowledge, reasoning mechanisms for behavior analysis, as well as modeling cognitive-affective states, with the objective of analyzing decision making of users; where their implementation has been able to capture the uncertainty of the system [29] or to explain the behavior of a complex system [27, 28].
The proposed system
In this section, the methodology followed for the design and development of the proposed system is described, where the response movements in children between 5 and 8 years old are studied. The goal is to analyze the moment they solve visual-perception analogical reasoning exercises and, then, to infer a level of performance of this task. In this way, the resulting data of the exercise carried out to obtain the performance of the cognitive factors is evaluated, and these factors are implicit in the analogical reasoning capabilities, based on visual perception. An essential stage of this work is to determine the set of exercises related to analogical reasoning, based on visual perception, mapping certain characteristics in the mobile application to solve the exercises. It is possible to evaluate the analogical reasoning capabilities based on the movements made by the child (in this work, we study children between 5 and 8 years old), through an augmented reality mobile application, by using an inference subsystem modeled with a Fuzzy Petri net, in order to improve the performance level of the target child (see Fig. 1).

Block diagram for the proposed system.
The methodology is based on three stages: 1) selection of analogical reasoning exercises based on visual perception, 2) structure of the augmented reality mobile application, and 3) assessment of the analogical reasoning capabilities based on visual perception.
The Wechsler Intelligence Scale is used in the first stage of this work. As we previously mentioned, we applied one part of the WISC-IV to children between 5 and 8 years old, since applying the complete test would take a complete week for each child, considering it is an individually administered IQ test. The perceptual reasoning index considers 4 subtests (one of them, optional), from which the one used is the matrix reasoning subtest.
In its official form, the WISC-IV is applied using printed books. Besides, the matrix reasoning subtest consists of presenting the child with partially filled grids in one of the printed books, having to select the item he/she considers to be the most appropriate to complete the matrix (an example of how this subtest works is shown in Fig. 2 (left)). This subtest considers several cognitive factors such as a child’s capacity for perceptual organization, abstract reasoning, attention to detail, attention and concentration, visual processing, and so on. Also, it is based on images and linked to motivation and persistence, capacity to accomplish a goal, capacity for trial-and-error experiments, and visual acuity [51]. Four aspects are analyzed and implemented on the augmented reality mobile application: visual processing, organization based on visual perception, spatial capacity, and discrimination based on visual perception (these aspects are shown in Fig. 2 (right)).

Left: An example of how the traditional analogical reasoning subtest works. Right: Four aspects of analogical reasoning based on visual perception, analyzed and implemented in the application.
Most related works focus only on the digitization of the tests, which accelerates different processes and provides certain advantages; some examples are [9, 23]. However, they do not obtain qualitative data about how the test is solved by the student; hence, the emphasis is put more on improving the test solving experience, rather than on producing a more reliable assessment.
In the case of pedagogues, they use qualitative data from the processes executed when students solve tests or even the way they execute them. These data are subjective because they are obtained from observation, based on the pedagogue’s experience. Finally, the pedagogue analyzes both qualitative data and test responses to carry out the analysis and evaluation. In this work, an additional contribution is to obtain quantitative data from the interaction and correlation of ways of test solving and the same responses, generating information that indicates how a test is solved. The variables considered for this quantitative assessment were verified by an expert in neuropsychology (Dr. Claudia Mestizo), who collaborated in observing when a kid solved the test, taking some notes on the variables proposed in this paper and validating them. This contribution provides an intelligent quantitative model, whose information is based on an automated evaluation and feedback data for the student and for the teacher; the main goal is to present tools based on instructional planning, which can correct and avoid possible learning deficiencies.
In the second stage of this work, the mobile application implements an adaption of the matrix reasoning subtest: supported by augmented reality, interactions among a number of possible on-screen answers (the items) and the physical application book, are enabled. In the application, the following movements are possible for the child to execute: 1) translation: where the item is moved from one place to another; 2) scaling: where the size of the item is changed; and 3) rotation: where the item is turned around an axis or center (see Fig. 3).

Structure of the Augmented Reality Mobile Application.
In order to quantify this subtest, the number of movements the child executes to solve the exercises is counted. Also, the time the child takes to solve it is measured. Finally, the quality of the answer is assessed and measured. These three measurements constitute the aforementioned variables validated by the expert. The number of movements is captured by the touchscreen and measured by the fuzzy inference subsystem. The range of available movements is shown in Table 1. Moreover, the time taken by the child for the resolution of the exercise (in seconds) is registered and evaluated using a range of values presented in Table 2. Finally, each virtual model receives a score depending on the quality of the answer (see Table 3).
Range of allowed movements
Numeric values related to the variable Time
Numeric values related to the variable Score
To establish the ranges of each variable, we interviewed the expert to observe how she applied the test. This way, she helped us determine these ranges. Further, according to an initial test applied to the kids, we fine-tuned the intervals for each variable. Besides, it is necessary to remark that the expert also verified and agreed with the output model.
The involvement of the expert allowed us to include these variables that, although they are not part of the standard test, they are frequently considered when a child is being assessed with this test. For instance, when a child makes random movements looking for an answer, spending a long time, whether the question is answered correctly or not, this behavior implies that the child has some areas of opportunity to improve his/her reasoning. Thus, this work is focused on capturing similar behaviors.
It is worth mentioning that, to the best of our knowledge, there are no references on how to define these variables or their values. Consequently, in order to take into account the neuropsychologist’s expertise, we decided the process of determining these ranges should be the one described above.
Additionally, it must be observed that, as we mentioned in Section 3, the WISC-IV test itself has different versions depending on the age group and that this test undergoes a formal validation before its application to a different population from that for which it was designed. Hence, it would be expected that the process of determining and fine-tuning the variables needs to be repeated for each new population.
Finally, the third stage is the evaluation of the analogical reasoning capability, based on visual perception. It considers movements on the objects for providing analogical solutions. The movements are captured by the mobile application, and a Fuzzy Petri net models the behavior. For this net, a rule base is designed, where a condition is a net place, and a transition is an activity. On the one hand, the input variables are the movements made on each object, the score related to each answer, and the time required to finish an activity (see Table 4). On the other hand, the output (control) variables represent the performance level of the cognitive factors related to each exercise (see Table 5).
Description of the state variables
Description of the state variables
Description of the control variables
The membership functions for the linguistic labels require data normalization, generating a range of values with a normal distribution, preventing higher values from having greater weight than lower ones. The Min-Max technique was used (see Equation (1)), maintaining the relation among the original data in a given scale, where A is the original data, [C, D] is the predefined range (in this work, this range is [0, 1]), and A′ is the resulting normalized data. As it can be seen in Tables 4 and 5, the values used in Equation 1 are numeric, which facilitates computing A′; hence its value is also numeric. Nevertheless, if the context were different, i.e., if any of the variables were not numeric, as in the case of A being a string variable, then it should be treated by translating its content to a numeric value in order to apply the normalization process as suggested in this work.
In this work, we use a triangular membership function (see Equation (2)) to describe the fuzzification process for the input and output variables used in the model. The parameters {a, b, and c} (with a < b < c) determine the x coordinates of the three corners of the underlying triangular membership function.
Following Equation (2), the triangular membership function is used for all the linguistic labels: 1) translation, rotation and scaling (Fig. 4.a); 2) time (Fig. 4.b); and 3) score (Fig. 4.c).

Triangular membership functions for the labels: (a) translation, rotation and scaling; (b) time; and (c) score.
Equation (2) is also used for computing the membership function for the output variable defined in Fig. 5, which is used for discrimination based on visual perception, spatial capability, visual processing, and organization based on visual perception.

Triangular membership function for the output linguistic labels.
One or several output variables are reached, depending on the input variables, while the net tokens are based on the score of the selected answer and the time required for solving the exercise. The architecture of the Fuzzy Petri net is shown in Fig. 6.

Fuzzy Petri net for the inference subsystem.
The input places (translation, scaling, and rotation) take two tokens each (score and time), and they are fired depending on the number of movements. The output places (discrimination, visual processing, spatial capability, and organization) take one, two, or three tokens. For example, the organization based on visual perception is reached if the firing of the translation transition is executed.
In this section, the basic elements of the mobile application, the evaluation of the analogical reasoning, and a case study are presented.
Basic elements of the mobile application
The mobile application was implemented using Unity and Vuforia software tools, which enable to develop augmented reality applications. Different markers are identified in the images using the mobile device’s camera; these markers are necessary for the adequate deployment of the 3D object models. For example, in Fig. 7, 9 markers are simultaneously identified since there are 9 objects on the screen at the same time (5 numbered objects at the bottom of the screen, as possible solutions for the exercise, which can be manipulated; and 4 more objects at the top of the screen, corresponding to the analogical exercise, ordered according to the matrix subtest, and indicating the position where the solution ought to be located).

Example of markers and their respective 3D models in Unity for the matrix reasoning subtest.
In the augmented reality application, each possible answer has a score, according to the quality of the answer (see Table 3). Also, the number of movements indicates how difficult it was for the child to solve the exercise (see Table 6). Finally, the exercise has a conditioned solving time (see Table 2).
Numeric values related to the movements
If the object moved to the space marked with the question mark is correct, then a sound of applause is heard; otherwise, if it is incorrect, then a sound indicating the error is played. These sounds represent instant feedback for the child. Results are sent to a Web server, where data is stored and normalized for the analogical reasoning evaluation.
The evaluation of the analogical reasoning is executed through the Fuzzy inference subsystem. The membership functions shown in Figs. 4 and 5 are normalized. According to these functions, the combinations for the linguistic labels provide a rule base to evaluate the performance level of the cognitive factor. So, the rule base (see Table 7) defines the fuzzy rules, following the structure shown in Equation (3).
Rule base for the proposed system
The rule base takes into account only combinations for generating three values in the output variable: low, medium, and high. The Fuzzy Petri net allows us to identify the input places firing the transitions, and the Mandami inference method is implemented, where “min” is the implication function, “max” is the aggregation function, and “centroid” is the defuzzification method.
For the evaluation of the fuzzy-system behavior, the first exercise of the WISC-IV matrix subtest is used (see Fig. 8), and its objective is to be used as a guideline for solving the analogical problem.

First exercise of the WISC-IV test.
A challenging issue that we faced in this research was specifically the number of participants included in this work, which is a common issue in similar trials, as the literature shows [53, 54]. With regard to this, the selected kids had been working previously with the expert; this allowed to reliably measure the performance of the model proposed with a complete psychological record. Besides, this was a difficult task due to the availability, confidentiality, and the facility to carry out several tests. This implied a commitment from parents and children involved in the project. In fact, the number of children recruited for this work was initially 9, but this number decreased to five by the moment the tests were applied.
The five tests were applied to different Spanish speaking children in both physical sheets and the mobile application. The fact that the native tongue of the children is Spanish is not expected to affect results since the matrix reasoning subtest is a non-verbal evaluation. The results for the physical sheets are shown in Table 8, where it can be seen that only two children solved the test correctly (they pointed at the figure or mentioned its number). In this test, the children are not allowed to move the notebook with the exercises, but they can approach or move the head, depending on the psychologist’s subjective criteria.
Results obtained from the test
The same five children used the augmented reality mobile application to solve the first exercise; where, by the way, the neuropsychologist recommended the use of a different figure, using fish instead of butterflies (see Fig. 9).

Graphical results obtained using the application.
It is important to emphasize that to evaluate the intelligent quantitative assessment model, we consider that the appropriate comparison should be against the original paper-based test instead of against other applications since there is not, to our best knowledge, another quantitative assessment application, developed to improve the WISC tests.
In the test, each fish receives a different score, according to the correctness of the answer, in terms of its features; for example, the shape or the stripes of the fish. The obtained results using the mobile application are shown in Table 9. The same children solve correctly, both the physical sheet and the mobile application. These values are input to the Fuzzy Petri net and evaluated by the inference subsystem, generating a cognitive factor level deduced from the solving of the exercise.
Results obtained from the application
The final results, obtained from the Fuzzy subsystem are shown in Table 10, where an unexpected behavior in test number 3, is reported. Table 8 shows that the correct answer was chosen, but, in fact, this process required a considerable amount of movements and a time out of the allowed range, evaluating the child’s spatial capability at a medium level. Moreover, the organization based on visual perception is at a low level, which is due to the number of translation movements. In test number 2, the child shows a high level, requiring a small number of movements, a time inside the range, and a correct value of the answer, generating cognitive factors at a high level of performance. Additionally, tests numbers 1 and 4 provide cognitive factors at a low level; although the number of movements is medium or low, the time is high, and the answers were incorrect.
Results obtained from the Fuzzy inference subsystem
Test number 5 provides cognitive factors at a medium level, where the answers are incorrect, but they are very close to the correct answers. The number of movements is in the intermediate range, but the time is, barely, inside the limit range.
It is important to remember that the expert accompanied the whole process and guided relevant decisions, so the above results were validated by her.
In this work, an analysis and design of a system based on augmented reality, a mobile application, and a Fuzzy inference subsystem are proposed to assess the performance levels of cognitive factors, implicit in visual-perception analogical reasoning. This proposed system uses a rule base and a Fuzzy Petri net. In this context, the analysis of a case study, where the results of 5 children solving both the traditional matrix reasoning subtest and its augmented reality implementation, showed that it is possible to implement an intelligent quantitative evaluation of a specific qualitative intelligence scale for analogical reasoning, applied to children between 5 and 8 years old. As future work for the improvement of the evaluation, the obtained results should be sent through the application to a psychologist for a better analysis of the cognitive levels, finding cognitive problems, or minor problems. Increasing the number of children tested is not a trivial task for which the proper strategy is being analyzed and developed as part of the future work as well. Although applying the model to different age groups is not in the scope of this work, we recognize this is an important step, so this action will be part of the future work too. Besides, the improvement can include some other sets of exercises, such as any the other 35 exercises in the WISC-IV test, to evaluate the analogical reasoning in a different age group. In the future, we can carry out these changes to contrast results.
Acknowledgments
We would like to thank Dr. Claudia Mestizo, an expert in neuropsychology from the Therapeutic Center for Children with Autism Spectrum Disorder in Xalapa, Veracruz, Mexico, for her help in the definition of the variables and the validation of the results obtained.
