Abstract
Fuzzy logic is a branch of artificial intelligence that has been used extensively in developing Fuzzy systems and models. These systems usually offer artificial intelligence based on the predictive mathematical models used; in this case linear regression mathematical model. Interval type 2 Gaussian fuzzy logic is a fuzzy logic that utilizes Gaussian upper membership function and the lower membership function, with a footprint of uncertainty in between the Gaussian membership functions. The artificial intelligence solutions predicted by these interval type 2 fuzzy systems depends on the training and the resultant linear regression mathematical model developed, which usually extract their training data from the expert knowledge stored in their knowledge bases. The variances in the expert knowledge stored in these knowledge-bases usually affect the overall accuracy of the linear regression predictive models of these systems, due to the variances in the training data. This research therefore establishes the extent that these variances in knowledge bases affect the predictive accuracy of these models, with a case study on knowledge bases used to predict learners’ knowledge level abilities. The calculated linear regression predictive models show that for every variance in the knowledge base, there occurs a change in linear regression predictive model with an intercept value factor commensurate to the variances and their respective weights in the knowledge bases.
Keywords
Introduction
Fuzzy logic is a branch of artificial intelligence which mimic human reasoning by providing an array of reasoning possibilities, hence it is being used in developing intelligent systems and models. Interval type 2 Gaussian fuzzy, is a Gaussian membership fuzzy with upper membership function and lower membership function with a footprint of uncertainty in between the Gaussian membership functions. Such that, any input in an interval type 2 membership function, cuts both the upper and the lower membership functions [4, 22].
In order for fuzzy systems and models to offer artificial intelligence solutions, general mathematical predictive models have to be trained and developed to enable these systems in offering intelligent predictions. In this case, linear regression predictive models were used. To train and develop linear regression predictive models, fuzzy systems will have a set of inputs and their corresponding outputs. These outputs are estimated based on the expert knowledge stored in their knowledge bases. Thus, different knowledge bases will tend to give different output values for similar input values [12, 28].
References [1, 6] explained that to train and develop linear regression mathematical models, the inputs to the fuzzy systems and their corresponding outputs are used for the training of the linear regression. Hence, interval type 2 Gaussian fuzzy linear regression models referred to this paper, is a linear regression mathematical predictive model trained and developed in order to make predictions for these Gaussian Systems.
In this paper we examine how variances in knowledge bases of interval type 2 Gaussian fuzzy logic systems affect these systems’ output values, and their resultant linear regression predictive models when placed under similar conditions.
To conduct this research, three interval type 2 Gaussian fuzzy used to predict knowledge level abilities of students were developed. The three Gaussian fuzzy were all similar with the only difference being in the expert knowledge stored in their knowledge bases, thus variances in knowledge bases. The three different knowledge bases were corresponding to type 1 test, type 2 test and type 3 test respectively as suggested by [5, 14].
In order for a student’s knowledge level ability to be predicted using the fuzzy, a student was to undertake an assessment test. After undertaking an assessment test, the test score and the time taken to finish the test were used as the inputs to the fuzzy, which was then used to predict the student’s knowledge level ability. The assessment test score was scored out of 100%, and the time to complete an assessment test was calculated out of 20minutes; while the knowledge level output was rated out of 100%.
Fifty (50) random inputs each of the three interval type 2 Gaussian fuzzy were experimented, and their resultant knowledge level abilities documented. The collected data was used for training and validating the linear regression predictive model for each exam type at a ratio of 70:30 respectively, as advised by [13, 23]. The behavior of linear regression model was then analyzed, based on the variances in the three knowledge bases used.
Methodology
Mamdami interval type 2 Gaussian fuzzy for test 1 type, test 2 type, and test 3 type were developed using Juzzy online fuzzy toolkit. In order to ensure that only the knowledge bases of the fuzzy were varying, all the input and the output membership functions of the three Gaussian fuzzy were identical, with only the expert knowledge stored in the knowledge bases being different as suggested by [5, 21].
As advised by [18, 28] an interval type 2 Gaussian fuzzy logic was developed using the following five modules: -
Fuzzification module
In this module, we defined the linguistic variables and their terms, and then set the degree of memberships. Our linguistic terms and variables were as follows: -
Input linguistic variables and linguistic terms
Test score = {low, fair, average, good, excellent}
Time = {short, average, long}
Output linguistic variables and linguistic terms
Knowledge level = {none, shallow, deep, expert}
Using Equations (1) and (2) Gaussian membership functions with varying variances were plotted as illustrated by [9, 25].
Using Equations (1) and (2), and taking considerable estimation of the midpoints and the spread of each membership function, a non-singleton Gaussian MF for the antecedents and precedence, with each membership function having upper MF

Assessment test score membership function.

Time taken on an assessment test membership function.

Knowledge level membership function.
Figures 1, 2 and 3 shows the membership functions of an assessment test score, time taken to complete the assessment, and the knowledge level. All the three exam types used similar membership functions for consistency, and to allow only the knowledge bases to vary. Assessment test score, and time taken, formed the input membership functions, while knowledge level, formed the output membership function. Assessment test score ranges from 0 to 100 marks, time taken ranges from 0 to 20 minutes, while knowledge level ranges from 0 to 100%, as guided by [1].
As informed by [16] the fuzzy having two input linguistic terms (assesment test score and time taken) with 5 and 3 linguistic terms respectively. This brings about 15 rules for every type of assessment test. Thus the three types of assessment tests (type 1, type 2, and type 3) will have a total of 45 rules using Equation (3).
Where xi and x
l
are inputs,
In developing the knowledge bases, expert knowledge was used to guide the relationship between the precedence and the consequents for each rule using Equation (3). The expert knowledge was then summarized using 45 rules in the three test type’s knowledge bases in Table 1.
Knowledge base for the various test types
Authors [8, 29] explained that inference engine is the place where the rules stored at the knowledge base are fired in order to get the desired fuzzy outputs. The rules were fired by joining them using logical AND, thus min t-norms, using Equations (4), (5) and (6).
Where * represent min operation
Where * is a min t-norm
Taking the advice of authors [9, 27] Karnik–Mendel (KM) type reducer was used for reducing the IT2FS to T1FS (type 1 fuzzy system). To achieve this Equations (7) and (8) was used.
Where
R1 (rule 1) provides [
R2 (rule 2) provides [
R3 (rule 3) provides [
R4 (rule 4) provides [
Centroid type reduction was used to perform defuzzification for obtaining crisp value from fuzzy values. Centroid type reduction involves finding the center between yl (minimum) and yr (maximum) values. Therefore the average of Equations (7) and (8) was the centroid value.
Fuzzy logic experimental results and discussion
As guided by [4, 18] each of the three Gaussian fuzzy representing the three knowledge bases, were experimented with 50 random inputs values and their resultant outputs collected and recorded. Out of the 50 records collected for each test type, 35 were used for linear regression model training while the remaining 15 were used for the model validation.
Type 1 test knowledge level results
A 3Dimension surface view of knowledge level abilities for type 1 test was extracted as shown in Fig. 4.

Type 1 test surface view.
Fifty (50) random trials on various acceptable assessment test scores and time taken to complete an assessment test were run on type 1 test Gaussian fuzzy, and their respective knowledge level abilities documented as indicated in Table 2.
Experimental results for type 1 test
A 3Dimension surface view of knowledge level abilities for type 2 test was extracted as shown in Fig. 5.

Type 2 test surface view.
From the results (surface view), we can deduce that the lower the assessment test score and the higher the time taken to complete an assessment test, then the lower the student knowledge level. Although, it can be noted that for similar test score and time taken for type 1 test and type 2 test. Type 2 test had slightly higher knowledge levels compared to type 1 test, implying that type 2 test was more technical or advanced than type 1 test, which was practically correct based on the expert knowledge stored in the knowledge bases.
Fifty (50) random trials on various acceptable assessment test scores and time taken to complete an assessment test were run on type 2 test Gaussian fuzzy, and their respective knowledge level abilities documented as indicated in Table 3.
Experimental results for type 2 test
A 3Dimension surface view of knowledge level abilities for type 3 test was extracted as shown in Fig. 6.

Type 3 test surface view.
From the results (surface view), we can deduce that the lower the assessment test score and the higher the time taken to complete an assessment test, then the lower the student knowledge level. Although, it can be noted that for similar test score and time taken for type 1 test, type 2 test, and type 3 test. Type 3 test had slightly higher knowledge levels compared to type 2 test, and a much higher knowledge level compared to type 1 test. This implying that type 3 test was more technical or advanced than both type 2 and type 1 tests, which is practically correct.
Fifty (50) random trials on various acceptable assessment test scores and time taken to complete an assessment test were run on type 3 test Gaussian fuzzy, and their respective knowledge level abilities documented as indicated in Table 4.
Experimental results for type 3 test
References [3, 29] explained that linear regression predictive models can predict values that are either part of their training data or new values. Due to more than one independent variables (assessment test score, and time taken), and one dependent variable (knowledge level ability), multiple linear regression Equation (9) was used to calculate the linear regression predictive model for this IT2GF. The linear regression model will enable the development of a predictive model, where the dependent variable will be predicted from the independent variables.
Where:
Y = dependent variable.
X = Independent variable(s).
a = Intercept.
b = Slope.
ɛ= Regression residual.
Multiple linear regression variables with two variable b1 and b2, as well as intercept a, can be calculated using the formulas in Equations (10) to (12) respectively.
Where
References [2, 10] pointed out that before training a regression model, there is need to receive sample simulation or experimental results to be used for training, and another set of simulation or experimental results for validation. Experimental results from Tables 2, 3 and 4 were used for training and validating multiple linear regression mathematical predictive models for the test types 1, 2 and 3 respectively. Multiple linear regression was calculated for each of the test type using Microsoft Excel, to establish the relationships between the independent and the dependent variables.
Type 1 test linear regression training and discussions
On performing multiple linear regression on type 1 test experimental results training data in Table 2, the following results and coefficients were obtained as summarized in Table 5.
Type 1 test multiple linear regression results
Type 1 test multiple linear regression results
There is R square and Adjusted R square of 83.77% and 82.75 respectively. This shows a very high percentage of variance proportion that the dependent variable could be predicted from the independent variables for this multiple linear regression model. The regression statistics also has an acceptable standard error of 7.97.
A significance F of 2.3206E-13 is far much less than acceptable value of 0.05, implying that there is a strong relationship between the independent variables and the dependent variables. On substituting coefficients in Equation (9), then a predictive multiple linear regression Equation (13) for predicting type 1 test knowledge level ability was achieved. There exist different residual error (ɛ) parameters for each set of independent and dependent values, though were very small values to interfere with the predictions.
There exist minimal standard error with intercept having a value of 3.888, test score having a value of 0.049, and time having a value of 0.235. These are acceptable values of t statistics (coefficient/standard error).
On performing multiple linear regression on type 2 test experimental results training data in Table 3, the following results and coefficients were obtained as summarized in Table 6.
Type 2 test multiple linear regression results
Type 2 test multiple linear regression results
The values of R square and Adjusted R square are 91.72% and 91.20 respectively. This shows a very high percentage of variance proportion that the independent variables could predict the dependent variables for this multiple linear regression model. The regression statistics also has an acceptable standard error of 7.19.
A significance F of 4.89E-18 is far much less than the accepted value of 0.05, implying that there is a strong relationship between the independent variables and the dependent variables. On substituting coefficients in Equation (9), a multiple linear regression Equation (14) for predicting type 2 test knowledge level ability was achieved. There exist different residual error (ɛ) parameters for each set of independent and dependent values, though were very small values to interfere with the prediction outcomes.
There exist minimal standard error with intercept having a value of 2.772, test score having a value of 0.037, and time having a value of 0.221. These are acceptable values of t statistics (coefficient/standard error).
On performing multiple linear regression on type 3 test experimental results training data in Table 4, the following results and coefficients were achieved as summarized in Table 7.
Type 3 test multiple linear regression results
Type 3 test multiple linear regression results
The values of R square and Adjusted R square are 91.67% and 91.15 respectively. This shows a very high percentage of variance proportion that the independent variables could predict the dependent variables for this regression model. The regression statistics also has an acceptable standard error of 5.48.
A significance F of 5.338E-18 is far much less than accepted values of 0.005, implying that there is a strong relationship between the independent variables and the dependent variables.
On substituting coefficients in Equation (9), a multiple linear regression Equation (15) for predicting type 3 test knowledge level ability was achieved. There exist different residual error (ɛ) parameters for each set of independent and dependent values, though very small values to interfere with the prediction outcomes.
There exist minimal standard error with intercept having a value of 2.719, test score having a value of 0.033, and time having a value of 0.201. These are acceptable values of t statistics (coefficient/standard error). The P values of the intercept and the independent variables are far much less than 0.001, implying that all the independent variables play a major role in the determination of the dependent variables to a high degree.
References [11, 17] advised that there is need to validate regression models to ascertain whether the model is making considerable predictions. The 50 sample experimental results for each test type were divided in a ratio of 70:30, for training and validation respectively. In order to validate our linear regression models, we used the validation data in Tables 2, 3 and 4.
Type 1 test validation experimental data results in Table 2 were used to validate the knowledge level ability for type 1 test multiple linear regression Equation (13). Type 2 test validation experimental data results in Table 3 were used to validate the knowledge level ability for type 2 test multiple linear regression Equation (14). Type 3 test validation experimental data results in Table 4 were used to validate the knowledge level ability for type 3 test multiple linear regression Equation (15).
Type 1 test linear regression model validation
Linear regression model Equation (13) was used to predict the knowledge levels abilities for type 1 test using the validation data inputs in Table 2.
Comparisons between predicted knowledge level verses the experimental knowledge level for type 1 test Gaussian fuzzy validation data, is summarized in Table 8 and plotted in graph 1. Considering Table 8 and graph 1 the values of residual (experimental values –predicted values) is within acceptable ranges, making the linear regression model Equation (13) valid. The model validity was also proved using the R square values in Regression and coefficient results in section 5.1.1.
Type 1 test validation data results (Showing comparison between experimental results, predicted results and the residual)
Linear regression model Equation (14) was used to predict the knowledge levels abilities for type 1 test using the validation data inputs in Table 3.
Comparisons between predicted knowledge level verses the experimental knowledge level for type 1 test Gaussian fuzzy validation data, is summarized in Table 9 and plotted in graph 2. Considering Table 9 and graph 2 the values of residual (experimental values –predicted values) is within acceptable ranges, making the linear regression model Equation (14) valid. The model validity was also proved using the R square values in Regression and coefficient results in section 5.2.1.
Type 2 test validation data results (Showing comparison between experimental results, predicted results and the residual)
Linear regression model Equation (15) was used to predict the knowledge levels abilities for type 1 test using the validation data inputs in Table 4.
Comparisons between predicted knowledge level verses the experimental knowledge level for type 1 test Gaussian fuzzy validation data, is summarized in Table 10 and plotted in graph 3. Considering Table 10 and graph 3 the values of residual (experimental values –predicted values) is within acceptable ranges, making the linear regression model Equation (15) valid. The model validity was also proved using the R square values in Regression and coefficient results in section 5.3.1.
Type 3 test validation data results (Showing comparison between experimental results, predicted results and the residual)
Graphs 1, 2 and 3, showed that the predicted knowledge level were very close to the exact knowledge levels, for all the test types. This implied that the linear regression models were valid, as they gave acceptable knowledge ability levels that could be relied upon.
Discussion
Knowledge base for type 1 test had linear regression predictive model Equation (13) with the coefficients of intercept, assessment test score, and time, being 23.595, 0.539 and –1.385 respectively. Thus, Knowledge level
Knowledge base for type 2 test had linear regression predictive model Equation (14) with the coefficients of intercept, assessment test score, and time, being 28.713, 0.656 and –1.444 respectively. Thus, Knowledge level (
Knowledge base for type 3 test had the linear regression predictive model Equation (15) with the coefficients of intercept, assessment test score, and time, being 48.026, 0.534 and –1.44 respectively. Thus, Knowledge level (
In order to compare the variances in the knowledge base, we made a comparison between the following test types: - type 1 test verses type 1 test, type 1 test verses type 2 test, and finally type 1 test verses type 3 test. Then we captured the change in coefficients verses the change in variances (where 0 shows no variance and 1 shows a variance), as illustrated in Table 11.
Knowledge base variances for the various test types
Knowledge base variances for the various test types
Comparing knowledge base of type 1 test with itself, there was no variance, thus resulting in similar linear regression models of Knowledge level
Alternatively, when we compared knowledge base of type 1 test with knowledge base of type 2 test, then there was variances in 5 rules, which implied 33% of the total 15 rules varied in the knowledge base. The variance changed the values of Gaussian fuzzy output and hence the linear regression model from Equation (13) to Knowledge level (
The 5 rules that varied adjusted the linear regression model with an increase in intercept of 5.121, which is about 21% increase on the intercept. There was also a small increase in test score. On changing the subject of the formula to hold the time and test score to being similar to Equation (13) (held constant), then the intercept would have been increase to an average of about 6.12 (implying 25.09% change in intercept). This implied that when a variance exists on knowledge base, but towards higher values, then the intercept increases with a ratio of almost that magnitude.
Lastly when we compared the knowledge base of type 1 test with knowledge base of type 3 test, then there was a variance of 11 rules, which implied 73% of the total 15 rules varied in the knowledge base. The variance changed the linear regression model to, Knowledge level (
Expert knowledge stored in the knowledge bases of Gaussian fuzzy systems play an important role in the calculation of the consequent intelligent output of those systems. From the research, it was evident that the variances in knowledge base greatly affected the output values of the fuzzy as indicated in Tables 2, 3 and 4. These output values were the ones to be used to train linear regression model. Due to these changes it was observed that when 33% of the rules in knowledge base were varied, linear regression predictive model intercept changed by 25.09%. On the other hand when 73% of the rules varied then the linear regression model intercept changed by 103.54%. This implying that in Gaussian fuzzy systems, linear regression predictive models intercepts changes exponentially other than linearly. Hence the higher the variances in expert knowledge stored in the knowledge base, the excessively higher the deviation from the acceptable prediction values. Thus the accuracy of the model depends on the lower the variance in the knowledge base.
