Abstract
Taiwan is an endemic area for chronic hepatitis disease. Since the early 1980’s, liver cancer has become the first cancer mortality causes among other cancers in Taiwan. Besides, liver cirrhosis and chronic liver diseases are the sixth rank and seventh rank in the causes of death, respectively. This is a serious disease affecting people’s health and it brings a lot of medical cost as well. This study develops a medical cost forecasting model for the acute hepatitis patients in the emergency room. In order to consider the uncertainty and hesitation in the human being’s thinking, this study employs the intuitionistic fuzzy logic (IFL) since it considers membership, non-membership, and hesitation values simultaneously. The proposed model combines the intuitionistic fuzzy neural network (IFNN) with Gaussian membership function and Yager-Generating function to enhance the performance of FNN. Furthermore, a back-propagation learning algorithm and genetic algorithm (GA) are applied in order to optimize the parameters and weights of the proposed IFNN. The proposed IFNN is applied to solve ten benchmark datasets including the nonlinear control and prediction problems. The computational results showed that the GA-IFNN is more efficient than conventional algorithms, such as an artificial neural network (ANN), a fuzzy neural network (FNN), and a support vector regression (SVR). In the real-world problem, the proposed method can really support physicians in planning medical resources and make a good decision to make the most efficient use of limited resources.
Keywords
Introduction
Hepatitis is a medical condition defined by the inflammation of the liver and characterized by the presence of inflammatory cells in the tissue of the organ. It may occur with limited or no symptoms, but often leads to jaundice, poor appetite and malaise. Acute hepatitis lasts less than six months and chronic when it persists longer. Usually, a patient who has hepatitis should have several treatments in hospital before recovering. During the treatment period, the hospital needs to prepare medicines and medical applications. However, each patient has a unique condition. This situation raises a difficulty for the hospital in preparing the necessary medical resources. If the medical cost can be predicted in advance, the hospital can prepare for the related medical resources efficiently in order to provide high-quality treatment and avoid the unnecessary resource waste. Therefore, to reduce waste and increase the efficiency, developing a cost predicting system to predict medical cost for patients has become a very critical issue in hospital.
On the other hand, data mining techniques have been applied to many practical applications including healthcare. Among them, artificial neural network (ANN) is one of the most popular data mining techniques and has obtained many promising results. ANN is a system from neurophysiology models. In general, an ANN consists of a collection of simple, nonlinear computing elements, whose inputs and outputs are connected together, to form a network [1]. ANNs have been employed to solve the medical problems [2–4]. In order to have both merits of ANN and fuzzy set theories, fuzzy neural network (FNN), which is another data mining technique, has been proposed and successfully applied to many areas, such as control, identification, prediction, pattern recognition, and bioengineering. FNNs inherit their learning ability from neural networks and their inference technology from fuzzy systems. Therefore, FNNs is able to solve the aforementioned characteristic behaviors [5–16]. The fuzzy neural networks combine the low-level learning and computational power of neural networks into fuzzy systems and the high-level, human-like thinking and reasoning of fuzzy systems to neural networks.
Since the hepatitis is not only to bring health threat in the Taiwan, it is also a tremendous and threatening disease all over the world. With respect to health care attention and both the progress and the health care costs more and more attention, not only hospital is very concerned about this problem, the patient itself and the insurance companies are also concerned about this issue. It is critical to develop a forecasting model for medical cost. In addition, FNN has been widely used in various fields of research, and it can be established the prediction model with good accuracy. Therefore, the study refers the architecture of fuzzy neural network and proposes an improved method, which integrates the concept of intuitionistic fuzzy sets with the fuzzy neural network. Then, the proposed method is applied to establish the forecasting model for medical cost.
John Holland, from the University of Michigan, initiated his work on genetic algorithm (GA) at the beginning of the 1960s. His first achievement was the publication of adaptation in natural and artificial system [17]. He had two goals in mind: to improve the understanding of natural adaptation process, and to design artificial systems having properties similar to natural systems [18]. The basic idea is as follows: the genetic pool of a given population potentially contains the solution, or a better solution, to a given adaptive problem. This solution is not “active” because the genetic combination on which it relies is split between several subjects. Only the association of different genomes can lead to the solution. Holland’s method is especially effective because it not only considers the role of mutation, but it also uses genetic recombination (crossover) [19]. The crossover of partial solutions greatly improves the capability of the algorithm to approach, and eventually find, the optimal solution. The essence of the GA in both theoretical and practical domains has been well demonstrated [20]. However, despite the distinct advantages of a GA for solving complicated, constrained and multi-objective functions where other techniques may have failed, the full power of the GA in application is yet to be exploited [21, 22]. Kuo et al. [13] proposed a fuzzy neural network (FNN) that can process both fuzzy inputs and outputs. The continuous GA (CGA) was employed to enhance its performance. Kuo et al. [23] employed growing self-organizing map (GSOM) algorithm and CGA-based SOM to improve the performance of SOM.
According to aforementioned statements, this study will integrate the intuitionistic fuzzy logic (IFL) in the FNN to develop the intuitionistic fuzzy neural network (IFNN) and employ the genetic algorithm (GA) to optimize the parameters and weights which are in the proposed IFNN called genetic algorithm-based IFNN (GA-IFNN). Then, it is applied to medical cost forecasting problem.
The remainder of the paper is organized as follows. The intuitionistic fuzzy neural network (IFNN) and the genetic algorithm (GA) based learning algorithm, which is used to optimize the proposed IFNN, are described in Section 2. In Section 3, ten computational experiments, using benchmark functions, demonstrate the performance of the proposed GA-IFNN. The forecasting model of the medical cost is developed and the test results in Section 4. Finally, a brief conclusion is drawn in Section 5.
The intuitionistic fuzzy neural network
The intuitionistic fuzzy sets (IFSs)
The fuzzy set theory was proposed by Zadeh [24] and has been applied successfully in various fields [25, 26]. The theory states that the membership of an element to a fuzzy set is a single value between zero and one, and there is not certain that an element’s degree of non-membership in a fuzzy set is equal to one minus the degree of membership. However, there is some of uncertain degree. To explain the uncertain degree, the concept of IFS was introduced by Atanassov [27, 28], which added an additional attribute parameter called non-membership [29]. Bustince and Burillo [30] showed that vague sets (VS) are a kind of IFS. Generally, IFS is a useful means to describe and deal with vague and uncertain data. They have received wide attention in recent years. Many studies have applied IFS to solve complex problems such as data mining [31], decision-making [32–39], clustering problem [40, 41], forecasting problem [42], pattern recognition [43–45] and medical problems [46, 47]. IFSs were proposed as an extension of fuzzy sets. An IFS A in a fixed set E is an objective of the expression:
where the functions, μ
A
: E → [0, 1] and υ
A
: E → [0, 1] respectively denote the degree of the membership and the degree of non-membership of the element, and x ∈ E When μ
A
(x) + υ
A
(x) =1 for every x ∈ E, the fuzzy set with a membership function, μ
A
(x) has the IFS expression:
Furthermore, the uncertain degree must be considered for an IFS, A in E. The degree of hesitation for an element, x ∈ E in A is defined as:
π A (x) is the degree of hesitation of x to A and 0 ⩽ π A (x) ⩽1 for all x ∈ E.
According to [28, 29], to describe an IFSs completely, the model should be included the membership function, non-membership function, and hesitation degree. A concept of IFSs is that to consider the non-membership function, therefore, obtaining the hesitation degree. In order to demonstrate the IFSs completely, the Yager-generating functions [30] is employed in this study. Because the advantage of the Yager-generating function is that, in the functions for each value of α ∈ (0, ∞), a particular fuzzy complement can be well defined, which includes non-membership and hesitation degree. Thus, the intuitionistic fuzzy complement with Yager-generating functions is shown as:
Therefore, using Atanassov’s intuitionistic fuzzy complement with Yager-generating functions, IFSs become:
and the degree hesitation is as follows:
where a > 0.
After defining the functions in IFSs, the degree of hesitation, the membership degree is calculated using a linear combination of μ A (x) and π A (x). Since the membership function, non-membership function, and hesitation degree are defined, therefore, the intuitionistic fuzzy neural network (IFNN) is developed with the concept of IFS. The model of the proposed IFNN and the learning algorithm are illustrated in the follow section.
The advantage of fuzzy neural network is that it combines the advantages of fuzzy control and artificial neural networks, and obtains the fuzzy IF-THEN rules. As the fuzzy neural network, the fuzzy IF-THEN rule is employed in IFNN. The k-th rule, which is instantiated as:

IFNN structure.
An integration function, f is associated with the fan-in of a unit and serves to combine information, activation, or evidence from other nodes. This function provides the net input for this node:
where the superscript shows the layer number.
A second action of each node is to output an activation value (act (f)) as a function of its net-input:
Next, the detailed computation of each layer is shown as follows:
Layer 1: The nodes in the layer only obtain an input value to the layer 2.
The link weight of layer 1
Layer 2: In the proposed IFNN, the Gaussian function is employed for the membership function, it shows in Equation (11),
where m ij is the center (mean) and σ ij is the width (variance) of the Gaussian function of the jth term of the ith input linguistic variable x i .
According to Equation (6),
The link weight of layer 2
Layer 3: Using the fuzzy intersection operator, AND, to translate the degree of accommodation into firing strength,
The link weight of layer 3
Layer 4: This structure uses the inference model of Mamdani. According to this inference model, this layer drives the OR operation of fuzzy inference:
The link weight of layer 4
Layer 5: The final layer of the IFNN architecture, and operate the defuzzification process.
Other than the back-propagation learning algorithm of the IFNN, this study also integrates the GA with IFNN to provide a better initial parameters, including the u ij , σ ij , α ij of the membership function in Layer 2 and the weight for the IFNN. This can avoid the local minimum. The procedures of GA in this study are as follows:
Step 1. Initialization
Setup the parameters, crossover rate (CR) and mutation rate (MR). Therefore, generate n structures of population randomly and set up the number of generation and fitness function. In this study, every gene in the chromosome is encoded with a continuous number, which is between 0 and 1. The chromosome is represented in Fig. 2, where m is the number of nodes in Layer 2, n is the number of nodes in Layer 3, i is the number of input variables, and j is the membership function.

The representation of chromosome.
Step 2. Evaluate the chromosomes
Calculate the fitness function value for each chromosome using Equation (16):
Step 3. Selection
The tournament selection is used in the selection process [49, 50]. Tournament selection involves running several “tournaments” among a few individuals chosen at random from the population. The winner of each tournament (the one with the best fitness) is selected for crossover. Selection pressure is easily adjusted by changing the tournament size. If the tournament size is larger, weak individuals have a smaller chance to be selected. The chromosomes would be ranked and evaluated. The number of chromosomes to crossover is decided by CR. If CR equals 0.9, there are 90% chromosomes of population to crossover.
Step 4. Crossover
Randomly generate an integer K within 0 to the dimension of chromosomes, then use K as the middle point. The parameter before K will remain the same while the parameter after K will exchange with each other. Assume the parameter of the two chromosomes located on K is x (i) and y (i) respectively, the operation processes are:
Randomly generate an integer M between 1 and 1000, then calculate the
Calculate the value of x′ (i) and y′ (i) through Equation (20):
Using the processes above to update the parameter values between paired chromosomes, of which can generate the population with diversity.
Step 5. Mutation
The chromosomes which are not selected to crossover, it means that these chromosomes have worse fitness, will have mutation process. The mutation rate (MR) control the number of gene on the chromosome to mutation. The selected genes would generate a random number between 0 and 1.
Step 6. Evaluate new chromosomes
Eliminate the chromosomes with lower fitness function values and add the new chromosomes with higher fitness function values.
Setp 7. If the stop criterion is satisfied, stop; otherwise, go back to Step 3.
In this study, there are fifty chromosomes in the population, and there are five hundred iterations. These settings are fixed in all experiments conducted in this study.
In order to test the proposed IFNN, this study uses Matlab to program the code. Three different benchmark functions are used to verify the proposed model. This study compares the proposed IFNN with other algorithms, including a FNN, a SVR andan ANN.
Besides FNN, this study also uses artificial neural network (ANN) and support vector regression (SVR) to construct the forecasting models for comparison. The LIBSVM package developed by Chang and Lin [52] is used to construct the SVR model. The ANN, which is a feed forward neural network with back-propagation learning algorithm, is coded in C++programming language. The forecasting performance is evaluated using the performance measures, the mean square error (MSE) and the mean absolute difference (MAD). The definitions of the measures are shown as Equations (21) and (22):
and
For the experiments, the K-fold cross-validation is employed to confirm the robustness of the developed models. The data set is divided into k subsets, and repeated k times. Each time, one of the k subsets is used as the test set and the other k-1 subsets are put together to form a training set. Then the average error (e.g. MSE) across all k trials is computed. In this study, there are 90% data is used to train samples and the subsequent 10% used to test sample. The goal is to determine whether the IFNN is significantly better than other algorithms.
For testing the performance of the proposed GA-IFNN, this section will show the evaluation results using ten benchmark datasets
The simulation cases
The sources of the ten benchmark datasets are summarized in Table 1. The datasets 1 to 7 are generated from the functions, and datasets 8 to 10 are real world problems.
The benchmark datasets
The benchmark datasets
This study proposed two algorithms to optimize the parameters in IFNN. They are the back-propagation learning algorithm and GA. In the back-propagation learning algorithm, the learning rate (η) affects the learning efficiency significantly. There are two parameters, crossover rate (CR) and mutation rate (MR), in the GA. In order to reduce the required number of simulations, the Taguchi method which uses orthogonal parametric arrays, is employed [53]. The orthogonal arrays only identify the main effects and not the interactions between the parameters. The method implements the efficient screening of a large number of parameters and identifies the parameters that are more impact on the performance.
Five factors with three levels are used to design the parameters for the back-propagation learning algorithm. The notations of the factors are as follows: the learning rate of mean (η m ), the learning rate of standard deviation (η s ), the learning rate of Yager-parameter (η α ), the learning rate of weight (η w ), and the momentum (ρ). The levels of the parameters are as follows: η m , η s ∈ (0.0005, 0.001, 0.005) , η α ∈ (0.001, 0.005, 0.009), and. Therefore, a L27 (35) orthogonal array is used for the experiment. The generated orthogonal array has 27 types of combinations. There are two parameters, CR and MR, in the genetic algorithm. For the simulations, three levels of both parameters are used, CR ∈ (0.7, 0.8, 0.9) and MR ∈ (0.1, 0.2, 0.3). Because there are nine combinations of the parameters, all of them are used in the experiments, without Taguchi method.
The test for each combination is performed ten times and five hundred iterations are used, to allow the optimal training parameters to be determined. This experiment determines the lowest MSE, so the-lower-the-better criterion is used for the calculation. The software package, MINITAB, is used to perform the Taguchi experiment. This study sets the MSE as the objective. The smaller MSE is better, so this experiment features the-lower-the-best characteristics. Table 2 shows the parameters of back-propagation learning algorithm, and
The parameters of back-propagation learning algorithm
The parameters of back-propagation learning algorithm
Table 3 shows the parameters of GA-IFNN after the Taguchi experiments.
The parameters of GA-IFNN
The simulation results (MSE) are shown in Tables 4 and 5, including the training data and testing data, respectively. Furthermore, the results of MAD are shown in Tables 6 and 7. The computational time of all compared algorithms are shown in Table 8. The results are obtained by running 500 iterations for each algorithm. We can see that the proposed GA-based IFNN needs longer computational time. But, it can offer the better results.
Computational results (MSE) of training data
Computational results (MSE) of training data
Computational results (MSE) of testing data
Computational results (MAD) of training data
Computational results (MAD) of testing data
The computational time of all algorithms
Unit: second.
To verify the performance of the proposed algorithm, ANOVA is employed to compare the efficiency of GA-IFNN with other models, and the results are shown in Tables 9 to 12. Furthermore, the non-parametric Wilcoxon signed-rank test [54] is employed. It calculates differences of pairs. The absolute differences are ranked after discarding pairs with the difference of zero. When several pairs have absolute differences that are equal to the other, each of these several pairs is assigned as the average of ranks that would have otherwise been assigned. The hypothesis is that the differences have the mean of 0. This allows us to apply it over the means obtained by the algorithms in each data set, without any assumptions about the sample of results obtained.
The ANOVA of training data (MSE)
The ANOVA of testing data (MSE)
The ANOVA of training data (MAD)
The ANOVA of testing data (MAD)
The MSE for training data indicates that GA-IFNN can obtain better results than other compared algorithms, except for Dataset 7. In Dataset 7, the MSE obtained by IFNN is 0.000159, which outperforms GA-IFNN. In Datasets 9 and 10, the ANN obtains better MAD but the MSE results are not better than the proposed algorithm.
A further analysis is conducted through a statistic test applied to the testing data. The Wilcoxon signed-rank test results indicate that the proposed GA-IFNN is significantly superior to other algorithms tested in this study. Tables 13 to 22 demonstrate the results of the statistical test, and there are the p-values in the tables. The hypotheses of the statistical test are as follows:
The statistical results of GA-IFNN (MSE)
The statistical results of IFNN (MSE)
The statistical results of FNN (MSE)
The statistical results of ANN (MSE)
The statistical results of SVR (MSE)
The statistical results of GA-IFNN (MAD)
The statistical results of IFNN (MAD)
The statistical results of FNN (MAD)
The statistical results of ANN (MAD)
Tables 13 to 17 show the results of MSE values, and Tables 18 to 22 show the result of MAD values. According to the experiments results, we can summarize that the GA is able to train the network efficiently. Secondly, since the GA-IFNN incorporates the concept of IFL, the hesitation degree considers the membership degree and non-membership degree simultaneously. This can better define the degree of uncertainty. Due to the reduction of the uncertainty degree, the performance can be enhanced.
The statistical results of SVR (MAD)
This section presents the application of the proposed algorithms in the real data. First, the descriptive statistics of the clinical data is shown in subsection 4.1. The next subsection describes the input variables. Subsection 4.3 shows the computational results. Finally, the discussion is presented in subsection 4.4.
Data collection
This study collects the real data which are used for medical resource cost forecasting of patients with acute hepatitis admitted to the Emergency Department (ED) from a well-known teaching-oriented hospital in Taipei, Taiwan. Table 23 shows the data demographic. Acute hepatitis lasts less than 6 months while chronic hepatitis lasts longer than 6 months. Acute hepatitis has several possible causes, such as Infectious viral hepatitis (hepatitis A, B, C, D, and E), other viral diseases (glandular fever and cytomegalovirus), severe bacterial infections, amoebic infections, medicines (acetaminophen and halothane), and toxins (alcohol and fungal toxins). The severity of illness in acute hepatitis ranges from asymptomatic to fulminant and fatal. Some patients are asymptomatic with abnormalities noted only by laboratory studies, while other patients might have symptoms and signs, such as nausea, vomiting, fatigue, weight loss, abdominal pain, jaundice, fever, splenomegaly, or ascites [55–58].
The descriptive statistics of the dataset
The descriptive statistics of the dataset
According to the findings of Yang et al. [59], the result indicated that the Child-Pugh score and abdominal ultrasound finding can used to predict the medical resource in patients with acute hepatitis. The results of correlation analysis between medical cost and variables are shown in Table 24. The six variables are significant correlation to the medical cost. Therefore, in this study, six input variables, hepatic portal vein varicose (PV), Gallbladder wall thickening (GB wall), splenomegaly, Child-Pugh index (CP), total bilirubin (T-bil), and prothrombin (PT) are used for training the FNN model. The medical cost is the output of the forecasting model. The notations about all variables are shown in Table 25, and the demographic of the medical cost is shown in Table 26.
Correlation between medical cost and variables
Correlation between medical cost and variables
**p-value<0.01, *p-value<0.05.
The notation of variables
The demographic of the medical cost
For developing the forecasting model, five algorithms, GA-IFNN, IFNN, FNN, ANN, and SVR are employed. Before constructing the forecasting model, there are parameters selection with Taguchi method in IFNN. The levels of the parameters are as follows: η m , η s ∈ (0.0005, 0.001, 0.005) , η α ∈ (0.001, 0.005, 0.009) and η w , ρ ∈ (0.01, 0.5, 0.1). In the GA, there levels of both parameters are used, CR ∈ (0.7, 0.8, 0.9) and MR ∈ (0.1, 0.2, 0.3). After the selection, the combination of the parameters is shown in Tables 27 and 28.
The parameters of IFNN in clinical data
The parameters of IFNN in clinical data
The parameters of GA-IFNN in clinical data
Tables 29 and 30 show the results of MSE and MAD, respectively. For the IFNN, the average training MSE and the average test MSE are 0.006876 and 0.008752, respectively. For the GA-IFNN, the average training MSE and the average test MSE are 0.004787 and 0.006765. Therefore, in terms of the MSE criterion, the GA-IFNN can provide a better forecast. For the MAD criterion, the similar results are represented. For the IFNN, the average training MAD and the average test MAD are 0.065687 and 0.078988, respectively.
Computational results (MSE) of clinical data
Computational results (MAD) of medical data
The converge curves are shown in Fig. 3. After training the GA-IFNN model, the membership functions (e.g. the Gaussian function) are shown in Fig. 4.

The converge curves in clinical data.

The membership functions in clinical data. (x-axis: value of features; y-axis: degree of membership function).
This study attempts to establish the medical cost forecasting model, and five forecasting techniques including the proposed GA-IFNN, IFNN, FNN, ANN and SVR, are employed to develop the forecasting models. As the limitations of the study, it is difficult to collect the many clinical data. Thus, we only can have 110 instances. The results indicate that the GA-IFNN have a better performance in MSE and MAD values. If more data can be collected, it is possible to improve the accuracy for the forecasting models. In addition, constructing the forecasting model by the proposed GA-IFNN does not only provide the accuracy, but also provides the fuzzy rules for users’ reference. The fuzzy rules of GA-IFNN are shown in Table 31. Some of the fuzzy rules are illustrated as follows:
The fuzzy rules of GA-IFNN
The fuzzy rules of GA-IFNN
Rule 1:
IF X1 is Low ∧ X2is Low ∧ X3 is High ∧ X4 is Medium ∧ X5 is High ∧ X6 is High
THEN Y = 0.1342 .
Rule 2:
IF X1 is High ∧ X2is High ∧ X3 is Low ∧ X4 is Low ∧ X5 is Medium ∧ X6 is Low
THEN.Y = 0.1473 .
Rule 3:
IF X1 is Low ∧ X2is High ∧ X3 is High ∧ X4 is Medium ∧ X5 is High ∧ X6 is Low
THEN.Y = 0.2135
Rule 1 reveals when PV is low, GB wall is low, Splenomegaly is high, CP is medium, T-bil is high, and PT is high, then the estimated medical cost is around NT$22,340 according to the FNN model result. Thus, the hospital manager can evaluate the medical cost needed for the patient from the rules and arrange the necessary medical resources.
In the industry, the prediction is a very important issue. However, there are not many studying related the medical cost forecasting. For medical management, if hospital can accurately predict healthcare costs, then it can avoid unnecessary consumption and waste. For the healthcare personnel, he/she can save time and thus provide better quality of care. This study proposed the GA-based IFNN to develop the medical cost forecasting model. In the simulations, there are ten benchmark datasets reveal that the GA-IFNN has better performance than the other compared methods, including FNN, ANN, and SVR. Additionally, the GA-IFNN results, which are in the form of fuzzy IF-THEN rules, can be easily interpreted. The medical doctors or hospital managers can apply these rules to explain the cost structure. The medical resource waste can be reduced and enhances the operation efficiency.
In the future, other soft computing techniques should be integrated into the heuristics to provide better estimation. The IF-THEN rules pruning also can be considered. In addition, increasing the data size for model evaluation might increase the accuracy of the prediction for the medical cost. Since it is quite difficult to collect the acute hepatitis cases, it might be feasible to combine different hospitals’ acute hepatitis cases together for the current study.
