Abstract
Aiming at the problem that fuzzy neural network (FNN) is difficult to be adjusted automatically its structure when there is no the threshold of loss function, as well as the problem that the neuron number of the regularization layer of FNN is adjusted by self-organizing algorithm when the structure of FNN is not stable yet, a structural design strategy of self-organizing recursive FNN based on the Boston matrix (SORFNN-BOSTON) is proposed. Compared with other self-organizing algorithms, the method used in this paper does not need to set the threshold of loss function. In addition to the indicators representing the importance of neurons in most self-organizing algorithms, the change rate is used to represent the change of the parameters of the neural network. The change rate is used to determine when the relevant parameters are stable, which further improves the reliability of the neuron adjustment process. Through the simulation of predicting Mackey-Glass time sequence, the final number of neurons in the hidden layer and the testing error are 6 and 0.110 respectively. Comparisons with other self-organizing algorithms show that the testing error decreased by 76.6% at most and 13.3% at least, which proves the practicability of the method.
Introduction
Since the fuzzy set theory was put forward, there have been many different development directions [1, 2]. As a combination of traditional fuzzy algorithm and neural network, fuzzy neural network is also an extension of fuzzy set theory. FNN can combine the fuzzy reasoning ability of fuzzy systems with neural networks. However, the structure of most FNNs is determined in advance. It is very troublesome to adjust FNN structure according to different situations.
The methods of adjusting the neurons number of neural networks include the following two sides: growth and pruning. Neural network growth refers to increasing the number of hidden layer neurons according to the specific input conditions in the learning process given a particularly small number of hidden layer neurons. References [3–5] start with no hidden layer neurons and increase the number of neurons when certain conditions are met. However, in these algorithms, once the number of neurons increases, it cannot be reduced, and the problem of redundancy in the number of neurons of neural networks cannot be solved. The pruning mainly settles the problem of high calculation cost and long calculation time of depth calculation model. In references [6, 7], a pruning method based on weight value is presented, which deletes neurons with smaller weight value by assuming that the small weight is irrelevant. However, deleting neurons with only weight value may delete important parts of networks [8]. In references [9–11], a filter layer is inserted after each layer of neural network to determine which neurons to be deleted according to the parameter size of the filter layer. Reference [12] realizes the dynamic adjustment of a neural network by deleting redundant neurons according to the density of the fuzzy rules. However, the pruning algorithm of neural networks can only solve the redundancy of neural networks instead of the insufficient computing performance.
For the sake of enabling neural networks to alter the number of neurons under diverse conditions, the references [13–15] propose that neural networks can be adjusted by genetic algorithm, the number of neurons in neural networks can be increased or reduced automatically according to the different input situation. Reference [13] proposed a self-organizing neural network to alter the number of neurons using genetic algorithms. Nevertheless, this method requires too much prior knowledge and is limited to be applied to practice. Reference [14] proposed a method based on genetic algorithms, backpropagation, and recursive least squares estimation. This method is used to adjust all parameters including the number of fuzzy rules. However, it can realistically be used in the offline situations only. Reference [15] proposed a learning method based on a second-order genetic algorithm, which can estimate the fuzzy weights as well as the shape of the membership function. Nevertheless, this method requires many experiments beforehand to select the optimal percentage. References [16] and [17] use support vector machines to automatically generate fuzzy rules. The number of fuzzy rules is consistent with the number of support vector machines. In reference [18], the number of initial fuzzy rules is set to the same number of support vector machines. Then the algorithm in this paper is used to delete the unrelated fuzzy rules. However, the main problem with this method is that there may be dead neurons [19]. How to generate the optimal number of fuzzy rules for a fuzzy neural network remains to be discussed [20, 21]. Reference [22] preset the maximum error and ideal error, then set the threshold according to above two values to adjust a network structure. Reference [23] proposed a self-organizing neural network based on first-order sensitivity analysis. When the error between actual and predicted results meets the preset conditions, the qualified neurons are deleted or added. However, in the above two methods, the setting of error threshold depends on experience. If the setting is not appropriate, the deletion or growth algorithm will fail. Reference [24] proposed a self-organizing algorithm to alter the structure of FNN according to the strength of fuzzy rules, which divides the neurons with high strength and prunes the neurons with low strength. However, due to the too frequent structural adjustment of the self-organizing algorithm, the neural network structure will be adjusted again when the new network architecture is not fully convergent. As a result, frequent structural changes make the convergence of a network is hard to be guaranteed.
In order to settle the above problems, this paper proposes a self-organizing recursive fuzzy neural network based on the Boston matrix (SORFNN-BOSTON). The method presented in this paper is a structural design method for online adjustment of network structure. The neuron deletion and growth indicators are derived from the parameters generated in the training process of SORFNN-BOSTON, and no complicated pre-experiments are needed. The regularized layer neurons are evaluated by relative share and change rate. The change of the relevant parameters of the neuron is judged by the change rate. The proportion of the neuron in the output is judged by the relative share. The rate of change is used to reflect the change of parameters, so the network tends to be stable when the change rate is small. Therefore, it can be ensured that the parameters of the neuron tend to be stable before judging whether the neuron can be deleted. In the structural change stage, the parameters of neurons with large relative occupancy are adjusted to avoid the problem of dead neurons. Compared with other methods, the addition of the change rate can avoid the false deletion of neurons with large fluctuation of the relative occupancy in the training process, and ensure the stability of neural networks before and after the change of the neural network structure. Simulations by Mackey-Glass time series prediction show that SORFNN-BOSTON has higher training and testing accuracy than other self-organizing algorithms.
SORFNN-BOSTON
FNN is a static feedforward neural network, while recursive fuzzy neural network (RFNN) adds feedback layer on the basis of feedforward connection. The self-feedback connection is added to the original regularization layer, which makes FNN have dynamic characteristics. In addition, the purpose of self-feedback connection layer is to enhance FNN’s ability to describe nonlinear systems. This method of adding local or global feedback links to feedforward neural networks has been more studied and applied in practice because it has the advantages of simple structure of feedforward neural networks and dynamic characteristics of recursive neural networks. Figure 1 shows the structure of SORFNN-BOSTON studied in this paper, which consists of the input layer, fuzzification layer, regularization layer, and output layer.

The structure of RFNN.
The input layer: This layer represents the input of the fuzzy neural network. It contains m neurons. is the input vector of the neural network at time t. x i (t) = [xi1 (t) , xi2 (t) , …, x im (t)]. m is the dimension of the input vector. i = 1, 2, …, N. N is the total number of input vectors.
The fuzzification layer: This layer contains n neurons and represents the fuzzification part of fuzzy reasoning. Its main function is to fuzzify the input by using the membership function to obtain a membership degree in the [0 : 1] interval. Finally, the value obtained by multiplying the membership degree of all inputs in the neuron is the trigger intensity of the fuzzy rule represented by the neuron. The membership function used in this paper is Gaussian function. The output of the neuron is shown below.
The regularization layer: this layer represents the fuzzy rule section. The neurons number of the regularization layer is consistent with that of the fuzzification layer. Its output is as follows.
The output layer: This layer presents the output of the recursive fuzzy neural network. The weighted factor method is used and the output is shown as follows.
Boston matrix
Boston matrix is a method for business analysis. Because of its excellent performance, it is now widely used in various fields. It divides goods into four types through market gravity and enterprise power. The sales growth rate is the index that can best reflect the market gravity and it is the external determinant to judge whether the framework of the enterprise is reasonable. While, the market occupancy is the corresponding inner determinant.
Products can be divided into four categories according to sales growth rate, market share, namely stars, dogs, question marks and cash cows. The stars are products with high sales growth rates and market occupancy. The dogs are products with low sales growth rates and market occupancy. The question marks are products with high growth rates and low market occupancy. Products with low growth rates and high market occupancy are named cash cows. Among them, dogs should be abandoned, while stars should be increased.
The market occupancy is divided into absolute market occupancy and relative market occupancy; the calculation formulas are as follows.
The relative market occupancy and sales growth in the Boston Matrix are applied to the neural network, which is used to adjust the neurons number of regularization layer to optimize the network structure.
Classification Index based on Boston matrix
Relative occupancy of the importance sr
k
is used to represent the importance of neurons. The calculation is as follows.
The input of SORFNN-BOSTON is regarded as various determinants in the market. The output of the regularization layer changes with the change of the input. The mean value of the output change is expressed as ur
k
, and taken as the change rate. ur
k
is calculated by the following equation.
Here, ur k is the rate of change of the k-th regularization neuron. When the change rate ur k is large, the neuron should not be changed at will. On the contrary, the neuron should be deleted when ur k is small.
The average of RMSE during a period of time is used to assess the performance of SORFNN-BOSTON. The RMSE and its mean value are calculated as follows.
If MEAN(S) > MEAN(S + K), the convergence performance of SORFNN-BOSTON is insufficient at this moment, and RMSE is close to the limitation under current structure of SORFNN-BOSTON. To reduce RMSE and enhance the performance of SORFNN-BOSTON, the neurons number of regularization layer should be increased. The neuron with the largest value of sr
k
in the regularization layer of SORFNN-BOSTON at this moment is found and split. Let the maximum value of sr
k
be sr
u
. Therefore, the condition for the split of the neuron is:
The parameters of the original neuron and those of the new split neurons are adjusted as follows.
When MEAN(S) < MEAN(S + K), the convergence performance of SORFNN-BOSTON is fine. The values of sr
k
and ur
k
at this moment are used to judge whether the redundant neurons exist. Let the minimum value of sr
k
be sr
c
. If the c-th neuron of the regularization layer that satisfy both sr
c
<a and ur
c
<b, the neuron is redundant and should be deleted. The conditions for the deletion of the neurons are shown below.
When the neurons are removed, the parameters of the neurons with the largest relative occupancy in the regularization layer are regulated. The parameters are adjusted by the following equations.
The steps of structure adjustment for SORFNN-BOSTON are as follows:
The steps of structure adjustment for SORFNN-BOSTON are as follows: The initial number of the neurons in the fuzzification layer and regularization layer of SORFNN-BOSTON is given randomly, the parameters are initialized and the training of SORFNN-BOSTON starts; Calculate the MEAN for each stage. If MEAN(S)>MEAN(S + K), turn to step 3, otherwise turn to step 5; Split the neuron of the regularization layer with the largest relative occupancy; Adjust the parameters of original neurons and new neurons according to formula (9)–(14). Turn to step 7; Find out the neurons with the sr
u
and the neurons with the sr
c
in SORFNN-BOSTON. Judge whether the parameters of neurons meet the condition (15)–(17). If the condition is met, turn to step 6. Or else, turn to step 7; Delete the neurons with the sr
c
and adjust the parameters of the neurons with the sr
u
according to formula (18)–(20); Training of neural network; Stop the training when the number of epochs is reached, otherwise turn to step 2;
Using MEAN as the standard to judge the performance of SORFNN-BOSTON, it is not necessary to set the loss function threshold separately. ur k and sr k are used as indicators of structural adjustment. Firstly, the changes of neuron related parameters can be judged through ur k , and then the proportion of the neuron in the total output can be judged through sr k . Therefore, in the neuron deletion stage, the neurons with small two indicators are deleted. It can ensure that the importance of the neuron can be judged after the relevant parameters of the neuron are basically stable. Avoid erroneous deletion of neurons caused by incomplete parameter update. At the same time, in the stage of neuron increase, the output value of the neuron with the largest relative occupancy is reduced to ensure that the value of the new neuron is not too small.
Structural adjustment stage
During the neuronal growth phase, the output of SORFNN-BOSTON is
After the neuron number changes, the output of SORFNN-BOSTON is
According to the parameter adjustment of neuron growth, the output of the neuron with the sr
u
in the fuzzification layer z
u
(t) ′ is:
Therefore, z
u
(t)
new
+ z
u
(t) ′ = z
u
(t). Thus, it can be obtained that before and after the structure adjustment, the total output of the fuzzification layer is constant. The total output of the fuzzification layer is defined as SUM. The output of the neuron with the sr
u
in the regularization layer of the original neural network is as follows.
After structure adjustment, the output of the neuron with the sr
u
in the regularization layer is as follows.
The output of new neurons in the regularization layer is as follows.
Because the new neurons do not possess the output of the previous moment, so r
u
(t - 1)
new
= 0. Therefore, the output of new neuron is as follows.
During the neuron deletion phase, when the neuron is deleted and the parameters are not adjusted, the output of SORFNN-BOSTON is y (t)
after
. Let the minimum value of sr
k
be sr
c
. Then the output after neuron parameter adjustment is as follows.
Therefore, the output after parameter adjustment is consistent with that of the initial neural network in the deletion stage.
By adjusting the parameters, the output of SORFNN-BOSTON before and after the structure adjustment is kept unchanged, the error fluctuation after the structure adjustment of SORFNN-BOSTON can be reduced, and the final convergence of SORFNN-BOSTON can be guaranteed.
The parameter updating method used in this paper is Adam algorithm. At present, many algorithms are improved by changing the parameter update method in the algorithm process. References [25] and [26] optimize the parameter update method of the traditional algorithm by simulating the habits of animals, which greatly improves the speed of finding the global optimal solution. However, this optimization method generally cannot prove the stability of the new parameter update process. The optimization of the algorithm adopted in this reference does not change the parameter update process, which avoids this problem to a certain extent. The convergence of the algorithm is proved in the reference [27]. The principle is as follows.
Where the initial values of m(0) and v(0) are 0. p(t) = [c (t) , b (t) , w (t) , wh (t)]. p(t) is the set of all parameters in SORFNN-BOSTON.
The following conditions are assumed: The loss function to be optimized is convex. Suppose the variable is bounded: ||p - p′||2 ⩽ D ∀ p, p′. Suppose the gradient is bounded: ||g
t
||2 ⩽ G ∀ t. g
t
= ∇ f
t
(p(t)). The judgment indexes selected in this paper are as follows.
When T→ ∞ and R (T) is bounded, the convergence of the algorithm can be proved.
Since f
t
(p(t)) is a convex function, R (T) can be distorted as follows.
Make further distinctions according to the dimensions of the variables and change the order of summation to get the following.
From the iteration of m(t) and v(t), the original formula is deformed as follows.
Let β1 = β1,t, the value of β1 does not change monotonically with the number of iterations.
The following formula can be obtained by deforming the above formula.
Therefore, it can be considered that SORFNN-BOSTON converges in the fixed stage.
In order to valid the method presented in this paper, the Mackey-Glass time sequence is used as the simulation function of this experiment. Mackey-Glass time sequence is recognized as a benchmark problem in reference [28–30] and is generated by the following differential equations.
Here, y = 0.1, c = 17, g = 0.2. In the generated time series, the past four values are used to predict the current value. The value of x(t) is predicted by using the value of x(t-p), x(t-2p), x(t-3p), x(t-4p), p = 6. The performance of different algorithms is assessed by the root mean square error (RMSE).
Here, K is the number of data samples. y (t) pred is the predicted output. y (t) true is the actual output.
SORFNN-BOSTON performs 500 epochs, and the number of initial neurons is set to 4, a = 0.05, b = 0.1.
Figure 2 shows the change curve of the neurons number in the regularization layer. It can be concluded from Fig. 2 that the neurons number fluctuates during the training process and eventually stabilizes at six. Figure 3 demonstrates the change of RMSE during training, which shows a downward trend. Because of the parameter adjustment method presented in this paper, the difference of RMSE before and after the change of neurons number is small, which shows that the parameter adjustment method is effective. Figure 4 shows the comparison between the output of SORFNN-BOSTON and the actual output. It is observed that the predicted value of SORFNN-BOSTON can track the actual value well. Figure 5 shows the deviation between the actual value and the predicted value, which is limited within [–0.06,0.04]. Most of the errors are limited between [–0.02, 0.02]. Therefore, SORFNN-BOSTON has superior nonlinear fitting ability.

Change of the neurons number.

Training error.

Predicted output and actual output of test sample.

Test sample error.
Comparison of RMSE between SORFNN-BOSTON and RFNN
By comparing the training error and testing error of RFNN and SORFNN-BOSTON, it is proved that the structural adjustment method used in this paper can enhance the nonlinear fitting ability of neural networks. The results of the simulation are shown in Table 2. It can be concluded that the training RMSE and testing RMSE of SORFNN-BOSTON are better than those of RFNN. The neurons number in the regularization layer of RFNN is 6.
Comparison of different network performance
Comparison of different network performance
The final number of neurons is six. Figure 6 shows the testing error of the predicted and actual output of SORFNN-BOSTON. Figure 7 shows the testing error of the predicted and actual output of RFNN. Figure 8 shows the RMSE changes during the two network training processes. It can be deduced that the convergence performance of SORFNN-BOSTON is better than that of RFNN. The simulation results show that the error between the predicted value and the actual value of RFNN is greater than that of SORFNN-BOSTON. The training error and testing error of RFNN are 0.0221 and 0.0221 respectively. The training error and testing error of SORFNN-BOSTON are 0.0114 and 0.0117 respectively. Compared with RFNN algorithm, the testing error and training error of SORFNN-BOSTON algorithm are reduced by 48.4% and 47.1% respectively. It shows that the algorithm enhances the convergence performance of SORFNN-BOSTON. However, due to structural changes during training, the fluctuation of RMSE is greater than that of RFNN; this is a problem that needs to be considered further.

Testing error of SORFNN-BOSTON.

Testing error of RFNN.

Two kinds of network training errors.
Compared with the simulation results of other self-organizing algorithms, it is proved that the method used in this paper can better optimize the structure of the network. In order to verify the availability of the method used in this paper, the comparisons with the following five self-organizing fuzzy neural networks are investigated: self-organizing neuro-fuzzy network based on first-order effect sensitivity analysis (NFN-FOESA [23]), an improved genetic-based learning method for fuzzy artificial neural networks (SOFNNGA [15]), online self-organizing fuzzy modified least-squares network (SOFMLS [12]), parsimonious fuzzy neural networks based on fast and accurate online self-organizing scheme (FAOS-PFNN [22]), and Nonlinear Systems Modeling Based on Self-Organizing Fuzzy-Neural-Network With Adaptive Computation Algorithm (SOFNN-ACA [24]) through Mackey glass time series prediction. 30 simulation experiments were conducted for all methods, and the average value of these thirty simulation experiments was taken as the final result. It can be concluded from Table 2 that SORFNN-BOSTON has the smallest testing RMSE. When compared with SOFMLS, SOFNNGA, SOFNN-ACA, NFN-FOESA, FAOS-PFNN algorithms, SORFNN-BOSTON algorithm’s testing error has reduced by 76.8%, 16.7%, 46.9%, 13.4%, 13.4%. It is proved that SORFNN-BOSTON can better optimize the network structure, and obtain better nonlinear approximation ability and generalization ability when the neurons number is small.
Conclusions
By applying the classification method based on Boston matrix to RFNN, this paper puts forward a structure adjustment method and a parameter adjustment method. Through the simulations of the Mackey glass time sequence, compared with the fixed structure neural network and other neural networks, the following conclusions are obtained. Compared with RFNN algorithm, the testing error and training error of SORFNN-BOSTON algorithm are reduced by 48.4% and 47.1% respectively. It can be concluded the structure adjustment algorithm used in this paper enhances the convergence ability of the network. Compared with other self-organizing fuzzy neural networks, the final number of hidden layer neurons of SORFNN-BOSTON is 6, and the final testing error is 0.0110. When compared with SOFMLS, SOFNNGA, SOFNN-ACA, NFN-FOESA, FAOS-PFNN algorithms, SORFNN-BOSTON algorithm’s testing error has reduced by 76.8%, 16.7%, 46.9%, 13.4%, 13.4%. SORFNN-BOSTON has the least hidden layer neurons and the least test error. The self-organizing algorithm used in this paper makes the neural network have stronger nonlinear fitting ability. The structural change indexes used in this paper are determined according to the parameters of the structural adjustment process, which can reflect the changes of neuron parameters and the importance of neurons at the same time. These indexes solve the problems of setting the threshold of loss function, cumbersome pre-experiment and structural adjustment without stable parameters in other self-organizing fuzzy neural networks. Due to the structural changes during training of SORFNN-BOSTON, the training error fluctuation of SORFNN-BOSTON is greater than that of RFNN, which is the problem solved in the next step.
Footnotes
Acknowledgments
Project (61803191) supported by the National Natural Science Foundation of China; Project (2019-KF-03-05) supported by Natural Science Fund Project of Liaoning Province.
