Abstract
This paper presents a new multiobjective genetic programming (MOGP) approach, to realize an all-in-one automatic nonlinear system design (NSD). The nonlinear system design is here modeled as a multiobjective optimization problem (MOP) to solve parameter estimation, structure optimization and feature selection simultaneously. The novel MOGP method is then proposed to rank individuals according to the ‘compromise distance’ between them, which has the benefit of combining decision making for NSD with the optimization process to get the final compromise solution in a single process. The effectiveness of the proposed learning approach for nonlinear system design is verified through experiments on the classical nonlinear autoregressive with extra inputs (NARX) system by comparison with classical aggregating method and a Pareto-based method for MOP. Finally, experimental results demonstrate the proposed approach is available to explore the unknown structure of nonlinear systems as well as the features and parameters with high accuracy and efficiency.
Keywords
Introduction
Nonlinear system design is crucial but a complex problem in many areas of engineering, i.e., industrial control systems [1], biomedical data modeling [2], and chemical process systems [3]. This problem not only requires to accurately estimate the unknown parameters of the system model, but also demands to identify the uncertainty structure of the nonlinear model even without any prior knowledge about the system. Based on these demands, genetic programming (GP) was employed by many researchers to discover and optimize the structure of nonlinearmodels [2, 3]. The reason is that GP learning algorithm can not only effectively detect the underlying relations among the huge data with little existed theory, but also take powerful global search in the function space and is able to co-evolve the model structure and parameters [4]. Consequently, it is better to solve nonlinear system design using GP method than the empirical black box models [5] which lack the capability of model reconstruction and provide little insight about the underlying dynamic of the system.
However, uncertainty nonlinear model structure without prior information makes the search function huge so that the GP algorithm convergence is time-consuming and the optimal solution is difficult to obtain, especially in the case that the input variables of system affect each other and significant features are not easy to recognize. Typically more than one model structure can be predicted by the nonlinear dynamic system with the same smaller error, thus many candidate models can get better model accuracy, but with redundant structure and poor comprehensibility. On the other hand, the rapid growth of tree sizes in the GP algorithm tends to cause the algorithm stagnating and leads to the phenomenon of bloating, which is resulted from evolution of non-functional subtrees and inefficient crossover operation [6]. In another words, the complexity of a GP tree should be considered as an objective to be restrained in a low level.
To overcome the above challenges, multiobjective GP (MOGP) approach was considered to learn the best nonlinear model with highest accuracy, most parsimonious model and appropriate features to exhibit the relationship between inputs and the response output of the nonlinear system. In mathematics, assume that an unknown system can be expressed in the form of a general regression model as Equation (1).
In our paper, a novel EMO approach based on GP,is proposed with the purpose of combing the preference of decision making with the optimization process to get the final compromise solution in a single run. Consider that all the classical EMO algorithms require predefining some information (i.e.: weight values or goal threshold) about the special application, which is hardly able to be determined exactly for decision maker without any prior information of a nonlinear system. In addition, Pareto-based approaches (such as, the multiobjective genetic algorithms (MOGA) [7], the nondominated sorting genetic algorithms (NSGA-II) [8] and the strength Pareto evolutionary algorithm (SPEA2) [9]) can only get the Pareto-optimal set which have many redundant solutions and then tend to shade the ‘best’ solution. Consequently, the complex multi-criteria decision making has to be made by designers to obtain the final ‘best’ model of nonlinear systems, which usually require goal and preference information as well. Therefore, in order to obtain the satisfied solution efficiently and effectively, this paper suggests a new multiobjective optimization approach, which cooperate a new multiobjective ranking approach with GP, to address nonlinear system design problem.
To certify the validity of our proposed method, a simulation test is then implemented to identify the model design of a classical nonlinear autoregressive with extra inputs (NARX) model, which is the most popular method as a foundation for model construction [2]. The paper is therefore organized as follows. The next section will present the multiobjective optimization model of nonlinear design problem. In Section 3, the definition and implementation of the proposed MOGP is presented in detail. This approach is then applied to a nonlinear system with NARX model in Section 4. Experimental results are presented along with a comparison to traditional single-objective GP method, as well as a classical MOP method and a Pareto-based MOP approach. Concluding remarks are given in Section 5.
System modeling and system identification are two main difficulties of nonlinear systems design. The objective of system modeling is to determine the structure of a rule expressing the relationship between inputs and outputs of systems. After that, system identification aims to accurately estimate the unknown parameters of the system model. Traditional methods for nonlinear system modeling generate the physical relations in accordance with the prior knowledge about the system, and the regressors vector is simply composed of the values of all the n related process inputs, that is,
Actually, it can be seen that these criteria for nonlinear system design have the following characteristics with regard to model performance and model structure. The criteria of model structure complexity and model performance are incompatible to some extent. Usually a model with higher complexity structure performs better than others; Due to the redundant terms appeared in a model, the same model performance can correspond to more than one model structure, such as: the result of (x1 * x2 + x2) is the same as that of but the later structure includes redundant terms; Due to the dependence relationship of different variables, different feature combinations may generate the same model performance, such as: the result of (x1 * x2 + x2) is the same as that of (x1 * (x3 + x4) + x2) when x2 = x3 + x4 is satisfied. Besides, it is worth noting that some feature combinations may cause over-fitting problem.
Therefore, nonlinear system design problem can be formulated as a multiobjective optimization (MO) problem, in order to meet the demand for the solution model with high accuracy and parsimonious structure. This idea has been introduced by some literatures [2], aiming to identify the uncertainty structure of nonlinear systems which is the most difficulty for nonlinear system design. Assume an autoregressive nonlinear system model for the unknown output vector
These three objectives effect and restrict with each other. Compared with the single objective of approximation error, it benefits to increase the search speed to the optimal space and avoid the redundant structure and features appearing in the solution model. But the solution is much more difficult to obtain the satisfied solution than that of the single-objective method using the conventional stochastic optimization techniques, because the solution search space of multiple objectives increased enormously compared with that of a single objective. Thus, considering evolutionary algorithm is characterized by the parallel computing ability to find the optimal set, evolutionary multiobjective optimization (EMO) techniques are applied to address this problem (3).
A new rank approach in evolutionary algorithm for NSD
The rank method of individuals with multiple inconsistent objective functions is a difficulty of MOP. Traditionally, classical EMO approaches convert the MOP to a single objective optimization by aggregating the cooperating objectives, and obtain only one final solution in one process. On the other side, Pareto-based EMO approaches generally rank the nondominance degree of individuals, and then achieve the solution as a vector which is composed by a set of ‘nondominated’ solutions for all the objectives. Thus, the final solution should be determined by the goal information provided by designers. The definition about ‘
Note that both of these methods used the comparison of absolute distance of different individuals for ranking. Generally, the classical aggregating method defines fixed weights of the multiple objectives while dominance rank method assumes that all objectives have an equal weighting for optimization, both of which are not suit for solving NSD problem. Obviously, for NSD problem (3), those three multiple objectives are required in the different priority levels in different cases for solving the optimal design of nonlinear systems. Particularly, the final solution prefers more to be a smallest approximate error model with appropriate structure, rather than a simplest model without smaller approximate error model. That means f1 has the higher priority level than f2 and f3 in the model (3), while f2 and f3 locate in the same preference level, especially near the optimum solution on the Pareto Front. However, the weighting coefficients is usually difficult to be exactly defined to balance the different relative role of the objectives, the value of which often changed for different cases as well. In addition, the desired goal values of each objective can not also be determined in our problem. Therefore, a new rank approach of evolutionary algorithm is proposed here to be involved in the EMO solution algorithm for NSD problem, aiming to enhance the convergence efficient and directly evolve the unique accurate solution satisfied by the users’ favor.
Our new rank method can conduct the final Pareto-optimal solution in one process by comparing two individuals according to their compromise distances as the following definition. Let
Because it is found that compromise distance can adaptively reflect the characteristics of these three objectives as above, we propose the rank rule of this new rank method as following: (Assume all the objective values are positive which is consistent with NSD problem). Here, the term ‘rank’ is used to measure the performance of every individual. For our minimization problem, the smaller rank the better optimal solution. when d
ij
=0, then rank( when d
ij
>0, then rank( when d
ij
<0 and d
ji
>0, then rank( when d
ij
<0 and d
ji
<0, if d
ij
<d
ji
, then rank(
If
If rank(
In the situation I assuming that and for all objectives,
Then, according to Equation (3), it can be induced that d
According to the rule 2), it can be concluded that rank( In the situation II that part of That means
and
Then, d
So, it can be concluded that this proposed rank approach assign the lower dominated front lower rank than the higher dominated front, which is consistent with dominate rule.
The flowchart of the proposed MOGP approach is shown in Fig. 1.
Initialization
Because GP considers a tree to represent a chromosome, the structures of trees express the nonlinear system structure, and the leaves of them are features selected. In order to let model structures consistent with semantic restraints, appropriate values within the function and terminal sets of GP should be determined to restrict the search space. Here, we take a test of classical polynomial NARX model, since it has been demonstrated as one of the most common model of the deterministic nonlinear systems in practical applications [11]. Then, the function set of our GP algorithm is defined as {+, *}. And the terminal set of our GP algorithm involves all the n decision variables of a nonlinear system as Equation (2).
In the initialization stage, the same as original GP, the ramped half-and-half method is adopted to select the initial trees. The operators of the system model are randomly chosen from the function set, and significant features are selected by random among all the n input variables as the leaves of every tree to form the initial generation. Nevertheless, this method would result in the problem that an internal node could be chosen from the feature set and its leaf nodes might be nonsensical as a solution. So every tree individual should be checked as Fig. 2 before calculating the fitness values of it. This procedure is called as model certification. In this procedure, all the internal nodes are checked to ensure whether it is a mathematics operator or a variable symbolic. If an internal node is a variable symbolic, the subtree from this internal node down to its leaf nodes is deleted. Model certification is of particular importance for calculating the complexity of the nonlinear model in the following evaluation step.
Evaluation
After generating the structures of candidate trees in the first step, Least squared method (LS) is employed to estimate parameters in the model this candidate tree equals. After this, three objective functions as Equation (5) are calculated tree by tree to compose a multi-objective values vector.
Whereafter, the candidate trees in one generation are assigned a fitness rank value according to their multi-objective values vectors based on its compromise distance proposed in the previous section. The proposed rank scheme for our model (5) is implemented as follows: Select the first tree T1 and calculate the compromise distance d1j from T1 to any othertrees T
j
; Decide the rank value of T1 using d1j according to four rules of the compromise rank method. If the relationship between the rank of T1 and the value of d1j meets these rules, it doesn’t need to change any rank value; otherwise, the tree in the higher rank side should be re-ranked and its rank value is defined as one more than the rank of the tree in the lower rank side; Select the next tree and repeat the steps (1-2) until all trees’ rank are set.
In the evolution stage, selection, reproduction, crossover and mutation operations are adopted to generate new individual trees. First, tournament selection method is carried out to select individuals from the population. This method compares any two individuals and select the individual with better fitness value. Thus, the number of these selected individuals is the half of the population size. Then, these selected individuals are separated into three parts with the probability to apply for reproduction, crossover and mutation respectively. In the reproduction processing, the new individuals are generated by reproducing the individuals with better fitness values. In the crossover processing, every two individuals are selected, and a subtree of each individual is randomly chosen. Two new individuals are generated by exchanging the selected subtree of each individual. In the mutation processing, for every individual, a new individual is generated by replacing a random subtree of the individual with a new subtree. This new subtree is generated by the same method as the initial trees. After the operations of reproduction, crossover and mutation, an elitism mechanism is employed in order to let the better individual survive to avoid the lost of excellent genes due to random effects. This mechanism uses a competition strategy to obtain better offspring by a comparison of parents and their children. In this competition strategy, the fitness rank values of parents and children trees are evaluated and two trees with lower ranks are selected to construct the new generation. This elitism mechanism has been demonstrated the convergence property [12] and successfully applied in the genetic algorithm to solve some real application. It could improve the convergence speed and the accuracy of solutions.
Therefore, the main loop of the proposed algorithm is implemented as follows: Initialize the function set and terminal set, set the parameters of the algorithm, such as the population size, generation number, crossover probability, mutation probability, etc. Evaluate the rank value of every tree in a generation using the proposed rank rule based on compromise distance. Operate the evolution processing to generate a new population. First, select some individuals using tournament selection method and the candidate tree with smaller rank is selected in the tournament comparison. Then, assign every tree of the selected individuals with an evolution operation among three different operations: reproduction, crossover and mutation, based on operation probability. Realize the corresponding operations of parent trees to generate their children trees, and calculate the fitness rank value of each children tree. After that, combine all the parent trees with children trees as an intermediate population and rank them by the proposed rank rule. Put the trees with smaller ranks in the next generation set until the population set is met. Go to Step 2 and evaluate the new population. When the fitness value with the smallest rank meets the goal value or the terminate conditions are satisfied, the loop ends. Otherwise, continue to realize following steps.
Simulation
In order to test and illustrate the performance of this proposed algorithm, this approach is applied to the task of designing a typical nonlinear system with the NARX model and two comparison experiments are made. Firstly, the comparison of the results of the proposed MOGP approach with single-objective GP is presented in order to show that NSD problemcan be addressed better by considering as a multi-objective problem than a single objective problem. Without generalization, this single-objective GP choose the approximate error of model as the objective function. Secondly, one kind of Pareto-based EMO, NSGAII, is chosen to deal with the evaluation and evolution process of multi-objective GP. Here, this cooperation of EMO algorithm is called as NSGAII-GP. By comparing the proposed MOGP approach with NSGAII-GP, the validity and superior property of the proposed rank method based on compromise distance is demonstrated.
Assume the input regression matrix
In the simulation, three objective functions as (5) are defined to optimize the model performance and mode structure. f1 calculates the NSE between the estimated and true output, f2 calculates the number of multiply operators and the number of terms in the model structure, f3 calculates the number of selected features in the model structure. The optimal solution should have a smallest NSE, and a parsimonious model structure with fewer selected features, so the optimal solution is obtained by minimizing f1, f2 and f3 in (5) simultaneously. Thus, the proposed MOGP method and NSGAII-GP method are applied to solve the multi-objective optimization model (5), and the single-objective GP method uses the sum of f1, f2 and f3 with weights. Tables 1–3 present the results of the proposed MOGP approach, NSGAII-GP and single-objective GP in 10 trails. It is shown that the proposed MOGP approach can obtain the optimal solution with the minimum NSE and a parsimonious model structure that is the same as the real one.
As the comparison of the proposed MOGP and the single-objective GP, Table 3 reveals that the single-objective GP is unstable while the convergence of the proposed MOGP approach is fast and stable shown in the Table 1. For instance, the trial 2, 6 and 8 in the GP simulations do not even converged due to its excessive dependence on the initial population, and this convergence problem does not occur in the proposed MOGP simulation. Even though some trials of GP may converge to the minimum model error, the model solution is not in the parsimonious form (such as: trial 1, 5, 7) or has certain redundant terms (such as: trial 3, 10). In addition, the accuracy of solution model structure is 100% in the proposed MOGP for several trials but less than 50% in single-objective GP. The reason is that original GP tends to get stuck in local optimal point when more than one structure correspond to a smaller error. Therefore, it is more reasonable to treat NSD problem as a multi-objective optimization problem than as a single objective optimization problem.
Through the comparison of Tables 1 and 2, it can be found that NSGAII-GP cannot obtain the global optimal structure of models while the proposed MOGP can converge to the best solution. NSGA-II can only find the Pareto-optimal set, from which designers need use multi-criteria decision making (MCDM) techniques to obtain the best solution. But the realization of MCDM always needs some weight information among multiple objectives or goal information about the special application, which can hardly be obtained for most NSD problem. Hence, NSGAII-GP chooses the smallest approximated error in the Pareto-optimal set as the final result. It is observed that the proposed MOGP provides a better way to solve the NSD problem without any prior information and is able to obtain a satisfiedsolution.
The convergence properties of these three algorithms are shown in Fig. 3. In Fig. 3(a), average learning curves of three different algorithms in 10 trials are presented respectively by converting multiple objectives to a single objective as f1 + λ1f2 + λ2f3. Here, the experience weights λ1 and λ2 are chosen as 0.001 by many training experiments. It can be seen that the proposed MOGP can converge quickly to the global optimum of multiple objectives while other two algorithms can not. To understand the reasons behind, their average learning curves for model error objective (f1) and model complexity objectives (f2 + f3) are plotted in Fig. 3(b) and 3(c) respectively. It is observed that all these algorithms can converge to the best model error but only the proposed MOGP can converge to the best model structure with low complexity at the same time.
Due to the excellent convergence property of the proposed MOGP in terms of model complexity, tree chromosomes of the proposed MOGP are simpler than those of other two algorithms after some generations, consequently the proposed MOGP requires less time than other algorithms in a single run. This statement can also be validated by Table 4 which reports CPU computing time for comparison.
The above simulation results show that the proposed approach has superior performance to design nonlinear polynomial systems than single-objective GP and Pareto-based EMO, especially it can obtain the correct model structure for the nonlinear systems whose input variables are dependent. It can converge to more simple structures with fewer features in case of smallest approximated error, compared with single-objective GP and NSGAII-GP. Moreover, the optimum result of the proposed MOGP is almost the same as the real solution Equation (6) and the proposed MOGP can obtain the real structure of model in this test.
The reason why the proposed MOGP can have such a good performance for the NSD problem attributes to the special relationship of three objectives. For this test, some objective vectors (log10f1, f2, f3) around the global optimum are shown in Fig. 4 where model error is calculated by NSE. It can be observed that the model error values of the global optimum or local optimum are away from other points by more than 103 but the difference of model complexity values is less than 10. Therefore, the methods of aggregating all the priori information as a fitness function tend to get stuck in the local optimum, and the proposed ranking method based on compromise distance can be more easy to achieve the global optimum.
In sum, the results show that the proposed MOGP is available to design a nonlinear system, especially the systems whose input variables are not independent. Moreover, it can converge to more simple structure with fewer features in the case of smallest approximate error, compared with single-objective GP and NSGAII-GP.
An all-in-one automated nonlinear system design scheme based on the proposed MOGP approach and NARX representation was presented. The nonlinear system design is modeled as hybrid estimation of mode parameters, model structure and feature selection simultaneously. The proposed approach combined the benefit of multi-objective optimization and structure learning attribute of genetic programming to identify the nonlinear system. In this new approach, a new multiobjective rank evolutionary algorithm for Pareto-based EMO is proposed to conduct the final Pareto-optimal solution in one process by comparing the compromise distances of individuals. The validity of this approach for nonlinear system design is proved by a dataset with classical NARX model. Simulation results present that the results of the proposed approach show higher solution accuracy and faster convergence than the single-objective GP and another multi-objective GP algorithm (NSGAII-GP). It can be concluded that multi-objective optimization techniques are more suitable than the single objective optimization methods to address the NSD problem. In addition, the proposed multiobjective evolutionary algorithm can obtain the final best solution with little information about the special applications, which is a superior character compared with other Pareto-based EMO algorithm. In the future, this algorithm can be implemented for more real-world applications to uncover the underlying models of whom structure and significant features are unknown as a priori.
Footnotes
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant no. 61401145), the Natural Science Foundation of Jiangsu Province (Grant no. BK20140858, Grant no. BK20151501), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). The authors would like to thank all the people who help us to overcome many difficulties during this research work.
