Abstract
Although many scholars say that their algorithms are better than others in the state estimation problem, only a fewer convincing algorithms were applied to engineering practices. The reason is that their algorithms outperform others only in some aspects such as the estimation accuracy or the computation load. To solve the problem of performance evaluation of state estimation algorithms, in this paper, the comprehensive evaluation measures (CEM) for evaluating the nonlinear estimation algorithm (NEA) is proposed, which can comprehensively reflect the performance of the NEAs. First, we introduce three types of the NEAs. Second, the CEM combining the flatness, estimation accuracy and computation time of the NEAs, is designed to evaluate the above NEAs. Finally, the superiority of the CEM is verified by a numerical example, which helps decision makers of nonlinear estimation algorithms theoretically and technically.
Keywords
Introduction
Nonlinear estimation algorithm (NEA) was widely used in many fields, such as target tracking [1, 2], satellite navigation [3], fault detection [4], signal processing [5] and optimal control implementation [33]. Typical NEA for nonlinear dynamical systems include the extended Kalman Filter (EKF) [7, 8], unscented Kalman filter (UKF) [9], particle filter (PF) [10, 11], where the EKF is the most popular one, which is based on the Taylor series expansions of the nonlinear functions in the description of the models. In the last two decades, a number of new filters based on the KF framework have been proposed. Instead of the Taylor series expansions, they employed stochastic linearization to estimate a random variable by a group of points, which are transformed from the nonlinear functions [12]. This filter is called sigma-point filter and its advantage is that it doesn’t to compute the Jacobi matrices of the nonlinear functions. The unscented transform (UT) adopted in the unscented Kalman filter (UKF) [13] is one of the stochastic linearization methods. However, an important assumption for EKF and UKF is that the prior density is Gaussian. As a result, several modifications were presented to account for the non-Gaussianity of the various densities, such as the PF [14, 15] which implements recursive filters via Monte Carlo simulations. The above two approaches can cope well with non-linear and/or non-Gaussian filtering and estimation. [16] proposed a new filtering technique to solve a nonlinear state estimation problem and used the root mean square error, percentage of track loss and execution time to demonstrate the proposed filter provides more accurate results than the cubature Kalman filter, the unscented Kalman filter, and the Gauss-Hermite filter. [17] evaluate the performance of our EKF-STDF against some representative positioning algorithms by numerical simulations in terms of root mean squared error (RMSE) between the estimated positions and the true positions. For the tight integration of low-cost ultra wide band (UWB) ranging sensors, the EKF. [18] used the average time and standard deviation to compare the Double-EKF with the EKF, the indirect Kalman filter (IKF), UKF, the factored quaternion algorithm (FQA), and the algebraic quaternion algorithm (AQUA). [19] applied the extraction error, filtering deviation (DE), and RMSE to compare the performance of the radial basis function-neural network based on the improved UKF algorithm, the traditional UKF algorithm, and long short-term memory algorithm at different SNRs. The position errors and the estimated 3σ2 bounds along the East and North directions are utilized to compare the performance of the reduced UKF, consensus-based distributed UKF, federated UKF, sequential UKF and standard UKF [20]. [21] the RMSE is used to evaluation the performance among the UKF, strong tracking UKF, M-estimation UKF, variational Bayesian VBHUKF, removed strong tracking UKF, and generalized strong tracking UKF. The prediction performance is evaluated and compared among the PF, unscented PF, radial basis function, long short-term memory, and PF+LSTM methods based on multiple criteria including absolute error, mean absolute error, RMSE, and Relative accuracy [22]. Furthermore, the cumulative probability distribution (CPD) of errors is used to evaluation the performance of the inertial measurement unit-only method, the EKF algorithm, the UKF algorithm and the Grey wolf optimizer-PF [23].
In practice applications, many scholars say that their algorithms are better than others in the state estimation problem [24]. However, only a fewer convincing algorithms were utilized to engineering application. So, it is important to explore and design more comprehensive metrics for these algorithms evaluation, which leads to an extensive literatures are by now available on performance evaluation of NEA (PE-NEA) (see, e.g. [25–27]). As shown in Fig. 1(a), the NEA is still a hot topic in current research, but EKF, UKF, and PF are still the mainstream algorithms in solving nonlinear estimation problems. Figure 1(b) to (d) illustrate the research trends and applications of these three commonly used estimation algorithms in various fields.

Number of publications of the NEA, EKF,UKFAND PFin the past 10 years.
In performance evaluation, error spectrum (ES) is a comprehensive measure because it aggregates several commonly incomprehensive metrics [28]. Therefore, ES has recently attracted lots of researches to improve it [29–32]. Unfortunately, it is difficult to analyze the NEA’s performance based on ES, because ES is a three dimensional (3D) plot over the total time span. Accordingly, [30] proposed a dynamic error spectrum (DES) based on the average value of ES. Furthermore, it was found that DES suffers from the information loss problem. Therefore, we proposed the range error spectrum induced area (RESA) and the DES induced area (DESA) to overcome the problem of DES [27, 31]. However, it’s still hard to distinguish which estimator performs better by using the DES, RESA, and DESA. So, a volume error spectrum (VES) was proposed to further solve the dynamic systems evaluation problem [32]. Actually, the above proposed metrics (i.e., DES, RESA and DESA) need to combine all the three-dimensional plot (ES curve in dynamic systems) into a single two-dimensional plot. Clearly, it is still not easy to find the better estimator when two ES curves intersect with each other among whole time horizon.
Therefore, this paper proposes the comprehensive evaluation measures (CEM) to evaluate the existed nonlinear estimation algorithms, which provides theoretical and technical support for decision-makers of the NEAs. And, three types of the NEAs are proposed to quantify the flatness, to measure the estimation accuracy and to calculate the computation time of every NEAs. And the CEM combining the flatness, estimation accuracy and computation time of the NEA is designed to evaluate its performance.
This paper is organized as follows. Three NEAs are summarized in Section 2. In Section 3, the CEM are proposed to evaluate the above three NEAs. Finally, a numerical example is provided in Section 4 to illustrate the superiority of the comprehensive metrics of the CEM. Section 5 concludes this paper.
In this paper, the model is given by
As we all know, there exist several types of the NEA. For example, the EKF is a NEA, which use the the linearization of the process and observation equations [33]. Furthermore, the UKF is presented to estimate the above model by calculating the statistics of a random variable which undergoes a nonlinear transformation [13]; and the PF is based on Monte Carlo method with sequential importance sampling, which gives a numerical method for nonlinear non-Gaussian state estimation [34].
However, in practical applications, how to choose the optimal nonlinear estimation algorithm scientifically and reasonably is very important for designing reasonable estimator, to solve this problem, we will proposed the dynamic enhanced error spectrum in the following section.
In the following the comprehensive evaluation measures (CEM) to the PE-NEA is proposed, which includes the flatness of the estimator, the estimation accuracy and the computation time.
Quantifying the flatness of an NEA
In practical applications, the estimated stability mainly means that the estimated errors of the estimator remain within an acceptable range within a certain period of time. To quantify the flatness of a nonlinear estimation algorithm, we previously proposed a range error spectrum induced area, i.e., [31]
Using the random simulation, at time instant t, the RESA can be calculated as
Since it is difficult to calculate f-1 (S (r
i
)) analytically, r
i
∈ [-1, 2] , i = 1, 2, ⋯ , n, Approximating method is used in Equation (4) as
Particularly, for r
n
= 2, r1 = -1, we see that
In fact, S (r n = 2) and S (r1 = -1) are the RMSE and AEE, respectively.
According to Equation (3), the DESA was defined as follows [31]
At time interval [T1, T m ], over M Monte Carlo runs, the average computation time (ACT) of the nonlinear estimation algorithm is defined as
where, M is the Monte Carlo runs; t = 1, 2, ⋯ m.
To take advantage of the RESA, the DESA and the ACT fully, the CEM was presented to combine the three metrics into a single objective function [31].
Since the RESA, the DESA and the ACT have different magnitudes and units, normalization should be carried out first, i.e.,
Thus, the CEM is defined as
Furthermore, we propose an additive form and a multiplicative form to the CEM.
(a) Suppose that prior preference about the weights is available, the additive form of the CEM is given as
Clearly, the estimator with a smaller CES M is better. Furthermore, Equation (14) indicates that ADES, ARES and ACT are equally important in the PE-NEA.
As stated before, we obtain the architecture diagram of the PE-NEA as shown in Fig. 2 and summarize the above comprehensive evaluation measures in Table 1.

The architecture diagram of the PE-NEA.
Comprehensive evaluation measures
Hereafter, a convincing testing case about performance evaluation of nonlinear estimation algorithm is designed to illustrate the superiority of the CEM in the PE-NEA.
State equation and measurement equation
In this part, we considered the following state equation and measurement equation
Table 2 shows the initialization parameters of the above three NEAs.
Initialization parameters
Initialization parameters
Let [T0, T m ] = [0, 50s], after M = 500 Monte-Carlo runs, we obtained the tracking results of the above three NEAs shown in Fig. 3, which shows that the measurement y matches the state x within the time interval [0, 5s] and [20, 30s], and mismatches the state x within the time interval [6, 19s] and [31, 50s].

The estimated results of the above three NEAs.
To illustrate the superiority of the CEM, we used the average of the estimation value among the above 500 Monte-Carlo runs at time instant t, i.e.,
By the way, the absolute errors were given as
Firstly, we used the RMSE, ADES (t), and ARES (t) to evaluate the above three NEAs, and their evaluated results were shown in the Figs. 4–6, respectively.

Evaluated results of the RMSE curves.

Evaluated results of The ARES (t) curves.

Evaluated results of the ADES (t) curves.
From Fig. 4, we see that RMSEPF (t) is lowest than RMSEUKF (t) curve, and RMSEEKF (t) curve is in the middle of them within the time intervals [0s, 31s], so, we have
Likewise, Fig. 6 shows that
Clearly, for the estimation accuracy, the evaluation results using the ADES (t) is the same as the RMSE. However, in practical applications, we may consider not only the accuracy of the estimator, but also the flatness or the estimation speed of the estimator. The evaluation results obtained by this way can provide better a scientific basis for decision-making.
Furthermore, as depicted in Fig. 5, we see that
Similarly, we can see from the time interval [32s, 50s] that
So, compared with these three NEAs, the estimation results of the PF is the most stable, which illustrates that the PF is the best NEAs only considering the flatness of the estimator.
Secondly, at time interval [0, T], each estimation time of the EKF, UKF, and PF were shown in Fig. 7.

Evaluated results of the the ACT curves.
So, according to the Equation (10), we obtained the ACT, i.e,
That is,
Clearly, the EKF has the shortest computation time; the UKF is next; and the PF is longest computation time among the above three NEAs.
Furthermore, according to Equations (19) and (22), we can see that different metrics lead to different evaluation results. Therefore, to consider the flatness of the estimator synthetically, the estimation accuracy and the computation time in PE-NEA, the CEM A and the CEM M were proposed to evaluate the above three NEAs, as shown in the Figs. 8, 9, respectively.

Evaluated results of the the CEM A curves.

Evaluated results of the the CEM M curves.
Similarly, the CEM A curves of the above three NEAs have the same results in case of considering only the flatness of the estimator or the estimation accuracy. From Fig. 8, the PF is the best algorithm among the above three metrics because the CEM A curves is the lowest curve among the above all curves. Actually, the ACT is too small to affect the value of the CEM A since it is dominated by large terms in the additive form. Fortunately, the above problem can be solve by the CEM A .
From Fig. 9, in the time interval [1, 9s] and [18, 30], we obtained the following results from the CEM
M
curves, i.e.,
However, in the time intervals [10, 17], [31, 50s], we obtained
Clearly, The interval length t
a
= [10, 17s] + [31, 50s] =28s is greater than the interval length t
a
= [1, 9s] + [18, 30s] =22s, that is, the EKF wins the UKF too much. So, the EKF is better than the UKF, i.e.,
Similarly, we got
}So, we obtained
According to Equations (25) and (27), we got
In other words, we obtained
where EKF ≻ UKF means that EKF outperforms UKF.As discussed above, if only considering the flatness of the estimator, we have
Furthermore, if only considering the estimation accuracy, we obtained
However, when only considering the computation time, we get
Finally, if comprehensively considering the flatness, the estimation accuracy and the computation time, we get
The Primary contribution of this paper is that the comprehensive evaluation measure was proposed to evaluate the existing nonlinear estimation algorithm, which can comprehensively reflect the flatness, estimation accuracy and computation time of the existing nonlinear estimation algorithm. (1) The additive form of the CEM is good for not only can integrate the stability, accuracy, and computational time of translation estimation algorithms, but also can reflect the decision-maker’s intentions by setting different weights. (2) the multiplicative form of the CEM can balance the stability, accuracy, and computational time of estimation algorithms, and is not affected by extreme values. In future work, we will apply the CEM to the other NEAs.
Footnotes
Acknowledgments
We would like to thank anonymous reviewers for their helpful comments and suggestions. This work was supported in part by the national natural science foundation of China through grant No. 72271243, No. 71801222 and No. 61973253, the national science foundation of shaanxi province of China through grant No. 2018JQ6019, the national postdoctoral program for innovative talents through grant No. BX201700104.
