Abstract
The methods of searching for optimized parameters have substantial effects on the forecast accuracy of ensemble data assimilation systems. The selection of these factors is usually performed using trial-and-error methods, and poor parameterizations may lead to filter divergence. Combined with the local ensemble transform Kalman filtering method (LETKF), a technique for an automated search of the best configuration (parameters) of a data assimilation system is proposed. To obtain better assimilation, a differential evolution (DE) algorithm-based multiple-factor parameterization method results in the corresponding circumstances. By combining with fast-searching DE algorithms, we may retrieve the most ideal parameter combinations. Several numerical experiments performed with the Lorenz-96 model show that new methods performed better than the original one-parameter optimization methods. As the basis of DE methods, the best combinations of the local radius and the covariance inflation parameter, which can guarantee the best DA performances in the corresponding circumstances, are retrieved. It is found that the new method is capable of outperforming previous search algorithms under both perfect and imperfect model scenarios, and the calculation cost in Lorenz-96 model is lower. However, how to apply the new proposed method to more complex atmospheric or land surface models requires further verification.
Keywords
Introduction
Data assimilation (DA) provides an objective methodology to combine prior information, observational data, numerical model predictions, and their corresponding error statistics in order to produce a better estimate of the state of a geophysical system [1]. To reduce sampling errors and remove spurious long-distance correlations in the ensemble-based error covariance, inflation [2] and localization [3, 4] have been introduced into ensemble Kalman filters (EnKF) to offset the underestimation of the true variance. Moreover, DA involves two key control parameters, i.e., the inflation factor and localization length scale, which may significantly influence the optimization performance of the DA. Therefore, the optimal values of those tuning parameters, which may depend on different cases (e.g., models and ensemble sizes), are traditionally determined using trial-and-error methods. Meanwhile, it is difficult to select appropriate factors in real data assimilation due to the lack of real model state, limited observational information and the missing model and observational errors [1, 5].
A series of techniques have proposed within the DA literature to deal with these tuning factors. For example, a relationship between localization lengths and ensemble sizes in domain localization and observation localization has been proposed by Kirchgessner et al. [6]. In addition, an efficient serial scheme was introduced in the EnKF analysis step for the purpose of mitigating the effects of observations’ perturbations [7]. Moreover, Raanes et al. [8] studied multiplicative inflation with the complementary scaling of the state covariance in the EnKF. In practice, Gharamti [9] proposed an enhanced adaptive inflation algorithm for ensemble filters to combat the loss of variance caused by various models and sampling errors in the forecast. Additionally, Attia and Constantinescu [10] presented an optimal adaptive tuning experimental design framework to minimize the uncertainty during the DA processes. However, there was not a theoretical method presented to select the best parameters, especially for the multiple factors that influence the whole DA process.
Inspired by Genetic Algorithm (GA) theories, a new data assimilation method coupled with the crossover principles of vector algorithms based on ensemble transform Kalman filters (ETKFs) was proposed to solve the difficult problem of error adjustment factor searches [11]. Subsequent to this work, a new parameterization method was developed using evolutionary strategies that were both relatively easy to program and computationally efficient [12]. However, only one parameter could be selected within those frameworks. In this study, inspired by the previous evidence of error handling methods, we extend the previous studies to propose a multiple-factor optimization method with DA to choose better values within the DA framework. The search algorithm belongs to the class of derivative-free evolutionary optimization methods designed to mimic biological (Darwin’s) evolution. In the data analysis part, two or more factors could be selected using DE algorithm-based DA methods, which provide the best assimilation results in certain circumstances.
The remainder of this paper is organized as follows. In Section 2, the classical local ensemble transform Kalman filter (LETKF) is reviewed as the theoretical basis of this study. Then, a DE-based parameter optimization system is introduced in Section 3. Lorenz-96 model is used to verify the new method in Section 4. A summary of the main results followed by a general discussion concludes the paper in Section 5.
Related works and theoretical basis
The LETKF was proposed as a more computationally efficient version of the ETKF [3, 4]. In this scheme, the number of ensemble members is
where
Refer to Eq. (2),
where
The estimated parameter disturbance is obtained by Eq. (4)
Where
The analysis covariance matrix
In the LETKF-based data assimilation system, many factors affect the assimilation results. The performance of DA systems is largely based on the optimal choice of the key parameters of the DA methods, such as the covariance inflation factors, the localized radius and the ensemble numbers. Unfortunately, the optimization methods that have been used to solve static data assimilation problems face difficulties in direct dynamic DA applications. The main reason is that the traditional GAs usually converge to a single optimum, while in dynamic DA problems, the solution landscape continues to change, which means that there are several local optimal points that will be revealed during the solving process. One of the ways to overcome the shortcomings of traditional genetic algorithm-based methods is to apply a differential evolution algorithm. Figure 1 illustrates the simultaneous estimation cycle.
Schematic representation of the optimal inflation factors and localization radius searching DA cycles using the Differential Evolution (DE) algorithm.
Similar to the DA systems [11], Fig. 2 shows the schematic diagram of the DA systems coupled with the DE algorithms. The idea behind this approach and the connection between the DA models and the DE is shown in Fig. 1. The DA systems in Fig. 2 will be the same as those in traditional DA systems. However, the fitness minimum mRMSE function defined later will be selected based on the real DA systems. The DE part of Fig. 2 will handle to the select the multiple optimal factors, such as the combination (referred to as the “gene” in the DE algorithm) of the local radius parameter and the covariance inflation parameter. In terms of the coupling strategy, our method is similar to those presented in previous studies [14]. The choice of the fitness function can refer to those in the literature for different DA systems.
Flowchart of a DE-based parameter optimized DA system. The DE search part is connected to the normal DA part serially, which forms a kind of feedback mechanism of the DA systems.
The DE algorithm is a population-based, heuristic global search algorithm [15]. Like other evolutionary algorithms, the technique proposed in this paper defines a “population” of parameter vectors (“individuals”). “Individuals” in the “population” give “birth” to a new “generation” whose “individuals” (“children”) are computed by a number of simple rules called “mutation”, “crossover”, etc. From the “parents”, “individuals” in the new “generation” are evaluated by a “fitness function” (the time and space averaged forecast RMSE). “Individuals” with the smallest fitness function “survive” and “give birth” to the next “generation”.
The population for which the data assimilation method parameters are considered as vectors is initialized. The original population is defined as follow:
Here,
where rand (0, 1) denotes the uniformly distributed random variables within the range (0, 1). The prescribed minimum and maximum parameter bounds are:
The difference between the differential evolution algorithm and the ordinary GA is that the former employs a difference strategy to achieve the variation. The main idea of the difference strategy is to randomly select two different individuals in the population and take the adjustment value of the individuals’ difference with a third individual to obtain the trial vector according to Eq. (9):
where
The crossover operation is used to reduce the overly fast convergence of a vector perturbation. The population
In Eq. (10), the crossover rate
The selection is the decisive operation in the DE algorithm that changes the population composition. The standard used to determine whether the generation population individual
The mRMSE of time is used to evaluate the performance of the DA system. The mRMSE is computed according to Eq. (11):
Where
The selection standard is defined by Eq. (12):
The
In addition, because of the bounds of the parameter space of the DA system, there may be a cross-border situation in the mutation process, resulting in meaningless system parameters. In this paper, the evolutionary optimization process is continued by using a random value in the parameter space instead of the crossed value.
The performance of the DE algorithm depends mainly on two aspects, namely the selected test vector generation strategy and the relevant parameter values used. In the past decade, many empirical guidelines have been proposed for choosing trial vector generation strategies and their associated control parameter settings [15, 16, 17]. In our case, according to Storn [15], we choose a reasonable value for the crossover operation
Numerical experiments
The Lorenz-96 model is a strongly nonlinear dynamical system and is used to verify the validity of the DA algorithm [18]. The state dimension is chosen as 40 and the governing equations are given in the following:
The main components of the experimental system are a high-dimensional nonlinear Lorenz-96 model, the LETKF and the DE algorithm. In this experiment, the optimal parameter combination of the local radius (irad) parameter and the covariance inflation (infla) parameter are sought, and the corresponding mRMSE is the minimum value in the parameter space. In the following experiments, for a normal DA system parameter setting, the ensemble number chosen is 10. The total number of iteration steps is 10000. To avoid the transient effect, the first 1000 steps are omitted. The observation error covariance is defined as 2. Usually atmospheric data are dense in the Northern Hemisphere (over United States, Europe, …) and very sparse in places like the Pacific. A good strategy would be to observe the first 20 variables and see how the optimization algorithm work. The observation numbers are set to 20 and are the fraction of one point at which observations are taken.
To check the trend of the mRMSE with changing parameters in the DA process, the mRMSE for LETKF in the parameter space of the local radius and covariance inflation is calculated. In the experiment, we set the range of covariance error inflation and localized radius as follows:
From Fig. 3, we can draw several conclusions. (1) With the increase in the local radius (irad), the mRMSE of LETKF and the growth rate of mRMSE become larger. (2) The experiments with severe model errors (Fig. 3a–d) produce very similar results to those of
Time mRMSE of inflation and local radius vectors. (a)–(d) with severe model error and (b) without model error.
Using the local radius and the covariance inflation parameter as the gene in the DE framework, the possibility of searching for the optimal parameter combination is investigated. The individual distribution of the DA method parameter population with the population update is studied. In the experiment, the number of populations is set to be 15 and the total number of generations is set to be 20. The forcing factor
The individual distributions with the evolutions with severe model error (
Figure 4 shows the individual distributions of the initial population and the populations of the 1st, 4th, 8th, 12th, 16th and 20th generations. The red points in Fig. 4 represent the individuals “genes”. The individual distribution of the initial population is random. (1) With the increase in the generation of the DE algorithm, the individuals with more minimal mRMSE values are preserved, such that the individuals in Fig. 4 gradually gather to the optimal individual. (2) The higher the population generation is, the more frequently the same vectors appear. This results in an increase in the number of repetitive individuals appearing in Fig. 4 and a decrease in the number of red points in Fig. 4. (3) In the 12th generation, the optimal individual appears. In the 20th generation, the vectors of all the individuals are the same, verifying that this is the optimal individual in the population. The vectors of the optimal individual, which is the optimal value of the local radius (
The individual distributions with the evolutions with severe model error (
In Fig. 5, the experimental conditions are set; the number of the population is 15, and the number of total generations is set to 20. The observation numbers are set to 20. The forcing factor
Furthermore, the convergence of the minimal mRMSE values in the population with increasing generations is studied. The number of the population is set to 15 and the number of the total generations to 20.
The convergence of the DE-based parameter-optimized DA system. (a) without model error (
As shown in Fig. 6, with the increase in the population generation, the optimal individual in the current population becomes gradually closer to the optimal individual in the global parameter space. At the same time, when the number of generations reaches a certain value, the optimal individual is selected and will not fluctuate again, which indicates that the method in this paper has better convergence. Although each experiment has different waveforms, all can show a good convergence. Notably, in the presence of severe model error, the new DE algorithms perform moderately better than the same cases without model error.
In the DE algorithm, the vector difference values between the vectors reflect the differences between the population individuals. The number of populations is set to 15 and the number of the total generations to 20.
In the process of the optimization of the parameters of the DA system method using the DE algorithm, the differential values of the vectors reflect the degrees of individual differences. Figure 7 shows that with the increase in the population generations, the maximum values of the difference are gradually reduced, and the global distribution is closer to the zero value in Fig. 7. This verifies that the search using the method of the global region shifts to the local region. Finally, the exact combination of the specific vectors corresponding to the optimal individual (minimal mRMSE) is sought, and in the subsequent population generations, the vector differential distributions no longer change.
The vectors difference value distributions of the initial population, the 5
Influence of generation steps on the optimization result
After the searching part of the experiments, the optimal vector combination (
A comparison of the mRMSEs of the optimal vectors, suboptimal and poor selections with different strength forcings 
To simulate that the atmospheric data are dense in the Northern Hemisphere (over the United States, Europe …) and very sparse in places such as the Pacific, a good strategy is taken to observe the first 20 variables in Lorenz-96 models and to see how the proposed algorithms work. As shown in Fig. 9, comparing the mRMSE of the first 20 observation used the best combination parameters with the 40 observations without the best parameters setting, we concluded that the former is better than the latter. Meanwhile, the computational complexity, as shown in Table 1 (The time unit is seconds), is increased with the increase of generation size. The running time for both of them increases gradually. However, when the generations steps exceed 16 generations, the running time of the two cases is almost the same. Therefore, combined with the LETKF, a DE algorithm-based multiple-factor method was developed to obtain better assimilation results.
A comparison of the mRMSEs of the first 20 observation used the best combination of parameters with the 40 observations without the best parameter settings for 
Evolutionary (and related) optimization techniques can be interesting to meteorologists, but there are quite a few papers in meteorology and other applied sciences that discuss this approach. Significant effort has been made to seeking to enhance the state of DA technology by developing new coupled DA methods aimed at maximize the strengths of both forms of data assimilation while offsetting the weaknesses of each. However, Tian and Xie [19] suggested that several issues, such as how to choose the appropriate localization radius and the relaxation coefficient, still need to be addressed. In this study, in considering the optimization of DA parameters, a new scheme was developed to improve error parameterization methods [12]. By applying the DE, the best combination of the two key factors of LETKF was generated through a typical procedure using the DE operation. The impact of the new searching method on the robustness of the DA system under selected special conditions, such as different force parameters, was then investigated. We use the Lorenz-96 model to verify the proposed optimization method. The numerical experiments show that for the Lorenz-96 model with 40 variables, two parameters of the LETKF analysis can be successfully optimized with the “population” size equal to 15 and the number of “generations” equal to 20. The results indicate that the proposed DE method has the strong robustness when the system environment changes. Additionally, when compared with the traditional filtering methods, the proposed robust optimization method can improve the assimilation performance. The main differences of the technique presented in the paper are twofold. (a) Two parameters are optimized here instead of one parameter [11]. (b) The “mutation” process is somewhat different here.
Conclusions
There is a well-established area of parameter estimation within DA [3, 20]. With that approach, the state and parameters can be estimated simultaneously, implying, potentially, much greater efficiency than with “black-box” schemes, such as the one described in this paper. In particular, the estimation of parameters is embedded in the DA system and thus is achieved “on-line”, without the need to examine multiple systems’ configurations (which is costly). The optimization of the parameters in the DA method in an efficient manner is a priority for optimizing the performance index of the mRMSE. In this paper, the parameter optimization scheme of the DA system based on the DE algorithm is proposed due to the non-enumeration of multiparameter vector space and the discontinuity of the mRMSE index function. The effectiveness of this scheme is verified by the change in the distribution of the parameter population, the convergence of the optimal combination, and the distribution of the vector difference values. Simultaneously, the heuristic convergence of the DE method for an optimization problem is presented.
In an ensemble DA system, the state dimension of the models is typically very high, and in most cases small ensemble can be used. The localization technique is used to calculate the local analysis of each grid point to solve the problems of false correlation and lack of freedom. Additionally, different localization methods were proposed, and extensive tuning of the localization parameters is necessary to achieve the optimal results, including adaptive localization methods [21, 22]. Normally, in order to obtain most of the relevant covariances, a sufficiently large localization radius should be selected. Therefore, the relationship between ensemble numbers and localization radii should be further investigated in the new scheme in the future. Meanwhile, multi-objective dynamic DE methods can be adopted to achieve better assimilation results.
Footnotes
Acknowledgments
This work is supported by the NSFC (National Natural Science Foundation of China) project (grant number: 41861047, 41461078) and the Northwest Normal University young teachers’ scientific research capability upgrading program (NWNU-LKQN-17-6).
