A brief survey of robust optimization

Abstract

To counteract the performance degradation in electromagnetic devices due to mechanical tolerances, assembly inaccuracies, poor knowledge of material parameters, etc., many robust design techniques have been proposed. A first class of methods takes advantage from statistical analysis to improve the average performance taking into account actual, randomly distributed values rather than the nominal figure of the parameters design values. A second class, making use of the “sensitivity” of device performance with respect to different parameters, looks for solutions as insensitive as possible with respect to possible modifications of the design parameters. In this paper, with the aim of presenting a synthetic selection of existing methods, a brief survey of the different approaches is presented. A simple test case is considered for demonstration purposes.

Keywords

Robust optimization sensitivity analysis Taguchi approach worst-case approach pareto optimality

1. Introduction

Due to construction and assembly tolerances, imprecise knowledge of material parameters, aging and other unpredicted effects, the actual behaviour of electromagnetic devices is usually different from what achieved during the design process using “nominal” parameters value, and assuming their perfect knowledge. In order to counteract such performance degradation, a Tolerance Analysis (TA) can be considered, that is an assessment with respect to parameters uncertainty, performed using either stochastic analysis with random variations on parameters, or deterministic analysis with fixed, yet small, variations of parameters to estimate performance sensitivity. Alternatively, “robust” behaviour could be required already in the design phase, including the minimization of uncertainties impact in the Device Performance Function (DPF). In such cases, a Robust Design (RD) approach is used [1, 2, 3, 4, 5, 6]. RD can be achieved using both of the methods introduced for TA, the main difference being in the tolerances impact assessment already in the design phase.

In the following, the attention will be focused on numerical design process, and consequently the approaches taking advantage of numerical analyses will be discussed. Note that both TA and RD imply a relevant impact on the computational cost, since, for each trial solution, the full local behaviour of the DPF must be analysed, and not the single, nominal design [2]. When the design process is done using mock-up or prototypes, specific procedure must be used [1], but this is out of the scope of the present discussion.

Two possible indexes of the performance degradation are usually considered, for both TA and RD:

A first one, based on a statistical point of view, tries to control the expected performance and its variability, using the mean and variance of DPF. This approach is due to Taguchi [3], and is known as the “total quality” approach. It is well suited for mass production processes, where, in order to reduce production costs by releasing tolerances, a (small) number of device samples can be rejected.

A second performance degradation index is the worst-case performance [5, 6], which aims to indicate how much the worst possible combination of uncertain values of parameters worsens the DPF. This approach is typically used for small series productions, when performance is critical or, at least, more important than the cost. This is the case, for example of life-support devices (e.g. pace-makers) or the case of expensive devices, built in a single prototype (e.g. magnets for plasma confinement and control in Tokamak devices).

Both approaches imply computational efforts that, in case of high complexity systems, can become too high for the computing resources available to the designer. As a consequence, possible techniques able to reduce the computational burden are very beneficial, examples being Latin Hypercube Sampling [8] or Unscented Transform [9] in the statistical approach, or Sensitivity Analysis in the worst-case approach [1, 4, 6].

In this paper, with the aim of providing some highlights about robust optimization in electromagnetism, a few formulations of the TA problem and of the best-suited RD methods will be discussed, including Paretian formulations [10]. The focus will be on the methods rather than applications, and, due to room limitations, only the main ideas will be introduced, although a simple example, the design of a Helmholtz coil, will be used to ease comprehension.

2. Problem formulations

Optimal design problems in electromagnetism are usually formulated as the search for a (possibly constrained) minimum or maximum of a scalar DPF $f$ , able to quantify the goals to be achieved in the device design. The function $f$ depends on the geometrical and physical parameters describing each trial device configuration, either those under the direct control of the designer (the “Degrees of Freedom” DOF, p) and those beyond the control of the optimization process (the Uncontrolled Parameters UP, q). Unfortunately, due to process tolerances during the manufacturing and assembly of real devices, or due to aging during the lifetime of the device, the exact value used in the design stage for both DOF and UP cannot be guaranteed; therefore, the effect of such uncertainties on the performances either at the end or during the design process, must be assessed. As anticipated, such problem can be faced with using statistical or worst-case approaches. In the first case, tolerance calculations are performed by considering the random distribution of the actual parameters around their nominal values, while in the second approach, the optimization function takes into account also the possible performance decrease at the tolerances range limits.

For the sake of exposition, it will be assumed that both the N DOF and M UP are real numbers (p $\in\Re^{\text{N}}$ , q $\in\Re^{\text{M}}$ ). The DOF array corresponding to the nominal values, possibly found as the result of an optimal design procedure, will be denoted as p ${}_{0}$ , while uncertainties as $\delta$ p; therefore, p $=$ p ${}_{0}+\delta$ p; similar symbols (q ${}_{0}$ , $\delta$ q) are used for the UP parameters and, in addition, q and p are assumed uncorrelated. As a consequence, the DPF is a function $f$ (p, q): $\Re^{\text{N}}\times\Re^{\text{M}}\to\Re$ . Note that $f$ is deterministic, but its arguments can be random variables, so its outcome can be seen as a random variable as well.

2.1 Statistical approach

Several statistical formulations can be adopted; the most diffused ones refer to the Taguchi’s RD philosophy [3]. The attention is not on the “most performing” device, but rather on the best trade off among cost, reliability and performance. Taguchi’s approach bases on a suitable “Quality Function” ${F}_{Q}\left(\underline{\text{p}},\underline{\text{q}}\right)=f\left(% \underline{\text{p}},\underline{\text{q}}\right)\rangle+\alpha\sigma_{f}^{2}% \left(\underline{\text{p}},\underline{\text{q}}\right)$ , where the design parameters p and the system parameters q are considered as random variables, $\sigma_{f}^{2}$ is the variance of $f$ , $\alpha$ is a suitable weight and $<\bullet>$ is the statistical average operator. Nature of the manufacturing process, and other similar considerations help choosing the best-suited probability distribution for p and q, the most usual being the normal one.

Such a formulation implies that the “best solution” is the one minimizing the effect of inaccuracies together with optimizing the performance in an averaged sense:

$\displaystyle\left\{{\begin{array}[]{l}\mathop{\text{min!}}\limits_{\underline% {\text{p}}}F_{Q}(\underline{\text{p}},\underline{\text{q}})\\ \text{subject to }h_{i}(\underline{\text{p}},\underline{\text{q}})<0\quad i=1,% 2,\ldots N_{C}\\ \end{array}}\right.$ (1)

where $h_{i}$ (p, q) represents the i-th constraint functions, verified by each instance of the parameters and of the performance function $f$ .

2.2 Deterministic approach

Worst-case approaches can be cast in several ways; here, just the “minimax” formulation is sketched, based on the minimization of the maximum discrepancy with respect to the nominal performance [1, 6]. In these cases, the worst-case performance $f_{W}$ has to be optimized together with the target performance $f$ . A possible formulation can then be:

$\displaystyle\left\{{\begin{array}[]{l}\mathop{\text{min!}}\limits_{\underline% {\text{p}}}F_{W}\left(\underline{\text{p}},\underline{\text{q}}\right)\\ \text{subject to }\left\{{\begin{array}[]{l}f_{\text{min}}\leqslant f\left(% \underline{\text{p}}_{0},\underline{\text{q}}_{0}\right)\leqslant f_{\text{MAX% }}\\ h_{i}\left(\underline{\text{p}},\underline{\text{q}}\right)<0\quad i=1,2,% \ldots N_{C}\\ \end{array}}\right.\\ \end{array}}\right.$ (2)

where $F_{W}(\underline{\text{p}},\underline{\text{q}})=f(\underline{\text{p}}_{0},% \underline{\text{q}}_{0})+\alpha f_{W}(\underline{\text{p}},\underline{\text{q% }})$ , $f_{\text{min}}$ and $f_{\text{MAX}}$ are the bounds of the allowed ranges for $f$ and $f_{W}$ (p, q) is the “worst” performance evaluated in a suitable subdomain in the DOFs and UPs space, centered on the nominal values p ${}_{0}$ and q ${}_{0}$ , respectively. Note that the constraint on $f$ (p ${}_{0}$ , q ${}_{0})$ states that the nominal performance must be guaranteed by any set of nominal design parameters, while the term $f_{W}$ expresses the performance degradation around the nominal design due to parameters tolerance and depends on specific application. Possible additional constraints can be included also in this case. Similarly to stochastic approach, also the deterministic one implies an increase in the computational cost, since at each optimization step a further local search for the worst case must be performed.

2.3 Pareto formulation of robust optimization problems

Equations (1) and (2) define generalized scalar error functionals, incorporating both the design criterion and its sensitivity to tolerances; this gives rise to a single-objective optimization process, leading to a unique solution, which is assumed as the (robust) optimum. Alternatively, the RD problem can be formulated in terms of Pareto optimality, based on the treatment of a set of optimal solutions characterized by a different degree of sensitivity. The problem reads: find the family of non-dominated solutions $\tilde{\underline{p}}_{0}$ minimizing $f\left(\tilde{\underline{p}}_{0},\underline{\text{q}}_{0}\right)$ and either $f_{w}\left(\underline{\text{p}},\underline{\text{q}}\right)$ or $\sigma_{f}^{2}\left(\underline{\text{p}},\underline{\text{q}}\right)$ , subject to the problem constraints [10]. This approach will be further pursued in the remaining of the paper.

3. Tools to reduce computational cost

As anticipated, RD is a computationally expensive process. As a consequence, a number of measures have been proposed to simplify or speed-up the analysis, basically trading off accuracy in the DPF evaluation with promptness; just a few of them are cited here, but the list is quite huge, and literature is full of examples [2, 5].

3.1 MonteCarlo and Latin hypercube samplings

Among viable approaches for stochastic analysis involved in statistical approaches, MonteCarlo (MC) technique is the simplest one. In the classical MC analysis, a probabilistically based sampling procedure is used to generate a finite number K of sampling points $\tilde{p}_{k}$ in the DOF space, and to get a collection of $f$ values [1]. These values are used to approximate mean and variance by using weighted summation of such values, the weights being simply 1/K. Note that the mapping obtained in this way can also be used to evaluate effects of single parameters e.g. by using scatter plots, or by creating a regression model using sampling points, but for these tasks more effective approaches will be introduced below. The same considerations of course apply to UP, but will be omitted here for the sake of brevity. Uncertainty quantification obtained by MC is reliable, but very poor in computational efficiency (the mean and the variance converge as 1/ $\surd$ K). As the computational burden of every evaluation of the objective function is in general not negligible, this “brute force” approach can be unfeasible in most cases. In addition, since plain MC does not exploit information about $f$ , but just probabilistic models of DOF, there is no assurance that a sample element will be generated from any particular subset of the sample space, and points with low probability but high consequences (e.g. the worst case) are likely to be missed.

Stratified sampling [1, 8] provides a way to mitigate MC problems by specifying subsets from which sample elements will be selected, but the approach requires the definition of “desired” subsets (or strata) taking advantage of information about $f$ , and the computation of a probability distribution allowing to sample non uniformly the DOF. When the number N of DOF is high, the determination of strata and strata probabilities becomes a major undertaking. Latin Hypercube Sampling (LHS) can be viewed as a compromise procedure that incorporates many of the desirable features of random sampling and stratified sampling and also produces more stable analysis outcomes than MC sampling. LHS operates in the following manner to generate a sample of size K from DOF space, coherent with the probabilistic properties of DOF:

the tolerance range of each DOF is exhaustively divided into K disjoint intervals of equal probability, and one value is randomly selected from each interval;

the K values for the first DOF are paired at random (without replacement) with the K values for the second DOF. These K pairs are combined at random (without replacement) with the K values for the third DOF, generating K triples. The procedure iterates until K N-tuples are generated;

the set of K N-tuples constitutes the Latin Hypercube, and the performance function $f$ is evaluated in the points of the tuples, to generate the set of values for mean and variance estimation.

This method works only if DOF are not correlated, but a number of methods exist to generate a set of sampling points showing the same correlation as for the DOF [8].

3.2 Unscented transform

While the classical MC approaches keep generality at the price of a high number of function calls, Unscented Transform (UT) specifically choose a reduced number of points (sigma points) in the DOF (or UP) space to have the same known statistical properties of parameters, and “propagates” sigma points through $f$ to get its statistics as a weighted sum of values at sigma points. The most adopted set of sigma points was proposed in [9]. In the case of DOF, the set is composed by K $=$ 2N $+$ 1 points $\tilde{p}_{k}$ ( $k=$ 0, …, 2N $+$ 1), and related weights $w_{k}$ , which result symmetric with respect to the origin:

$\displaystyle\left.{\begin{array}[]{ll}\tilde{p}_{0}=\text{p}_{0},&w_{0}=% \epsilon/\left(\text{N}+\varepsilon\right)\\ \tilde{p}_{k}=\text{p}_{0}+\left(\sqrt{\left(N+\epsilon\right)\underline{% \Sigma}}\right)_{k},&w_{k}=1/\left(\text{2N}+2\varepsilon\right)\\ \tilde{p}_{k+N}=\text{p}_{0}-\left(\sqrt{\left(N+\epsilon\right)\underline{% \Sigma}}\right)_{k},&w_{k+N}=1/\left(\text{2N}+2\varepsilon\right)\\ \end{array}}\right.$ (3)

where $\underline{\underline{\Sigma}}$ is the covariance matrix of DOF, $\left(\sqrt{\left(N+\epsilon\right)\underline{\Sigma}}\right)_{k}\text{ }$ is the kth row of the matrix square root of (N $+$ $\varepsilon$ ) $\underline{\underline{\Sigma}}$ , and $\varepsilon\in\Re$ is a control parameter, to be chosen to fit higher order statistical moments of DOF. Of course, similar considerations apply to UP, not reported for the sake of brevity. Mean and variance of $f$ are then estimated as:

$\displaystyle f\approx\sum\limits_{k=1}^{2N+1}w_{k}f(\tilde{p}_{k})$ (4) $\displaystyle\tilde{\sigma}_{f}^{2}=\sum\limits_{k=1}^{2N+1}w_{k}(f(\tilde{p}_% {k})-\langle\tilde{f}\rangle)(f(\tilde{p}_{k})-\langle\tilde{f}\rangle)^{T}$

3.3 Functional expansions

Some of the most diffused tools to speed up DPF evaluations base on its approximation as a sum of analytical functions, much faster to be evaluated. Typical examples are Taylor expansions, but also approaches based on data collection are well known, such as Fourier series, splines approximations, wavelet transform and so on. We will limit ourselves here to just two cases, namely Taylor expansion and Kriging for the sake of brevity. We will also present an approach based on chaos theory which fits quite well the contrasting needs of promptness and representation accuracy of random problem nature typical of RD.

3.3.1 Taylor series and Sensitivity Analysis

It could be reasonably expected that the relative values of $\delta$ p and $\delta$ q are rather small, and their effect on the device performance could be expressed by using the k-th order truncated Taylor expansion of $f$ with respect to $\delta$ p and $\delta$ q [9]:

$\displaystyle f\left(\underline{\text{p}}_{0}+\delta\underline{\text{p}},% \underline{\text{q}}_{0}+\delta\underline{\text{q}}\right)$

(5) $\displaystyle\quad=f\left(\underline{\text{p}}_{0},\underline{\text{q}}_{0}% \right)+\underline{\text{s}}_{p}^{T}\delta\underline{\text{p}}+\underline{% \text{s}}_{q}^{T}\delta q+\frac{1}{2}\delta\underline{\text{p}}^{T}\underline{% \underline{\text{H}_{\text{p}}}}\delta\underline{\text{p}}+\frac{1}{2}\delta% \underline{\text{q}}^{T}\underline{\underline{\text{H}_{\text{q}}}}\delta% \underline{\text{q}}+\cdots+o(|\delta\underline{\text{p}}|^{\text{k+1}},|% \delta\underline{\text{q}}|^{\text{k+1}})$

The arrays s ${}_{\text{p}}$ and s ${}_{\text{q}}$ are the gradients of $f$ in the space of DOF and of UP respectively, and are typically called “sensitivity arrays” [7]; if derivatives of $f$ are not easily available in closed form, they can be computed using interpolation, or using covariance computation. The matrices $\underline{\underline{\text{H}_{\text{p}}}}$ and $\underline{\underline{\text{H}_{\text{q}}}}$ are the Hessian matrices in the space of DOF and UP respectively. It should be noticed that in Eq. (3.3.1) all the mixed second order derivatives of $f$ versus a couple of (p, q) variables are neglected. Higher order terms require computation of higher derivatives, and are not reported here for the sake of brevity.

When Eq. (3.3.1) is truncated to the first order, the approach is commonly known as Sensitivity Analysis (SA). Note that Eq. (3.3.1) provides local sensitivity; another possible definition of sensitivity makes reference to global figures, obtained e.g. by correlation analysis on the whole DOF (or UP) domain. Different computing strategies could drive to different values and different results in terms of robust optima. In the following example, the elements of the sensitivity array are computed using finite differencing with middle-point.
3.3.2 Kriging methods

Kriging (or Gaussian) regression is a method of interpolation for which the interpolated values are modeled by a Gaussian process governed by a-priori covariances, as opposed to a piecewise polynomial spline chosen to optimize smoothness of the fitted values [12]. The basic idea of kriging is to predict the value of a function at a given point by computing a weighted average of the known values of the function in the neighborhood of the point. Under suitable assumptions on the priors, kriging gives the best linear unbiased prediction of the intermediate values. Some advantages of kriging are:

it helps to compensate for the effects of data clustering, assigning individual points within a cluster less weight than isolated data points (or, treating clusters more like single points);

it gives estimate of estimation error (kriging variance), along with estimate of the function $f$ itself (with an error map which is basically a scaled version of a map of distance to nearest data point);

availability of estimation error provides basis for stochastic simulation of possible realizations of $f$ .

All kriging estimators are but variants of the basic linear regression estimator $\tilde{f}$ defined as [10]:

$\displaystyle f\left(\underline{\text{p}}\right)\approx\tilde{f}\left(% \underline{\text{p}}\right)=\tilde{f}_{0}+\sum\limits_{k=1}^{K\left(\underline% {\text{p}}_{0}\right)}{w_{k}\left[\tilde{f}\left(\underline{\text{p}}_{k}% \right)-\tilde{f}_{0}\right]}$ (6)

where p ${}_{0}$ is the estimation point, and p ${}_{\text{k}}$ are the neighbouring points, $K(\underline{\text{p}}_{0})$ is the number of data points used for estimation, $\tilde{f}_{0}$ is the trend component (usually, the average), and $w_{k}$ are the kriging weights assigned to data for estimation at location p ${}_{0}$ . Note that the same datum can be differently weighted when used for different estimation points. The goal of the kriging process is to determine weights $w_{k}$ that minimize the variance of the estimator $\sigma_{\tilde{f}}^{2}=var[\tilde{f}(\underline{\text{p}}_{k})-\tilde{f}_{0}]$ under the unbiasedness constraint $\langle\tilde{f}(\underline{p}_{k})-\tilde{f}_{0}\rangle=0.$ Different kriging algorithms choose estimator functions $\tilde{f}$ on the basis of either apriori knowledge about the problem or on the basis of smoothness requirements. As usual, quite similar results apply to UP, and were omitted for the sake of compactness.

3.3.3 Polynomial chaos

In the last few years, the use of Polynomial Chaos Expansion (PCE) in optimization has been proposed [13]. PCE was introduced by the work of N. Wiener and consists essentially in expanding the uncertain variable in a suitable series and then computing the statistical moments of the truncated expansion. This method allows the decomposition of the variance of the output as a sum of contributions of each input variable, or their combinations. This numerical scheme belongs to the family of statistical tools called ANOVA (ANalysis Of Variance) [11]. The homogenous chaos expansion could be employed to approximate any function in the Hilbert space of square-integrable functions. Considering the function $f$ as a random variable with finite variance, and neglecting dependence on q for the sake of readability, $f$ can be represented by a spectral expansion standing on a Hermite polynomial basis.

$\displaystyle f\left(\underline{\text{p}}_{0},\delta\underline{\text{p}}\right% )=\sum\limits_{k=1}^{\infty}F_{k}(\underline{\text{p}}_{0})\Phi_{k}(\xi\left(% \delta\underline{\text{p}}\right))$ (7)

with $F_{k}(\underline{\text{p}}_{0})$ , the unknown deterministic coefficients and $\Phi_{k}$ the multivariate Hermite polynomials. Polynomial basis is orthogonal with respect to the distribution type, for instance: the Hermite polynomials are related to the Gaussian distribution, while Legendre polynomials are related to the uniform one. The use of variational approach and the exploitation of the orthogonality among expansion polynomials increase the computational efficiency of the process. The calculation of the PCE decomposition coefficients is achieved by means of the scalar product between the objective function and the multivariate polynomials. In this case the use of sparse numerical quadrature techniques, like the Smolyak sparse grids method, can increase the numerical efficiency of the whole procedure [14]. Numerical tests on analytical functions show the advantage of PCE over MC apparent. Due to its computational performances, PCE seems to be a promising approach for RD where evaluation of statistical values of the objective function are required.

4. Continous flock of starlings

In order to limit the impact of “robust” formulations on the computational burden required in the optimization, highly efficient optimization algorithms must be adopted as, for example, the Continuous Flock of Starlings Optimization (CFSO) [14]. The core equations underlying this algorithm belongs to a well-known variation of the Particle Swarm Optimization (PSO) algorithm called the Flock of Starlings (FSO) optimization [15]. This algorithm features the same update equations that are usually found in the PSO, but introduces a component in the velocity of the particles that takes into account the average of the velocities of a subset of particles belonging to the swarm. As a consequence, particles tend to follow each other, increasing the collective behavior dramatically, and making the swarm less prone in being entrapped in local minima. The strong mobility and exploration capabilities of this algorithm made it well suited to solve complex optimization problems both as stand-alone algorithm and in conjunction with other local-search algorithms. The CFSO core algorithm is derived from FSO by considering a fictitious continuous “time” instead of the discrete time-step that can be postulated in the position and velocities update equations. More specifically, the FSO update equations are re-written as differential state equations of a dynamic system [14]:

$\displaystyle\dot{v}_{n}^{j}(t)=\omega v_{n}^{j}(t)+\lambda\left(u_{\textit{% best}_{n}}^{j}(t)-p_{n}^{j}(t)\right)+\gamma\left(g_{\textit{best}}^{j}(t)-p_{% n}^{j}(t)\right)+\sum\limits_{m=1}^{N}h_{nm}v_{m}^{j}(t)$ (8) $\displaystyle\dot{p}_{n}^{j}(t)=v_{n}^{j}(t)$

where $p_{n}^{j}$ is the n-th DOF of the j-th individual, and $\dot{p}_{n}^{j}\left(t\right)$ is the “derivative” of the n-th DOF of the jth individual with respect to the “fictitious” time, also corresponding to the particle velocity $v_{n}^{j}$ . The update equations parameters are: $\omega$ , known as inertial coefficient (tendency of a particle of conserving its actual velocity); $\lambda$ , known as cognitive coefficient (tendency of a particle to be attracted to the personal best position found so far $u_{\textit{best}_{n}}^{j}$ ); $\gamma$ , known as social coefficient (tendency of a particle to be attracted to the global best position $g_{\textit{best}}^{j}$ found by the swarm so far); $h$ , known as topological coefficient (tendency of a particle to follow a subset of K other particles of the swarm).

The result for this approach is a set of time-continuous trajectory expressions for the particles [14], and analytical expressions for the poles of the dynamic system. Those trajectory equations, coupled with the stability control obtained by manipulating the poles, constitute the CFSO algorithm. The great advantage coming with the analytical expressions of the poles is the ability to control the behavior of the swarm: converging, diverging and oscillating trajectories can be enforced. Those behaviors can be combined to create a hybrid algorithm maximizing both exploration and exploitation capabilities. This algorithm has been already used with success in magnetics [13] for model identification purposes, in a parallel-strategy variant called the CFSO3. In this strategy, the algorithm is implemented in a master-slave architecture. The master runs the CFSO algorithm with a diverging/oscillating configuration of the poles (complex conjugates with real-part positive) to coarsely explore the solution space. During the exploration, some areas are selected as candidates to search for the final solution. In those areas, slaves are individually initialized with converging/oscillating configurations of the poles (complex conjugates with real-part negative) to refine the final solution.

5. Example of application

In order to show capabilities of the discussed approaches, a pair of Helmholtz coils is analyzed as a test device. Helmholtz coils are a pair of “twin” coils, with radius $R_{c}$ , fed by the same current I, and at axial distance $D_{c}=R_{c}$ ; this guarantees vanishing derivatives at $z=$ 0 up to 2 ${}^{\text{nd}}$ order (odd terms vanish for symmetry, first non-constant term along axis is z ${}^{4}$ ). For a couple of thin (filamentary) circular coils, the central field is: $B_{z}\left(z=0\right)=B_{0}=\frac{\mu_{0}I}{R_{c}}\frac{4}{5}^{3/2}$ , while the axial field component is:

$B_{z}\left(z\right)=\frac{\mu_{0}IR_{c}^{2}}{2}\left(\frac{1}{\sqrt{\left(R_{c% }^{2}+\left(z+\frac{D_{c}}{2}\right)^{2}\right)^{3}}}+\frac{1}{\sqrt{\left(R_{% c}^{2}+\left(z-\frac{D_{c}}{2}\right)^{2}\right)^{3}}}\right).$

In this example, the DOF are p $=$ [ $R_{c}$ , $D_{c}$ ] and the UP are q $=$ [ $\Delta R_{1}$ , $\Delta Z_{1}$ ; $\Delta R_{2}$ , $\Delta Z_{2}$ ].

Figure 1.

A pair of Helmholtz coils. In the left figure, coordinates of the current barycentre ( $R_{c}$ and $D_{c})$ are indicated, together with radial and axial thickness of the coils ( $\Delta R$ , $\Delta Z$ ). In the right figure, axial field component is shown, in a.u., for $R_{c}=$ 10 cm and $I=$ 1.1 A. The radius of the uniform region is $r_{\text{VOI}}=$ 5 mm.

The DPF $f$ of the device is the uniformity of the magnetic field, expressed as:

$\displaystyle f\left(\underline{\text{p}},\underline{\text{q}}\right)=\frac{% \text{max}\left|B_{z}\left(z;\underline{\text{p}},\underline{\text{q}}\right)-% B_{0}\right|}{B_{0}}$ (9)

In order to show the Paretian nature of the RD problem, we report in Fig. 2 the Pareto front obtained for a stochastic approach based on Eq. (1). To estimate mean and variance at each trial solution, a MC analysis with 300 cases has been run. Tolerance $\delta R_{c}=$ 200 $\mu$ m and $\delta D_{c}=$ 200 $\mu$ m on $R_{c}$ and $D_{c}$ respectively have been considered, which amounts to 0.1% of the maximum allowed radius, $R_{c\_\text{max}}=$ 0.2 m (and of maximum allowed distance $D_{c\_\text{max}}=$ $R_{c\_\text{max}}$ ). Figure 3 shows the Pareto front for a worst-case approach based on Eq. (2). In this case, sensitivity is computed as $|B^{+}-B|+|B-B^{-}|$ , where $B^{+}=B(R_{c}+\delta R_{c}/2,D_{c}+\delta D_{c}/2;z=r_{\text{VOI}}),B^{-}=B(R_% {c}-\delta R/2_{c},D_{c}-\delta D_{c}/2;z=r_{\text{VOI}})$ , with $\delta R_{c}/R_{c}=\delta D_{c}/D_{c}=$ 1%.

Figure 2.

Pareto front for a stochastic approach.

Figure 3.

Pareto front for a worst-case approach.

Table 1

Results of TA on $R_{c}$ , $Z_{c}$ , $\Delta R_{c}$ and $\Delta Z_{c}$ for Helmholtz coils

Figure	MC 10 ${}^{5}$ cases	LHS 5 $\times$ 10 ${}^{4}$ cases	UT
Average Uniformity (p.p.m.)	7.79	7.79	7.68
Standard Deviation (p.p.m.)	3.86	3.84	4.04
Worst Case (p.p.m.)	17.99	17.19	–
Time Required (on Pentium i7) (s)	545	253	0.1

It is interesting to note that the solution located at the left end of the Pareto front in Fig. 3 (minimal field homogeneity) corresponds just to the Helmholtz’s classical solution for $R_{c}=$ 3.66 cm and $D_{c}=$ 3.62 cm (i.e. $D_{c}=R_{c}$ holds with quite good approximation). On the other hand, the rightmost solution in the Pareto front, characterized by the minimal sensitivity to tolerances, shows $R_{c}=$ 7.82 cm and $D_{c}=$ 4.00 cm.

Note that the symmetry of design is lost when the effect of tolerances is considered, and parameters must be considered each different from the others. A TA was also performed for this test example to assess performance reduction due to this effect: tolerances on all parameters have been assumed uncorrelated and normally distributed, with a standard deviation equal to 1.2% of the nominal value (corresponding to 99.9% of cases falling inside the tolerance range), except for the radius, for which a tolerance of 50 $\mu$ m has been assumed, since uniformity is very sensitive to this parameter. A comparison among different approaches for statistical tolerance analysis of this device was reported in [16], and is briefly recalled in Table 1 to help assessing impact of different tools to reduce computational costs.

Table 2

CFSO optimization results

Best solution
DOF		Fitness
$D_{c}$ (m)	0.136	Average Uniformity	8.11e-04
$R_{c}$ (m)	0.139	Uniformity Standard Deviation	4.89e-09

Table 3

CFSO configuration and computational time

CFSO configuration
	Stability type			Converging oscillations
Algorithm parameters
$\omega$	$\lambda$	$\gamma$	H	# of Individuals	# of Iterations
$-$ 0.815	0.421	0.579	9.52e-9	30	25
Cost function
Computational time (no robust analysis) (s)				6.03 $\times$ 10 ${}^{-5}$
Computational time (robust analysis) (s)				9.61 $\times$ 10 ${}^{-3}$

Figure 4.

Solutions in the objective space for both approaches (LHS $+$ LM and CFSO). Green and blue dots represent the guess values and final values for the LHS $+$ LM (2000 cases). Red line represent the optimization path followed by the CFSO.

To assess the impact of “robust” formulation on the optimization of electromagnetic device, the CFSO algorithm was used to find the best robust configuration of the coils. Radius and mutual distance of the coils are both considered DOF, independent each to the other, while coils thickness, length and current are considered UP. For each trial solution, a stochastic tolerance analysis is performed, and the objective function is a linear combination of average and standard deviation of field uniformity. Different weights have been tried for the linear combination but best results have been obtained by 1:1 ratio (i.e. direct sum). The optimization of the cost function was performed using two techniques: an exhaustive LHS paired with a Levenberg-Marquardt local search algorithm (LM) to trace the boundaries of the objective space, and a swarm based global optimization technique based on the CFSO algorithm. Results are shown in Fig. 4. The best numerical solution found by the CFSO is reported in Table 2, along with the configuration parameters and the computational costs associated to the cost function in Table 3. Computational times are relative to an Intel Processor Core i7 @2.40 GHz.

6. Conclusions

Robust design and tolerance analysis are more and more emerging as lively research field in the area of optimization in electromagnetism. Different approaches are possible to cope with this problem. In this paper, with the aim to present different, possible ways to formulate robust design algorithms, two methods have been presented, showing their Paretian nature. The additional computational efforts required by robust formulations call for tools able to reduce computation times: a few of them have been presented in the paper, together with an original, highly effective optimization algorithm based on a continuous formulation of flock of starlings, providing control of trajectories in DOF space through control of poles of a continuous system. A simple illustrative example has been used to show practically the different aspects of the robust design problem and of the tools discussed in the paper.

Footnotes

Acknowledgments

This paper was a joint work among different research groups in the Italian community of Optimization and Inverse Problems in Electromagnetism. Authors wish to thank all the researchers in the community, and particularly Dr. Lozito, from Univ. Roma Tre, Dr. Chiariello, from Univ. della Campania “Luigi Vanvitelli” for their contribution to the work.

References

Montgomery

, Design and Analysis of Experiments, 6

{}^{\text{th}}

ed., New York: J. Wiley and Sons, 2005.

Diwekar

U.M.

and Kalagnanam

J.R.

, Robust Design using an Efficient Sampling Technique, Computers and Chemical Engineering 20 (1996), 389–394.

Taguchi

, Systems of Experimental Design, Unipub/Kraus International Publications & American Supplier Inst., 1978.

Dorf

R.C.

and Kusiak

, Handbook of Design, Manufacturing, and Automation, New York: J. Wiley, 1994.

Shen

Ameta

et al., A Comparative Study of Tolerance Analysis Methods, Journ. of Computing and Information Science in Engineering 5 (2005), 247–256.

Greenwood

W.H.

and Chase

K.W.

, Worst-Case Tolerance Analysis with Nonlinear Problems, New York: John Wiley, 1992.

Saltelli

Chan

and Scott

E.M.

, Sensitivity Analysis: Gauging the Worth of Scientific Models. New York: Wiley 2000.

Helton

J.C.

and Davis

F.J.

, Latin Hypercube Sampling and the Propagation of Uncertainty in Analyses of Complex Systems, Reliability Eng. and Safety 81 (2003), 23–69.

Julier

and Uhlmann

, Unscented Filtering and Nonlinear Estimation, Proc. of IEEE 92 (2003), 401–422.

10.

Di Barba

Dughiero

Forzan

and Sieni

, A Paretian approach to optimal design with uncertainties: application in induction heating, IEEE Trans on Magn 50 (2014), 917–920.

11.

Ambrisi

Formisano

and Martone

, Tolerance Analysis of NMR Magnets, IEEE Trans. on Magn. 46 (2010), 2747–2750.

12.

Matheron

, Principles of Geostatistics, Economic Geology 58 (1963), 1246–1266.

13.

Nagy

Z.K.

and Braatz

R.D.

, Distributional uncertainty analysis using power series and polynomial chaos expansions, Journ. of Process Control 17 (2007), 229–240.

14.

Laudani

Fulginei

F.R.

Lozito

G.M.

and Salvini

, Swarm/flock optimization algorithms as continuous dynamic systems, App. Math. and Comp. 243 (2014), 670–683.

15.

Coco

Laudani

Riganti Fulginei

and Salvini

, Accurate design of Helmholtz coils for ELF Bioelectromagnetic interaction by means of Continuous FSO, Int. Journ. of Appl. Electrom. and Mech. 39 (2012), 651–656.

16.

Formisano

and Martone

, Quick Tools for Stochastic Tolerance Analysis, Proc. of Compumag 2011, 12–15 July 2011, Sidney, Australia.