Abstract
This paper presents a new method called multilayer embedded bat algorithm (ME-BAT) to solve the general curve reconstruction problem with free-form parametric B-splines. Opposed to previous approaches in the literature, this method computes the optimal values of all free variables (data parameters, breakpoints, and poles), a very difficult task because they are strongly intertwined in a highly nonlinear way. The method is based on the idea of applying the bat algorithm at different layers: a main bat algorithm at an upper layer to compute the breakpoints and a second bat algorithm at a lower layer to compute the data parameters. This second bat algorithm is embedded into the first one and executed for each breakpoint vector of the population and at each iteration step of the main algorithm. Then, the poles are calculated by least-squares minimization through SVD. The method has been applied to three real-world engineering examples. The experimental results show that the method performs very well, being able to recover the underlying shape of data with high accuracy. A comparison with eleven alternative methods (including six classical methods in the field and all the metaheuristic methods applied so far to this problem) shows that this method outperforms the previous approaches in the field for all instances in the benchmark.
Keywords
Introduction
A striking feature of current manufacturing processes is the increasing trend of mass customization. Nowadays, products are manufactured in lesser amounts but with greater product diversity. A critical issue is how to use flexible computer-aided engineering systems to optimize mass customization without the corresponding increase in production costs [19]. In this context, design is becoming a major factor in the product development lifecycle. Individual customization implies that the geometric model of a product has to be frequently modified during the design process. As a result, computer-aided design and manufacturing (CAD/CAM) systems are becoming indispensable tools for industrial production and manufacturing.
A common strategy for mass customization in many industrial fields is to build prototypes to guide the early conceptual design. Typically, the process begins with a volume of raw material (such as clay, foam, wood, or plastics) from which a physical prototype is created. This physical model is digitally stored through data sampling by using technologies such as 3D laser scanning. The output is a cloud of data points. Then, curve and surface reconstruction techniques are applied to recover the underlying shape of data. This process, called reverse engineering, has many advantages: the digital model is much easier to understand and analyze than the physical model. It is also cheaper to customize, since any modification is simulated by computer and can readily be reversed if needed. Finally, digital models can be efficiently stored, archived and transferred among manufacturers, providers and customers, using the current high-speed networks and enhanced methods for feature-based data exchange and handling and big data cloud manufacturing [3, 34, 45].
Owing to these reasons, reverse engineering is the prevalent technology in many industrial fields such as CAD/CAM and computer-numerically controlled (CNC) machining and milling for automotive, aerospace and ship building industries. Another popular example is rapid prototyping (the generation of scale models of physical objects from CAD data), mostly due to the popularity and affordable prices of 3D printers, additive layer manufacturing technologies, and handheld scanning devices. Other examples include pressure machines in shoemaking industry, coordinate measuring machines for metrology and quality assessment, and computer tomography and magnetic resonance for medicine and biomedical engineering (e.g., prosthesis and customized medical implants).
In this context, this paper presents a method called multilayer embedded bat algorithm (ME-BAT) to solve the general curve reconstruction problem with parametric B-splines. The method relies on the bat algorithm, a powerful metaheuristic technique for continuous optimization problems. Opposed to all previous approaches, our method computes the optimal values of all free variables at full extent. The experimental results show that it performs very well, being able to recover the underlying shape of data with high accuracy. Comparison with eleven alternative methods shows that this method outperforms previous approaches in the field for all instances in the benchmark.
The structure of this paper is as follows: Section 2 describes the B-spline curve reconstruction problem and the state of the art in the field. Section 3 describes our proposed method in detail. The experimental results for three illustrative real-world examples are reported in Section 4. A comparison of this method with other alternative approaches is discussed in Section 5. The paper closes in Section 6 with the main conclusions and some plans for future work.
Related work
The problem of curve reconstruction from data points can be stated as follows:
Given a (generally large) collection of (usually noisy) data points, obtain the curve providing the best fitting to the input data points, according to an error metrics.
When data is acquired from accurate sources and smooth shapes, it can be performed through popular interpolation techniques. However, real-world data are usually affected by measurement noise, so interpolation methods fail since they force the fitting functions to pass through the noisy outliers. To overcome this drawback, approximation techniques are applied instead. In this case, curve reconstruction can be mathematically formulated as an optimization problem, generally a least-squares fitting problem [6, 26, 27].
Solving this optimization problem requires an adequate choice of the fitting functions. Arguably, the best ones are the free-form parametric curves, which are very flexible and well-suited for interactive design. Among them, the B-spline curves are the most powerful fitting curves, and are now included as basic primitives in all major CAD/CAM programs of the market. Based on these reasons, in this paper we address the curve reconstruction problem with B-splines curves, which are briefly described in next paragraphs. The interested reader is referred to [27] for further details.
Let
for
for
A B-spline curve of order
where
Let us now consider an input set of measured data points
for
where
System Eq. (5) is generally overdetermined. Consequently, the matrix of basis functions
where
Metaheuristic methods applied to B-spline curve reconstruction.
Metaheuristic methods applied to B-spline curve reconstruction.
Curve reconstruction with free-form parametric curves has been an active field of research for a very long time. First approaches in the field were based on classical mathematical optimization methods [6]. Early approaches were developed in the 60 s and 70 s for Bézier curves, where the bottleneck is the parameterization step. Due to its simplicity, the uniform parameterization (where all data parameters are equally spaced) is a popular choice. However, in many practical settings (e.g., CNC machining and milling for manufacturing, metrology and quality assessment) it is more convenient to consider arc-length parameterization. This method was later extended to the centripetal method, a variation of the arc-length parameterization that yields better results for shapes with sharp turns [27].
It was soon noticed however that Bézier curves are affected by the so-called global control: moving a pole automatically modifies the entire curve, making interactive design very tedious and difficult to handle. Due to these reasons, Bézier curves have been replaced in industrial applications by the more powerful B-spline curves. Many B-spline curve reconstruction approaches focus on the problem of data parameterization and skip the (more difficult) problem of computing the breakpoints. In general, these approaches proceed by setting the number of breakpoints a priori and then computing their location according to some formula [16, 24, 25, 32]. The simplest way to do it is to consider equally spaced values (uniform breakpoint vector). However, this approach may lead to singular systems of equations and does not reflect the real distribution of data. A more refined procedure consists of the averaging method and its variations [27], which allocate breakpoints to ensure that every interval span contains at least one parametric value.
It has been shown that B-spline curve reconstruction improves dramatically if the breakpoints are treated as free variables of the problem [2, 17]. Typically this is carried out by changing their number through either breakpoint insertion or removal. These methods require terms or parameters (tolerance errors, smoothing factors, cost functions, initial breakpoint locations) whose values are chosen subjectively [6, 11, 17]. Therefore, they fail to automatically generate a good breakpoint vector. In addition, these methods tend to be prone to errors and time-consuming, and yield an unnecessarily large number of breakpoints. Other approaches use error bounds [24], curvature-based squared distance minimization [32], or dominant points [25]. In general, they perform well but require some strong constraints (such as high differentiability, closed curves, noiseless data) that are difficult to meet for many real-world applications. Other methods use curvature information extracted from input data [14, 18, 21], so they are restricted to smooth data points and strongly affected by the noise in data.
In the last few years, different artificial intelligence techniques have been applied to this problem. Some examples are neural networks [12], Kohonen’s SOM (Self-Organizing Maps) nets [13], functional networks [15], and support vector machines [16]. All these methods are restricted to particular (very simple) cases and do not address the general case.
A very promising line of research is given by the bio-inspired metaheuristic techniques, which have been intensively applied to solve difficult optimization problems [1, 4, 5, 20, 22, 33, 39, 49]. Table 1 shows all the metaheuristic methods for B-spline curve reconstruction reported in the literature (arranged in rows in chronological order). For each method, the table shows (in columns) the method used and its bibliographic reference, the year it was introduced, and its discrete/continuous nature. Then, three true/false questions follow to enrich the discussion: parametric B-spline curves supported; data parameters
The bat algorithm
The bat algorithm is a powerful bio-inspired metaheuristic algorithm for optimization based on the echolocation behavior of bats and originally proposed in 2010 by Yang [35, 36]. Despite the short time since its appearance, it has already been applied to several engineering and industrial problems. See also [37] for a comprehensive, updated review of the bat algorithm and all its variants and applications.
This paper considers the standard bat algorithm (as described in the original paper in [35]), whose pseudo-code is shown in Algorithm 1. Basically, the algorithm considers an initial population of
where
Bat Algorithm
(Initial Parameters)Population size:
Initialize the location and velocity
Define pulse frequency
Initialize pulse rates
Generate new solutions by adjusting frequency,
and updating locations and velocities
Generate a new solution by local random walk
Accept new solutions
Increase
Rank the bats and find current best
If the new solution achieved is better than the previous best one, it is probabilistically accepted depending on the value of the loudness. In that case, the algorithm increases the pulse rate and decreases the loudness (lines 16–19). The evolution rules for loudness and pulse rate are as:
The method introduced in this paper applies the bat algorithm to solve the difficult problem given by Eq. (6). This requires to optimize all variables of the problem, instead of some of them as previous methods do. This cannot be done by a simple application of the bat algorithm; a more sophisticated approach is needed.
The method proposed addresses this problem by applying the bat algorithm at two different levels (called layers). The first (upper) layer computes the optimal vector of breakpoints. To this aim, the method is initialized with a random population of breakpoint vectors. For each, a second population of parameter vectors is randomly generated and then optimized through bat algorithm to obtain the best parameter vector associated with such a breakpoint vector. After this initialization stage, bat algorithm is applied to the population of breakpoint vectors for a given number of iterations. At each iteration step, each individual (the breakpoint vector) is modified according to Eqs (7)–(9), then, the change is probabilistically accepted according to the improvement of the fitness function and the value of the loudness. However, the fitness function depends not only on the breakpoint vector but also on the parameter vector. Therefore, any update of the breakpoint vector means that a new optimal parameter vector is to be computed. This task is carried out at a second (lower) layer, consisting of another bat algorithm embedded into the first (main) one. In other words, for each iteration step of the bat algorithm for the breakpoint vector population, another bat algorithm for a new parameter vector population is executed. The best parameter vector is selected to evaluate the fitness and the main bat algorithm continues until convergence. To reflect this nested structure of layers where the second bat algorithm is embedded into the first one at each iteration step, this procedure is called multilayer embedded bat algorithm (ME-BAT for short).
The output of this ME-BAT algorithm is the optimal pair of parameter and breakpoint vectors. They are then used to compute the optimal values for the poles. The resulting system of equations is overdetermined, so they are computed by using the least-squares method through either LU decomposition or SVD. This final procedure yields the B-spline curve that fits the data points better in the least-squares sense.
Description of the method
In this section, the method is formalized and each step of the method is discussed in detail.
Input data
The input of the method consists of:
the order of the B-spline curve, the input data, the number of poles,
The issue of finding suitable values for the parameters of the method will be discussed in Section 3.4.
Breakpoint placement – data parameterization
As indicated above, these two tasks are dependent of each other and must be conducted simultaneously. They are carried out by applying the new approach introduced in this paper, ME-BAT. It consists of two stages: initialization and main procedure.
Stage 1: Initialization. Firstly, the method is initialized with a random population of breakpoint vectors:
where the right superscript
is randomly generated. Then, an initial optimization on this population of parameter vectors is performed through a bat algorithm. This procedure is denoted as:
where
where
The output of this first stage is given by the collection of pairs
Stage 2: Main procedure. At this stage, the ME-BAT approach is applied. It consists of applying the bat algorithm at different layers. The upper layer computes the optimal vector of breakpoints through the bat algorithm for a given number of iterations,
where the meaning of all symbols is already known. Further decomposition of this bat algorithm is:
where, once again, all symbols are known. However, the fitness operator
that is, at iteration step
The final output of this process is the best pair of parameter vector and breakpoint vector:
The output of the previous step is now used to compute the poles of the curve. Using matrix notation and inserting the optimal values for data parameters and breakpoints, Eq. (6) can be written as:
where the optimization is now performed only on the poles
which leads to the normal equation:
where
and
for
Parameter tuning is a critical issue for metaheuristic techniques, as their performance is strongly dependent on the parameter values. Such values are also problem-dependent, so they cannot be determined in advance. The approach in this paper is a mixed one: good values described in the literature are used as an initial seed to guide the process; then, they are refined by numerous computer simulations for different parameter values. As a result, the final choice is necessarily empirical. The different parameters used in this method are arranged in rows in Table 2. For each parameter, the table shows (in columns) its symbol, meaning, range of values, and the parameter value chosen in this paper.
Parameters and values used in the ME-BAT algorithm
Parameters and values used in the ME-BAT algorithm
The most important parameters of the method are:
Population size (for breakpoint vectors,
Maximum number of iterations (for breakpoint vectors,
Initial loudness, Initial pulse rate,
Butterfly example: (top) initial noisy data points; (middle) reconstructed curve with the control polygon and the poles; (bottom) best reconstructed B-spline curve.
Convergence diagram for the butterfly example: RMSE vs. iterations of main bat algorithm, 
The method introduced in previous section has been applied to three real-world examples from different engineering fields: a butterfly model, a hand drill, and a fan blade. In all cases, the input data is a collection of scanned noisy data points. They are fitted through a single parametric B-spline curve. The order of the curve,
For all examples in this paper, 20 independent executions have been carried out to remove any spurious effect derived from stochasticity.
First example corresponds to the shape of a butterfly model firstly introduced in [40] as an example about real-time NURBS interpolation for high-speed motion control in the context of CNC machine tools for manufacturing. Since then, it has been used as a suitable benchmark instance for various industrial applications [38, 43, 44], including the use of a commercial CNC system to produce a machined contour of this shape [43]. Originally, the curve was described as a cubic NURBS curve with 51 poles and non-unit weights at some corner points to strengthen the pull of the curve towards the control points. A NC code reporting the parameter values of the curve can be found in [40]. This example has been primarily chosen not only because it is a real-world example widely used as a benchmark in previous works but also because it is mathematically described as a NURBS curve. It is therefore a challenging shape for the method introduced in this paper, which does not consider weights.
Hand drill example: (top) initial noisy data points; (middle) reconstructed curve with the control polygon and the poles; (bottom) best reconstructed B-spline curve.
Convergence diagram for the hand drill example: RMSE vs. iterations of main bat algorithm, 
Figure 1(top) shows the collection of 472 scanned data points. Middle and bottom pictures in this figure show the best fitting B-spline curve, obtained for the value
Whole three-blade fan of Example 3.
Cross-section curves are very important in several applied fields. For example, in the medical area it is possible to reconstruct the outer surface of the organ under study from a set of parallel slices corresponding to different levels. Parametric curves are also used to represent active contours in ultrasound imaging segmentation [23]. Similarly, in CAD/CAM many objects are designed by defining a number of cross-sections; the shape of the surface in-between is computed by some interpolation scheme. The second example corresponds to the profile curve of the central cross-section of a commercial hand drill of a popular manufacturer. Figure 3(top) shows the collection of 466 scanned data points. This example is a good candidate to check the performance of our method against the noise, as data points are affected by measurement noise of varying intensity for different regions. The best fitting curve, obtained for
Lower fan blade example: (top) initial noisy data points; (middle) reconstructed curve with the control polygon and the poles; (bottom) best reconstructed B-spline curve.
RMSE of the three examples for different noise intensity
Convergence diagram for the fan blade example: RMSE vs. iterations of main bat algorithm, 
Last example corresponds to a commercial three-blade fan shown in Fig. 5. The 271 sampled data points correspond to the lower blade of the fan, displayed in Fig. 6(top). The other two blades of the fan are similar and can readily be obtained by simple rotation, so they are not considered here. The blade is reconstructed with a B-spline curve with 31 poles, displayed in middle and bottom figures, with a very good RMSE of 1.68241
Robustness against noise
To check the robustness of our method against noise, the (already noisy) initial data for the three examples have been perturbed by an additive Gaussian white noise of intensity rate
Implementation issues
All computations in this paper have been performed on a 2.6 GHz. Intel Core i7 processor with 8 GB of RAM. The source code has been implemented by the authors in the popular scientific program Matlab, version 2013b. Equation (17) was solved through specialized Matlab commands for Gaussian elimination with partial pivoting and singular value decomposition (SVD) for squared and non-squared systems, respectively. In this paper, SVD has been primarily used since it provides the best numerical answer when the exact solution is not possible. Matlab also provides routines for ill-conditioned matrices, a situation that can happen in practice, for instance, when one or several singular values in SVD decomposition are null or very near to zero. Advisable answer to this problem is to set reciprocals of such singular values to zero. Matlab command svd handles this problem automatically. Another problem is that, although matrix
RMSE with different methods (best results in bold)
RMSE with different methods (best results in bold)
CPU time (in minutes) for the three examples in this paper
It is well-known that metaheuristic techniques are not well suited for real-time applications. Also, it is very difficult to determine the CPU time in advance, as it can vary greatly depending on the complexity of the model, size of input data, population size, number of iterations, hardware and programming language used for execution, and many other factors. Table 5 shows the CPU time (in minutes) for all the metaheuristic methods applied so far to the parametric B-spline curve reconstruction problem, including our method (reported in last row). Although our CPU times might seem very large, they are not surprising at all considering the large number of executions of the bat algorithm required by the multilayer embedded structure of our approach. The comparison with the CPU times of other alternative methods in Table 5 show that they are of similar order as ours, although some are actually better. However, such better CPU times are obtained at the expense of much larger fitting errors (see Section 5 for details). The emphasis of our method is on quality, not on computational speed. Note also that the embedded bat algorithms in Eqs (11) and (12) are executed for parameter vector populations associated with independent breakpoint vectors and similar happens for Eq. (3.3.2). This means that they can be computed simultaneously, so the method is well suited for partial parallelization.
Comparison with other methods
As shown in Table 1, most of the previous work about B-spline curve reconstruction reported in the literature is focused on the simplest case: explicit curves. For the parametric case, only partial solutions are provided: in general, either the data parameters or the breakpoints are not optimized but fixed. To the best of authors knowledge, no single metaheuristic technique addressed the general B-spline curve reconstruction problem so far. This fact is not accidental, but a clear indication of the difficulty of this task.
This lack of previous references prevents the comparison with any other general method. It is possible, however, to make a comparison with other methods commonly accepted in the field, even although they provide only partial solutions. This includes the three classical parameterization methods indicated in Section 2.1: uniform, arc-length, and centripetal, along with two methods for breakpoint placement: uniform and averaging. Since solving the general case requires the full computation of data parameters and breakpoints, these partial approaches have to be combined, yielding six feasible couples. They are listed in rows 1–6 in Table 4. Five other methods for parametric B-spline curves (listed in rows 7–11 of Table 4) can be found in the literature. The first one applies an artificial immune system called ClonalG to compute the breakpoints from a centripetal parameterization [30]. The second method computes the breakpoints by the averaging method, while particle swarm optimization (PSO) is applied to the data parameters [29]. The method in [46] relies on the arc-length parameterization and applies an estimation of distribution algorithm (EDA) with Gaussian mixture distributions to compute the breakpoints. This method can only be applied to closed curves. Fourth method applies the firefly algorithm to data parameterization while the breakpoints are obtained by using De Boor’s method [8]. The immune approach in [30] is replaced in [31] by a Pareto Envelope-Based Selection Algorithm (PESA), a type of genetic algorithm for multi-objective optimization. The ME-BAT method is considered in last row.
Table 4 reports the RMSE values for all these methods (in rows) and the three examples (in columns) of this paper. Best results are highlighted in bold for easier identification. We also show (in columns) the error rate (in percent) with respect to our method for better assessment. As the reader can see, the ME-BAT algorithm outperforms all other methods for all instances in the benchmark. Among the classical methods, the uniform parameterization is generally the worst, while the arc-length parameterization is slightly better than the centripetal, but they still perform very similarly. Although based on a powerful metaheuristic technique, the method in [30] does not improve the classical methods. This can be explained by the fact that the method is not continuous, but based on the conversion of the original continuous optimization problem into a combinatorial optimization problem. This conversion process introduces large discretization errors, even for very simple shapes. This discrete approach is also applied in [31], although the fitting errors improve in all cases. The methods in [8, 29, 46] perform pretty well: they improve the results of the classical methods and those in [30, 31] significantly. Among them, the method in [46] is the best for the two first examples. However, it cannot be applied to the third one (an open curve). Still, their results are worse than those of the ME-BAT method. As a conclusion, the method introduced here outperforms all previous methods in the literature on our benchmark.
Conclusions and future work
This paper presents a method called multilayer embedded bat algorithm (ME-BAT) to solve the general reconstruction problem with parametric B-spline curves. The method relies on a powerful metaheuristic technique called bat algorithm aimed at solving difficult continuous optimization problems. Opposed to the previous methods in the literature, the proposed method computes the optimal values of all free variables (data parameters, breakpoints, and poles) at full extent. This is very difficult task because these variables are strongly intertwined in a nonlinear and complicated way, so they cannot be obtained independently. The proposed approach is based on the idea of applying the bat algorithm at different layers, with a main bat algorithm at an upper layer to compute the breakpoints and a second bat algorithm at a lower layer to compute the data parameters. This second bat algorithm is embedded into the first one and executed for each individual breakpoint of the population and at each iteration step of the main bat algorithm. Then, the poles are calculated by least-squares minimization through SVD. The method has been applied to three real-world examples from engineering fields. The experimental results show that the method performs very well, being able to recover the underlying shape of data with high accuracy. A comparison with eleven alternative methods (including six classical methods in the field and all the metaheuristic methods applied to this problem so far) shows that the proposed method outperforms previous methods in the literature for the examples in this benchmark.
Main contributions of the paper are:
A new bat algorithm-based scheme (ME-BAT) for continuous optimization where the variables are related to each other in a highly nonlinear and complicated way. This methodology is very general and can readily be adapted for application to any other problem exhibiting similar features. The application of this scheme to the general curve reconstruction problem with free-form parametric B-splines. Opposed to all previous approaches, the proposed method computes all free variables of the problem at full extent. This is the first reported method able to solve this problem in all its generality.
Main limitations of the method are the computation time and the determination of the optimal number of poles. The first problem can be alleviated by partial parallelization of the embedded bat algorithm, but it is still unclear how much improvement could be obtained. An exciting trend in this regard is the increasing power of GPUs with their highly parallel structure, making them more efficient than CPUs for algorithms where large blocks of code or data can be processed in parallel (see [47, 48] for two recent examples of the use of GPUs to optimize the performance of metaheuristic techniques). The determination of the optimal number of poles might be addressed by adding a careful chosen penalty term to current fitness function. These tasks are part of the future research in the field.
This method could be extended to NURBS at the expense of introducing another set of free variables (the weights) that must be computed as well. The method should be modified to account for these extra variables. Theoretically, the addition of a new layer might probably solve the problem but the computation time could become unacceptable for many practical problems. Clearly, further research is still needed to solve these and other challenging issues related to curve reconstruction. The application of this method to other engineering problems is also part of our future work.
Footnotes
Acknowledgments
This research is supported by the Computer Science National Program of the Spanish Ministry of Economy and Competitiveness, Project Ref. TIN2012-30768 and Toho University (Funabashi, Japan). Special thanks are due to the editors and the four anonymous reviewers for their encouraging and constructive comments and very helpful feedback that allowed us to improve our paper significantly.
