Abstract
Genetic Algorithms are efficient for the travelling salesman problem, but they have a premature convergence problem resulting in suboptimal solutions. As the initialization step has a profound impact on the algorithm’s performance, this study proposes a knowledge-based initialization technique to learn the patterns of evolved populations, and integrates four heuristic strategies to generate the initial population. Advanced initial solutions and high quality gene blocks can be quickly created with this method. Instances in the TSPLIB library are used to set the parameters and test different initialization methods. The results show that this proposed technique can improve the initial population and optimization performance of genetic algorithms.
Introduction
The travelling salesman problem (TSP) is a widely studied combinatorial problem in operations research and computer science. It has various applications in engineering, such as controlling unmanned aerial vehicles [1], dairy transportation [2] and the design of electric vehicle information service systems [3]. The TSP is an NP-hard problem [4]. Exact algorithms, including enumeration methods and branch and bound algorithms [5] are only suitable for small scale problems due to limitations on time and memory. In contrast, heuristic techniques [6] and intelligent optimization algorithms such as genetic algorithms [7, 8], ant colony algorithms [9], neural networks [10, 11], simulated annealing [12] and differential evolution algorithms [13, 14] are able to reliably find acceptable solutions within a reasonable time.
Among these, genetic algorithms (GA) [15], which are based on an imitation of natural selection and evolution, provide a powerful means of solving the TSP. The first step is initialization, which aims to generate the initial population. The initial population, representing a set of initial candidate solutions, has been shown to play an important role in the performance of GAs [16]. The traditional method of initialization is to randomly generate a population of solutions [17]. This method is common, simple and fast, but is likely to lead to a time-consuming optimization process and unsatisfactory solutions because of the uncertain quality of the initial population.
To improve the performance of GAs, initialization techniques have been extensively studied [18]. Yugay et al. [19] sorted the initial population by fitness value. The Nearest Neighbor heuristic algorithm (NN) is an alternative to the random method. To generate an individual candidate solution, it chooses a start node randomly, and then the nearest node is chosen iteratively [20]. The Neighbor Field method is similar to NN, but the node is chosen from a set of unused points [21]. Wei et al. [22] used the gene bank(a matrix of permutations of nodes according to their distance) to generate the initial population. Paul et al. [23, 24] proposed a Vari-begin and Vari-diversity (VV) population seeding technique. It generates a permutation-coded population by selecting different start nodes and using a variation of the gene bank which is named the Ordered Distance Vector (ODV).
The above initialization heuristics use the greedy principle in terms of distance to some extent. However, the key to optimizing the initialization is to identify important patterns of better solutions (evolved populations) and features of high quality gene blocks (combinations of genes). Based on the knowledge learned from evolution, this paper proposes a practical technique with several strategies to generate advanced initial populations with high quality gene blocks.
In Section 2, we build the mathematical model of the TSP. In Section 3, the principles of GAs are introduced. In Section 4, the mechanism of knowledge-based initialization techniques is explained. In Section 5, the parameters are determined and the effectiveness of the proposed method is proven experimentally. Finally, we provide a summary and discussion in Section 6.
Travelling salesman problem
The travelling salesman problem is easy to describe but hard to solve. It aims to find the shortest path to visit each city once and only once. A TSP can be represented by a complete graph G = (N, A), where N is a set of n cities, and A is a set of arcs. The cost of distance of each arc (i, j) ∈ A is represented by d ij in the distance matrix D. The problem is defined as follows:
The variables:
The objective function:
The constraints:
Equation (1) is the objective function minimizing the total cost (length) of the tour. Constraints (2-5) assure that each city is visited once and only once.
Genetic algorithms are based on the principles and mechanisms of natural selection. With the advantages of a population of candidate solutions and the help of different genetic operators, GAs can solve complex problems effectively. The population is a collection of individual solutions that are represented by chromosomes. A chromosome is a permutation of genes. The pseudocode of a GA is shown in Algorithm 1.
The parameters include the maximum number of iterations, population size, and probabilities of crossover and mutation. The path representation method is widely used [25] to encode and decode the chromosomes. In other words, a chromosome C is a permutation of cities [c1, c2, ⋯ , c
n
]. For instance, C = [1, 2, 4, 3] represents a path 1 → 2 →4 → 3 →1. The process of evolution ends when requirements on the number of iterations or the fitness value are satisfied. The fitness is the criterion of individual solutions’ quality and is associated with the objective function. The fitness function can be defined as Equation (6) for the TSP, in which represents the tour length.
A larger fitness value means that the solution is better and has a greater chance of being selected into the next generation. Both crossover and mutation are probabilistic processes. The crossover operator uses two existing chromosomes to generate new chromosomes, while the mutation operator changes chromosomes individually. When the termination conditions are satisfied, the GA decodes chromosomes into solutions. Figure 1 illustrates the typical development of a solution.
General patterns of evolution
The optimization process of a GA is based on the evolution of populations. As shown in Fig. 1, the general patterns of evolution for the TSP are: Initially, the solution has several intersections in the path. The solution is improved by iteration with a decrease in the number of intersections. For example, the number of intersections is decreased from two to zero in Fig. 1. There are high quality gene blocks in the evolved solution, which represent combinations of several neighboring cities. For instance, gene blocks [6, 5] and [2, 7] in Fig. 1.
Framework of the initialization technique
In practice, uncertainty in the quality of the population will influence the robustness of the GA. Furthermore, low gene diversity can result in the algorithm converging to a local optimum (premature convergence). Therefore, the initialization step should generate advanced candidate solutions and others with high quality gene blocks that will be used as components of the final solution.
The patterns discovered inspire the proposal of a knowledge-based initialization (KI) technique to ensure gene diversity and quality. As knowledge about the evolved population’s features and patterns is gained, we can generate an advanced initial population with high quality gene blocks and enable the GA to achieve a better solution in a short run time, thanks to improvement of the initial population.
Figure 2 displays the mechanism of the KI. First, a constructed solution strategy can generate an initial solution without path intersections quickly. Second, a random strategy is used to keep gene diversity. Last, some high quality gene blocks can be created by using a local optimization strategy and a cross detection and deletion strategy.
Constructed solution strategy
The constructed solution strategy uses coordinate transformation and the polar angle to generate an initial route without path crossing. The key is to divide the plane into non-intersecting sectors. Using a rule based on increasing the polar angle to choose cities to visit can avoid intersections of paths.
Figure 3 illustrates the process to generate initial solutions (individuals) using this strategy. The steps of the constructed solution strategy are shown below:
Random strategy
The random strategy is the classic way to generate individuals, and it can increase the diversity of the initial population. The process is shown below.
Local optimization strategy
A local optimization strategy is used to improve the quality of gene blocks of individuals based on the principles of greedy algorithms. Figure 4 illustrates the main process.
1) Gene block determination: Choose the length (L) of a gene block of chromosome C randomly; L ∈ [4, min (10, n)]; n is the number of cities.
2) Gene block optimization:
Cross detection and deletion strategy
A cross detection and deletion strategy is another way to optimize the gene blocks in terms of crossing of paths. Figure 5 represents a strategy to detect and delete crossovers in the path [26, 27]. But when the scale of the problem is large, it is time-consuming to examine every combination of two cities. Therefore, a random number I is used to limit the number of pairs to be examined. The main steps are listedbelow.
Experiments and results
The program is coded in the Matlab environment on a computer with a 2.94 GHz CPU and 2 GB physical RAM, using the parameters from [23], as shown in Table 1. According to Equation (6), the tour length is inversely proportional to the fitness of the individual. Therefore, the tour length is used as the optimality criterion for solutions.
Parameter setting
To determine optimal parameters for the knowledge-based initialization technique, i.e., the percentages of individuals generated by different strategies, parameter setting experiments are done.
Table 2 shows parameter sets of different strategies. SN, CSS, RS, LOS and CDDS represent the sequence number, constructed solution strategy, random strategy, local optimization strategy and cross detection and deletion strategy respectively. For example, the first parameter of CSS (25%) means that 25% of the initial population is generated by applying the constructed solution strategy in the KI. The instance KroA100 in TSPLIB is used to run the tests and the results of 50 runs are averaged.
The results in Table 2 indicate that the 6th parameter set is the best, and helps the KI achieve the best average initial length of tour and solutions. Furthermore, its initialization time is the second-shortest. The KI with the 7th parameter set is the fastest, in which over half of the initial solutions are generated with the random strategy. It is noted that the local optimization strategy and the cross detection and deletion strategy are relatively time-consuming.
Comparative studies
The performance of the GAs with different initialization techniques are compared to verify the effectiveness of the KI technique.
Comparative instances are from TSPLIB (Eil51, Pr76, KroA100, Pr144 and Gil262). The competitors are Random initialization technique (RD), Nearest Neighbor technique (NN), Gene Bank technique (GB) and Vari-begin with Vari-diversity technique (VV). Their performance in [23] is compared with that of the KI, using the parameters given in Table 1. For each case, the average result of 50 runs is collected for analysis.
The results in Table 3 show that the KI helps the GA obtain the best solution and the greatest improvement in all instances. Because the platforms are different, computation times cannot be compared. However, the initialization time of the KI is shortest, indicating that the computational complexity is acceptable.
Application
An important real-life application of this novel initialization method is for path planning for Unmanned Aerial Vehicles (UAV) in surveillance or aerial filming. For example, a UAV should visit several places in one flight to monitor an area. Considering the time effectiveness and cost, an optimized solution for the flight path can be generated more quickly using a genetic algorithm with the KI technique. The case lin105 in TSPLIB is used for testing on the Matlab platform.
Figure 6 shows the optimization process of the GA initialized by the RD and the KI in a run for lin105. The initial tour length of the GA with the KI is much shorter than that of the GA with the RD in the beginning, which illustrates that the initial candidate solutions generated by the KI are advanced. The other advantage of the proposed initialization technique is the improvement in convergence rate, that is, we can get a better solution in shorter time.
Figures 7 and 8 illustrate path planning results using these two initialization methods respectively. It is noted that the path in Fig. 8 is an improvement over that in Fig. 7 because there is no path crossover.
Conclusion
This paper proposes a knowledge-based initialization technique to enhance the ability of a genetic algorithm to solve the travelling salesman problem. It integrates four heuristic strategies to use the patterns and features of evolved populations to generate a high quality initial population. Experimental results indicate that the proposed initialization technique outperforms other published methods for the improvement of GAs.
The performance of the GA is also determined by its operators and parameters. For this reason, in further research, we intend to design and improve operators of the GA to cooperate with the initialization technique and optimize the parameters.
Footnotes
Acknowledgments
This research is supported by the National Natural Science Foundation of China (Nos. 71331008 and 71101150), the Youth Training Program for Innovation and Entrepreneurship Platform of Science and Technology in Hunan Province, Foundation for the Author of National Excellent Doctoral Dissertation of PR China (201492), the Outstanding Youth Fund Project of Hunan Provincial Natural Science Foundation, the Top-notch Innovative Talents Training Plan of National University of Defense Technology, the Outstanding Youth Fund Project of National University of Defense Technology, and the Program for New Century Excellent Talents in University.
