Abstract
To overcome the disadvantages of low optimization accuracy and prematurity of the canonical PSO algorithm, we proposed an improved particle swarm optimization based on the interaction mechanism between leaders and individuals (PSO-IBLI), and used it to implement robot global path planning. In the PSO-IBLI algorithm, in different stages, each particle learns from the elites according to different regular. Moreover, the improved algorithm divides the execution state into two categories, where the parameters and the evaluation mechanisms are varied accordingly. In this way, the global best particles no longer walk randomly and have more learning objects. At the same time, other particles learn from not only the global best position, their historical best positions, but also the other elites. The learning strategy makes the search mode always in the adaptive adjustment, and it improves the speed of convergence and promotes this algorithm to find a more precise solution. The experimental results suggest that the precision and convergence speed of the PSO-IBLI algorithm is higher than the other three different algorithms. Additionally, some experiments are carried out to plan the robot’s entire collision-free path using the PSO-IBLI algorithm and the other three algorithms. The results show that the PSO-IBLI algorithm can obtain the shortest collision-free way in four algorithms.
Keywords
Introduction
Robot global path planning (RGPP) is one of the critical issues of autonomous navigation in the field of the mobile robot. Gemeinder proposed an extending method based Genetic Algorithm (GA). First, the path to avoid obstacles is found, and then the GA is used to find the optimal path [1]. Mohanta proposed a novel knowledge-based GA, The Petri-Net model combines with the GA and makes integrated navigation controller [2]. Oleiwi developed a hybrid approach that incorporated the ant colony optimization with GA. He uses ant colony optimization algorithm to obtain the suboptimal paths without obstacles and then takes them as the initial paths of GA [3]. Combining discrete Levy flight and discrete particle swarm optimization (PSO) algorithm, Wang proposed the Levy-PSO algorithm and used it to plan the robot path [4]. Zhang presented a pigeon-inspired algorithm to plan the three-dimension path of a UCAV. [5]. Given the features of fast convergence and simple model, PSO algorithm has been used widely in the optimization field. Therefore, we focus on studying PSO algorithm for RGPP.
PSO algorithm was proposed in 1995. Now it has gained widely attention [6]. In PSO algorithm, many particles cooperate and interact with each other to find solutions in the problem space. Each particle flies at the velocity that is changed based on its best historical position and the best global position of the entire swarm to date [7–9]. PSO algorithm has few parameters, and it is easy to be implemented. Compared with traditional algorithms, it deals with all kinds of benchmark optimization problems faster [10]. Nowadays, PSO algorithm is widely used to solve many practical problems. For instance, the parameter optimization in PID controllers [11], the boxing-covering problem in complex networks [12], and multi-robot path planning [13]. Relative approaches have been made to improve the canonical PSO. For example, Liang proposed a constraint processing mechanism based on multi-population [14]. Niu introduced a new variant of PSO inspired by the symbiotic cooperation of natural ecosystems [15]. Referring to the controller, Xiang proposed a PSO based on PID to accelerate the convergence and get rid of local optimality [16]. Based on above research, we studied the social learning ability of the particles and proposed an improved PSO algorithm.
This paper is organized as follows. In section 2, we describe the canonical PSO algorithm. The PSO-IBLI algorithm is depicted in section 3 with experimental verifications to show the improvement of our proposed idea. In section 4, the PSO-IBLI algorithm is applied to plan robot global path. Finally, it is summed up in section 5.
Canonical PSO and its variants
Because the PSO algorithm has few parameters, simple implementation, and superior performance, it has become a swarm intelligence algorithm that researchers like to study.
Canonical PSO algorithm
PSO algorithm can optimize almost any problem that can abstract fitness function. Firstly, The PSO algorithm randomly initializes a set of particles. Then, it updates the velocity and position of particles according to the Equation (1) and the Equation (2), respectively.
In this paper, t is the amount of iterations, N is the amount of the particles, D represents the dimension of the search space, and X is treated as a population of particles: X = (x1, x2, ⋯ , x
N
). In the canonical PSO algorithm, the i
th
particle is associated with two vectors: x1 and v
i
. The vector x1 represents the position of the i
th
particle (or called the i
th
candidate solution):
When the optimization precision or the preset amount of iterations is satisfied, the algorithm ends. Algorithm 1 described the canonical PSO in detail.
Algorithm 1: Canonical PSO
Initialize a population of particles X; Calculate the fitness of all particles f (X); Set xpbest.i to be the personal historical best position of the i
th
particle; Set x
gbest
to be the current global best position; for i = 1 : N do Update x
i
and v
i
according to the formula (1) and (2); Update xpbest.i and x
gbest
;
Researchers mainly studied the parameters, learning models, and mixed models to propose the variations of PSO.
In terms of parameter research, researchers mainly studied the inertia weight, the individual cognitive factor, the social cognitive factor, and shrinkage factor. In the canonical PSO, the inertia weight ω is a constant value. Therefore, the algorithm cannot automatically update the inertia weight according to the optimization accuracy and the number of iterations, so that the convergence velocity of the algorithm cannot be automatically adjusted according to the actual situation. The existing algorithms mainly set the inertia weight as a function and make the inertia weight adjust itself according to the actual evolution. For example, the commonly used update rule was adopted in literature [18], that is, ω dropped from 0.9 to 0.4. In literature [19] and [20], nonlinear adjustment functions were proposed considering the nonlinear factors of search. The two parameters c1 and c2 are also studied usually. In the existing studies, c1 is generally greater than c2 in the exploration phase, and c2 is greater than c1 in the development phase. At the same time, the sum of the two parameters is equal to 4 usually [16]. There isn’t the shrinkage factor in the canonical PSO algorithm, but this parameter can increase the convergence velocity effectively. We use χ to represent the shrinkage factor, and χ is generally used in the Equation (3), where d ∈ [1, D].
The learning model plays a pivotal role in the accuracy and convergence speed because it determines the way of information transmission in the crowd. To optimize the learning model of PSO, researchers have integrated learning ideas from other fields (For example, psychology and automatic control.) into PSO. The researchers improved PSO mainly by expanding the learning range of particles. For examples, Literature [21] proposed an optimization method for two populations to adopt different learning strategies. Literature [22] studied bottleneck targets and proposed a new learning strategy to carry out multi-objective co-evolution of PSO. Literature [23] enhanced exploration and exploitation and improved PSO. But the expansion of the learning range will lead to an increase in complexity. Therefore, when improving the learning model, the contradiction between the learning range and the time complexity of the algorithm must be balanced.
Each swarm intelligence algorithm has its advantages and disadvantages. Many researchers are committed to integrating two or more intelligent algorithms to improve PSO. For example, GA has better global search performance because of its better population diversity in the iterative process, so some researchers conducted a lot of comprehensive studies on GA and PSO. For instance, Literature [24] used a new framework to combine PSO with another optimization technique. To avoid the population falling into potential local optima, Literature [25] proposed a multilevel adaptive strategy and a purposeful detection strategy, and proposed an improved PSO algorithm by combining them.
Most variants of PSO also have some disadvantages. First, generally, the parameters are set as functions related to evolution times, to achieve adaptive evolution. However, evolution times cannot accurately reflect the evolution process of the algorithm. Secondly, when the algorithm is in the development stage, all particles gather together to the global best solution, and the inertia weight is minimal. Once the algorithm falls into the local best solution, it isn’t effortless to jump out to explore the new domain. Besides, in terms of learning strategies, learning samples are mostly selected based on fitness values. This learning strategy results in a single learning orientation that is likely to keep the particle at a local best solution.
This paper proposed a variant of PSO, which set the corresponding parameters and built the learning model according to the convergence characteristics of the PSO algorithm in the optimization process. The variant is called PSO based on the interaction mechanism between leaders and individuals in the collective (PSO-IBLI).
PSO-IBLI Algorithm
This chapter will describe the critical improvements of the proposed algorithm and the learning model in detail.
Improved learning strategy
In human society, leaders adjust their decisions by asking for the opinions of some excellent individuals, thus leading a collective to complete the tasks successfully. An individual in the collective learns from the leader and its excellent peers, whose ability is above the individual, and the style of these peers is often different from it. Based on the learning strategy in the collective of human society, this paper proposed a learning model to regulate the learning mode of each particle adaptively.
Dividing the evolutionary process into many stages
With reference to literature [17], this paper also divided the evolutionary process into stages clearly. We can cover the entire evolutionary process in two kinds of phase: exploration and development. What is noteworthy is that the operation process of the algorithm does not consist of two only separate stages, but the operation process of the algorithm may consist of many discontinuous exploration and development stages.
In each iteration, we calculate d
ij
, the Euclidean distance of each particle from the others, and simultaneously obtain the maximum and minimum values of these distances: dmax, dmin,where i, j ∈ [1, N],and i ≠ j. In addition, the average distance between the current best particle and the others is treated as d
gbest
.Thus, we can get a function f, as shown in the Equation (4).
Where f is the value between 0 and 1. To distinguish the stages of evolution, we segment the function f. We need to set a parameter a1, if a1 < f ≤ 1,the algorithm is in the exploration phase, then if 0 ≤ f < a1, it is in the development phase. a1 is set to 0.1 according to the results of many experiments in this paper.
At each stage, we set different parameters for the algorithm and assign different learning models to the particles.
Some parameters need to be set and processed before describing the new algorithm model. In this case, we assume that M is the size of learning sample of the i
th
particle. The fitness values of M particles are bigger than the others besides the global best solution and itself. If we suppose di1, di2, ... , and diM to be the distance between the i
th
particle and the learning samples, then g
ij
represent their normalized values, and p
ij
represent the normalized values of their reciprocals, where j ∈ [1, M], g
ij
and p
ij
can be expressed as the formula (5) and (6).
If we suppose fiti1, fiti2, ... , and fitiM to be the fitness value of learning samples of the i
th
particle, then q
ij
represents their normalized values and expresses as the formula (7).
In addition, k ij is used to express the sum of q ij and g ij : k ij = q ij + g ij , l ij is used to express the sum of q ij and p ij : l ij = q ij + p ij .
According to the above foreshadowing, the improved algorithm model is described in detail as follows.
(1) Exploration stage
At this stage, we sort N particles according to the value of l ij in descending order and select the first M particles as learning samples of the global optimal particle. The velocity of the global optimal particle can be updated according to the formula (8).
If a particle is not the global best particle, M is set to 2. We specify it learns from the global best position, its own historical best position, and the two elites close to it. The velocity of the i th particle can be renewed according to formula (9).
In formula (9), c1 + c2 + c3 + c4 = 4. r1, r2, r3, and r4 are the values between 0 and 1.
At this stage, ω is set to be a maximum ωmax,and ωmax = 0.9. Each particle is updated according to the Equation (3), and χ = 1.
(2) Development stage
At this stage, the algorithm gradually approaches the optimal solution. But because it is impossible to tell whether the algorithm is approaching the global best solution or the local best solution, it is most likely to get into the local best solution. Therefore, at this stage, it is not only necessary to promote algorithm convergence, but also to explore the diversity of solutions and encourage particles to learn from elite ones which are far away from them, so as to avoid the algorithm falling into local optimization. So we stipulate that under the probability p, the global best particle learns from the nearby elites; under the probability 1 - p, the global optimal particle learns from the instant elite. The velocity of the global optimal particle can be updated according to the formula (10), where r is between 0 and 1.
To make each particle converge to the optimal solution and maximize the diversity of the population, under the probability p, it learns from the best global position, its best historical position, and two elites close to it; under the probability 1 - p, it learns from the instant elite. The velocity of the i th particle can be updated according to the formula (11).
At this stage, we control the inertia weight and contraction factor adaptively according to the formula (12) and (13).
Where t is the number of iterations now, T is the maximum number of iterations, and ωmin = 0.4.
According to the change of the χ, each particle is updated according to the Equation (3). Description of the improved PSO is as follow.
Algorithm 2: PSO-IBLI
Initialize a population of particles X; Calculate the fitness of all particles; Set xpbest.i to be the individual historical best position of the i
th
particle; Set x
gbest
to be the current global best position; Judge the value of f
Update x
i
and v
i
according the formula (8), (9) and (3), and χ = 1; Update x
i
and v
i
according the formula (10), (11), (12) and (13);
Test results of PSO-IBLI and other algorithms on four test functions
The flow chart of PSO-IBLI algorithm is showed in Fig. 1, from which we can see that the algorithm performs more mathematical operations, so the real-time performance of the algorithm in real-time path planning may not be as good as that of the typical PSO algorithm. However, in the field of static searches, such as robot global path planning, the experiment shows that the improved algorithm can still meet its time requirement.

Flow chart of PSO-IBLI algorithm.
To verify PSO-IBLI, we selected four multidimensional test functions, which are used by researchers commonly. There are two unimodal functions and two multi-peak functions in four functions. The names, the dimensions of the variable, the scopes of the variable initialization, and the minimum values of the four functions are shown in Table 1.
Four test functions
Four test functions
Four functions are used to test PSO-IBLI, PSO and two improved algorithms in literature [1] and [2]. The improved algorithms in literature [14] is called CLPSO, and the improved algorithms in literature [15] is called SALMPSO. The maximum iteration times of all algorithms are 200 times, the population size is 100, and the experimental platform is Matlab2015a. Each algorithm conducted 30 independent experiments on the same test function and calculated the mean, standard deviation. Table 2 shows the test results from which we can see that the precision of PSO-IBLI algorithm is higher than that of the other three algorithms.
To further verify the performance of PSO-IBLI algorithm, the convergence speed of PSO-IBLI algorithm is compared with the other three ones. Figure 2 shows the results indicating that the fitness value refers to the average of optimal fitness values for each generation of 30 independent experiments. The minimum values of the four functions we used for testing are 0. Figure 2 shows that the PSO-IBLI algorithm has the smallest fitness value under the same number of iterations, so its convergence speed is the fastest among the four algorithms.

Convergence speed of the PSO-IBLI and the other three algorithms.
The above theoretical analysis and experiments verify the property of PSO-IBLI, the contribution of which is described as follows. This algorithm establishes an adaptive learning model so that each particle can choose its learning samples adaptively according to its evolutionary state. For the global best particle of each generation, the elites close to it is selected as its learning sample in the exploration phase, which helps to improve the global convergence speed. In the development phase, the elites far away from it are learned with a certain probability, which promotes the algorithm jump out of the local best solution. For other particles in the exploration phase, the global best position, its historical best location, and nearby elites are used as learning samples to improve the speed of global convergence. In the development phase, with a certain probability p, these particles learn from the elites far away from them which increases the diversity of the population. According to the different stages of evolution. In the exploration stage, the larger inertia weight and shrinkage factor increase the search step length, which is conducive to convergence. However, during the development phase, these parameters are gradually reduced to adjust the step size to prevent it from getting too large and missing the optimal solution.
Robot global path planning algorithm based on PSO -IBLI
In this section, PSO-IBLI is used to intelligently design the path of the robot and the performance of this algorithm is demonstrated by experiments.
Basic idea of robot global path planning based on PSO-IBLI
The two-dimensional environment model established in this paper is shown in Fig. 3. The starting and ending points are marked with two five-pointed stars. In order to connect the particles with the path of robot, the path point circles are established with the obstacle position as the radius, and these circles are represented by dotted lines. After locating the path points on these circles, a path is formed by connecting the starting one, the ending one, and the path points in turn. If there are n obstacles, n path points should be found. Every particle is an answer of the optimization problem, which means every particle is a path.

Two-dimensional environment model.
To find a better path using PSO algorithm, we need to build fitness function to judge whether the current solution is better than the current global best solution searched in the previous iteration. It is also the basis for PSO algorithm to update historical optimal position of individual particle and the current global optimal position of all particles. If there are n particles, they can be described as X = (x1, x2, ⋯ x n ),each particle can be described as
x
i
= (x1, x2, ⋯ x
D
), where D is the amount of path points. The fitness function of the i
th
particle is expressed in formula (14), where dist (x
i
) represents the path length of x
i
, and α is a marker parameter for whether the path point crosses the obstacle(If α = 1, path point didn’t cross the obstacle, otherwise, the path point crosses the obstacle).
The average path length obtained by four algorithms
We carried out the following experiments to use the PSO-IBLI algorithm for robot global path planning.
Firstly, we obtained two paths using the PSO algorithm and PSO-IBLI algorithm, respectively. The two paths are shown in Fig. 4. From Fig. 4, we can clearly see that the path optimized by PSO is obviously shorter than that optimized by PSO-IBLI.

Result of robot global path planning using PSO and PSO-IBLI.
To further illustrate the superiority of PSO-IBLI, we adopted 4 algorithms, each of which was run 30 times, and obtained average length and the shortest length of each one. Table 3 shows the results indicating that average length and the shortest length obtained by the PSO-IBLI algorithm are the shortest in the four algorithms.
This paper mainly studied the PSO algorithm, and an improved PSO-IBLI algorithm was proposed and used to implement robot path planning. Firstly, the algorithm is divided into two kinds of stages, and the whole algorithm process includes many explorations and many development stages. Secondly, the global best particle and the others have different learning mechanisms in various stages. Lastly, the parameter of the algorithm is changed according to the change of the stage. The PSO-IBLI algorithm was tested by four functions and the test results reveal that it is more superior than the other algorithms in convergence speed and accuracy. When PSO-IBLI is applied to the actual robot path planning, it has higher precision in path planning than the other algorithms.
