Abstract
In recent years, multiple robots have been successfully applied in various fields, including the handling of logistics factories, agriculture, and disaster relief. This study proposes a novel method for multi-robot deployment and navigation in dynamic environments. To address the problem of location deployment, a grid-based method was used to simplify environmental input information, and a self-clustering method was used to adjust location deployment. To address the problem of navigation, a behavior manager was used as a navigation strategy to control the towards-goal behavior and wall-following behavior (WFB) of mobile robots. An interval type-2 fuzzy controller based on improved particle swarm optimization (IPSO) was proposed to implement the WFB control. The proposed IPSO improved the search ability and enhanced the convergence speed of traditional PSO. Additionally, an escape mechanism was proposed to avoid a dead cycle. Experimental results show that the proposed IPSO is superior to other methods used for WFB and navigation control.
Keywords
Introduction
Remarkable progress has recently been made in the research and development of robotics. In addition to industrial robots used for automating production lines, numerous types of service robots have been developed to meet the diverse demands of daily life; for instance, cleaning [7], home care [21], and home safety [6]. Professional service robots include those deployed in the fields of farming and animal husbandry [1, 20], medical care [14, 15], object handling [2], and navigation [5]. With the advances in technology, problems relevant to industrial robots, service robots, and multi-robot cooperation [2, 12] have attracted increasing research interest. For example, the robot lifeguard [12] can help ensure a rapid response to the correct location in a vast sea; robots monitor temperature and conduct humidity testing in large factories (such as wineries); logistics robots help place goods on shelves in large shopping malls; and robots assist in the sprinkling of water and pesticides in fields used for farming. However, there are common problems associated with the deployment and navigation of multi-robots.
To overcome the aforementioned problems, Reif and Wang [9] proposed autonomous robot behavior control of a social potential field, which is given to each robot by defining the rules between themselves and the robot group. This reflects the social relations among robots and enables them to arrange themselves into teams. Parker [11] proposed a low-level controller based on an artificial potential field to imitate the concept of an electric field force. This perceives the movement of the robot in the environment as an abstract, artificial force field. Although the two methods [9, 11] are simple in structure and provide easy control, the disadvantage of both methods is an existing local optimum that can easily become trapped in a deadlocked situation. Therefore, Cortes et al. [8] proposed a distributed processing method based on Lloyd’s [23] method and the computational geometry of the Voronoi diagram, in which a gradient descent algorithm is used to solve unconstrained optimization problems and deploy the decentralized processing of a robot team in the best position within a region. Although this method is simple and easy to implement, it has a slow convergence rate and easily falls into a local optimal solution. The K-means clustering algorithm [10] is used to divide n points into K clusters so that each point belongs to the cluster that corresponds to its nearest mean value as the standard value of the cluster. Although the K-means clustering algorithm is simple and rapid, the number of cluster center points must first be defined because an inappropriate number of clusters yields poor results which considerably limits the usefulness of the clustering method. To solve this problem, a self-clustering method is proposed that can dynamically evaluate the data set and determine the current population center in the input data space without to the requirement of predefining the number of clusters.
In terms of navigation, avoiding collisions with dynamic obstacles or adapting to changes in the environment, such as the appearance of a pedestrian or a moving object and a dead road caused by collapsed goods, is difficult. To realize navigation control of the mobile robot, obstacle avoidance is important in enabling the mobile robot to reach the target location. In recent years, intelligent robots have been widely used, most commonly through the methods of fuzzy control and neural network control. Researchers, including Juang et al. [3] and Wai et al. [17], have combined the concepts of fuzzy logic and neural network in robot controllers that directly transmit the sonar message received from the sensor to the designated fuzzy neural network (FNN) to achieve mobile robot control. However, this is affected by the uncertainty of input, output, and noise signals in a real environment. Although an FNN can successfully facilitate the wall-following control, its performance may not be optimal. To solve these problems, some researchers [4, 19] have applied a type-2 fuzzy neural network (T2-FNN).
This network was developed to solve the shortcomings of a type-1 fuzzy neural network (T1-FNN). The T2-FNN uses a fuzzy set as the membership value and is therefore an extension of the T1-FNN, which uses a crisp set as membership value. Ownership of these fuzzy sets provides a footprint of uncertainty, which enables the handling of uncertainties. Therefore, T2-FNN generally performs more effectively than T1-FNN, although its computational complexity increases significantly. In this research, an interval type-2 fuzzy controller (IT2FC) was therefore used to reduce computational complexity, and the center of sets (COS) reduction process [16] was adopted to simplify the reduced order process.
In parameter learning, evolutionary algorithms [18, 27] are widely used to adjust the parameters of a neural or fuzzy neural network, such as in the artificial bee colony (ABC) [26], difference evolution (DE) [27], quantum-behaved particle swarm optimization (QPSO) [22], and particle swarm optimization (PSO) [18]. This study focused on the particle swarm algorithm [18], which is an evolutionary calculus inspired by the collective behavior of social animals, which has the advantages of fast convergence and simple implementation. However, it also has several shortcomings, including low precision, faster speed of convergence, and a tendency to readily fall into the optimal solution in complex applications. To solve these problems, an improved PSO is proposed to maintain the diversity of particles and avoid falling into the local minimum solution.
The purpose of this study was to deploy and navigate multiple mobile robots for use as water lifesaving robots, logistics robots, or in factories or farmland. To solve the problem of location allocation, we used gridding to simplify the input of environmental information and utilized the proposed self-organizing algorithm to deploy the robot. To help the robot safely reach the deployment point, an IT2FC was designed for avoiding obstacles. Finally, the proposed method was applied to the simulated field and its performance was compared with other methods identified in the literature.
Deployment of multiple mobile robots
The self-clustering method
In this subsection, a self-clustering method (SCM) is proposed to cluster input data. The main advantages of this algorithm are as follows. First, it is a one-pass algorithm that can dynamically evaluate the cluster entries in the input space. Second, it calculates the current data and automatically determines the number of clusters. Thus, if the user knows the environment must allocate several robots, they can adjust the threshold in SCM to change the effect of clustering. Using two-dimensional data as an example, the details of the SCM are as follows.

The self-clustering method in two-dimensional space.
To determine whether the candidate sample belongs to the jth cluster, the restriction conditions are as follows:
If
The threshold D thr is the most important parameter in the SCM. A low threshold value produces more clusters, whereas a higher threshold value produces fewer clusters. Selection of the threshold value therefore affects the number of clusters and is ascribed according to the performance of the robot and the problem. Figure 2 shows the effect of using different threshold values.

Location deployment of mobile robots using SCM with different thresholds (red dots denote robots).
This section describes the use of mobile robots and the proposed navigation control method. The proposed method is to design a behavior manager (BM) to change a robot’s behavior according to the relationship it has with the environment.
Mobile robot description
Figure 3 shows the Pioneer 3-DX robot used in this study. It is manufactured by Mobile Robots USA and exhibits many features, such as high load, high endurance, highly scalable, and a software development kit across platforms, which also includes robot motion control, a client-server model, and various equipped libraries. Users can apply this robot in a variety of areas and integrate it with all the peripherals to achieve their research and development goals.

Pioneer3-DX Mobile Robot.
The robot moves towards the target when there is no obstacle preventing it from doing so. If there is an obstacle, it needs to bypass this to find the target point. In this subsection, we describe a BM designed to control two mobile robot behaviors: towards-goal behavior (TGB) and wall-following behavior (WFB). The mobile robot is divided into four regions R_1, R_2, R_3, and R_4, as shown in Fig. 4. The position (R_i) of the target point in the mobile robot is then determined as shown in Fig. 5. If it is located at R_(1∼3), the BM detects whether there is an obstacle in this area. If an obstacle is detected (S _ i ≤ 1m), the BM switches to the WFB; if not, it executes the TGB. If the target point is located at R_4, the robot switches to the WFB until it reaches the target point.

The robot is divided into four areas.
To realize the WFB control, an IT2FC based on improved particle swarm optimization (IPSO) is proposed for collision avoidance.
Interval type-2 fuzzy controller
Figure 5 shows the IT2NC architecture which comprises five layers: the input layer, fuzzy layer, firing layer, output processing layer, and output layer. The IF-THEN rule can then be expressed as follows:
THEN

Structure of the interval type-2 fuzzy controller.
The membership degree of the Gaussian primary membership function
and
The output of each node is represented as an interval [
and
Nodes in Layer 5 defuzzify the output by computing the average of y
l
and y
r
to provide the crisp value of y.
Traditional PSO presents the advantages of fast convergence and simple implementation, but also the shortcomings of low precision and a tendency to fall readily into the local best solution in complex applications. To improve these shortcomings, this study proposes an IPSO, the flowchart for which is shown in Fig. 6.

Flowchart of improved particle swarm optimization.
If Dis i < ADIS g and Fit i < AFIT g , this particle is similar to the group leader. They are then placed in the same group and their group number is updated to g If this is not the case, the particle does not belong to the group, and no action is taken.
If ungrouped particles remain, the algorithm returns to step 2 and particles with the highest fitness value are set as the leader of the new group. Steps 2 through 3 are repeated; conversely, if all particles have been grouped, Step 3 is complete.
The entire system block diagram is shown in Fig. 7. The IT2FC has four input signals; the four sonar sensed distances S0, S1, S2 and S3. The sonar detection range is limited to between 0.1 and 1 m, while the IT2FC outputs the rotational speeds V L and V R of the two axles. The output range is –5.24∼5.24 rad/s. The duration of the execution cycle is 500 ms and is referred to in this study as a time step.

Block diagram of the learning behavior of the wall-following control.
To ensure that the mobile robot is implementing WFB during the learning process, it possesses three termination conditions [25]: (1) Mobile robot collides with the wall; (2) Mobile robot deviates from the wall; (3) The actual distance the mobile robot moves is more than one lap of the training environment. The fitness function is used to evaluate the effectiveness of wall following until the termination condition of the algorithm is established. The fitness function contains four sub-fitness functions: SF1, SF2, SF3, and SF4, which are defined as follows:
The weighting coefficients of the control factors are[α1, α2, α3, α4] = [0.35, 0.35, 0.25, 0.05]. The higher the weight setting, the more important it is to evaluate this item. In this research, the objective functions SF1 and SF2 are the most important factors in learning to succeed.
The objective functionSF1 evaluates the moving distance of the mobile robot. When the moving distance T
dis
is closer to the preset value T
stop
, the robot has moved closer to a successful detour training environment, as defined below:
where d
wall
is a preset fixed distance value. The objective function SF2is defined as the average moving time RD (t):
The objective function SF3 evaluates the angle θ between the mobile robot and the side walls. When the action robot is parallel to the wall, θ=90°.θ (t)is defined as follows:
To ensure that the mobile robot is parallel to the wall during the move, SF3is defined as the average of all moving times|θ (t) -90|, formulated as follows:
The Objective function SF4 evaluates the moving speed of the mobile robot. To ensure that the average speed of the robot remains close to the desired preset speed, mobile robots not only maintain a fixed distance from the wall but also increase their speed of movement, which is defined as follows:
To verify the effectiveness of the proposed navigation control method, the experiments were divided into two parts: the WFB control and the deployment and navigation control of multiple mobile robots.
Mobile robot wall-following control
To determine the effectiveness of this method, the experimental results were compared with the performance results of other evolutionary algorithms. To demonstrate the stability of the algorithms, ten evaluations of each algorithm was conducted. Table 1 presents the initial preset parameters of the proposed IPSO, which comprises the total number of particles, the inertia weight ω, the accelerating constants C1andC2, the algebra, and the number of fuzzy rules. For many problems, the inertia weight ω is difficult to determine; therefore, it was set to 0.2, 0.3, 0.4, 0.6, and 0.8, as shown in the experimental results presented in Table 2. These results show that when ω = 0.3, performance is improved; therefore, ω was set to 0.3 in subsequent experiments.
Initial set parameters for the proposed IPSO method
Initial set parameters for the proposed IPSO method
Evaluation using different inertia weights
The experimental results for the fuzzy rule number are shown in Table 3. These show that although fewer fuzzy rules reduce computing time and memory space, their performance is poor (the higher fitness value is better). However, an increase in the number of fuzzy rules requires more evolutionary time and memory space to achieve superior results. Therefore, six fuzzy rules were set in this experiment.
Evaluation of effectiveness using different fuzzy rules
To verify the effectiveness of the proposed method, the experimental results were compared with the performance of other evolutionary algorithms [21–24]. Figure 8 presents the learning curve of different algorithms for the mobile robot WFB control. A detailed comparison of the results is presented in Table 4 and encompasses the best fitness, worst fitness, average fitness, standard deviation (STD), the number of training successes (NTS), and learning time (LT). The number of training successes indicates that the mobile robot can successfully learn to navigate the training environment within 10 evolutionary simulations. Figure 8 and Table 4 show that under the same conditions, the proposed IPSO performs more effectively because less time is used in the WFB control and the standard deviation is lower, indicating that the proposed algorithm has high stability.

Learning curve of different algorithms for the mobile robot wall-following control.
Evaluation of learning behavior using different methods
Two testing environments were also established to verify that the WFB control can be successfully implemented in unknown environments after learning different algorithms. The environment shown in Fig. 9 contains many difficult large bends, whereas the environment shown in Fig. 10 contains many consecutive curves. Quantified indicators include the distance and time taken for a robot WF a circle and the average distance between the robot and the wall. Table 5 presents the experimental results, which show that the performance of the proposed IPSO algorithm is superior to other algorithms [18, 27].

Paths of movement of the mobile robot using (a) IPSO, (b) PSO, (c) QPSO, (d) ABC, and (e) DE methods, respectively, in test environment 1.

Paths of movement of the mobile robot using (a) IPSO, (b) PSO, (c) QPSO, (d) ABC, and (e) DE methods, respectively, in test environment 2.
Comparison of various methods in the two test environments
To verify the proposed navigation method, three test environments were established. In the following experiments, the green R represents the mobile robot, the red G represents the deployment target, the white O represents the dynamic obstacle, and the arrow represents the direction in which the obstacle was moving.
R2 have successfully reached the target points G1 and G2.
Dynamic obstacle avoidance
The test environment is shown in Fig. 11. Its purpose was to ascertain whether the proposed method can avoid the dynamic obstacle located in the environment. Figure 11 (a) shows the initial positional relationship between the mobile robot and the obstacle. Figure 11 (d)–(e) shows that the mobile robot R1 avoids the mobile robot R2 as a dynamic obstacle. Figure 11(g) shows that the mobile robot is safe from dynamic obstacles. Finally, in Fig. 11(h), both R1 and R2 have successfully avoided obstacles and reached the target points G1 and G2, respectively.
Beach life-saving robot
The beach environment is based on an aquatic life-saving robot test site located on ShaWan Beach at Kenting Pak in Taiwan. We first meshed the input map, as shown in Fig. 12, and then used the proposed self-clustering method to assign the robot’s position (Fig. 13) before moving the robot to the configured location. Figure 14 (a) shows the positional relationship between the first mobile robot and the obstacle. In Fig. 14 (b)–(e), the red box shows that the mobile robot is safely moving away from the dynamic obstacle. Figure 14 (f) shows that R1–R4 have successfully reached the target point.

Dynamic obstacle avoidance of two mobile robots (R1 and R2 represent the starting positions of mobile robots, G1 and G2 represent the deployment targets of mobile robots, and the arrow symbol represents the moving directions of obstacles).

Meshed beach environment.

Deployment of mobile robots using D thr = 20in the beach environment.

Beach life-saving robots (green points denote starting positions, red points denote deployment targets, arrows denote the direction in which the obstacles moved, and the red circle denotes obstacle avoidance).
The wine cellar environment sensing robot is based on a test site constructed as a floor plan of a wine cellar in Beijing, China. We meshed the entered map, as shown in Fig. 15, and then assigned the mobile robot’s position using the proposed SCM (Fig. 16) before moving the mobile robot to the configured spot. Figure 17 (a) shows the positional relationship between the first robot and the obstacle. In Figs. 17 (b)–(c), (e)–(f), the red box shows that the robot is safe from moving obstacles. In Fig. 17 (d), R2 records the shortest distance for leaving the dead zone. Figures 17 (e)–(g) shows that R2 has entered the shortest distance range and successfully exited the infinite loop. An obstacle between R2 and the target was identified so it continued to follow the wall. Finally, all mobile robots in Fig. 17 (h) successfully arrive at the target point.

Meshed wine cellar environment.

Deployment of the mobile robots using D thr = 20 in the wine cellar environment.

Wine cellar environment sensor robot (green points denote the starting positions, red points denote deployment targets, arrows denote the direction in which the obstacles moved, and the red circle denotes obstacle avoidance).
This study used SCM and IT2FC for multi-robot deployment and navigation in dynamic environments. The reinforcement learning method enables the mobile robot to adaptively develop the controller without the need for experts to design rules and without additional paired training data. The proposed SCM is a one-pass algorithm that can dynamically evaluate the cluster entries in the input space and automatically determines the number of clusters. Additionally, the proposed IPSO introduces the concept of grouping to cluster, preserves the diversity of particles, improves the stability of convergence, and improves the local search. The experimental results were divided into two parts. The first experiment verified the effectiveness of the proposed IT2FC for a mobile robot WFB control. The experimental results show that the performance (the distance and time taken for a robot WF a circle) of the proposed IT2FC with IPSO in terms of WFB control is more efficient than that of other methods in unknown environments. The second experiment verified the performance of the deployment and navigation of multiple mobile robots. The three test environments were dynamic obstacle avoidance, a beach life-saving robot, and a wine cellar environment sensing robot. The experimental results show that the proposed method can effectively avoid moving obstacles to move the robot to the configuration point.
Footnotes
Acknowledgments
The authors would like to thank the Ministry of Science and Technology of the Republic of China, Taiwan for financially supporting this research under Contract No. MOST 107-2218-E-005-023.
