Abstract
Path planning is the basis and prerequisite for unmanned aerial vehicle (UAV) to perform tasks, and it is important to achieve precise location in path planning. This paper focuses on solving the UAV path planning problem under the constraint of system positioning error. Some nodes can re-initiate the accumulated flight error to zero and this type of scenario can be modeled as the resource-constrained shortest path problem with re-initialization (RCSPP-R). The additional re-initiation conditions expand the set of viable paths for the original constrained shortest path problem and increasing the search cost. To solve the problem, an effective preprocessing method is proposed to reduce the network nodes. At the same time, a relaxed pruning strategy is introduced into the traditional Pulse algorithm to reduce the search space and avoid more redundant calculations on unfavorable scalable nodes by the proposed heuristic search strategy. To evaluate the accuracy and effectiveness of the proposed algorithm, some numerical experiments were carried out. The results indicate that the three strategies can reduce the search space by 99%, 97% and 80%, respectively, and in the case of a large network, the heuristic algorithm combining the three strategies can improve the efficiency by an average of 80% compared to some classical solution.
Introduction
The path planning is becoming one of the key technologies of the unmanned aerial vehicle (UAV) autonomous control module [13] and the main purpose of drone path planning is to design a flight path directed to the target with the low-cost, i.e., the minimal probability of being destroyed with the UAV performance requirements [17]. Therefore, the path planning problem can be viewed as a complex optimization problem that requires efficient algorithms to solve [10].
The RCSPP is a classical problem in combinatorial optimization and there exists a part of the path planning problems can be modeled as the resource constrained shortest problem (RCSPP) where a quantity of resource in the network and the aim of RCSPP is to find a path with the smallest cost, such that resource consumption is less than the maximum amount of available resource [9]. The applications of RCSPP include robot global path planning [14, 18], intelligent transportation cloud computing system [4], UAV task path planning [10, 11] and so on.
When there are resource reset nodes in the network, the cumulative resources on the path are reinitialized to zero, and the problem is transformed into resource-constrained shortest path problem with re-initialization (RCSPP-R). Compared with RCSPP, there are relatively few literatures about RCSPP-R or its similarity. Cabral [2] considers the shortest path problem with relays (SPPR), which is similar to RCSPP-R, a path can accumulate no more than a given resource limit before replenishment must occur as while and uses the structure of feasible path, regards the optimal path as a series of sub paths between reset nodes, and decomposes SPPR into multiple sub SPP (shortest path problem). While the algorithm is found to be far less efficient due to the combination operation of sub paths. To solve similar problems, Laporte and Pascal [5] proposed a label correcting algorithm in the network where all costs and weights are non negative integers, which takes advantage of the characteristic that the number of labels required by any node does not exceed the number of resources plus one. Based on the assumption that the resource limit is relatively low, a matrix that can store tags to increase memory usage in order to achieve better time performance. In order to improve the time efficiency of the label correction method, Smith and Boland [12] proposed a standard preprocessing process and boundary calculation method to reduce the search space and improve the time efficiency of the algorithm without increasing the memory usage.
In this paper, there are two re-initiable resources (vertical and horizontal errors), reset nodes have associated constraints and only one resource can be re-initiated. Compared with the traditional RCSPP-R, the problem is more complicated because the reset of multiple resources expands the range of solution space and traditional solutions may get local optimal solutions. To effectively solve the constrained shortest path problem of multi resource reset, a heuristic based on the Pulse algorithm (PA) [6] and the branch bounds algorithm is proposed, which is called Pulse-R. The search space is initially reduced by first removing arcs or nodes whose cumulative resources do not satisfy the constraints through maximum resource capacity and re-initiation constraints. Second, considering the differences in the re-initiation constraints of the resources, the relevant nodes are pruned using an inverse search to calculate the boundary cost and the boundary optimal path after they are properly relaxed. In addition, to further improve the efficiency of the algorithm, a heuristic strategy is proposed to optimize the node expansion method. A large number of numerical experiments have been used to evaluate the stability and effectiveness of the proposed strategy, and the proposed algorithm has been compared with label correction method [12], ant colony algorithm [1] and the PA [6], indicating the effectiveness of the solution.
The main contributions of this paper can be summarized as follows:
•The existence of multi resource re-initialized RCSPP-R is studied. Compared with the traditional RCSPP-R, the problem is more complicated due to the mutual initialization of multi resources;
•In order to solve the problem of multi resource re-initialization RCSPP-R, a high performance algorithm based on pulse algorithm is proposed;
•A loop free pruning strategy is developed, and the special condition of the problem (re-initialization constraint) is included in the preprocessing to reduce the network size;
•A heuristic pruning strategy is proposed, which uses relaxed reverse search to pre order the subsequent extended nodes, so as to improve the search efficiency of the algorithm;
The remainder of this article is organized as follows. In Section 2, we describe a system model that is navigated by a drone in the presence of error correction, and in Section 3, we give specific design steps for an algorithm that can effectively solve such problems. In Section 4, some numerical experiments are presented to discuss the performance of the proposed algorithm. Section 5 discusses the results obtained and the paper concludes in Section 6.
System model and problem formulation
Path planning in a complex environment is an important problem of intelligent aircraft control. Due to the limitation of the system structure, the positioning system of UAV cannot locate itself accurately. Once the positioning error accumulates to a certain extent, the task may fail. Therefore, it is important and necessary for intelligent aircraft path planning to correct the positioning error during flight.
System model
As shown in Fig. 1, We consider a path planning system for UAV under the constraint of flight error, which consists of vertical error and horizontal error. Here gives some notation for mathematical describing of the problem. Starting node S, destination node E and N other error correction nodes denoted by s1, s2, ⋯ , s N . After setting s0 = S and sN+1 = E, all nodes in the system can be represented by A ={ s0, s1, s2, ⋯ , s N , sN+1 }. Without loss of generality, the nodes set can be denoted by the simplified index set Λ = {0, 1, 2, . . . , N + 1}.

Scene of path planning based on UAV.
Regarding the evaluation of one candidate flight path, it can generally be constructed from fuel cost C
f
and threat cost C
t
as follows [7, 15]:
Two types of nodes can correct only one of the horizontal error and vertical error independently. As shown in Figure 2, the horizontal error ɛ
h
can be corrected by the black nodes and the vertical error ɛ
v
can be corrected by the red nodes. ɛ
h
will increase by δ
h
after a flight over a unit distance and ɛ
h
will increase by δ
v
. For example, in the path fragment i → j, the cumulative horizontal error at node j (before correction) can be computed by plus a increment to the horizontal error at node i (after correction), i.e.

A two-dimensional case and the correction ranges.
After necessary error correction in some nodes, the UAV can arrive the destination with the error constraints satisfied. However, there are some difficult in searching a shortest path with such a re-initialization condition. First, the constraints formulation are related to the nodes type and nodes serial in a candidate path. They are changed with path evolution. Second, the additional re-initialization role expands the feasible path set and some traditional algorithms can’t achieve expected efficiency even lead to fail. We will give the solution in the following sections.
The goal is to find a shortest path with the error constraints satisfied in the UAV system. After introducing some binary constants T i , the related optimization problem, which is based on [3, 16], can be mathematically formulated as
The above optimization problem is very different from the traditional constrained path planning because the resource (i.e. flight error) in the problem can be re-initialized, and the constraints are linked to the accumulated resource consumption which is changed with the path evolution. In order to solve this problem effectively, we introduce a heuristic strategy and a relaxation pruning strategy to the Pulse algorithm [6].
The Pulse algorithm [6] only propagates one pulse at each moment, and another pulse starts when it reaches the end or is cut by the pruning strategy. Fig. 3 shows the flow of the Pulse algorithm. In order to reduce the search space, three pruning strategies are used: (1) infeasible pruning is used to judge whether the current path is valid; (2) Bounds pruning narrows the search range of the algorithm through the current optimal cost; (3) Rollback pruning is used to reevaluate whether the previous node is selected correctly when the algorithm is extended to a certain point. While the boundary pruning strategy in the Pulse algorithm only aims at the monotonic increase or decrease of accumulated resources, so the traditional Pulse algorithm can not solve the optimization problem (2).

The flow chart of the Pulse algorithm.
To distinguish it from the original algorithm, it is named Pulse-R for the proposed algorithm. The algorithm consists of two phases: (1) in the delimitation stage, the lower limit of path length from each node to the destination node can be calculated by relaxation constraints; (2) in the recursive exploration stage, the pulse is transmitted recursively based on depth-first search until the pulse extends to the end or is cut off. In the recursive exploration phase, the pulse will store the current optimal solution of each node, avoiding the need to store all nodes’ information. To reduce the search space, different pruning strategies are used on each node to try to prevent pulse propagation. Algorithm 1 outlines the Pulse-R algorithm.
Lines 1 to 3 initialize the path
The recursive calling procedure for Pulse-R is exhibited by the algorithm 2. IsFeasible and IsBounds, respectively, representing two pruning strategies (see 3.3). The re-initialize in row 4 represents a correction for the accumulated flight error. The Pathacceleration function is represented as a heuristic policy using the lower bound path obtained in the boundary policy for node expansion (see 3.5). Lines 13-14 make a deep recursive extension to the rest of the child node k calling algorithm 2 for node j, where
Preprocessing
The process will construct a distance adjacency matrix
From constrains (5), Specifically, the distance between node i, j must satisfy one of the following inequalities
Moreover, there will be a cycle in the viable path since flight error can be re-initiated. Obviously, this is not the optimal solution. To avoid the cycle, a no fallback strategy is proposed, which constructs an adjacency matrix by comparing the direct distance between the node and the starting node S. For example, as Fig. 4, an adjacency matrix of node i was constructed that satisfies the equations (9) and (10) and satisfies the assumption that the child nodes are jand k. The distance from the node i to S is expressed as ξ (i), and correspondingly, for the nodes j and k, the distance to S is ξ (j), ξ (k). If ξ (i) > ξ (j), it means that i to j are regressive relationships (in Fig. 4). To avoid cycle, set

Two typical non optimal path segments.

Original and relaxed correction constraint.
IsFeasible pruning. Discarding infeasible local paths is a commonly used method in labeling algorithm [8]. When a partial path reaches the node i and prepares to expand the node j, the strategy will check whether the node not worthy to be expanded and whether the capacity limit for resource accumulation is exceeded. If any of these events happen, the part is not feasible because it can be safely pruned. Moreover based on the branch and bound principle, pruning the feasible solution that does not satisfy the current optimal solution can reduce the search space of the solution. This pruning looks for the node which worthy of being expanded, and it checks the cost c′ (j) which between j and the starting node S is less than the current optimal cost c (j). If c′ (j) < c (j), the node j is reserved.
IsBounds Pruning. To restrict the search space more strictly, a boundary policy verifies whether the current node is worth expanding. Specifically, the partial path distance at the node j is c (j), and the boundary distance (the minimum path distance from node j to E)denoted by
Bounds pruning
A bounder strategy that is based on lower bounds on the cost, is proposed. This boundary strategy stores the lower cost of the path length of each node, which is realized by relaxing the constraint condition of each correct node. That means that the cumulative error satisfies one of the following inequalities at node j (assuming that the horizontal error can be corrected, the same is true in other cases),
In the boundary pruning strategy proposed by Lozano et al. [6], only the lower bound of the path cost
Since the lower bound path
For example, in Fig. 6, the dashes line as the shortest path. The node expansion way of traditional algorithms is shown in Fig. 6(a), expanding the children of each node in turn. The darker nodes in the graph represent the node extension of traditional algorithm, and it can be observed that almost all the nodes in the network are visited. The node extension range the proposed algorithm based on the heuristic strategy are represented by Fig. 6(b), where the dash-dotted line is the lower bound path corresponding to the starting node S. The algorithm scales along this path, expanding its children in the traditional way of node expansion if there is a node that does not satisfy the constraints (4), (5). The optimal path can be obtained quickly without extensive access to the nodes in the network.

Node extension for traditional and heuristics.
In this section, the proposed solution strategy will be evaluate and all algorithms have been coded in Matlab and the numerical results are carried out on an Intel(R) Core(TM) i5-1135G7 2.40 GHz 16.00GB RAM machine under Microsoft Windows 10. All computational times are reported in seconds.
Algorithms
The following seven algorithms will be used in comparative experiments.
LC denotes the label-correcting algorithm proposed in [12]. PA denotes the Pulse algorithm proposed in [6]. AC denotes the ant colony algorithm proposed in [1]. PRW solves the problem with Pulse-R does not use preprocessing and any pruning strategies in section 3. PRP solves the problem with Pulse-R uses preprocessing in section 3.2 but without uses any pruning strategies in section 3. PRB implements the preprocessing and bounds pruning in sections 3.2,3. PR denotes the complete Pulse-R algorithm which combines preprocessing, boundary strategy and heuristic search in section 3.5.
In order to make a fair comparison between the solution strategies under consideration, we use the same data structure to implement and modify them appropriately to solve the problem.
Instances
In order to construct the data set more objectively, the instances generation method similar to [12] is used to carry out the numerical experiments. The instances set used in the experiment is randomly generated and the three-dimensional coordinates of each node, and the range of x-axis, y-axis and z-axis are [0,100000]. Based on the experimental design in Smith[12], The resource re-initialized constraint is generated by 13 and the constraints used for the experiments can be viewed by Table 1. To verify the influence of different types of re-initialized node distributions on the proposed algorithm, the network is constructed by using re-initialized nodes based on different distributions, and each network is randomly generated by section 4.1. In the network with the same number of nodes corresponding to different constraints, only the correction type of nodes is changed. The nodes are respectively uniformly distributed (1, N), normal distribution (N/2, N/4) and the exponential distribution (N/2), where N is the number of network nodes.
The constrain of network
The constrain of network
The coordinate of the starting node is (0,50000,5000) and the destination node is (100000,60000,5000). At the same time, setting the upper limit of the algorithm’s time consumption to 300 seconds, if the time consumed by the algorithm exceeds this upper limit, it is considered that the algorithm has no solution. The cost of each arc is the Euclidean distance and the α- = 15, α+ = 25, β+ = 20, ς- = 10, ς+ = 25, θ+ = 30 and the correlation coefficient δ v = 0.001, δ h = 0.001.
In order to comprehensively test the effectiveness of the proposed strategies. This section will test the performance of PRW, PRP, PRB and PR. The instances are randomly generated from Section 4.1.
Table 2 shows the average execution time(CPUs) of each algorithm in different networks, where the preprocessing time in PRW is to calculate the cost between any two nodes, without boundary computation time. Pretreatment times from Section 3.2 are used as included in the PRP, and boundary calculation times from Section 3 are also included in the PRB. PR contains the time for the preprocessing, boundary calculation, and heuristic strategies involved in this paper.
Average time for Algorithm 1 with various pruning strategies
Average time for Algorithm 1 with various pruning strategies
On average, the preprocessing process is very useful in improving the efficiency of the algorithm. In fact, we can conclude that almost all PRB calculations in Table 2 take more than 10 seconds, while PRP is only 1.5 seconds at most. With the increase of the number of network nodes, the calculation time of PRB almost presents an exponential growth trend. In addition, the average computing time of PRB is less than 0.15 seconds, which indicates that the addition of boundary strategy can further reduce the time consumption of the algorithm. At the same time, the time consumption of PR integrating all strategies is lower, even to the constraint V, when the number of network nodes is 1000, it only consumes 0.2 seconds, which is 1500 times higher than the PRW efficiency. On the other hand, the performance of PR algorithm under three different distributions is almost the same.
To better highlight the benefits of each strategy in the article, Figures 7, 8 and 9 respectively show the percentage of node extensions of the algorithm using preprocessing, boundary policy and heuristic search strategy ψ, that is

Node extensions comparison (PRW vs PRP).

Node extensions comparison (PRP vs PRB).

Node extensions comparison (PRB vs PR).
Furthermore, to evaluate the effectiveness of PR algorithm under different cost indicators, six test sets were randomly generated in a fixed network with constraint II, number of network nodes 800 in which the cost weight in each test set obeyed (0.3, 0.05) normal distribution.The convergence curve of the PR algorithm is shown in Fig. 10. it can conclude that the PR algorithm has a better convergence rate, and all of the PR algorithms in six instances can converge near the fixed cost and indicating the effectiveness of the proposed algorithm.

Comparative convergence curves of algorithms.
To test the effectiveness and accuracy of the proposed algorithm, the performance of LC, PA, AC and PR will be calculated in this section, and the instances are randomly generated from Section 4.1.
Table 3 shows the average execution time (CPUs) of LC, PA, PR and AC in different networks, including preprocessing and boundary calculation time. It can be concluded from the table that in addition to AC, the other three algorithms consume less time, which indicates that the scale of network nodes is within a certain range, and the swarm intelligence algorithm has time efficiency compared to the deterministic algorithm. In addition, PR and PA are based on the depth-first search idea. When the number of nodes is small (such as 600 or less), the time consumption of LC will be better than the two, but as the number of nodes increases, the effect of the boundary strategy begins to gradually Increase, so that the time consumption of PR and PA when the number of nodes is more than 800 is less than that of LC. On the other hand, the tests of different strategies in Section 3.5 show that it is effective in improving the efficiency of the algorithm, so the calculation time of PR is smaller than that of PA.
Average time for various algorithms in different constraint
Average time for various algorithms in different constraint
In order to show the time consumption of the four algorithms more clearly, Figures 11, 12, and 13 are based on the time consumption of PR and calculate the time efficiency improvement Φ between each other, that is

Average time comparison (LC vs PR).
The comparison between the AC algorithm and the proposed algorithm is shown in Fig. 12. Due to the intelligent algorithms such as AC are solutions for very large networks, AC algorithms may be better than Pulse-R. While the heuristic methods improve the computational efficiency in large networks, the reduction percentage of the time between Pulse-R and AC fall slowly. Fig. 12 can show that when the number of nodes is less than 1000, the time consumed by the Pulse-R algorithm is reduced by more than 99% compared to the AC algorithm.

Average time comparison (AC vs PR).
In Fig. 13, due to the existence of a heuristic search strategy, and the existence of a second inspection and correction process for feasible paths in PA, the time efficiency of PR is higher than that of PA when the same preprocessing and boundary strategies are used. This shows that the heuristic search strategy has a better effect than the PA rollback strategy.

Average time comparison (PA vs PR)
To evaluate the convergence and convergence speed of the proposed heuristic, experiments were performed in a instances with a network node number of 600 and constraint III, compared with LC, PRW, PRP, PRB and PR.
The convergence curves of the six algorithms are illustrated in Fig. 14. Form these curves we can see that the best convergence rate belongs to the PR. The PRW has a poor convergence rate, and the convergence effect of PRP, PRB and LC are preferable but worse than the proposed algorithm. Moreover, it can be found by PRW that the solution space is huge when no pruning strategy is adopted, and although LC can quickly converge to the optimal solution, not quickly obtaining the feasible solution is its inadequacy.The convergence rate of PRP and PRB can conclude that the preprocessing and boundary strategies are effective, but there is a problem of insufficient efficiency in the late stage of search, while the incorporation of heuristic search strategy (PR) can avoid redundant search in the late stage and improve the efficiency of the algorithm.

Comparative convergence curves of algorithms.
The resource constrained shortest path problem with reinitilization has demonstrated its wider application in Smith [12]. Because there are resources to reinitiate the conditions, making the problem more complex while also expanding the search for viable solutions, the solution solving efficiency is tested.
Algorithms such as LC based on breadth first search (BFS) can be very effective in solving the shortest path problem, while its cost update on destination node is often at the end of the algorithm iteration, making the effect of some pruning strategies less obvious, such as the time consumption when the number of nodes is 1000 in Table 3. While due to the idea of DFS, PR, PA, etc. can combine the pruning strategy to delete redundant nodes, so that the efficiency of the algorithm can be greatly improved. It also shows that the efficiency of algorithms such as PR, PA is very dependent on the relevant pruning strategy.
PA has a better effect in larger scale networks [6], the reason for improving the boundary strategy is due to the large amount of computational time that needs to be consumed for the boundary calculation of this algorithm. Based on the idea of boundary pruning from the original algorithm and relaxing the constraints to obtain the lower boundary cost, the constraints on the lower boundary cost are less effective but the time consumed is significantly less compared with the traditional pulse algorithm. Meanwhile, the improved algorithm utilizes the heuristic search strategy of lower bound path to preorder the intended extended nodes, which can quickly exclude the nodes under increasing a certain memory demand, thus significantly improving the solving efficiency of the improved algorithm by up to 90%, compared with the original algorithm. At the same time, all of the improved algorithms have so elevated that they are largely based on boundary cost calculation, so how to obtain a better boundary cost while not significantly increasing the computational cost is a direction that needs to be studied in the future to improve the algorithms.
In summary, from the computational results, it was shown that the time consumption of PR did not significantly increase when the number of network nodes increased gradually, especially when the number of re-initialization nodes increases or the associated constraints are more stringent. Several pruning strategies and heuristic node expansion strategies are major contributions to this advantage. This study shows that improved algorithms are well suited to solve the problems.
Conclusion
In this paper, the UAV path planning problem with flight error correction is investigated, which can be modeled as RCSPP-R. The re-initiation condition of the resource expands the conventional feasible path set for constrained shortest path problems, increasing both the problem solving difficulty and the search cost. To solve such optimization problem, a heuristic algorithm (PR) based on PA [6] and branch bounds algorithm is proposed. In preprocessing, arcs and nodes that are not part of any feasible solution are removed by resource capacity and reinitiation constraints. The boundary clipping strategy reduces the redundant computation of the algorithm by calculating the effective lower boundary cost of relaxing the reinitiation constraint, and subsequently, the heuristic strategy preorders the extended nodes to improve the efficiency of the algorithm.
The instances are randomly generated by taking a similar approach to Smith [12], and the computational results show that the Pulse-R algorithm has a good performance of preprocessing and boundary pruning, with up to 99% and 97% of arcs and nodes can be removed with PRP and PRW, respectively. Compared with the pulse algorithm, the heuristic strategy based on the effective lower bound path, although increasing a certain degree of memory use but reducing the number of node extensions by up to 90%, greatly improves the time efficiency of the algorithm and the more obvious it is when the number of nodes of the network increases gradually. Through the experimental results with the label correction method, improved ant colony algorithm and pulse algorithm, etc., the proposed algorithm can solve the problem on the same scale in a shorter time, and the time efficiency can be improved up to 85.6% and 99%, respectively, which can provide a more effective solution in a network graph with thousands of nodes.
Footnotes
Acknowledgement
The authors would like to thank the editor and referees for their valuable comments and suggestions that help to improve this paper. This work is supported in part by Sichuan Science and Technology Program (No. 2021YJ0084) and the Research Project (G202012) of State Key Laboratory of Oil and Gas Reservoir Geology and Exploitation (Southwest Petroleum University).
