Abstract
Among the indoor localization algorithms, the algorithm based on traditional Back Propagation Neural Network (BPNN) has the problems of slow convergence and easy to fall into local optimum. It is difficult to apply the algorithm in noisy environments. Therefore, in this paper, we propose a novel indoor localization algorithm where the whole localization process is divided into two parts: data preprocessing and localization output. Data preprocessing means using filtering algorithm to process the Received Signal Strength Indication (RSSI) sequence. It is considered that the initial value of the received sequence has a significant impact on the performance of Kalman Filter (KF). An improved Kalman Filtering algorithm (DBSCAN-KF) is proposed based on the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. First, the RSSI values that are seriously disturbed by noise in the sequence are removed using the DBSCAN algorithm, and then the RSSI sequences are processed using KF so that the RSSI values can be closer to the theoretical values. The localization output part is to reduce the localization error caused by the BPNN. In this paper, the Differential Evolution (DE) algorithm and Particle Swarm Optimization (PSO) algorithm are combined, and the Differential Evolution Particle Swarm Optimization (DE-PSO) algorithm is proposed. The BPNN weights and thresholds are optimized in parallel, which improves the speed and ability of global optimization search and further avoids the shortcomings of traditional BPNNs that are prone to fall into local optimization in the training process. Experimental results show that the BPNN localization algorithm based on DBSCAN-KF improves the average localization accuracy by 0.26m compared with the BPNN localization algorithm without filtering. After filtering, the localization algorithm based on DE-PSO improved BPNN (DE-PSO-BP) improves the average localization accuracy by about 24% compared with the localization algorithm based on DE-PSO-BP. The localization algorithm based on DE-PSO-BP improves the average localization accuracy by about 61% compared with the traditional BPNN.
Introduction
In recent years, with the rapid development of Internet of Things technology, indoor localization technology based on Internet of Things has become a new research direction [1]. There are various technologies to realize indoor localization, such as WiFi fingerprint [2–4], radio frequency identification [5], magnetic field [6], trilateration algorithm [7], channel estimation [8], internal measurement unit [9], etc. Among them, in the indoor environment, WiFi is widely deployed and the cost is low, so the indoor fingerprint localization algorithm based on WiFi Received Signal Strength Indication (RSSI) is the most common. In addition, with the rapid advance of artificial intelligence, neural networks have been widely used in the field of indoor localization technology. Due to its strong nonlinear learning ability, it can effectively learn the signal intensity characteristics in space and realize the pre-estimation of the location to be measured. Because the structure of Back Propagation Neural Network (BPNN) is simple and easy to realize, indoor localization technology based on BPNN has been widely used [10].
Reference [11] proposes an indoor localization algorithm for WiFi signals based on BPNN. First, the location area is sampled to obtain the fingerprint database, and then the BPNN is trained to output the location model to realize indoor localization. Reference [12] uses BPNN and probability method to locate WiFi fingerprint. Neural networks can learn the complex relationship between inputs and ideal outputs independently, and the experiment proves the effectiveness of this method. However, Reference [11] and [12] do not consider the influence of environmental noise on RSSI, which will reduce the localization accuracy. Reference [13] proposes to use Kalman Filter (KF) to preprocess RSSI data, which reduces the effect of environmental noise on RSSI and establishes a more accurate fingerprint database, thus improving the localization accuracy. However, a mutation in the initial value of the sequence can have a significant impact on KF, and stabilizing the initial value of the sequence can improve the effect of KF [14]. It is easy to fall into local optimum in the process of training BPNN [15]. In Reference [16], the Particle Swarm Optimization (PSO) algorithm is used to improve the BPNN, and in Reference [17], the Differential Evolution (DE) algorithm is used to improve the BPNN, which prevents the BP neural network from falling into local optimum to some extent. But BPNN may also fall into local optimum in the training process.
By summarizing the advantages and disadvantages of the proposed methods in the above documents, it is found that the lack of indoor localization accuracy is mainly caused by two aspects. One aspect is the localization error caused by fluctuations in RSSI values due to environmental noise. On the other hand, BPNN failed to find the globally optimal weights and thresholds during training, resulting in localization errors. Therefore, in this paper, we improve the data filtering and BPNN models. Considering the influence of sudden change of sequence initial value on KF, the combination of Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm and KF algorithm (DBSCAN-KF) is used. First, the RSSI values seriously disturbed by noise are removed by the DBSCAN algorithm, and then the RSSI sequences are processed by KF. This brings the filtered RSSI sequences closer to the theoretical value and thus reduces the noise-induced localization error. To further avoid the BPNN falling into local optimum, the PSO and DE algorithms are used to optimize the BPNN in parallel to make the localization accuracy more accurate. Experiments show that the proposed algorithm is more suitable for noisy environments and has higher localization accuracy.
The paper is structured as follows: Section 2 describes that the proposed algorithm is divided into two parts: data preprocessing and localization output, as well as the whole process of algorithm processing. Section 3 details the algorithm used in data preprocessing. Section 4 details the algorithm used in the localization output part. Section 5 describes the experimental process and analyzes the results. Section 6 summarizes the advantages and contributions of the proposed algorithm.
DBSCAN-KF location model based on DE-PSO-BP algorithm
As shown in Fig. 1, the entire localization algorithm processing process is divided into two parts: data preprocessing and localization. Clustering and Kalman filtering algorithms are used in the data preprocessing part, and DE-PSO and BPNN are used in the localization part. In the following two subsections, data processing is involved and the algorithms used are detailed.

Complete structure diagram of indoor location model.
The DBSCAN-KF algorithm is used to filter the received RSSI, and then the DE-PSO algorithm is used to iterate the weights and thresholds of the BPNN. Finally, the trained BPNN is applied in the localization test. Figure 2 shows the full structure of the proposed localization algorithm. The algorithm is divided into two phases: training and location. The training process consists of three stages: preprocessing RSSI with DBSCANKF, initialization of BPNN with DE-PSO, and training of BPNN. In the localization phase, the indoor location model generated in the training phase will be used. Note that the location phase also requires the same preprocessing as RSSI.

Complete structure diagram of indoor location model.
KF algorithm
KF is a filter estimation algorithm that combines priori experience and actual measurements. The whole algorithm is divided into two phases, forecast and update. The specific formulas are as follows:
Forecast:
Update:
In the formulas,
The KF algorithm execution program diagram is shown in Fig. 3. Note that when the algorithm starts to execute, the parameters

Kalman filtering algorithm flow chart.
When the initial value error is too large, the KF algorithm will be inaccurate. The DBSCAN algorithm can remove the data with large deviations in the sequence and make up for the shortcomings of KF.
Let’s briefly describe the DBSCAN process [18]. At the beginning of the algorithm, all objects in the dataset are marked as “unvisited”, and then randomly select an unvisited object p, calculate the number of object m in the neighborhood of p, mark p as “visited”. If m < MinPts, p is marked as noise, otherwise a new cluster C is created, all objects in a ɛ neighborhood of p are put into candidate set N. Objects in the set that do not belong to other clusters are added to C, and is also marked as “visited”. In this process, calculate the number of objects in the neighborhood of each . If , add the objects in the ɛ neighborhood of to set N until the objects in C no longer increase. If N is an empty set, output C. Repeat the above process until all objects are marked as “visited”.
Where, ɛ is the neighborhood radius and MinPts is the clustering threshold.
In this paper, the clustering result should be only one class. In order to ensure that the number of dimensions of the input and output data is the same, the arithmetic mean of the clustering results is taken to fill the filtered data. After the DBSCAN algorithm, the abrupt data will be deleted. Then, the data with abrupt noise removed is fine-tuned by the KF algorithm to further restore the real data.
BPNN based on DE-PSO algorithm
Hybrid algorithm based on PSO and DE
PSO algorithm
In the traditional PSO algorithm, some particles are initially generated. X
i
= (xi1, xi2, ⋯ , x
iD
) is the position vector of the ith particle, V
i
= (vi1, vi2, ⋯ , v
iD
) represents the velocity vector of the ith particle, where D represents the particle space dimension. The optimal position of the ith particle is called the individual optimal position P
best
. The optimal position searched by the group is recorded as the global optimal position P
g
[19–21]. Its speed and position update formulas can be expressed as (3) and (4).
DE algorithm is an efficient global optimization algorithm, which contains fewer control parameters, simple structure and fast convergence speed. For most nonlinear and multi-objective optimization problems, it has good reliability and applicability [22]. Randomly generate NP parameter vectors with a uniform probability distribution as the initial population of each generation by (5).
Mutation operation. Randomly select three different individuals in each generation of the initial population and the variation vector is generated by (6).
Cross operation. In order to enrich the diversity of the population, the objective vector and the variation vector are binomially crossed to generate a crossed population.
Select operation. One-to-one selection is made between the test vector and the target vector, and the vector with the better fitness value is reserved for the next generation. The selection operation expression is (8).
The mutation and crossover steps of the DE algorithm enhance population diversity. However, the implementation power of mutation and crossover operation comes from the difference between individual vectors. At the late stage of iteration, the information difference between individual vectors gradually disappears, resulting in a slow convergence rate in the late search process. The combination of PSO and DE can enhance the diversity of the population and improve the ability of individuals to jump out of the optimal local solution. The two algorithms have low complexity and easy to program, and mixed processing will not increase the amount of computation in the iterative process. DE and PSO are divided into two parallel iterative subpopulations in the initial structure of the hybrid algorithm. Each iteration, find the optimal fitness value of particle population and the optimal fitness value of difference particle population. The third part of the PSO velocity formula expresses the particle’s ability to approach the optimal individual of the global population.
In order to enhance the particle’s ability to learn from the best individuals in the two subpopulations, the optimal fitness values of the two subpopulations are determined each iteration. If the optimal fitness value in the differential population is smaller than the optimal fitness value in the particle population, the optimal position Pg in the third part of the velocity update (3) is replaced by the optimal position individual Xg in the differential population, which is used to guide the particles to approach the optimal individual in the global population. The flow of the hybrid algorithm is shown in Fig. 4. The particle velocity update formulas in the hybrid algorithm are (9) and (10).

Hybrid algorithm flow of particle differential evolution.
If xbest < gbest , then
If x best ⩾ g best , then
BPNN algorithm
BPNN is a multi-layer feedforward neural network. The basic idea of its algorithm consists of forward propagation of signal and backward propagation of error. The error is minimized by constantly changing the weight and threshold of BPNN. Forward propagation is the propagation process of signals from the input layer through the hidden layer and finally to the output layer, while reverse propagation is the opposite. In this propagation process, the initial weights and thresholds of each BPNN layer are constantly adjusted to achieve the purpose of training the BPNN [25]. Figure 5 shows a schematic diagram of the BPNN double hidden layer.

BP double hidden layer feed-forward network.
In order to express the back propagation process conveniently, a cost function is defined as (11).
Firstly, the initial weights and thresholds of BPNN are normalized. Then, the DE-PSO algorithm is applied to iteratively optimize the position of the optimal particle. The optimal particle position is used as the initial weight and bias of the BPNN training. Finally, the training model is output and applied to RSSI positioning. Figure 6 shows the DE-PSO-BP algorithm.

Flow chart of DE-PSO-BP network model.
Simulation conditions and sample set composition
The performance of the location algorithm is tested by simulating the real situation. A logarithmic path loss model is used as in (13).
In this paper, A = -41dB, n = 4. A virtual environment of 10m × 10m is generated by MATLAB software and the distance between fingerprint points is 1m. The wireless access point (AP) adopts the arc triangle layout model [26], the distance between adjacent APs is 5m, and the area surrounded by three arcs is an arc triangle. AP layout is shown in Fig. 7. Remove the location point where the AP is located, there are 118 fingerprint points.

Schematic diagram of AP location.
Before the experiment began, we simulated the theoretical curve of RSSI value changing with distance and the actual curve affected by noise, as shown in Fig. 9. In order to bring the actual measured value closer to the theoretical value, the traditional KF and the improved KF algorithm were used to denoise and correct the RSSI value in the first experiment, and the performance of the two filtering algorithms was compared.

RSSI logarithmic path loss model.

KF algorithm curve graph.
The RSSI value is sampled 8 times for a certain position repeatedly, and the KF algorithm is used to filter the sampled RSSI sequence. In Fig. 9(a), the performance of the KF algorithm is shown. The “sampling/times”ž on the X axis indicates the number of times a site is repeatedly sampled. Where “Actual”ž represents the actual unfiltered measured signal curve. The “Actual After KF”ž represents this signal curve after KF. The “Theoretical”ž represents the theoretical RSSI value without any noise. It can be seen that the “Actual After KF”ž curve after KF fluctuates less, and has smaller errors than the actual measured value. Therefore, the KF algorithm can effectively correct the original measured value and reduce the error. However, there are some abnormal data in the experiment. As shown in Fig. 9(b), the RSSI curve after KF has larger errors, which we don’t want to see.
When the error of the initial RSSI value is relatively large, the filtering result will be more deviated. This is because initial value will be set in the Kalman modeling, and the abnormal initial value will lead to inaccurate correction in the filtering algorithm. Therefore, the DBSCAN method is introduced in this paper to eliminate filtering errors caused by burst factors.
Performance of DBSCAN-KF
Multiple sets of RSSI values are collected from seven different location points, and abnormal values are classified and eliminated by the DBSCAN algorithm. Figure 10 shows the results of the DBSCAN of the RSSI sample dataset.

Comparison of clustering results under different parameters.
On the x-axis, the “Location/Points”z represents the number of location points. When ɛ is equal to 0.2, 0.3 and 0.5, respectively, the outliers obtained by clustering results account for 33.9%, 14.2% and 0% of the total value respectively. Too much abnormal data or too little abnormal data does not fit our experimental background, so when the input value is in the neighborhood ɛ = 0.3 with the minimum number M = 3, the clustering effect is better.
Figure 11 shows the influence of the DBSCAN algorithm on the filtering effect. As can be seen from the figure, the abrupt point after the DBSCAN algorithm is removed, and the average signal value is added to the eliminated point, where “actual after DBSCAN” represents the RSSI curve after the DBSCAN algorithm. Figure 12 shows the performance of the DBSCAN-KF algorithm. DBSCAN-KF is the result of KF based on the DBSCAN algorithm. KF is the result of direct KF. The figure shows that the fluctuation range of the RSSI curve is smaller and the error is smaller after the DBSCAN algorithm and the KF algorithm. Compared to other algorithms, DBSCAN-KF has the smallest fluctuation range. It can also be seen from the figure that using KF alone cannot reduce the error when the initial value deviates too much. Therefore, the DBSCAN algorithm has a good effect on KF and plays an important role in filtering and correcting RSSI values.

Denoising performance of DBSCAN algorithm.

Denoising performance of DBSCAN-KF algorithm.
Use the KF algorithm and the DBSCAN-KF algorithm to denoise the RSSI dataset, respectively, and average the denoised data. The error formula is (15). Table 1 shows the denoising effect of clustering algorithms at different distances. It can be clearly seen that the RSSI containing noise is closer to the theoretical value after algorithm filtering, and the farther the distance is, the better the denoising effect will be.
The denoising effect of the DBSCAN-KFalgorithm
The denoising effect of the DBSCAN-KFalgorithm
The ultimate goal of filtering RSSI value is to reduce positioning error, so the DBSCAN-KF algorithm is added to BPNN to test the positioning performance of this algorithm. Location error is shown in Fig. 13, where OG is the original unfiltered location error. It can be seen that the filtering algorithm can improve the positioning accuracy. The improved KF algorithm has lower loaction error and higher accuracy.

Comparison of DBSCAN-KF algorithm localization error.
DE-PSO algorithm is introduced to optimize BPNN, which mainly solves the problems of slow convergence speed, many iterative steps, and easily falls into the local optimal solution.
DE-PSO algorithm initialization
According to the BPNN topological structure diagram, it is determined that the dimension of particles is 132, the population number of particles is 20, the learning factors c1, c2 are all 1.494, w is set as the nonlinear inertia weight, and the speed and position of each particle are between [-1,1]. The mutation operator F is 0.5, and the cross probability CR is 0.9.
Influence of DE-PSO algorithm optimizing BPNN on positioning error
Randomly generate 20 real localization points, and the generated localization points may coincide. BP algorithm, PSO-BP algorithm, DE-BP algorithm, DE-PSO-BO algorithm are used for localization. The deviation of the location point from the real point is shown in Fig. 14, where the red diamond indicates the real position. It can be seen that PSO algorithm and DE algorithm make up for the deficiency of BPNN to some extent, but obviously DE-PSO-BO algorithm has higher positioning accuracy. Figure 15 shows the error curves of four positioning models. Table 2 shows the maximum positioning error, minimum positioning error and average positioning error of four positioning models. It can be concluded from Fig. 15 and Table 2 that BPNN improved by DE-PSO algorithm has better performance in indoor positioning.

Localization errors of four models.

Localization error curve.
Localization errors of four algorithms
Figure 16 shows the iteration time and corresponding fitness values of three improved BPNN algorithms. It can be seen that DE-PSO algorithm has better performance in optimizing BPNN.

Iterative curves of three algorithms.
The experimental scenario and environment are shown in Fig. 17(a), and the whole laboratory area is m × 14m. The localization system adopts ZigBee (XBee S2C) WSN, which consists of seven nodes: seven beacon nodes and a mobile location point. The beacon nodes have been fixed at each corner of the building ceiling. The mobile location point is connected to the laptop via a USB cable and powered by the laptop. The laptop uses X-CTU software to record mobile localization points and collect RSSI samples of beacon nodes. Figure 17(b) shows the connection between the laptop and the localization system. Figure 18 shows the top view of the experimental scenario and the layout of the beacon nodes. The red dots represent the reference nodes separated by a distance of 2m.

Actual measurement scenariome.

Top view of experimental scene.
The signals obtained after clustering Kalman filtering are sequentially localized by the PSO-BP, DE-BP and DE-PSO-BP algorithms, and the localization effect is shown in Fig. 19. As can be seen from Fig. 19, the localization accuracy based on DE-PSO-BP algorithm is obviously higher than the other two localization algorithms, reaching an average error of 0.95m. This is a 24% improvement over the localization algorithm using DE-BP only, and a 25% improvement over the localization algorithm using PSO-BP.

Algorithm performance comparison figure.
In this paper, we propose a DE-PSO-BP indoor localization algorithm based on improved filtering. The whole indoor localization process is divided into two parts: data preprocessing and localization output. In the data preprocessing part, the problem of localization error due to fluctuations in the RSSI values is addressed by filtering the RSSI values using the DBSCAN-KF algorithm, which brings the filtered RSSI values closer to the theoretical values. In the localization output part, BPNN converges slowly and easily falls into local optimum. In this paper, we combine the DE and PSO algorithms to search for the weights and thresholds of BPNNs in parallel, which greatly improves the optimization speed and accuracy of BPNNs. Simulation results show that the proposed algorithm can significantly improve the accuracy of indoor person localization even in noisy environments. Among them, the BPNN localization algorithm based on DBSCAN-KF improves the average localization accuracy by 0.26m compared with the traditional BPNN localization algorithm. After RSSI filtering, the average localization accuracy of the localization algorithm based on DE-PSO-BP is improved by 0.12m compared with the localization algorithm based on PSO-BP. It achieves a 61 percent improvement over the traditional BPNN localization algorithm.
Acknowledgment
We are grateful to the Project of Young YuYou of North China University of Technology, 2018 and the Research on Key Technologies of accurate positioning of BINGTUAN transportation key vehicles and infrastructure based on Beidou under Grant 2108AB028.
