Abstract
In order to solve the problem that the population diversity of sparrow search algorithm (SSA) decreases and easily falls into the local optimal solution when it approaches the global optimal, an artificial immune algorithm-sparrow search algorithm (AIA-SSA) is proposed in this paper by combining artificial immune algorithm and sparrow search algorithm. This paper uses 10 benchmark functions for experimental simulation of AIA-SSA algorithm, and compares it with five widely used intelligent algorithms and SSA. Experimental results show that AIA-SSA overcomes the deficiency of SSA and improves the search accuracy, convergence speed and stability of the algorithm. Meanwhile, this paper applies AIA-SSA to network intrusion detection and constructs a network intrusion detection model based on support vector machine (SVM). After testing, the accuracy of AIA-SSA-SVM prediction for various network attacks has been greatly improved. It not only shows that AIA-SSA-SVM has a broad application prospect in the field of network security, but also verifies the feasibility and advanced nature of AIA-SSA in solving practical engineering problems.
Introduction
Bionic population intelligence optimization algorithms have been developed rapidly in recent years, such as Ant Colony Optimization (ACO) [1], Cuckoo Search (CS) [2], Grey Wolf Optimization (GWO) [3], Whale Optimization Algorithm (WOA) [4], Particle Swarm Optimization (PSO) [5], and Sparrow Search Algorithm (SSA), which are based on the behavior and habits of various creatures, and are applied to solve the optimal solution of complex problems. Swarm intelligence algorithm can only solve constrained optimization problems because it must specify the size of search space. The core idea is to find the optimal solution in a certain range of search space. The Sparrow Search Algorithm (SSA) was proposed by literature in 2020 as a new population intelligence optimization algorithm [6].
The algorithm is applied to mathematical problems by imitating the behavior of sparrows in nature while foraging for food. The Sparrow Search Algorithm (SSA) is characterized by high search accuracy, fast convergence, stability, and robustness compared with other intelligent optimization algorithms [7]. However, because population intelligence algorithms have a common problem, when the search approaches the global optimum, it is highly susceptible to a reduction in population diversity, which leads to problems such as falling into a local optimum [8–10]. Therefore, scholars have carried out a lot of research in this area [11–14].
Mao proposed to use cubic mapping to initialize the population and use a backward learning strategy to introduce elite particles to enhance the population diversity and expand the range of the search region, and then introduce a positive cosine algorithm with a linear decreasing strategy to balance the algorithm development and exploration ability [15]. Zhang proposed an immune particle swarm algorithm based on adaptive search by combining artificial immune algorithm [16]. The artificial immune optimization algorithm (AIA) [17] was developed from the artificial immune system to control the diversity of population antibodies by its unique concentration mechanism, memory regulation mechanism and vaccination mechanism to ensure the presence of good antibodies at low concentration. The artificial immune optimization algorithm was developed based on the genetic algorithm (GA) [18], which overcomes the disadvantages of the genetic optimization algorithm of falling into the local optimum, but the disadvantages are also obvious. Although the way of population evolve is the same as the genetic optimization algorithm, but the convergence speed is significantly behind the genetic optimization algorithm. In the fusion algorithm, artificial immune optimization algorithm is often used as an auxiliary algorithm to improve population diversity [19].
At the same time, intrusion detection system is a proactive security protection technology [20]. Through real-time monitoring of the network. It can effectively perceive network attacks and provide response decisions for security managers [21]. The main existing technologies are as follows: K nearest neighbor algorithm [22], decision tree [23], support vector machine (SVM) [24]. Compared with other algorithms, support vector machine can better solve the problem of small samples, has strong generalization ability, and is considered as a more effective intrusion detection algorithm [25]. Sahu normalized KDD1999 data with z-score, used compression sampling method for feature compression, and then classified the compression results with SVM. The proposed method had a low False Positive Rate (FPR), it can effectively detect denial-of-service attacks and probe attacks [26]. Jiang adopted SVM-based data mining method and rough set theory for feature selection, thus reducing the need for manual analysis task [27]. Ahmim proposed an improved genetic algorithm (GA) to optimize the intrusion detection method of support vector machines, and designed a fitness function based on classification accuracy, false positive rate and data feature dimension [28].
Inspired by the above literature, this paper fuses the search seeking method of the classical sparrow algorithm with the artificial immune algorithm to improve the sparrow search algorithm, and on the basis of the traditional immune algorithm, introducing concentration regulation and affinity mechanisms. The subpopulation size is dynamically adjusted and the search range is automatically adjusted according to the maximum concentration value of antibodies, which increases the diversity of the whole sparrow population, thus solving the local convergence problem of the sparrow algorithm and improving the performance of the whole algorithm with good convergence accuracy and global search capability. In order to avoid the excessive impact on the sparrow algorithm’s own merit search due to the introduction of traditional artificial immunization algorithms, we introduced the idea of a two-branch antibody population iteration, i.e., excellent sparrow genetics and mutation at the same time. The weight of both is dynamically mediated by the immune concentration. This algorithm combines the advantages of the artificial immunization algorithm and the sparrow search algorithm to complement each other. Finally, we introduce AIA-SVM into support vector machine to construct AIA-SSA-SVM network intrusion detection model, and test and compare the detection effect of the model.
Sparrow search algorithm
The Sparrow Search Algorithm (SSA) can be abstracted as a discoverer-joiner-reconnaissance warning mechanism model.
Discoverer mechanism: assume that in a D dimensional search space, there exist N sparrows, and the ith sparrow in the D dimensional search space is the position of x i = [xi1, . . . , x id , . . . , x iD ], where i = 1, 2, . . . , N, and x id denotes the position of the i position of the sparrow in the dth dimension.
The discoverers generally account for the population 10% to 20% and the location update equation is as Equation (1).
In Equation (1), The t represents the current number of iterations. T is the maximum number of iterations. α is a random value between 0 and 1 (excluding 0). Q is a random number that obeys the standard normal distribution. L denotes a matrix of sizes 1 × d and the elements of which are 1. R2 ∈ [0, 1] and ST ∈ [0.5, 1] denote the warning value and the safety value, respectively. When R2 < ST the population does not detect the presence of predator or other dangers, the discoverer can search extensively and guide the population to obtain higher adaptation; when R2 ⩾ ST time, the detecting sparrow found the predator and immediately released the danger signal, the population immediately adjusted the search strategy and rapidly approached the safe area.
Joiner mechanism: Except for the discoverer, the remaining sparrows are treated as joiners and their positions are updated according to the Equation (2).
In Equation (2): the
Early-warning agent mechanism: reconnaissance early-warning sparrows generally account for 10% to 20% of the population, and the location is updated as Equation (3).
In Equation (3), The β denotes the step control parameter that is subject to a mean of 0 and the variance is 1 and it is a normally distributed random number. K is a random number between [–1,1], e is a very small constant number and it is used to avoid a denominator of 0. f i denotes the ith sparrow’s fitness value, and f g and f w denote the optimal and the worst fitness values of the current sparrow population, respectively. When f i ≠ f g , it indicates that the sparrow is at the edge of the population and is highly vulnerable to predators; When f i = f g , it indicates that the sparrow is in the middle of the population.
Basic idea of the algorithm
The traditional sparrow algorithm has a certain degree of randomness in performing sparrow position transformation, which leads to an extremely easy to fall into local convergence in the process of finding the best. To solve this problem, most solutions in the literature have focused on determining whether aggregation occurs by comparing the fitness values of each sparrow, but once encounter a high-dimensional function, the effect is found to be very limited.
(1) The concentration regulation mechanism: it is used to regulate the population antibody diversity by ensuring that low concentrations of highly adapted antibodies are present in moderate amounts meanwhile low concentrations or low levels of adapted individuals are present in small amounts. This process can increase the population diversity and prevent the population from being trapped in a local optimum due to the presence of a large number of highly adapted individuals. The calculation is shown in Equations (4) and (5).
ρ (x i ) is the number of sparrows in the population representing how many sparrows are similar to the ith sparrow. When the sparrow is similar to a certain sparrow in all dimensions, the two are judged to be similar and the value is 1 or 0, d (x i ) is the concentration of the ith sparrow.
(2) The affinity mechanism: the evaluation index of the AIA algorithm is determined by the affinity [29], which includes the affinity of the antibody to the antigen (fitness) and the affinity between the antibodies (concentration). The affinity is calculated by Equation (6).
In Equation (6), p (x i ) is the ith affinity value of the first sparrow, and f (x i ) is the affinity of the ith sparrow, and d (x i ) is the i concentration value of the ith sparrow, and α is the weighting factor.
At the same time, if the immune algorithm is simply tandem with the sparrow algorithm, i.e., the population is first selected for superiority and inferiority by the artificial immune algorithm during population update, and then the sparrow position is updated by the standard sparrow algorithm. Although it improves a lot in the diversity of sparrow positions, the introduction of concentration mechanism at the early stage of evolution when the diversity of sparrow positions is sufficient will instead reduce the diversity of the population. Therefore, this paper adopts the approach of reducing the role of the artificial immunity algorithm at the early stage of sparrow population evolution, i.e., when the diversity is sufficient, to improve the efficiency of the algorithm; and increasing the role of the artificial immunity algorithm at the later stage of evolution to prevent premature stagnation.
In order to solve the problem, this paper proposes to start from the position coordinates of each sparrow, borrow the idea of artificial immunity algorithm, adaptively retain a part of the position coordinates of the outstanding sparrows from the previous iteration during each round of iteration, and add a custom sparrow variation mechanism to expand the population of sparrows. Finally, the total sparrow population is controlled to enter a new round of iterations in the sparrow algorithm.
The sparrow population is divided into two subpopulations with variable numbers of l A and l B , populations l A retains those sparrows with good variability that have low concentrations of low adaptation. Population l B is poorly performing sparrows with a high concentration and adaptation are retained and mutation is performed on these sparrows.
Here we use the defined concentration ratio C
r
to control the number of the two populations, i.e.
In this Equation (7), C max is the maximum sparrow concentration value, and C i is the ith sparrow, and M is the total number of the sparrow populations.
That lower the C
r
value means that the sparrow population diversity is better at this time and one can increase the number of l
A
sparrow population and reduce the number of immune algorithm variation; if the value of C
r
is higher, it means that the sparrow population diversity is poor and the number of sparrow population can be reduced. The number of l
A
sparrow population, the corresponding l
B
sparrow population to increase the proportion in the whole sparrow population.
Where M is the total number of sparrow populations, then l A sparrow population includes sparrows of the top affinity ranked within num A and the rest are the l B sparrow populations. Thus, the number of the two populations varies with the sparrow concentration.
For sparrow population l
B
, which needs variation, in order to ensure both the diversity of the mutated sparrow population and the outward expansion of the mutated sparrow population around the location of the outstanding sparrow. In this paper, the Artificial immune algorithm-Sparrow Search algorithm is constructed in section 3.1 based on the variation algorithm in differential evolution algorithm. Where, (xr1, xr2, xr3) denotes the number of sparrows selected from l
A
the three sparrows selected from the sparrow population with lower fitness i.e. converging to the optimal solution. If you want n individuals, the size of n must be a multiple of three, and the n sparrows with the lowest fitness in the current population, so there will be n/3 variations, and the l
B
population will choose evenly from these n/3 variations. However, the experiment in this paper selected three sparrows with the lowest fitness here
In this Equation (9), Xi,j (t + 1) is the ith sparrow antibody of the jth dimension at the value of taken after the t + 1 mutation. lb is the lower boundary and ub is the upper boundary. And, F refers to a random number between 0 and 2. Where random (lb, ub) refers to random (lb, ub) random number between 0 and 2.
Then the above Equations (4) to (9) can both generate some random sparrows around some well-performing sparrows, thus ensuring excellent inheritance between each iteration, and also satisfy the requirements of low-dimensional search for sparrow location diversity, while performing better in high-dimensional search. Thus, the improved AIA-SSA algorithm can effectively improve the global search ability and reduce the local convergence.
The flow of the algorithm implementation discussed in this paper is as follows.
If is no: Turn to Step8;
If is yes: Stop iteration and output the optimal solution.
Experimental results and analysis
Experimental design and environment
Selection of the benchmark function
In order to test AIA-SSA for its ability to find the best in different functions, as well as to test the feasibility and superiority of the algorithm, ten different types of benchmark functions are selected and simulated in different dimensions to verify the ability of the algorithm to find the best in low and high dimensional spaces, as shown in Table 1 below. The numbers of dimensions selected by functions F1 to F10 are 20, 2, 20, 2, 5, 20, 20, 20, 2, and 2, respectively [30].
Basis test functions
Basis test functions
The experiments in this paper are all carried out on a computer configured as Intel(R) Core (TM) i7-10750H CPU@ 2.60GHz, with a memory size of 16GB and an operating system of Window10. Where the code for the experiments is written in the Python (version 3.8.3), the development tool is PyCharm, and PyCharm’s version is 2020.3.3, and the graphing tool for testing the benchmark functions is MATLAB, and MATLAB’s version 2018a.
AIA-SSA algorithm test benchmark function
In this paper, to verify the feasibility of the AIA-SSA algorithm and the specific performance of the algorithm, we selected these ten benchmark functions, and also selected the PSO, GWO, WOA, AOC, CS, and SSA, and used these six optimization algorithms and the AIA-SSA algorithm in the ten benchmark functions to find the optimal performance, respectively, with the parameters set as follows: the number of iterations is 100, the number of populations is 100, and to avoid chance, we conducted ten tests separately to find the average optimal value, the average running time and the optimal value among the ten times, and the results are shown in Table 2. Figure 1 show the iterative process and its comparison when the SSA, AIA-SSA, GWO, PSO, WOA, AOC, and CS achieve the optimal values in ten tests, respectively, for the F1 to F10 functions, where the y-axis is the optimal value and the x-axis is the number of iterations.
Comparison of the seven algorithms SSA, AIA-SSA, GWO, PSO, WOA, AOC, and CS
Comparison of the seven algorithms SSA, AIA-SSA, GWO, PSO, WOA, AOC, and CS

Convergence curves of SSA, AIA-SSA, GGO, PSO, WOA, AOC, and CS on ten test functions.
(1) Algorithm convergence accuracy evaluation: According to the average value column in Table 2, it can be clearly seen that the optimal value obtained by AIA-SSA algorithm in ten experiments is more competitive than that of SSA. Especially in the test results of F1 function, we find that the convergence accuracy of AIA-SSA is 20 orders of magnitude better than that of SSA. The result of F9 and F10 is the optimal value in theory. It is also far better than the other five algorithms.
(2) Stability evaluation of the algorithm: the average values in Table 2 are repeated tests for each test function for 10 times, and the standard deviation of the test results is calculated. The smaller the standard deviation is, the more stable the algorithm optimization results are. We can see from this column that AIA-SSA has better stability than SSA and is far superior to the other five algorithms. F9 and F10 functions are directly calculated to the theoretical optimal value, and the stability is the best.
(3) Evaluation of algorithm convergence speed: It can be clearly seen from Fig. 1 that AIA-SSA algorithm has the fastest convergence speed, especially in the five groups of test functions F1, F3, F5, F8, and F10, which achieve the priority of the whole process. The final convergence accuracy of the other five functions is still optimal, because other algorithms tend to fall into local convergence.
We select F1, F2, F3, F4, and F6 functions to draw their 3D images (see the first column of Fig. 2), search for the distribution of sparrow population in the space during a random iteration (see the second and third columns of Fig. 2), and search trajectory during SSA optimization (see the fourth column of Fig. 2). And the corresponding convergence curve (see the fifth column of Fig. 2). The parameters of this experiment were set as follows: the maximum number of iterations was 100, the dimension was 2, and the population size was 100.

3D image of the delta function.
Data preprocessing
In order to verify the feasibility and improvement of the algorithm proposed in this paper in practical applications, this paper selects the SVM model commonly used in network intrusion detection, uses the AIA-SSA algorithm for the optimization of the model parameters, and uses the optimized SVM for the prediction and evaluation of network intrusion detection for the KDD99 dataset.
The KDD99 dataset has 41 dimensional features and 1 dimensional attack features, and the specific information is shown in Table 3. At the same time, the dataset contains four main abnormal attack behaviors, and each major category in the KDD99 dataset contains several attack types of small classes, and finally 38 attack types appear. See Table 4. We took the data in the first 41 columns of each data as features and quantified them. Finally, we randomly selected 5000 pieces of data from the processed data as the data set used in the experiment.
Attribute information for the KDD dataset
Attribute information for the KDD dataset
Various attacks on KDD datasets
In order to avoid the influence of other uncontrollable factors on the training accuracy, we take a ten-fold cross-validation to obtain the average, and the available results are compared with the prediction results of SVM without optimized parameters, and the prediction results of SVM parameters optimized with Sparrow Search Algorithm (SSA) and Genetic Algorithm (GA), respectively, and we set SSA and GA and the parameters of the algorithms in this paper as such lb = [0.005, 200].
The maximum number of iterations is 20, the dimension is 2, the parameter C of the optimization-seeking SVM and the g of the RBF function, and the forecast accuracy are shown in Table 5.
Comparison of the results of optimized SVM by SSA and AIA-SSA algorithms
Comparison of the results of optimized SVM by SSA and AIA-SSA algorithms
As shown in Table 5, compared with SSA, AIA-SSA in this paper has a great improvement in optimizing the SVM model and improving the prediction accuracy. The overall prediction accuracy of AIA-SSA-SVM is 10.29% higher than GA-SVM and 3.29% higher than SSA-SVM. Meanwhile, for different attack types, the prediction accuracy of AIA-SSA-SVM is greatly improved, the most obvious is for U2R attack. The prediction accuracy of AIA-SSA-SVM is 16.76% higher than that of SSA-SVM, and the prediction accuracy of R2L attack is also greatly improved. The prediction of these two attacks is just a weakness of SSA-SVM, but AIA-SSA-SVM greatly compensates for this weakness.
The Fig. 3 show the comparison between the predicted values and the real values calculated by the algorithm in this paper and the optimized SVM parameters of the SSA algorithm, respectively. 1, 2, 3, and 4 in the vertical coordinates indicate whether it suffers from DoS attack, Probe attack, R2L attack, and U2R attack respectively, and 0 on the vertical axis is under attack; and the horizontal coordinates indicate the first few data in the test set, with a total of 1000 data.

Comparison of three intrusion detection models (GA-SVM, SSA-SVM, and AIA-SSA-SVM).
The blue star indicates the label value of each data in the test set, and the red dot indicates the value predicted by the trained SVM model. The blue star clearly highlights the data prediction failure, and the red dot covers the blue star indicates that the data set was predicted successfully. It can be clearly seen that the improvement of the prediction results after optimizing SVM by AIA-SSA is more obvious than the original SSA and GA.
In this paper, we study the problem that the SSA algorithm is very easy to fall into local convergence in the process of finding the optimal value, and at the same time, considering the good performance of the immune algorithm in finding global convergence, so we integrate the concentration mediation mechanism of the immune algorithm, while adding the two-branch population transfer problem, and integrate the idea of differential evolution in the process of variation. This paper presents an adaptive sparrow algorithm based on immune algorithm concentration mechanism and finally form the AIA-SSA algorithm, which can effectively retain its good performance of the search for superiority while broadening the search range of sparrow in each iteration, improving its ability of global search and avoiding falling into local convergence. After two experimental comparisons, it can be found that the algorithm has a large improvement in both theoretical and practical application of feasibility and optimization performance, its optimization-seeking effect is much higher than the other six algorithms in the same parameter selection and operation environment. It is found that this algorithm will have a great application prospect in network intrusion detection. The performance of the optimized SVM model in network intrusion detection shows that the AIA-SSA algorithm is a more obvious improvement to the SSA algorithm and the GA algorithm, and the good performance of the algorithm can be more specifically reflected in the actual application.
The AIA-SSA algorithm is still in the initial stage of research, and the application of this algorithm in future practical engineering will be the main direction and focus of future improvement. For example, the AIA-SSA-SVM model is tested and improved again in the actual network environment, and the practical application of AIA-SSA in power system scheduling, path planning and other aspects, as well as the required improvement when applied in these practical scenarios.
