Artificial immune algorithm-sparrow search algorithm and its application in network intrusion detection

Abstract

In order to solve the problem that the population diversity of sparrow search algorithm (SSA) decreases and easily falls into the local optimal solution when it approaches the global optimal, an artificial immune algorithm-sparrow search algorithm (AIA-SSA) is proposed in this paper by combining artificial immune algorithm and sparrow search algorithm. This paper uses 10 benchmark functions for experimental simulation of AIA-SSA algorithm, and compares it with five widely used intelligent algorithms and SSA. Experimental results show that AIA-SSA overcomes the deficiency of SSA and improves the search accuracy, convergence speed and stability of the algorithm. Meanwhile, this paper applies AIA-SSA to network intrusion detection and constructs a network intrusion detection model based on support vector machine (SVM). After testing, the accuracy of AIA-SSA-SVM prediction for various network attacks has been greatly improved. It not only shows that AIA-SSA-SVM has a broad application prospect in the field of network security, but also verifies the feasibility and advanced nature of AIA-SSA in solving practical engineering problems.

Keywords

Intrusion detection SSA intelligence algorithm adaptive search differential evolution algorithm

1 Introduction

Bionic population intelligence optimization algorithms have been developed rapidly in recent years, such as Ant Colony Optimization (ACO) [1], Cuckoo Search (CS) [2], Grey Wolf Optimization (GWO) [3], Whale Optimization Algorithm (WOA) [4], Particle Swarm Optimization (PSO) [5], and Sparrow Search Algorithm (SSA), which are based on the behavior and habits of various creatures, and are applied to solve the optimal solution of complex problems. Swarm intelligence algorithm can only solve constrained optimization problems because it must specify the size of search space. The core idea is to find the optimal solution in a certain range of search space. The Sparrow Search Algorithm (SSA) was proposed by literature in 2020 as a new population intelligence optimization algorithm [6].

The algorithm is applied to mathematical problems by imitating the behavior of sparrows in nature while foraging for food. The Sparrow Search Algorithm (SSA) is characterized by high search accuracy, fast convergence, stability, and robustness compared with other intelligent optimization algorithms [7]. However, because population intelligence algorithms have a common problem, when the search approaches the global optimum, it is highly susceptible to a reduction in population diversity, which leads to problems such as falling into a local optimum [8 –10]. Therefore, scholars have carried out a lot of research in this area [11 –14].

Mao proposed to use cubic mapping to initialize the population and use a backward learning strategy to introduce elite particles to enhance the population diversity and expand the range of the search region, and then introduce a positive cosine algorithm with a linear decreasing strategy to balance the algorithm development and exploration ability [15]. Zhang proposed an immune particle swarm algorithm based on adaptive search by combining artificial immune algorithm [16]. The artificial immune optimization algorithm (AIA) [17] was developed from the artificial immune system to control the diversity of population antibodies by its unique concentration mechanism, memory regulation mechanism and vaccination mechanism to ensure the presence of good antibodies at low concentration. The artificial immune optimization algorithm was developed based on the genetic algorithm (GA) [18], which overcomes the disadvantages of the genetic optimization algorithm of falling into the local optimum, but the disadvantages are also obvious. Although the way of population evolve is the same as the genetic optimization algorithm, but the convergence speed is significantly behind the genetic optimization algorithm. In the fusion algorithm, artificial immune optimization algorithm is often used as an auxiliary algorithm to improve population diversity [19].

At the same time, intrusion detection system is a proactive security protection technology [20]. Through real-time monitoring of the network. It can effectively perceive network attacks and provide response decisions for security managers [21]. The main existing technologies are as follows: K nearest neighbor algorithm [22], decision tree [23], support vector machine (SVM) [24]. Compared with other algorithms, support vector machine can better solve the problem of small samples, has strong generalization ability, and is considered as a more effective intrusion detection algorithm [25]. Sahu normalized KDD1999 data with z-score, used compression sampling method for feature compression, and then classified the compression results with SVM. The proposed method had a low False Positive Rate (FPR), it can effectively detect denial-of-service attacks and probe attacks [26]. Jiang adopted SVM-based data mining method and rough set theory for feature selection, thus reducing the need for manual analysis task [27]. Ahmim proposed an improved genetic algorithm (GA) to optimize the intrusion detection method of support vector machines, and designed a fitness function based on classification accuracy, false positive rate and data feature dimension [28].

Inspired by the above literature, this paper fuses the search seeking method of the classical sparrow algorithm with the artificial immune algorithm to improve the sparrow search algorithm, and on the basis of the traditional immune algorithm, introducing concentration regulation and affinity mechanisms. The subpopulation size is dynamically adjusted and the search range is automatically adjusted according to the maximum concentration value of antibodies, which increases the diversity of the whole sparrow population, thus solving the local convergence problem of the sparrow algorithm and improving the performance of the whole algorithm with good convergence accuracy and global search capability. In order to avoid the excessive impact on the sparrow algorithm’s own merit search due to the introduction of traditional artificial immunization algorithms, we introduced the idea of a two-branch antibody population iteration, i.e., excellent sparrow genetics and mutation at the same time. The weight of both is dynamically mediated by the immune concentration. This algorithm combines the advantages of the artificial immunization algorithm and the sparrow search algorithm to complement each other. Finally, we introduce AIA-SVM into support vector machine to construct AIA-SSA-SVM network intrusion detection model, and test and compare the detection effect of the model.

2 Sparrow search algorithm

The Sparrow Search Algorithm (SSA) can be abstracted as a discoverer-joiner-reconnaissance warning mechanism model.

Discoverer mechanism: assume that in a D dimensional search space, there exist N sparrows, and the ith sparrow in the D dimensional search space is the position of x_i = [x_i1, . . . , x_id, . . . , x_iD], where i = 1, 2, . . . , N, and x_id denotes the position of the i position of the sparrow in the dth dimension.

The discoverers generally account for the population 10% to 20% and the location update equation is as Equation (1). $x_{id}^{t + 1} = {\begin{matrix} x_{id}^{t} \cdot \exp (\frac{- i}{α \cdot T}), R_{2} < ST \\ x_{id}^{t} + Q \cdot L, R_{2} ⩾ ST \end{matrix}$ (1)

In Equation (1), The t represents the current number of iterations. T is the maximum number of iterations. α is a random value between 0 and 1 (excluding 0). Q is a random number that obeys the standard normal distribution. L denotes a matrix of sizes 1 × d and the elements of which are 1. R₂ ∈ [0, 1] and ST ∈ [0.5, 1] denote the warning value and the safety value, respectively. When R₂ < ST the population does not detect the presence of predator or other dangers, the discoverer can search extensively and guide the population to obtain higher adaptation; when R₂ ⩾ ST time, the detecting sparrow found the predator and immediately released the danger signal, the population immediately adjusted the search strategy and rapidly approached the safe area.

Joiner mechanism: Except for the discoverer, the remaining sparrows are treated as joiners and their positions are updated according to the Equation (2). $x_{id}^{t + 1} = {\begin{matrix} {xb}_{d}^{t + 1} + \frac{1}{D} \sum_{d = 1}^{D} (rand {- 1, 1} \cdot | x_{id}^{t} - {xb}_{d}^{t + 1} |), i ⩽ \frac{n}{2} \\ Q \cdot \exp (({xw}_{d}^{t} - x_{id}^{t}) / - i^{2}), i > \frac{n}{2} \end{matrix}$ (2)

In Equation (2): the ${xw}_{d}^{t}$ denote the sparrow is in the worst position of the dth dimension in the populati- on’s tth iteration. ${xb}_{d}^{t + 1}$ denotes the best position of dth dimension at the (t + 1)th iteration of the population. When i > n/2, it indicates that the ith joiner is not getting food, is in a hungry state, has a low adaptation level. When i ⩽ n/2, the ith joiner will find a random location near the current optimal location to forage.

Early-warning agent mechanism: reconnaissance early-warning sparrows generally account for 10% to 20% of the population, and the location is updated as Equation (3). $x_{id}^{t + 1} = {\begin{matrix} {xb}_{d}^{t} + β (x_{id}^{t} - {xb}_{d}^{t}), f_{i} \neq f_{g} \\ x_{id}^{t} + K (\frac{x_{id}^{t} - {xw}_{d}^{t}}{| f_{i} - f_{w} | + e}), f_{i} = f_{g} \end{matrix}$ (3)

In Equation (3), The β denotes the step control parameter that is subject to a mean of 0 and the variance is 1 and it is a normally distributed random number. K is a random number between [–1,1], e is a very small constant number and it is used to avoid a denominator of 0. f_i denotes the ith sparrow’s fitness value, and f_g and f_w denote the optimal and the worst fitness values of the current sparrow population, respectively. When f_i ≠ f_g, it indicates that the sparrow is at the edge of the population and is highly vulnerable to predators; When f_i = f_g, it indicates that the sparrow is in the middle of the population.

3 Improved hybrid intelligence algorithm

3.1 Basic idea of the algorithm

The traditional sparrow algorithm has a certain degree of randomness in performing sparrow position transformation, which leads to an extremely easy to fall into local convergence in the process of finding the best. To solve this problem, most solutions in the literature have focused on determining whether aggregation occurs by comparing the fitness values of each sparrow, but once encounter a high-dimensional function, the effect is found to be very limited.

(1) The concentration regulation mechanism: it is used to regulate the population antibody diversity by ensuring that low concentrations of highly adapted antibodies are present in moderate amounts meanwhile low concentrations or low levels of adapted individuals are present in small amounts. This process can increase the population diversity and prevent the population from being trapped in a local optimum due to the presence of a large number of highly adapted individuals. The calculation is shown in Equations (4) and (5). $ρ (x_{i}) = \sum_{j = 1}^{M} {\begin{matrix} 1, 0.95 ⩽ \frac{x_{i}}{x_{j}} ⩽ 1.05, \\ 0, otherwise \end{matrix}$ (4) $d (x_{i}) = \frac{ρ (x_{i})}{M}, i = 1, 2, 3, \dots, M .$ (5)

ρ (x_i) is the number of sparrows in the population representing how many sparrows are similar to the ith sparrow. When the sparrow is similar to a certain sparrow in all dimensions, the two are judged to be similar and the value is 1 or 0, d (x_i) is the concentration of the ith sparrow.

(2) The affinity mechanism: the evaluation index of the AIA algorithm is determined by the affinity [29], which includes the affinity of the antibody to the antigen (fitness) and the affinity between the antibodies (concentration). The affinity is calculated by Equation (6). $p (x_{i}) = α f (x_{i}) + (1 - α) d (x_{i})$ (6)

In Equation (6), p (x_i) is the ith affinity value of the first sparrow, and f (x_i) is the affinity of the ith sparrow, and d (x_i) is the i concentration value of the ith sparrow, and α is the weighting factor.

At the same time, if the immune algorithm is simply tandem with the sparrow algorithm, i.e., the population is first selected for superiority and inferiority by the artificial immune algorithm during population update, and then the sparrow position is updated by the standard sparrow algorithm. Although it improves a lot in the diversity of sparrow positions, the introduction of concentration mechanism at the early stage of evolution when the diversity of sparrow positions is sufficient will instead reduce the diversity of the population. Therefore, this paper adopts the approach of reducing the role of the artificial immunity algorithm at the early stage of sparrow population evolution, i.e., when the diversity is sufficient, to improve the efficiency of the algorithm; and increasing the role of the artificial immunity algorithm at the later stage of evolution to prevent premature stagnation.

In order to solve the problem, this paper proposes to start from the position coordinates of each sparrow, borrow the idea of artificial immunity algorithm, adaptively retain a part of the position coordinates of the outstanding sparrows from the previous iteration during each round of iteration, and add a custom sparrow variation mechanism to expand the population of sparrows. Finally, the total sparrow population is controlled to enter a new round of iterations in the sparrow algorithm.

The sparrow population is divided into two subpopulations with variable numbers of l_A and l_B, populations l_A retains those sparrows with good variability that have low concentrations of low adaptation. Population l_B is poorly performing sparrows with a high concentration and adaptation are retained and mutation is performed on these sparrows.

Here we use the defined concentration ratio C_r to control the number of the two populations, i.e. $C_{r} = \frac{\sum_{i = 1}^{M} C_{i}}{{MC}_{\max}}$ (7)

In this Equation (7), C_max is the maximum sparrow concentration value, and C_i is the ith sparrow, and M is the total number of the sparrow populations.

That lower the C_r value means that the sparrow population diversity is better at this time and one can increase the number of l_A sparrow population and reduce the number of immune algorithm variation; if the value of C_r is higher, it means that the sparrow population diversity is poor and the number of sparrow population can be reduced. The number of l_A sparrow population, the corresponding l_B sparrow population to increase the proportion in the whole sparrow population. ${\begin{matrix} {mum}_{A} = M \cdot C_{r} \\ {num}_{B} = M - {num}_{A} \end{matrix}$ (8)

Where M is the total number of sparrow populations, then l_A sparrow population includes sparrows of the top affinity ranked within num_A and the rest are the l_B sparrow populations. Thus, the number of the two populations varies with the sparrow concentration.

For sparrow population l_B, which needs variation, in order to ensure both the diversity of the mutated sparrow population and the outward expansion of the mutated sparrow population around the location of the outstanding sparrow. In this paper, the Artificial immune algorithm-Sparrow Search algorithm is constructed in section 3.1 based on the variation algorithm in differential evolution algorithm. Where, (x_r1, x_r2, x_r3) denotes the number of sparrows selected from l_A the three sparrows selected from the sparrow population with lower fitness i.e. converging to the optimal solution. If you want n individuals, the size of n must be a multiple of three, and the n sparrows with the lowest fitness in the current population, so there will be n/3 variations, and the l_B population will choose evenly from these n/3 variations. However, the experiment in this paper selected three sparrows with the lowest fitness here ${\begin{matrix} X_{i, j} (t + 1) = X_{r 1, j} (t) + F (X_{r 2, j} (t) - X_{r 3, j} (t)) \\ where, lb ⩽ (X_{r 1, j} (t) + F (X_{r 2, j} (t) - X_{r 3, j} (t))) ⩽ ub \\ otherwise, X_{i, j} (t + 1) = random (lb, ub) \end{matrix}$ (9)

In this Equation (9), X_i,j (t + 1) is the ith sparrow antibody of the jth dimension at the value of taken after the t + 1 mutation. lb is the lower boundary and ub is the upper boundary. And, F refers to a random number between 0 and 2. Where random (lb, ub) refers to random (lb, ub) random number between 0 and 2.

Then the above Equations (4) to (9) can both generate some random sparrows around some well-performing sparrows, thus ensuring excellent inheritance between each iteration, and also satisfy the requirements of low-dimensional search for sparrow location diversity, while performing better in high-dimensional search. Thus, the improved AIA-SSA algorithm can effectively improve the global search ability and reduce the local convergence.

3.2 Algorithm flow

The flow of the algorithm implementation discussed in this paper is as follows.

Step1: Define the population size is N, the location dimension of sparrows is M, the search space of sparrows is [lb, ub], the maximum number of iterations is T, and the proportion of discoverers, and the proportion of warner. Initialize the sparrow population X^t.

Step2: The concentration of each sparrow position in the initialized population and the affinity, respectively, according to Equation (4) to Equation (6).

Step3: Generation of antibody information sheets based on sparrow antibody populations, for each sparrow in order of affinity.

Step4: According to Equation (7) as well as Equation (8) to classify the l_A sparrow population as well as the l_B sparrow populations, and select the three (or n) sparrows with the lowest adaptation. Their distribution is (x_r1, x_r2, x_r3).

Step5: The sparrow population is mutated according to Equation (9) to obtain the (t + 1)th result of the l_B.

Step6: The variation results of Step5 were fused with the sparrow population l_A to form l_A new population X^t+1.

Step7: Whether the iteration reaches the maximum number of iterations.

If is no: Turn to Step8;

If is yes: Stop iteration and output the optimal solution.

Step8: Bring the new population X^t+1 Bring into the sparrow algorithm for calculation, and update the discoverers, joiners and hazards respectively according to Equation (1) to Equation (3). Update the global optimum and turn to Step2.

4 Experimental results and analysis

4.1 Experimental design and environment

4.1.1 Selection of the benchmark function

In order to test AIA-SSA for its ability to find the best in different functions, as well as to test the feasibility and superiority of the algorithm, ten different types of benchmark functions are selected and simulated in different dimensions to verify the ability of the algorithm to find the best in low and high dimensional spaces, as shown in Table 1 below. The numbers of dimensions selected by functions F1 to F10 are 20, 2, 20, 2, 5, 20, 20, 20, 2, and 2, respectively [30].

Table 1
Basis test functions

Marker Baseline test functions Function Formula Search space Dim. Optimum value

F1 Sphere function $f (x) = \sum_{i = 1}^{D} x_{i}^{2}$ [–100,100] 20 0

F2 Rosenbrock function $f (x) = \sum_{i = 1}^{D - 1} [100 {(x_{i}^{2} - x_{i + 1})}^{2} + {(x_{i} - 1)}^{2}]$ [–2.48,2.48] 2 0

F3 Ackley function $f (x) = - 20 e^{- 0.2 \sqrt{\frac{1}{D} \sum_{i = 1}^{D} x_{i}^{2}}} - e^{\frac{1}{D} \sum_{i = 1}^{D} cos {(2 π x_{i})}^{2}} + e + 20$ [–32,32] 20 0

F4 Schwefel function $f (x) = 418.4829 D - \sum_{i = 1}^{D} x_{i} sin \sqrt{| x_{i} |}$ [–500,500] 2 0

F5 Schaffer function $f (x) = 0.5 + \frac{{(sin \sqrt{\sum_{i = 1}^{D} x_{i}^{2}})}^{2} - 0.5}{{[1 + 0.001 (\sum_{i = 1}^{D} x_{i}^{2})]}^{2}}$ [–10,10] 5 0

F6 Rastrigin function $f (x) = \sum_{i = 1}^{D} [x_{i}^{2} - 10 cos (2 π x_{i}) + 10]$ [–5.12,5.12] 20 0

F7 Griewank function $f (x) = 1 + \frac{1}{4000} \sum_{i = 1}^{D} x_{i}^{2} - \prod_{i = 1}^{D} cos \frac{x_{i}}{\sqrt{i}}$ [–600,600] 20 0

F8 Quartic function with noise $f (x) = \sum_{i = 1}^{D} {ix}_{i}^{4} + random [0, 1)$ [–1.28,1.28] 20 0

F9 Easom function f (x) = cos(x₁) cos(x₂) exp [- (x₁ - π) ² - (x₂ - π) ²] [–100,100] 2 –1

F10 Schaffer function $f (x) = {(x_{1}^{2} + x_{2}^{2})}^{0.25} [50 {(x_{1}^{2} + x_{2}^{2})}^{0.1} + 1]$ [–100,100] 2 0

Marker	Baseline test functions	Function Formula	Search space	Dim.	Optimum value
F1	Sphere function	$f (x) = \sum_{i = 1}^{D} x_{i}^{2}$	[–100,100]	20	0
F2	Rosenbrock function	$f (x) = \sum_{i = 1}^{D - 1} [100 {(x_{i}^{2} - x_{i + 1})}^{2} + {(x_{i} - 1)}^{2}]$	[–2.48,2.48]	2	0
F3	Ackley function	$f (x) = - 20 e^{- 0.2 \sqrt{\frac{1}{D} \sum_{i = 1}^{D} x_{i}^{2}}} - e^{\frac{1}{D} \sum_{i = 1}^{D} cos {(2 π x_{i})}^{2}} + e + 20$	[–32,32]	20	0
F4	Schwefel function	$f (x) = 418.4829 D - \sum_{i = 1}^{D} x_{i} sin \sqrt{\| x_{i} \|}$	[–500,500]	2	0
F5	Schaffer function	$f (x) = 0.5 + \frac{{(sin \sqrt{\sum_{i = 1}^{D} x_{i}^{2}})}^{2} - 0.5}{{[1 + 0.001 (\sum_{i = 1}^{D} x_{i}^{2})]}^{2}}$	[–10,10]	5	0
F6	Rastrigin function	$f (x) = \sum_{i = 1}^{D} [x_{i}^{2} - 10 cos (2 π x_{i}) + 10]$	[–5.12,5.12]	20	0
F7	Griewank function	$f (x) = 1 + \frac{1}{4000} \sum_{i = 1}^{D} x_{i}^{2} - \prod_{i = 1}^{D} cos \frac{x_{i}}{\sqrt{i}}$	[–600,600]	20	0
F8	Quartic function with noise	$f (x) = \sum_{i = 1}^{D} {ix}_{i}^{4} + random [0, 1)$	[–1.28,1.28]	20	0
F9	Easom function	f (x) = cos(x₁) cos(x₂) exp [- (x₁ - π) ² - (x₂ - π) ²]	[–100,100]	2	–1
F10	Schaffer function	$f (x) = {(x_{1}^{2} + x_{2}^{2})}^{0.25} [50 {(x_{1}^{2} + x_{2}^{2})}^{0.1} + 1]$	[–100,100]	2	0

4.1.2 Experimental environment

The experiments in this paper are all carried out on a computer configured as Intel(R) Core (TM) i7-10750H CPU@ 2.60GHz, with a memory size of 16GB and an operating system of Window10. Where the code for the experiments is written in the Python (version 3.8.3), the development tool is PyCharm, and PyCharm’s version is 2020.3.3, and the graphing tool for testing the benchmark functions is MATLAB, and MATLAB’s version 2018a.

4.2 AIA-SSA algorithm test benchmark function

In this paper, to verify the feasibility of the AIA-SSA algorithm and the specific performance of the algorithm, we selected these ten benchmark functions, and also selected the PSO, GWO, WOA, AOC, CS, and SSA, and used these six optimization algorithms and the AIA-SSA algorithm in the ten benchmark functions to find the optimal performance, respectively, with the parameters set as follows: the number of iterations is 100, the number of populations is 100, and to avoid chance, we conducted ten tests separately to find the average optimal value, the average running time and the optimal value among the ten times, and the results are shown in Table 2. Figure 1 show the iterative process and its comparison when the SSA, AIA-SSA, GWO, PSO, WOA, AOC, and CS achieve the optimal values in ten tests, respectively, for the F1 to F10 functions, where the y-axis is the optimal value and the x-axis is the number of iterations.

Table 2
Comparison of the seven algorithms SSA, AIA-SSA, GWO, PSO, WOA, AOC, and CS

Function Properties SSA AIA-SSA GWO PSO WOA AOC CS

F1 Optimum value 3.74E-14 1.02E-33 8.18E-07 2.87E-03 4.34E-05 1.24E-01 4.23E-05

Average value 1.66E-11 1.57E-31 1.67E-06 2.88E-02 1.74E-03 6.46E+01 7.28E-02

Running time 0.223 s 1.462 s 0.749 s 0.428 s 0.515 s 1.002 s 2.975 s

F2 Optimum value 5.88E-08 9.45E-16 5.48E-06 1.35E-04 4.26E-05 5.77E-15 1.54E-05

Average value 4.86E-07 1.42E-12 9.71E-05 5.22E-04 6.26E-06 3.21E-10 7.57E-02

Running time 0.116 s 0.225 s 0.156 s 0.127 s 0.097s 0.388 s 0.417 s

F3 Optimum value 1.33E-15 3.41E-22 1.89E-05 3.86E+00 1.70E-02 2.54E-06 4.21E-03

Average value 1.41E-13 9.52E-20 2.54E-05 4.27E+00 7.12E-02 6.83E+00 5.44E+00

Running time 0.302 s 1.019 s 0.925 s 0.142 s 0.537 s 1.360 s 3.114 s

F4 Optimum value 1.77E-04 1.85E-18 1.51E-02 1.16E-03 1.69E-02 8.51E-02 9.27E-01

Average value 2.34E-04 1.58E-12 7.55E-01 2.15E-03 1.35E+00 1.18E+02 2.96E+02

Running time 0.094 s 0.279 s 0.134 s 0.104 s 0.072 s 0.375 s 0.419 s

F5 Optimum value 7.77E-10 1.45E-19 2.44E-03 9.58E-03 2.44E-03 5.98E-08 6.18E-05

Average value 2.25E-08 9.16E-13 3.13E-03 1.28E-02 2.89E-03 2.45E-03 2.76E-02

Running time 0.147 s 0.324 s 0.434 s 0.270 s 0.181 s 0.674 s 0.953 s

F6 Optimum value 1.21E-02 2.08E-07 6.64E+00 1.46E+02 3.33E+00 4.58E-03 5.32E-08

Average value 8.44E-02 3.71E-05 8.19E+00 1.65E+02 4.31E+00 1.41E+01 3.80E-05

Running time 0.143 s 0.467 s 0.783 s 0.437 s 0.545 s 1.177 s 3.008 s

F7 Optimum value 1.26E-15 1.05E-19 9.71E-07 1.08E+01 5.73E-04 5.81E-05 8.21E-08

Average value 2.72E-14 1.84E-17 1.02E-06 1.37E+01 1.21E-03 2.40E+00 3.01E-04

Running time 0.464 s 1.178 s 1.247s 0.664 s 0.782 s 1.362 s 3.176 s

F8 Optimum value 4.01E-06 1.24E-08 7.41E-05 4.47E-03 6.32E-05 3.84E-04 2.96E-05

Average value 6.09E-04 2.33E-05 2.52E-03 7.49E-01 9.06E-03 9.02E-02 1.52E-03

Running time 0.649 s 1.521 s 2.008 s 0.965 s 1.253 s 1.438 s 3.192 s

F9 Optimum value –1.00E+00 –1.00E+00 –1.00E+00 –1.00E+00 –1.00E+00 –1.00E+00 –1.00E+00

Average value 0.00E+00 0.00E+00 4.25E-25 8.26E-06 4.96E-09 7.93E-16 6.14E-36

Running time 0.149 s 0.284 s 0.233 s 0.196 s 0.155 s 0.395 s 0.380 s

F10 Optimum value 0.00E+00 0.00E+00 7.63E-35 4.97E-09 1.29E-26 9.57E-30 5.29E-26

Average value 0.00E+00 0.00E+00 4.25E-22 5.84E-01 4.12E-16 2.83E-20 2.85E-17

Running time 0.170 s 0.301 s 0.230 s 0.202 s 0.156 s 0.413 s 0.371 s

Function	Properties	SSA	AIA-SSA	GWO	PSO	WOA	AOC	CS
F1	Optimum value	3.74E-14	1.02E-33	8.18E-07	2.87E-03	4.34E-05	1.24E-01	4.23E-05
	Average value	1.66E-11	1.57E-31	1.67E-06	2.88E-02	1.74E-03	6.46E+01	7.28E-02
	Running time	0.223 s	1.462 s	0.749 s	0.428 s	0.515 s	1.002 s	2.975 s
F2	Optimum value	5.88E-08	9.45E-16	5.48E-06	1.35E-04	4.26E-05	5.77E-15	1.54E-05
	Average value	4.86E-07	1.42E-12	9.71E-05	5.22E-04	6.26E-06	3.21E-10	7.57E-02
	Running time	0.116 s	0.225 s	0.156 s	0.127 s	0.097s	0.388 s	0.417 s
F3	Optimum value	1.33E-15	3.41E-22	1.89E-05	3.86E+00	1.70E-02	2.54E-06	4.21E-03
	Average value	1.41E-13	9.52E-20	2.54E-05	4.27E+00	7.12E-02	6.83E+00	5.44E+00
	Running time	0.302 s	1.019 s	0.925 s	0.142 s	0.537 s	1.360 s	3.114 s
F4	Optimum value	1.77E-04	1.85E-18	1.51E-02	1.16E-03	1.69E-02	8.51E-02	9.27E-01
	Average value	2.34E-04	1.58E-12	7.55E-01	2.15E-03	1.35E+00	1.18E+02	2.96E+02
	Running time	0.094 s	0.279 s	0.134 s	0.104 s	0.072 s	0.375 s	0.419 s
F5	Optimum value	7.77E-10	1.45E-19	2.44E-03	9.58E-03	2.44E-03	5.98E-08	6.18E-05
	Average value	2.25E-08	9.16E-13	3.13E-03	1.28E-02	2.89E-03	2.45E-03	2.76E-02
	Running time	0.147 s	0.324 s	0.434 s	0.270 s	0.181 s	0.674 s	0.953 s
F6	Optimum value	1.21E-02	2.08E-07	6.64E+00	1.46E+02	3.33E+00	4.58E-03	5.32E-08
	Average value	8.44E-02	3.71E-05	8.19E+00	1.65E+02	4.31E+00	1.41E+01	3.80E-05
	Running time	0.143 s	0.467 s	0.783 s	0.437 s	0.545 s	1.177 s	3.008 s
F7	Optimum value	1.26E-15	1.05E-19	9.71E-07	1.08E+01	5.73E-04	5.81E-05	8.21E-08
	Average value	2.72E-14	1.84E-17	1.02E-06	1.37E+01	1.21E-03	2.40E+00	3.01E-04
	Running time	0.464 s	1.178 s	1.247s	0.664 s	0.782 s	1.362 s	3.176 s
F8	Optimum value	4.01E-06	1.24E-08	7.41E-05	4.47E-03	6.32E-05	3.84E-04	2.96E-05
	Average value	6.09E-04	2.33E-05	2.52E-03	7.49E-01	9.06E-03	9.02E-02	1.52E-03
	Running time	0.649 s	1.521 s	2.008 s	0.965 s	1.253 s	1.438 s	3.192 s
F9	Optimum value	–1.00E+00	–1.00E+00	–1.00E+00	–1.00E+00	–1.00E+00	–1.00E+00	–1.00E+00
	Average value	0.00E+00	0.00E+00	4.25E-25	8.26E-06	4.96E-09	7.93E-16	6.14E-36
	Running time	0.149 s	0.284 s	0.233 s	0.196 s	0.155 s	0.395 s	0.380 s
F10	Optimum value	0.00E+00	0.00E+00	7.63E-35	4.97E-09	1.29E-26	9.57E-30	5.29E-26
	Average value	0.00E+00	0.00E+00	4.25E-22	5.84E-01	4.12E-16	2.83E-20	2.85E-17
	Running time	0.170 s	0.301 s	0.230 s	0.202 s	0.156 s	0.413 s	0.371 s

Fig. 1

Convergence curves of SSA, AIA-SSA, GGO, PSO, WOA, AOC, and CS on ten test functions.

(1) Algorithm convergence accuracy evaluation: According to the average value column in Table 2, it can be clearly seen that the optimal value obtained by AIA-SSA algorithm in ten experiments is more competitive than that of SSA. Especially in the test results of F1 function, we find that the convergence accuracy of AIA-SSA is 20 orders of magnitude better than that of SSA. The result of F9 and F10 is the optimal value in theory. It is also far better than the other five algorithms.

(2) Stability evaluation of the algorithm: the average values in Table 2 are repeated tests for each test function for 10 times, and the standard deviation of the test results is calculated. The smaller the standard deviation is, the more stable the algorithm optimization results are. We can see from this column that AIA-SSA has better stability than SSA and is far superior to the other five algorithms. F9 and F10 functions are directly calculated to the theoretical optimal value, and the stability is the best.

(3) Evaluation of algorithm convergence speed: It can be clearly seen from Fig. 1 that AIA-SSA algorithm has the fastest convergence speed, especially in the five groups of test functions F1, F3, F5, F8, and F10, which achieve the priority of the whole process. The final convergence accuracy of the other five functions is still optimal, because other algorithms tend to fall into local convergence.

We select F1, F2, F3, F4, and F6 functions to draw their 3D images (see the first column of Fig. 2), search for the distribution of sparrow population in the space during a random iteration (see the second and third columns of Fig. 2), and search trajectory during SSA optimization (see the fourth column of Fig. 2). And the corresponding convergence curve (see the fifth column of Fig. 2). The parameters of this experiment were set as follows: the maximum number of iterations was 100, the dimension was 2, and the population size was 100.

Fig. 2

3D image of the delta function.

4.3 AIA-SSA-SVM model

4.3.1 Data preprocessing

In order to verify the feasibility and improvement of the algorithm proposed in this paper in practical applications, this paper selects the SVM model commonly used in network intrusion detection, uses the AIA-SSA algorithm for the optimization of the model parameters, and uses the optimized SVM for the prediction and evaluation of network intrusion detection for the KDD99 dataset.

The KDD99 dataset has 41 dimensional features and 1 dimensional attack features, and the specific information is shown in Table 3. At the same time, the dataset contains four main abnormal attack behaviors, and each major category in the KDD99 dataset contains several attack types of small classes, and finally 38 attack types appear. See Table 4. We took the data in the first 41 columns of each data as features and quantified them. Finally, we randomly selected 5000 pieces of data from the processed data as the data set used in the experiment.

Table 3
Attribute information for the KDD dataset

Characteristic attribute name Serial number

Basic features of network connections 1–9

Content characteristics of a network connection 10–22

Traffic characteristics based on time 23–31

Host-based traffic characteristics 32–41

Type of attack 42

Characteristic attribute name	Serial number
Basic features of network connections	1–9
Content characteristics of a network connection	10–22
Traffic characteristics based on time	23–31
Host-based traffic characteristics	32–41
Type of attack	42

Table 4

Various attacks on KDD datasets

Attack Category	Attacks in KDDCup’99 Training set	Additional attacks in KDDCup’99 Test set	Replace mark
Dos	Back, Neptune, smurf, teardrop, land, pod	Apache2, mailbomb, processtable	1
R2L	Warezmaster, warezclient, ftpwrite, guesspassword, imap, multihop, phf, spy	Sendmail, named, snmpgetattack, snmpguess, xlock, xsnoop, worm	2
Probe	Satan, portsweep, ipsweep, nmap	Mscan, saint	3
U2R	Rootkit, bufferoverflow, loadmodule, perl	Httptunnel, ps, sqlattack, xterm	4

4.3.2 AIA-SSA-SVM model testing

In order to avoid the influence of other uncontrollable factors on the training accuracy, we take a ten-fold cross-validation to obtain the average, and the available results are compared with the prediction results of SVM without optimized parameters, and the prediction results of SVM parameters optimized with Sparrow Search Algorithm (SSA) and Genetic Algorithm (GA), respectively, and we set SSA and GA and the parameters of the algorithms in this paper as such lb = [0.005, 200].

The maximum number of iterations is 20, the dimension is 2, the parameter C of the optimization-seeking SVM and the g of the RBF function, and the forecast accuracy are shown in Table 5.

Table 5
Comparison of the results of optimized SVM by SSA and AIA-SSA algorithms

Test prediction accuracy Dos attack Prediction Accuracy Probe attack Prediction Accuracy R2L Attack Prediction Accuracy U2R Attack Prediction Accuracy C g

GA-SVM 86.91% 83.51% 60.87% 85.71% 97.14% 0.682 0.587

SSA-SVM 93.91% 94.48% 93.23% 78.75% 81.81% 0.323 0.038

AIA-SSA SVM 97.20% 97.75% 93.75% 98.62% 98.57% 157.635 0.051

	Test prediction accuracy	Dos attack Prediction Accuracy	Probe attack Prediction Accuracy	R2L Attack Prediction Accuracy	U2R Attack Prediction Accuracy	C	g
GA-SVM	86.91%	83.51%	60.87%	85.71%	97.14%	0.682	0.587
SSA-SVM	93.91%	94.48%	93.23%	78.75%	81.81%	0.323	0.038
AIA-SSA SVM	97.20%	97.75%	93.75%	98.62%	98.57%	157.635	0.051

As shown in Table 5, compared with SSA, AIA-SSA in this paper has a great improvement in optimizing the SVM model and improving the prediction accuracy. The overall prediction accuracy of AIA-SSA-SVM is 10.29% higher than GA-SVM and 3.29% higher than SSA-SVM. Meanwhile, for different attack types, the prediction accuracy of AIA-SSA-SVM is greatly improved, the most obvious is for U2R attack. The prediction accuracy of AIA-SSA-SVM is 16.76% higher than that of SSA-SVM, and the prediction accuracy of R2L attack is also greatly improved. The prediction of these two attacks is just a weakness of SSA-SVM, but AIA-SSA-SVM greatly compensates for this weakness.

The Fig. 3 show the comparison between the predicted values and the real values calculated by the algorithm in this paper and the optimized SVM parameters of the SSA algorithm, respectively. 1, 2, 3, and 4 in the vertical coordinates indicate whether it suffers from DoS attack, Probe attack, R2L attack, and U2R attack respectively, and 0 on the vertical axis is under attack; and the horizontal coordinates indicate the first few data in the test set, with a total of 1000 data.

Fig. 3

Comparison of three intrusion detection models (GA-SVM, SSA-SVM, and AIA-SSA-SVM).

The blue star indicates the label value of each data in the test set, and the red dot indicates the value predicted by the trained SVM model. The blue star clearly highlights the data prediction failure, and the red dot covers the blue star indicates that the data set was predicted successfully. It can be clearly seen that the improvement of the prediction results after optimizing SVM by AIA-SSA is more obvious than the original SSA and GA.

5 Conclusion

In this paper, we study the problem that the SSA algorithm is very easy to fall into local convergence in the process of finding the optimal value, and at the same time, considering the good performance of the immune algorithm in finding global convergence, so we integrate the concentration mediation mechanism of the immune algorithm, while adding the two-branch population transfer problem, and integrate the idea of differential evolution in the process of variation. This paper presents an adaptive sparrow algorithm based on immune algorithm concentration mechanism and finally form the AIA-SSA algorithm, which can effectively retain its good performance of the search for superiority while broadening the search range of sparrow in each iteration, improving its ability of global search and avoiding falling into local convergence. After two experimental comparisons, it can be found that the algorithm has a large improvement in both theoretical and practical application of feasibility and optimization performance, its optimization-seeking effect is much higher than the other six algorithms in the same parameter selection and operation environment. It is found that this algorithm will have a great application prospect in network intrusion detection. The performance of the optimized SVM model in network intrusion detection shows that the AIA-SSA algorithm is a more obvious improvement to the SSA algorithm and the GA algorithm, and the good performance of the algorithm can be more specifically reflected in the actual application.

The AIA-SSA algorithm is still in the initial stage of research, and the application of this algorithm in future practical engineering will be the main direction and focus of future improvement. For example, the AIA-SSA-SVM model is tested and improved again in the actual network environment, and the practical application of AIA-SSA in power system scheduling, path planning and other aspects, as well as the required improvement when applied in these practical scenarios.

References

Dorigo

, Maniezzo

and Colorni

, Ant system: optimization by a colony of cooperating agents, IEEE Trans Syst Man Cybern Part B 26(1) (1996), 29–41.

Xiangtao

, Jianan

and Minghao

, Enhancing the performance of cuckoo search algorithm using orthogonal learning method, Neural Computing and Applications (6) (2014).

Mirjalili

, Mirjalili

S.M.

and Lewis

, Grey wolf optimizer, Adv Eng Softw 69 (2014), 46–61.

Seyedali

and Andrew

, The Whale Optimization Algorithm[J], Advances in Engineering Software (2016), 95.

Kennedy

and Eberhart

, Particle swarm optimization, In: Proceedings of the 1995 IEEE International Conference on Neural Networks 194 (1995), 2–8.

Xue

and Shen

, A novel swarm intelligence optimization approach: sparrow search algorithm, Systems Science & Control Engineering 8 (2020), 22–34.

and Mu

X.-D.

, Chaotic Sparrow Search Optimization Algorithm, Journal of Beijing University of Aeronautics and Astronautics (2020), 1–10.

Ouyang

C.-T.

and Ouyang

D.-L.

, Research on Multi-strategy Improved Sparrow Search Algorithm Fused with K-means, Electronics Optics & Control (2021), 1–9.

Feng

and Liu

, Finding Key Node Sets in Complex Networks Based on Improved Discrete Fireworks Algorithm, Journal of Systems Science and Complexity (2021).

10.

Alexandros

and Yiannis

, Optimizing the seismic response of base-isolated liquid storage tanks using swarm intelligence algorithms, Computers & Structures (2021), 243.

11.

Ouyang

J.Q.

and Wei

Y.P.

, Financial Sequence Prediction Based on Swarm Intelligence Algorithms of Internet of Things, Computational Economics (2021).

12.

Zhiheng

and Jianhua

, Flamingo Search Algorithm: A New Swarm Intelligence Optimization Algorithm, in IEEE Access 9 (2021), 88564–88582. doi:10.1109/ACCESS.2021.3090512.

13.

Jianhua

and Zhiheng

, A Hybrid Sparrow Search Algorithm Based on Constructing Similarity, in IEEE Access 9 (2021), 117581–117595. doi:10.1109/ACCESS.2021.3106269.

14.

Tang

A.-D.

and Han

, An unmanned aerial vehicle path planning method based on chaotic sparrow search algorithm, Journal of Computer Applications (2021), 1–11.

15.

Mao

Q.H.

and Zhang

, An improved sparrow algorithm combining Cauchy mutation and reverse learning, Journal of Frontiers of Computer Science and Technology (2020), 1–12.

16.

Zhang

and Li

, Immune Particle Swarm Optimization Algorithm Based on Adaptive Search, Chinese Journal of Engineering 39(01) (2017), 125–132.

17.

Wang

and Pan

, Immune algorithm, Acta Electronica Sinica 7 (2000), 74–78.

18.

Wang

X.F.

and Zhang

X.J.

, A Genetic Algorithm Based on Immune Principle, Journal of Chinese Computer Systems 2 (1999), 38–41.

19.

M.J.

and Luo

, Artificial immune algorithm and its applications, Control Theory Appl 21 (2004), 153.

20.

Ghosh

, Neogy

, Das

P.K.

, et al., Intrusion Detection at International Borders and Large Military Barracks with Multi-sink Wireless Sensor Networks: An Energy Efficient Solution[J], Wireless Personal Communications 98(1) (2018), 1083–1101.

21.

Mohamed

M.B.

, Meddeb-Makhlouf

and Fakhfakh

, Intrusion cancellation for anomaly detection in healthcare applications[C], International Wireless Communications & Mobile Computing Conference (IWCMC) (2019), 313–318.

22.

Liang

, Ma

, Sadiq

and Yeung

, A filter model for intrusion etection system in Vehicle Ad Hoc Networks: A hidden arkov methodology[J], Knowledge-Based Systems, 2019.

23.

Senthilnayaki

, Venkatalakshmi

and Kannan

, Intrusion detection system using fuzzy rough set feature selection and modified KNN classifier[J], Int Arab J Inf Technol 16(4) (2019), 746–753.

24.

Reddy

R.R.

, Ramadevi

and Sunitha

K.V.N.

, Effective discriminant function for intrusion detection using SVM[C], International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2016), 1148–1153.

25.

Wen

Z.-Y.

and Xie

, Multi-objective Sparrow Search Algorithm Based on New Crowding Degree Distance, Computer Engineering and Applications (2021), 1–10.

26.

Sahu

and Mehtre

B.M.

, Network intrusion detection system using J48 Decision Tree[C], in International Conference on Advances in Computing, Communications and Informatics (ICACCI) (2015), pp. 2023–2026.

27.

Jiang

, Chun

C.P.

and Zeng

H.F.

, Relative Decision Entropy Based Decision Tree Algorithm and Its Application in Intrusion Detection[J], Computer Science 39(4) (2012), 223–226.

28.

Ahmim

, Maglaras

L.A.

, Ferrag

M.A.

, et al., A Novel Hierarchical Intrusion Detection System Based on Decision Tree and Rules-Based Models[C], International Conference on Distributed Computing in Sensor Systems (DCOSS) (2019), 228–233.

29.

Zhang

P.W.

and Chu

S.-L.

, Application of Immune Particle Swarm Optimization Algorithm in Collision Avoidance Planning of Night Ship, Ship Science and Technology 43(2) (2021), 64–66.

30.

Arora

and Singh

, Butterfly optimization algorithm: a novel approach for global optimization, Soft Computing, 2018. https://doi.org/10.1007/s00500-018-3102-4