Network intrusion detection method based on IEHO-SVM

Abstract

As the growth of network technology, the network intrusion has become increasingly serious. An elephant herding optimization algorithm and support vector machine-based network intrusion detection method are proposed to address the difficulties and low detection accuracy of the detection. This method first uses an optimized elephant swarm optimization algorithm to select features from the intrusion data and then uses the elephant swarm optimization method to optimize the parameters of the support vector machine algorithm. Finally, a detection model is constructed based on support vector machines. The main contribution of the research is the proposal of a network intrusion detection method based on improved swarm optimization algorithm and support vector machine. By using an improved swarm optimization algorithm to optimize the parameters of the support vector machine classification algorithm, this method significantly improves the accuracy and stability of detection when dealing with the classification task of network intrusion detection. The experimental results show that the detection model has a stable average accuracy of around 94% in detecting four types of intrusion data, surpassing the performance of other commonly used algorithms. The results validate the effectiveness of introducing the improved elephant swarm optimization algorithm and demonstrate its superiority in intrusion detection tasks.

Keywords

elephant herding optimization support vector machine network intrusion detection data processing

Introduction

Due to the quick advancement of network, more users are participating in online interactions, and the internet has become an indispensable part of human’s daily life. However, as the number of users increases, network intrusion events have become increasingly frequent.^1,2 By 2025, the global cost of network crime is expected to reach $10.5 trillion per year, highlighting the necessity of enhancing network security measures. According to statistics, the Asia Pacific region is the region most affected by network attacks, accounting for 31% of all reported incidents, while Europe and North America account for 28% and 25%, respectively.^3,4 Over the past year, the global threat of ransomware attacks has been rapidly evolving. The number of victims of ransomware attacks worldwide has increased by 46%, setting a record high in history. Large enterprises have become the target of most attacks, accounting for approximately 40%. Small organizations account for 25% of all victims, followed closely by medium-sized enterprises at 23%. Overall, the number of attacks on enterprises is steadily increasing.^5,6 In response to frequent network intrusion events, efficient and accurate network intrusion detection (NID) methods have become a hot topic in recent years. Intrusion detection is the identification and detection of malicious use of computer and network resources. It contains external intrusion of the system and unauthorized behavior of internal users. This technology is built to safeguard the computer systems’ security by promptly identifying and reporting any irregularities in the system. It can also be utilized to monitor and analyze the compliance of users with established security policies within computer networks. There are many methods for intrusion detection, such as expert system-based intrusion detection methods and neural network-based intrusion detection methods. However, facing a large amount of intrusion data and diverse intrusion attack methods, most current detection methods are still difficult to handle well. Most current models are trained on outdated or imbalanced datasets, resulting in poor generalization ability in real-world applications. Traditional machine learning methods are sensitive to high-dimensional data and rely on manual adjustment of model parameters, which can lead to overfitting. Feature selection and classification modeling are often independent of each other, which limits the overall improvement of detection performance. Faced with increasingly complex network intrusion behaviors, these limitations have become the main reason why traditional intrusion detection methods are unable to handle them effectively. In response to this issue, an NID method with the improved elephant herding optimization (IEHO) and SVM is raised, aiming at raising the efficiency and accuracy of detection. Although SVM has good performance in handling high-dimensional datasets, training directly on raw high-dimensional data may lead to overfitting and reduce the model’s generalization ability. Therefore, feature selection is crucial as it can help reduce data dimensionality, improve training efficiency, and enhance model accuracy and stability. The main contributions of the research are as follows: The IEHO algorithm was proposed, which improved the global search capability and convergence speed. We have developed a unified detection framework that combines IEHO and SVM optimization, achieving the integration of feature selection and model parameter tuning, and designed and implemented a parallel version of IEHO based on Apache Spark to improve the efficiency of large-scale intrusion data processing.

Related work

Exploring efficient and accurate NID methods is of great significance for addressing the increasingly serious network intrusion behaviors. Wei et al. proposed an improved K-means clustering network security detection model for NID. Through the clustering calculation method of k-means, the model could more effectively calculate and select special data. The results showed the effectiveness of the model.⁷ Regarding the issue of anomaly detection in network traffic, Imran M et al. proposed a new anomaly detection method that utilizes an artificial neural network optimized with the cuckoo search algorithm. The results showed that compared with methods that do not use parameter optimization, this method had significant advantages.⁸ Regarding the problem of detecting abnormal traffic behavior in IoT systems, Li et al. used artificial neural networks to detect abnormal behavior in medical IoT systems. In the proposed method, the butterfly optimization algorithm was used to select the best features for the learning process of artificial neural networks. The results obtained demonstrated the ability of the butterfly optimization algorithm to determine the distinguishing features of network traffic data. The results proved the effectiveness of the method.⁹ Sarkar et al. proposed an effective machine learning ensemble technique for NID classification. They demonstrated that preprocessing data can effectively improve detection quality, and correcting the training dataset can aid in category recognition, especially for abnormal attacks such as Root to Local attacks (R2L) and User to Root attacks (U2R). The results indicated that after preprocessing the data, the classification accuracy has been significantly improved.¹⁰ The above literature indicates that there are multiple types of intrusion in NID, which poses challenges for detection models. In addition, the above research indicates that preprocessing intrusion data will benefit the detection quality of the model.

Methods and materials

A new detection method was proposed to address the issues of low accuracy and high false alarm rates in NID. Firstly, the study introduced the Levy flight strategy and sparrow search algorithm (SSA) to improve the traditional elephant herding optimization algorithm (EHO). Subsequently, the study utilized an IEHO algorithm to address the high-dimensional issue of intrusion data while improving SVM parameters. Finally, an NID model based on improved SVM was proposed.

Attribute selection of network intrusion data based on IEHO algorithm

Network intrusion data often contains high-dimensional and other complex and useless data, which greatly reduces the processing speed and accuracy of detection behavior.^11–13 To better detect intrusion data, preprocessing the data is of great significance. The study used IEHO to preprocess intrusion data. IEHO introduces Levy flight strategy and SSA on the basis of EHO. EHO is utilized to address global unconstrained optimization problems, originating from the animal husbandry behavior of elephants in nature. So far, EHO has been successfully utilized to multi-level thresholds, and many other issues. Although EHO algorithm is a relatively new metaheuristic algorithm, it has the characteristics of simple structure, few control parameters, and easy combination with other methods, which can effectively solve optimization problems.^14–16 In EHO, local searches represent natural elephant clans, while global searches represent leaving clans. In the fundamental EHO algorithm, the initial operations are those of updating, which serve to determine the search direction and the level of detail at which the local search is conducted. These are then followed by the operations of separation. This process includes two stages: clan renewal and clan separation.^17–19 In the clan update operation, elephants live together and are led by a female elephant, as shown in equation (1).

x_{n e w, c i, j} = x_{c i, j} + α \times (x_{b e s t, c i} - x_{c i, j}) \times r

(1)

In equation (1), $x_{n e w, c i, j}$ indicates an update of the female leader’s position of the clan, $x_{c i, j}$ represents the previous generation position of the female leader of the clan, $α$ means the proportional factor of the influence of the female leader with the best position in the clan on individual elephants, $x_{b e s t, c i}$ represents the female leader with the best position in the clan, and $r$ represents the random number used by the algorithm to improve population diversity in the later stage. The female leader’s position of the clan is shown in equation (2).

x_{b e s t, c i} = β * x_{c e n t e r, c i}

(2)

In equation (2), $β$ means the second algorithm parameter, which controls the impact of the clan center. $x_{c e n t e r, c i}$ represents the clan center. The calculation of clan center is shown in equation (3).

x_{c e n t e r, c i, d} = (\sum_{j = 1}^{n c i} x_{c j, j, d}) / n_{c j}

(3)

In equation (3), $d$ represents dimension. $n_{c j}$ means the amount of elephants in the clan. The operation of clan separation is shown in equation (4).

x_{w o r s t, c i} = x_{\min} + {x_{\max} - x_{\min} + 1} * r a n d

(4)

In equation (4), $x_{w o r s t, c i}$ represents the position of the constant number of elephants with the worst fitness function value in the clan. $x_{\min}$ and $x_{\max}$ ,respectively, mean the lower and upper limit of the search space. $r a n d$ represents a function that generates random numbers. In response to the problems of premature convergence and falling into local optima in traditional EHO algorithms, this study introduces Levy flight strategy in EHO update operations and SSA in separation operations. The clan update operation is optimized as shown in equation (5).

x_{n e w, c i, j} = x_{c i, j} + L e v y (λ) + α \cdot (x_{b e s t} - x_{c i, j}) \cdot r

(5)

In equation (5), $L e v y (λ)$ represents the introduced Levy flight strategy. This strategy can enable clan individuals to search towards a wider range and avoid premature convergence. The separation operation is optimized using SSA as shown in equation (6).

x_{n e w, w o r s t, c i} = x_{c u r r e n t} + r * (x_{c u r r e n t} - x_{w o r s t, c i})

(6)

In equation (6), $x_{n e w, w o r s t, c i}$ represents the position update of the constant amount of elephants with the worst fitness function value in the clan. $x_{c u r r e n t}$ represents the current position of the elephant. This optimization method utilizes the strong global search capability of SSA. SSA calculates the fitness value of the worst placed individual in the clan and updates the individual with the high fitness value obtained. If the high fitness value cannot be obtained, the individual is not updated. The IEHO algorithm flowchart is shown in Figure 1.

Figure 1.

IEHO algorithm flowchart.

In Figure 1, the steps of the IEHO algorithm are to first initialize the population and set the amount of populations. Subsequently, the clan position is randomly set and its fitness value is calculated. After updating and separating the clan, it can obtain the local optimal position and decide whether the algorithm termination condition is satisfied. If it is satisfied, update it. If it is not satisfied, continue with the update operation. Due to the fact that intrusion data often contains a large amount of useless data, it is necessary to preprocess the data. The algorithm pseudocode is shown in Figure 2.

Figure 2.

Pseudocode of the IEHO algorithm.

The basic process of feature detection is shown in Figure 3.

Figure 3.

Feature detection flowchart.

In Figure 3, the raw data is first preprocessed to ensure that the data quality is suitable for subsequent analysis. Using IEHO to select the most relevant features, IEHO evaluates the contribution of each feature to the classification model and selects the feature with the highest predictive ability. After completing the feature selection, the selected features are evaluated and cross validation is used to verify the performance of a subset of the selected features. In this process, the features are sorted based on the evaluation results to ensure that the final selected feature subset not only has efficient data representation capabilities but also significantly improves the detection accuracy of the model. In response to the low efficiency of IEHO in handling massive intrusion data, this study introduces Apache Spark to complete large-scale data processing. Apache Spark is a fast and versatile computing engine that transforms the process of finding the best location and solution for an elephant into a parallel solving process for each individual. Parallel processing greatly improves the efficiency of IEHO in selecting intrusion data features, as shown in Figure 4.

Figure 4.

Flowchart of IEHO algorithm for Spark parallelization.

In Figure 4, the solving steps of Spark Parallel IEHO (SPIEHO) are the same as solving IEHO separately, but the optimization of each individual’s position and optimal value are carried out in parallel. In traditional IEHO algorithms, the calculation process is often serial, which may result in low computational efficiency and time delay when processing large amounts of feature data. When the IEHO algorithm is combined with Apache Spark, its powerful distributed computing capability is utilized to enable parallel processing of location updates and optimal value evaluations for each individual. This parallelization not only improves the efficiency of the algorithm but also makes it more scalable when dealing with large-scale datasets. Regarding the time and space complexity issues of SPIEHO, the analysis of the algorithm shows that it is mainly affected by feature selection, parameter optimization, and model training stages in terms of time complexity. The spatial complexity is mainly affected by the storage of feature sets, individual storage and fitness storage, and SVM model storage. Under the Apache Spark framework, by utilizing Spark’s RDD and data frame structure, algorithm execution can be parallelized, thereby reducing the load on each computing node and improving the response speed of the entire system. When processing datasets of millions of levels, Apache Spark can effectively integrate computing resources, optimize memory usage, and enable algorithms to adapt to larger training sets.

Intrusion detection model based on IEHO-SVM

To build an intrusion data detection model, in addition to using IEHO to preprocess intrusion data, it is also necessary to classify incoming data. Based on this, SVM is introduced in the study. SVM is utilized for classification and regression analysis. The basic aim is to find an optimal hyperplane in the feature space to maximize the distance between the classification boundary and the nearest training sample.^20,21 Kernel function is an important concept in SVM, which can map data from the original space to high-dimensional space, thereby handling nonlinear problems. The kernel function avoids the computational cost of high-dimensional space by calculating the similarity between samples. The performance of SVM is affected by the penalty coefficient $C$ . $C$ controls the complexity and fault tolerance of the model. In binary classification problems, SVM can find a hyperplane to separate two sets of data and make the data points farthest from the interface. The hyperplane expression is shown in equation (7).

\bar{w} \cdot \bar{x} - b = 0

(7)

In equation (7), $\bar{x}$ and $\bar{w}$ represent point set and normal vector, respectively. $b$ represents the intercept term. If the data is separable, then the area between the two hyperplanes that separate the data is the interval. These hyperplanes can be represented by equation (8).

{\begin{cases} \bar{w} \cdot \bar{x} - b = 1 \\ \bar{w} \cdot \bar{x} - b = - 1 \end{cases}

(8)

According to equation (8), the distance between hyperplanes can be calculated as $\frac{2}{‖ w ‖}$ . To maximize the distance, $‖ w ‖$ needs to be minimized. At the same time, to guarantee that the sample data is outside the hyperplane, all data $i$ should satisfy equation (9) or (10).

\bar{w} \cdot {\bar{x}}_{i} - b \geq 1, y_{i} = 1

(9)

In equation (9), $y_{i}$ represents the vertical axis direction dataset.

\bar{w} \cdot {\bar{x}}_{i} - b \leq - 1, y_{i} = - 1

(10)

All data points must be located on one side of the interval and can be merged into equation (11).

y_{i} (\bar{w} \cdot {\bar{x}}_{i} - b) \geq 1, f o r a l l 1 \leq i \leq n

(11)

In equation (11), $n$ means the total amount of data. When the dataset is non-linear and separable, soft intervals and loss functions are introduced, and their expressions are shown in equation (12).

\max (0, 1 - y_{i} (\bar{w} \cdot \bar{x_{i}} - b))

(12)

When the constraint condition is on the correct side of the interval, equation (12) is equal to 0. For the data on the other side of the interval, it is expected to be minimized. When the data is linear, the SVM classification effect is the same. When the data is non-linear, the soft interval method also has a certain classification ability, as shown in equation (13).

[\frac{1}{n} \sum_{i = 1}^{n} \max (0, 1 - y_{i} (\bar{w} \cdot {\bar{x}}_{i} - b))] + λ {‖ \bar{w} ‖}^{2}

(13)

In equation (13), $λ$ represents the parameter that balances the size of the interval. In practical problems, soft intervals can only learn feasible classification rules for nonlinear classification and cannot fully solve nonlinear data processing. Kernel functions are introduced when the dataset is non-linear. The general expression of the kernel function is shown in equation (14).

K (x_{i,} x_{j}) = \emptyset (x_{i,}) \cdot (x_{j})

(14)

In equation (14), $x_{i}$ and $x_{j}$ represent different spatial vectors. $K (x_{i,} x_{j})$ represents the mapping function. $K (x_{i,} x_{j})$ represents the kernel function value. Kernel function is a method of mapping low dimensional datasets to high-dimensional feature spaces, which transforms the originally linear inseparable problem into a linear separable problem, making classification easier. The essence of kernel function is to replace the dot product in low-dimensional space with the dot product in high-dimensional space, so that SVM can classify nonlinear data. There are many types of kernel functions, and this study uses Gaussian kernel functions considering its non-linear mapping ability, as shown in equation (15).

K (x_{i}, x_{j}) = \exp (- γ {‖ x_{i} - x_{j} ‖}^{2})

(15)

In equation (15), $γ$ represents the parameter that determines the shape of the function. ${‖ x_{i} - x_{j} ‖}^{2}$ represents Euclidean distance. The optimal hyperplane theorem for SVM is shown in equation (16).

\min_{w, b} \frac{1}{2} {‖ w ‖}^{2} s u b j i e c t t o y_{i} (w^{T} x_{i} + b) \geq 1, \forall i

(16)

This theorem indicates that the goal of SVM is to minimize the complexity of the classification model by maximizing the decision boundary. The boundaries of the hyperplane are determined by the data points located in the support vectors, which are crucial for the final performance of the model. The schematic diagram of high-dimensional space partitioning of the SVM dataset after kernel function optimization is shown in Figure 5.

Figure 5.

Higher dimensional hyperplane.

In Figure 5, the non-linear dataset’s dot product in low-dimensional space is replaced with a dot product in high-dimensional space, which helps to use SVM for partitioning. The spatial normal vector separates data that cannot be separated by the original low-dimensional space through algorithms. In response to the problem of SVM needing to adjust too many parameters, the study adopts IEHO to optimize its parameters (IEHO-SVM). IEHO mainly optimizes SVM parameters by adjusting the penalty coefficient and kernel function parameter $σ$ , and the optimization process is shown in Figure 6.

Figure 6.

SVM parameter optimization flowchart.

In Figure 6, optimization of SVM parameters involves initializing the population, learning samples of invasion data using SVM, and optimizing SVM parameters using IEHO. Fitness values are calculated to determine the optimal location for population renewal and population separation operations. The optimal solution and locally optimal position of the community are updated to determine whether the termination condition is satisfied. If satisfied, the global optimal solution and optimal location are updated. If unsatisfactory, the clan update and separation operations are performed again. SVM is applied to classify the samples and build the model. Finally, the model is used to classify the intrusion data and output the obtained data results. The IEHO algorithm optimizes the parameters of the SVM, especially the penalty factor and kernel function parameters. Proper parameter selection not only improves the classification accuracy of the model but also reduces the risk of overfitting. IEHO optimizes the potential parameter space to find the parameter combinations that maximize the classification performance of the SVM. In terms of feature selection, IEHO’s parallel processing capability greatly improves the efficiency of feature space exploration. By efficiently selecting features that are highly relevant to the intrusion detection task, IEHO reduces the amount of data that the SVM model needs to process, thus speeding up the training process and improving model performance. The pseudocode for Figure 7 is as follows.

Figure 7.

Pseudo code for SVM parameter optimization flowchart.

The SVM classification detection model optimized by IEHO algorithm is shown in Figure 8.

Figure 8.

Structure diagram of intrusion detection model based on IEO-SVM.

In Figure 8, the detection model processes intrusion data by first extracting the intrusion data, performing dimensionality reduction on the extracted data, and transforming it into type data that SVM can process. Then it will initialize the parameters and iterate using the fitness function until the termination condition is met. Finally, IEHO is utilized to optimize SVM parameters and a prediction model is obtained. The obtained prediction model is used to train intrusion samples to construct an intrusion detection model.

Results

To assess the efficacy of the proposed NID model, feasibility testing was first conducted on the proposed IEHO in the experiment. Subsequently, the effectiveness of optimizing IEHO in selecting intrusion data attributes was verified. Finally, the efficacy comparison was conducted between the proposed NID model with other models to verify the superiority.

Analysis of network intrusion detection methods based on IEHO

The experiments were conducted on a workstation with an Intel Xeon Gold 5218 CPU, 256 GB RAM and a Hadoop cluster (4 slave nodes) running Apache Spark 3.3.1. The framework was implemented in Java (JDK 1.8), with Weka 3.8 used for classification. All datasets were normalized, label-encoded, and split into training and test sets (7:3). The experiment was analyzed using Weka software and JAVA language, and classified using the UCI dataset. Two real-world network security datasets, CICIDS 2017 dataset and UNSW-NB15 dataset, were also included. The UCI dataset refers to KDDCup99 (≈4.9 million records, 41 features), covering DoS, R2L, U2R, and Probe attacks. CICIDS 2017 contains real traffic scenarios with benign and malicious behaviors (e.g., DDoS and brute force). UNSW-NB15 includes ≈2.5 million records and 49 features, covering 9 recent attack types. A subset of the UCI dataset can be accessed at the following link: https://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html. This dataset contains various types of information about network connections, including normal connections and various types of attacks such as DoS, R2L, U2R, and Probe. This dataset is widely used to evaluate and compare the performance of NID systems (Table 1).

Table 1.

Bearing parameters.

Name	Sample size	Number of features	Category
Glass	215	10	6
Heart	272	13	2
Wine	177	13	3
Ionosphere	350	34	2

To prove the efficacy of the designed IEHO algorithm, two unimodal and multimodal functions were selected for testing in the experiment, as shown in Figure 9.

Figure 9.

Fitness curves of algorithms in different peak functions. (a) Performance testing of two algorithms in the Rosenbroc function. (b) Performance testing of two algorithms in the SumSquare function. (c) Performance testing of two algorithms in the Griewank function. (d) Performance testing of two algorithms in the Schaffer function.

In Figure 9(a), in the unimodal function Rosenbrock, IEHO completed convergence after 12 iterations and EHO completed convergence after 25 iterations. In Figure 9(b), in the unimodal function SumSquare, IEHO completed convergence after 6 iterations and EHO completed convergence after 14 iterations. In Figure 9(c), in the multimodal function Griewank, IEHO completed convergence after 8 iterations and EHO completed convergence after 12 iterations. In Figure 9(d), in the multimodal function Schaffer, IEHO completed convergence after 7 iterations and EHO completed convergence after 16 iterations. Experimental data showed that IEHO achieved faster convergence and stronger performance than EHO, proving the IEHO algorithm’s feasibility. To verify the efficacy of the IEHO algorithm in selecting intrusion data attributes, experiments were conducted to compare the convergence of fitness values for different datasets using four algorithms: IEHO and the moth flame optimization (MFO), ant colony optimization (ACO), and EHO algorithms, as shown in Figure 10.

Figure 10.

Fitness values in different datasets. (a) The optimal number of feature selection subsets for different algorithms in wine datasets. (b) The optimal number of feature selection subsets for different algorithms in glass datasets. (c) The optimal number of feature selection subsets for different algorithms in Lonosphere datasets. (d) The optimal number of feature selection subsets for different algorithms in heart datasets.

In Figure 10(a), in the wine dataset, the optimal feature subsets for ACO, MFO, EHO, and IEHO were 6, 6, 5, and 4, respectively. In Figure 10(b), in the glass dataset, the optimal feature subsets for IEHO, ACO, MFO, and EHO were all 6. In Figure 10(c), in the ionospheric dataset, the optimal feature subsets for IEHO, ACO, MFO, and EHO were 5, 10, 9, and 7, respectively. In Figure 10(d), in the heart disease dataset, IEHO had the least number of optimal feature subsets, only 4, while there were at least 5 optimal feature subsets for ACO, MFO, and EHO. Experimental data showed that the IEHO algorithm is suitable for data attribute selection. SPIEHO has been proposed in response to the problem of processing large amounts of data. The experiment compared the running time of IEHO and SPIEHO on different datasets to verify the superiority of SPIEHO, as shown in Figure 11.

Figure 11.

Comparison of runtime between IEHO and SPIEHO on different datasets. (a) Two algorithms processing 2.5 million data speeds. (b) Two algorithms processing 5 million data speeds. (c) Two algorithms processing 7.5 million data speeds. (d) Two algorithms processing 1 million data speeds.

In Figure 11(a), in a dataset of 2.5 million, IEHO ran for 18561 seconds and completed processing; SPIEHO ran for 4125 seconds and completed processing. In Figure 11(b), IEHO and SPIEHO ran on a 5 million dataset with processing times of 42659 seconds and 8614 seconds, respectively. In Figure 11(c), in the 7.5 million dataset, SPIEHO ran faster and only took 16842 seconds to complete, while IEHO ran slower and took 75689 seconds to complete. In Figure 11(d), in a dataset of 10 million, IEHO processed data much slower than SPIEHO, taking 196485 seconds to complete, while SPIEHO only took 32671 seconds. Experimental data showed that SOIEHO processed large amounts of data faster and more efficiently.

Analysis of intrusion detection model based on IEHO-SVM

To assess the efficacy of the NID model based on IEHO-SVM, experiments were organized on the same dataset to compare the detection models of different algorithms combined with SVM algorithm with the IEHO-SVM detection model, as shown in Figure 12.

Figure 12.

Fitness changes of four algorithm models for different attack types. (a) Adaptation changes of different algorithms to DOS attacks. (b) Adaptation changes of different algorithms to probe attacks. (c) Adaptation changes of different algorithms to R 2 L attacks. (d) Adaptation changes of different algorithms to DOS attacks.

As shown in Figure 12(a), in the Denial Of Service (DOS) attack, the IEHO-SVM detection model converged after 19 iterations, ACO-SVM, MFO-SVM, and EHO-SVM iterated 22, 21, and 45 times respectively, to complete convergence, and the IEHO-SVM model had the highest fitness value of 0.842. As shown in Figure 12(b), in the Probe attack, the IEHO-SVM detection model converged after 62 iterations and ACO-SVM, MFO-SVM, and EHO-SVM completed convergence by iterating 58, 19, and 12 times, respectively, with the IEHO-SVM model having the highest fitness value of 0.675. According to Figure 12(c), in remote to Login (R2L) attack, the IEHO-SVM detection model completed convergence after 57 iterations, ACO-SVM, MFO-SVM, and EHO-SVM iterated 58, 57, and 58 times, respectively, to complete convergence, and the IEHO-SVM model had the highest fitness value of 0.846. According to Figure 12(d), it can be seen that in the U2R attack, the IEHO-SVM detection model completed convergence after 52 iterations, ACO-SVM, MFO-SVM, and EHO-SVM iterated 79, 8, and 40 times, respectively, to complete convergence, and the IEHO-SVM model had the highest fitness value of 0.894. Experimental data showed that the IEHO-SVM detection model had the highest fitness value and fast convergence speed, which verified the effectiveness of the model. To verify the stability of the IEHO-SVM detection model, precision experiments were conducted by comparing three other models. Advanced methods of the same type, Convolutional Neural Networks (CNN) and Random Forest (RF), were selected for comparison. The results are shown in Figure 13.

Figure 13.

Comparison of detection accuracy of four models for different intrusion data.

In Figure 13, the IEHO-SVM detection model had a detection accuracy of 94% in normal attacks. The EHO-SVM model had a second highest detection accuracy, at 89%. The IEHO-SVM detection model had a detection accuracy of 87% in DOS attacks. That of the RF model was 82%. In Probe attacks, that of the IEHO-SVM detection model was 85%, except for the IEHO-SVM detection model, the RF model had the highest detection accuracy, at 81%. The IEHO-SVM detection model had a detection accuracy of 89% in R2L attacks. The IEHO-SVM detection model had an accuracy rate of 92% in U2R attacks. Experimental data showed that the IEHO-SVM detection model had the highest detection accuracy for different intrusion data. To further verify the superiority of the IEHO-SVM detection model, the experiment ran four independent models to detect different intrusion data and verified the superiority of the IEHO-SVM detection model through average accuracy, as shown in Figure 14.

Figure 14.

Comparison of average accuracy of four models for detecting different intrusion data.

As shown in Figure 14, the average accuracy of the IEHO-SVM detection model for detecting four types of intrusion data remained stable at around 94%. The average accuracy of CNN and RF detection models was around 84% and 80%, respectively. The accuracy curve of the EHO-SVM model fluctuated greatly and had poor stability. Experimental data showed that the IEHO-SVM detection model had the highest detection accuracy and the most stable performance for different intrusion data. The experiment proved the superiority of the IEHO-SVM detection model in terms of detection performance compared to other models.

Discussion and conclusion

The rapid development of network technology has brought about numerous security issues, and traditional NID methods have become increasingly difficult to handle current network intrusion problems. In response to this difficulty, research was conducted to optimize SVM parameters using IEHO and established a detection model based on IEHO-SVM. Experimental data showed that in the wine dataset, the optimal feature subsets for ACO, MFO, and EHO were 6, 6, and 5, respectively. The optimal subset of features for IEHO was 4. In the glass dataset, the optimal feature subsets for IEHO, ACO, MFO, and EHO were all 6. In the ionospheric dataset, the optimal subset of features for IEHO was 5. The optimal feature subsets for ACO, MFO, and EHO were 10, 9, and 7, respectively. In the heart disease dataset, IEHO had the least number of optimal feature subsets, only 4, while there were at least 5 optimal feature subsets for ACO, MFO, and EHO. In a dataset of 2.5 million, IEHO ran for 18561 seconds and completed processing. In a dataset of 2.5 million, IEHO ran for 18561 seconds and completed processing, and SPIEHO ran for 4125 seconds and completed processing. IEHO and SPIEHO ran on a 5 million dataset with processing times of 42659 seconds and 8614 seconds, respectively. In a 7.5 million dataset, SPIEHO ran faster and only took 16842 seconds to complete, while IEHO ran slower and took 75689 seconds to complete. In a dataset of 10 million, IEHO processed data much slower than SPIEHO, taking 196485 seconds to complete, while SPIEHO only took 32671 seconds. The IEHO-SVM detection model was suitable for general attacks. The detection accuracy in network attacks such as DOS, Probe, R2L, and U2R was 94%, 87%, 89%, 85%, and 92%, respectively. The accuracy of other models in detecting different intrusion data was lower than that of the IEHO-SVM detection model. The outcomes indicated that IEHO had the best preprocessing effect on intrusion data and SPIEHO had the highest efficiency in processing data. Meanwhile, it is proved that the EHO-SVM detection model proposed by the research is effective in dealing with network intrusion problems. The current IEHO-SVM model is mainly trained and tested based on traditional network traffic characteristics. However, the increasing number of adversarial attacks against NID may lead to a decline in model performance. Attackers can evade detection through carefully designed traffic patterns. In the future, it is necessary to evaluate the robustness of the IEHO-SVM model to determine its performance in the face of adversarial attacks. The introduction of an adaptive mechanism is considered to improve the stability and accuracy of the model in such situations. Deep learning techniques are combined with the IEHO-SVM model. Deep learning performs well in automatic feature extraction and complex pattern recognition, providing a more powerful foundation for IEHO-SVM. Deep learning is used to generate an initial feature set and further optimize the selection through IEHO, which can improve the overall accuracy and robustness.

Footnotes

ORCID iD

Jia Guo

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Abdajabar

Yunus

NAM

. A review on the impact of cybersecurity crimes in financial institutions during the time of Covid-19. Acta inform Malays 2023; 7(1): 19–23.

Olorunfemi Adams

Azikwe

Zubair

. Artificial neural network analysis of some selected Kdd Cup 99 dataset for intrusion detection. Acta Inform Malays 2022; 6(2): 55–61.

Wang

. Improved elephant herding optimization using opposition-based learning and K-means clustering to solve numerical optimization problems. J Ambient Intell Hum Comput 2023; 14(3): 1753–1784.

Chandra

Bedi

. Survey on SVM and their application in image classification. Int J Inf Technol 2021; 13(5): 1–11.

Akhyar Damanik

. Securing data network for growing business Vpn architectures cellular network connectivity. Acta Inform Malays 2022; 6(1): 01–06.

Nanda

. Network/security threats and countermeasures for cloud computing. Acta electron Malays 2022; 6(1): 01–03.

Wei

Zang

Pan

, et al. Strategic application of ai intelligent algorithm in network threat detection and defense. J Theory Pract Enginee Sci 2024; 4(01): 49–57.

Imran

Khan

Hlavacs

, et al. Intrusion detection in networks using cuckoo search optimization. Soft Comput 2022; 26(20): 10651–10663.

Ghoreishi

Issakhov

. Improving the accuracy of network intrusion detection system in medical IoT systems through butterfly optimization algorithm. Wirel Pers Commun 2022; 126(3): 1999–2017.

10.

Sarkar

Sharma

Singh

. A supervised machine learning-based solution for efficient network intrusion detection using ensemble learning based on hyperparameter optimization. Int J Inf Technol 2023; 15(1): 423–434.

11.

Injadat

Moubayed

Nassif

, et al. Multi-stage optimized machine learning framework for network intrusion detection. IEEE Trans Netw Serv Manage 2020; 18(2): 1803–1816.

12.

Al-Turaiki

Altwaijry

. A convolutional neural network for improved anomaly-based network intrusion detection. Big Data 2021; 9(3): 233–252.

13.

Wang

Cao

Hong

. A network intrusion detection system based on convolutional neural network. J Intell Fuzzy Syst 2020; 38(6): 7623–7637.

14.

Azizan

Mostafa

Mustapha

, et al. A machine learning approach for improving the performance of network intrusion detection systems. AETiC 2021; 5(5): 201–208.

15.

Zhou

Liang

, et al. Hierarchical adversarial attacks against graph-neural-network-based IoT network intrusion detection system. IEEE Internet Things J 2021; 9(12): 9310–9319.

16.

Liu

Zhao

, et al. Parameter identification of photovoltaic cell model based on improved elephant herding optimization algorithm. Soft Comput 2023; 27(9): 5797–5811.

17.

Bhosle

Musande

. Evaluation of deep learning CNN model for recognition of Devanagari digit. Artif Intell Appl 2023; 1(2): 114–118.

18.

Molina-Coronado

Mori

Mendiburu

, et al. Survey of network intrusion detection methods from the perspective of the knowledge discovery in databases process. IEEE Trans Netw Serv Manage 2020; 17(4): 2451–2479.

19.

Sharma

Vijayvargiya

. An optimized neuro-fuzzy network for software project effort estimation. IETE J Res 2023; 69(10): 6855–6866.

20.

Apruzzese

Pajola

Conti

. The cross-evaluation of machine learning-based network intrusion detection systems. IEEE Trans Netw Serv Manage 2022; 19(4): 5152–5169.

21.

Qiu

Dong

Zhang

, et al. Adversarial attacks against network intrusion detection in IoT systems. IEEE Internet Things J 2020; 8(13): 10327–10335.