Abstract
Wireless sensor nodes (WSN) combine sensing and communication capabilities in the smallest sensor network component. Sensor nodes have basic networking capabilities, such as wireless connection with other nodes, data storage, and a microcontroller to do basic processing. The intrusion detection problem is well analyzed and there exist numerous techniques to solve this issue but suffer will poor intrusion detection accuracy and a higher false alarm ratio. To overcome this challenge, a novel Intrusion Detection via Salp Swarm Optimization based Deep Learning Algorithm (ID-SODA) has been proposed which classifies intrusion node and non-intrusion node. The proposed ID-SODA technique uses the k-means clustering algorithm to perform clustering. The Salp Swarm Optimization (SSO) technique takes into residual energy, distance, and cost while choosing the cluster head selection (CHS). The CHS is given the input to a multi-head convolutional neural network (MHCNN), which will classify into intrusion node and non-intrusion node. The performance analysis of the suggested ID-SODA is evaluated based on the parameters like accuracy, precision, F1 score, detection rate, recall, false alarm rate, and false negative rate. The suggested ID-SODA achieves an accuracy range of 98.95%. The result shows that the suggested ID-SODA improves the overall accuracy better than 6.56%, 2.94%, and 2.95% in SMOTE, SLGBM, and GWOSVM-IDS respectively.
Keywords
Introduction
Gait WSN is considered a capable technology in various fields, including smart home automation and industrial manufacturing [1]. WSNs are typically remote and self-organized networks controlled by energy supplies, power computation, and communication abilities of separate nodes [2]. These networks are typically organized in locations with limited power and maintenance resources. The typical deployment areas include military purposes, traffic, and environmental monitoring [3]. WSN security has become more of an issue due to the hasty growth and widespread use of smart sensor networks [4]. The nodes in WSNs are more susceptible to attacks than the nodes in resource-rich wired networks [5]. The limitations of WSNs, however, make relying only on security protection insufficient to protect against attacks. It has been agreed that ID technologies should be used to protect WSNs from attacks [6]. The detection method used to determine whether an intrusion is caused by the existence or absence of sound has a high error rate, but it is still in use because of how inexpensive it is. Due to WiFi’s growing use and declining cost, several technologies and approaches for intrusion detection based on channel state data have been put forth by domestic and foreign research organizations [7]. Intrusion detection systems (IDS) are network security solutions that detect vulnerabilities in applications or computers. An IDS cannot only protect against intruder attacks but also improve the system’s protection competencies based on identified attacks [8]. The IDS will only identify the threat and alert the network security administrator if it is not defined with any reactive strategy. The response is to get rid of the danger and secure the infrastructure [9].
IDS may be classified according to the data sources into three different groups: (1) Host-based IDS, which doesn’t rely on network data and instead check whether data from the ID library within the system is abnormal [10]. It wastes lots of CPU resources and isn’t appropriate for small, dispersed devices. (2) Network-based IDS can capture real-time system data packets and create an ID database for frequency analysis, pattern matching, and decision on the data packets, but the database updates are expensive. (3) Distributed IDS, in which the system can simultaneously monitor both of the above IDSs, i.e., it can identify both host operational data and network information. In current scenarios, DL is one of the most emerging technologies for classifying and predicting data efficiently, [11, 12], particularly in the field of communication and computer technology [13]. One of the newest technologies in the communication sector is WSN. The architecture and restricted resource availability of WSNs make it challenging to ensure security and energy efficiency. Routing and clustering are just two of the many techniques offered for safe WSNs. The existing DSKMS [14] has been done on security, privacy [15, 16], violation detection, power consumption improvement, scalability [17, 18], and privacy in the face of these threats [19]. The learning process and improves botnet resolution are sped up [20]. IDS scans networks for intruders based on user actions but can have a high false alarm rate. To overcome this challenge, a novel ID-SODA has been proposed. This deep learning-driven approach not only detects intrusions quickly but also minimizes network energy consumption due to its efficient processing abilities.
The major contributions of this research, Initially, sensor nodes are clustered using the K-means clustering algorithm, which is a significant clustering technique for unsupervised learning tasks. After clustering, the cluster head has been selected using the Salp Swarm optimization technique which considers parameters such as residual energy, distance, and cost. The MHCNN classify into intrusion nodes and non-intrusion nodes. Finally, the intrusion node is blocked and the non-intrusion node is given to the base station.
The remaining portions of the analysis are structured as followed: Section II describes the literature review in detail. Section III describes the suggested ID-SODA technique. The result is given in section IV and finally, section V describes the conclusion.
Literature survey
In recent years, many deep learning and machine learning studies for intrusion detection and classification strategies have been discussed. In this part, an overview of a few recent advancements and techniques for IDS.
In 2019 Vinayakumar et al., [22] presented a Scale-Hybrid-IDS-AlertNet that observes the system traffic and host-level actions to constantly alert against potential network threats. By sending the input through this system, it acquires the conceptual and high-dimensional description of the data through multiple hidden layers. On the other hand, advanced DNN models are computationally expensive.
In 2019 Zhang W., et al., [23] had introduced a hierarchical IDS based on the functions of nodes. Further, to increase the WSN intrusion detection system’s accuracy in detecting abnormal behavior and reduce the rate of false alarms. A multi-kernel ELM was created by evaluating and using the multi-kernel function to the linear integration and create the ideal WSN intrusion detection system.
In 2019 Borkar GM., et al., [24] had devised an effective clustering method with an ACSO algorithm for selecting CHs. The time consumption of the network is lowered to a higher extent by this adaptive strategy, while the network’s lifetime and scalability are also improved. Moreover, an adaptive SVM classifier for two-stage classification with IDS where the abnormal sensor nodes were described via an acknowledgment-based approach.
In 2019 Tan, X., et al., [25] proposed a SMOTE is used to balance the dataset. The intrusion detection classifier is then trained using the random forest method. In simulations using a benchmark intrusion dataset, the random forest algorithm’s accuracy was found to be 92.39%, which is greater than that of other comparison algorithms.
In 2020 Wang W., et al., [26] had proposed a SLGBM in WSN-based IDS. The SBS approach is used to decrease the data dimensions on the trait space of the real data traffic to lessen the computing cost. After that, a Light GBM method was used to detect various network attacks. The F1-score of the proposed model achieves better results for normal classes.
In 2021 Zhao R., et al., [27] had designed a deep learning LDAN-based network IDS. LDAN uses lightweight units to extract features, and the autoencoder was trained using a modified loss function for NID. The lightweight unit’s expansion and compression (EC) design can extract the characteristics and capture crucial data to identify threats while drastically minimizing the cost of computation and reducing system size.
In 2021 Hasheminejad, E., et al., [28] introduced a reliable data aggregation technique based on tree structure in WSN. It consists of three phases: building a binary tree, authentication, and reliable data aggregation. We evaluated the proposed method on NS2. The simulation results show a significant reduction in energy consumption.
In 2021 Naghibi, M., et al., [29] had introduced a secure data aggregation technique based on a combination of star and tree structures. In the Secure Hybrid Structure Data Aggregation (SHSDA), each node is assigned a parent node for data transfer. Experimental results show that the average power consumption and data delivery delay of SHSDA are lower compared to traditional methods.
In 2021 Gowdhaman V., Dhanpal R., [30] had introduced an IDS based on deep learning for WSN. The suggested network’s computational cost was reduced by using cross-correlation to select the best features. A cross-correlation technique was utilized to choose the best characteristics and these features were utilized as the building blocks for a deep neural network and this model yields an accuracy range of 95.53%.
In 2021 Yue C., et al., [31] had designed an ensemble intrusion detection approach to defend the railway ECN from system assaults like Port Scan, IP Scan, DoS, and MITM. The six basic classifiers are constructed using a variety of CNN and RNN: LeNet-5, VGGNet, AlexNet, Simple RNN, LSTM, and GRU are some of the most popular networks with an accuracy range of 97.5%.
In 2021 Safaldin M., et al., [32] had introduced a modified binary gray wolf optimization algorithm with an SVM classifier for enhanced IDS (GWOSVM-IDS). The objective of the proposed solution was to improve the accuracy rate and detection rate by minimizing the false alarm ratio and the large number of characteristics produced by IDSs in the WSN environment, while also lessening the processing time.
In 2022 Liu G., et al., [33] had devised an edge intelligence framework that can detect DoS intrusions against a WSN using the k-NN algorithm and the arithmetic optimization algorithm (AOA) in the growth estimation. To progress the accuracy rate of the network, an equivalent method was used to relay information among the population, and the Lévy flight approach was implemented to adapt the optimization.
Research gap
The following research gaps were found about the suggested research challenge by extensively examining the literature: Despite many advances in securing wireless communications, wireless sensor networks still face subtleties when it comes to secure data transmission. However, research is still underway to develop new reliable and secure routing protocols for data transmission. Most articles focused on energy efficient routing, including single-hop and multi-hop routing, and few articles addressed the security of data transmission during routing.
In this paper, a novel ID-SODA has been proposed which classifies intrusion nodes and non-intrusion nodes for wireless sensor networks. ID-SODA design goals are: improving network security by removing cryptanalysis nodes, energy consumption in the network, reducing required memory, and improving the accuracy rate by decreasing the false alarm rate by using the deep neural classification network.
Proposed methodology
In this section, a novel ID-SODA has been proposed for detecting intrusion in the wireless sensor network to improve network security. Initially, sensor nodes are clustered using the K-means clustering algorithm, which is a significant clustering technique for unsupervised learning tasks. After clustering the cluster head has been selected using the Salp Swarm optimization technique which considers parameters such as residual energy, distance, and cost. For saving energy consumption by using MHCNN. The MHCNN can classify into intrusion nodes and non-intrusion nodes. Finally, the intrusion node is blocked and the non-intrusion node is given to the base station. The overall workflow of the proposed system is shown in Fig. 1.

The schematic representation of the ID-SODA
K-means clustering is a significant clustering technique for unsupervised learning tasks. It is primarily built on Euclidian distances, and the node residual energies influence the choice of the cluster head. The proposed method utilizes the k-means clustering algorithm to perform clustering. By reducing the objective function based on a Squared-Error-Function (SEF), this technique seeks to locate the optimal cluster centroid. Therefore, the central node gathers data on the position, residual energy, and node ID of each node and saves it in a list. It begins running the clustering algorithm after collecting this data from all locations. The clusters have a better formation when using the K-means algorithm when the average distance between each cluster node is reduced. To evenly distribute the nodes across clusters and balance the network’s load, it is more effective. This method is quite helpful for creating clusters for various WSN applications. The K-means algorithm’s objective function is defined as:

Intrusion detection system (IDS).
Where dis (a x , a z ) 2 is the Euclidean distance calculated between node a ji and its cluster centroid zy, where x is the number of nodes and y is the number of clusters. This algorithmic process comprises several parts.
This results in the clustering of nodes into k clusters, with k cluster leaders to be selected for each cluster.
Salp swarm algorithm (SSA) is a technique, the network lifetime is optimized using this technique. Work with nearby nodes to replace any damaged nodes if they are unable to transmit data due to damage. The Cluster head SSO version described in this work improves the performance of the previous SSO by using node replacement. Access to biological studies on Salps and their habitats is challenging. The Salp swarm approach was developed primarily because of the difficulty of maintaining them inside a limited area. Greater foraging motility has been made possible by the salps’ swarming in the area of the deep waters known as the Salp chain. Eq 3. provides the arithmetic model for SSO.
The number L denotes the recent r largest amount of rounds, and M is the maximum number of rounds, where q 1 is a significant coefficient of SSA. The SSA first updates the location of the leader before beginning to update the positions of the followers.
An IDS is a typical safety measure in PC systems. Instead of a firewall, an IDS is installed inside the system to monitor all internal traffic and it is an effective way to secure the system. Intrusion detection systems operate in networked systems to detect instances of violations of security procedures and standard security measures. IDS units are part of the network. IDSs are determined based on hosts, networks, and hybrid systems. The real-time data are collected and pre-processed using data mining techniques and are given to IDS. This IDS module gets its input entirely from the network. A packet inspection IDS detects suspicious attacks by inspecting every packet that passes through the network.
In contrast, network devices with host-based IDS are equipped with an IDS module. An IDS observes the network for any intrusions based on user actions, but it may create a high false alarm ratio. To overcome this challenge, the deep neural network is implemented. The output of the IDS is given to a multi-head convolutional neural network, which will extract the features and classify them into intruded data and non-intruded data.
Multi-Head Convolutional Neural Network (MHCNN)
Each time series is processed by a separate CNN called a convolution head in a multi-head convolution. In Fig. 3 MHCNNs are built using one-dimensional convolutions, and the dimensionality affects how the incoming data is processed. In multi-head convolution, the time series are processed in different convolutions (also called convolution heads). They are in charge of gleaning usable data from sensor data. There is no correlation between the sensors that are installed in machines for many industrial uses. They frequently perform the role of a heterogeneous sensor network’s component, enabling them to collect data of various types, real-value scales, and even intervals. It is appropriate to treat them differently as a consequence. Instead of processing the time series all at once, the proposed design does so use a sliding window. In industrial systems with multiple phases of action, it is common for a single timeline to represent multiple behaviors. A total of 80 features common to all sub datasets are included. These common capabilities allow to develop and evaluate any attack model in the same environment. Finally, select 79 functions other than “timestamp” as shown in Table 1.

Architecture of Multi-head Convolutional neural network.
Extracted features for CNN-based intrusion detection
Each traffic data set has 41 features, one class designation and one difficulty designation. Features include basic features (No.1 –No.10), content features (No.11 –No.22), and traffic features (No.23 –No.41). The network extracts features based on each episode as a result of window-based processing of the time series. Another way is to extract attributes based on the full-time series and omit the important phases. A convolution is described as:
Where y represents the result of the filters f (i, j) with length (P) and breadth (Q), and x(i,j). To manage numerous time series, MHCNN uses many one-dimensional convolutions in one channel. When dealing with a large number of time series, it is common to use CNNs with multiple channels. Each channel represents a different time series. When a large number of time series are analyzed by an MHCNN, a unique feature map is produced that contains the key traits of each time series. The final result is determined by the combination of the features, which are extracted independently for each channel and subject to their collection of filters. The distinctive properties of each sensor could be eliminated by merging all of the sensor data in this way. An MHCNN isolates the features of each time series. As a result, each time sequence is given its feature image. On the other hand, convolution networks’ attribute count is impacted by this element. The number of attributes in each layer of a typical MHCNN is calculated using Equation (7).
Where fn is the number of filters, fs is the filter size, and l is the final length of the resultant vectors that are drawn from the previous layer. Additionally, the number of attributes should be increased by the number of sensors at the MHCNN levels.
It is impossible to maintain the characteristics of each input without processing them on separate convolutional networks because the convolutional network in this design seeks to extract the essential properties of each input. As a consequence, it increases linearly with the number of instruments. Each neural head of the MH-CNN requires a four-dimensional input, which is calculated as follows:
Each window receives the convolution algorithm treatment to produce a collection of feature maps, which are then progressively arranged within the sequences. Because they each influence one another, these patterns are interconnected. Fusing feature images with the same window number together (w). Concatenating all Trait Cards results in a collection of Trait Cards. Over various periods, CNNs have been used in the multi-channel form, with each channel representing a different type of variable. To analyze various time series, multi-head CNN uses many 1-dimensional convolutions in a single channel.
MH denotes the multi-head used to fuse the h1, h2, …… h
n
with the transformation matrix T. When there are multiple time series, a multi-channel CNN creates a feature map that includes all of the key characteristics of each time series. In comparison, the multi-head CNN extracts the features of each time series separately, producing a centered time stamp. As a result, each time is given its feature map. The feature vectors vavg are produced by performing average pooling processing on the output matrix, and they are used as the input for the completely connected layer. Finally, a dense layer sends the network’s output to the final SoftMax classifier, which produces the categorization outcomes shown in Equation 11.
The weight matrix is indicated by the placeholder wm. MH-CNN uses the SoftMax function as a nonlinear activation function for classification problems. Accurate results are obtained using the SoftMax function, which computes the probability of each class and selects the highest value. The basic softmax activation function in mathematics is:
Where x is the softmax function’s input, exi specifies the common exponential function applied to each input element, k specifies the number of classes, and x is the input to the softmax function. The result function will add to one to produce a reliable chance distribution. Cross-entropy is used as the loss function, the suggested approach may effectively capture important information at all scales, from the local to the global. The cross-entropy loss function is predicted to be as follows:
In this section, the suggested ID-SODA approach is assessed using various metrics, including accuracy, precision, F1 score, recall, and detection rate based on the NSL-KDD dataset. The performance of the suggested strategy against novel attacks is also confirmed, and the benchmark includes the overall accuracy rate, which is specifically defined and evaluated. Additionally, a contrast between the suggested ID-SODA system and traditional DL models is described.
Table 2 shows the KDD Train data set as the training set and the KDD Test data set as the test set, containing various standard data sets and four different types of attack data sets.
Various classifications in the NSL-KDD dataset
Various classifications in the NSL-KDD dataset
The performance analysis of the suggested ID-SODA model can be measured using evaluation metrics, F1 score, precision, accuracy, detection rate or recall, and false alarm rate.
Where TP is a true positive, TN is a true negative, FP is a false positive, and FF false negative of the samples.
Table 3 shows the classification of several classes of WiFi Intrusion Detection concerning particular parameters. The average accuracy, recall, F1score, precision, and specificity of the proposed ID-SODA with the specific parameters. Figure 4 illustrates the average accuracy of the proposed ID-SODA is 99.25 % and 98.65 % respectively.
Performance analysis of the suggested ID-SODA

Performance analysis of the ID-SODA.
Figure 6 shows the number of packets detected from the attacker’s source to the target node within a fixed time window. Figure 5 shows the attack detection times for attacks against networks with different numbers of nodes. Intrusion detection is calculated in relation to the number of anomalous packets arriving at their destination over a period of time. Due to heavy network traffic, packets are diverted to another route to reach their destination. As the number of routers in the network increases, the number of alternate routes to the target also increases, smoothing out the time it takes attack packets to reach the target.

Intrusion detection time based on network size.

Instant packet reaching the target.
The performance of the existing techniques was compared with the performance of the suggested strategy to demonstrate that it is more effective. In a comparative study, the suggested model is compared against four existing approaches. Compares the overall performance of DL models with the suggested method is shown in Table 4. Comparison of the proposed model to DL like ResNet, AlexNet, Dense-Net, and CNN.
Comparative analysis of deep learning networks with the ID-SODA
Comparative analysis of deep learning networks with the ID-SODA
Table 2 shows the outcomes of the overall accuracy rate. In comparison to MHCNN, traditional networks like ResNet, Alex Net, and DenseNet gain less accuracy. MHCNN achieves a high accuracy range of 95.98%. Figure 7 shows that the accuracy obtained by ResNet, Alex Net, and DenseNet is 90.34%, 94.39%, and 89.92% respectively. The specificity obtained by ResNet, Alex Net, DenseNet, and MHCNN is 91.19%, 89.98%, 90.18%, and 93.89%. Precision is obtained by ResNet, Alex Net, and DenseNet and MHCNN is 89.58%, 93.87%, 91.49%, and 94.29%. The recall is obtained by ResNet, Alex Net, and DenseNet and MHCNN is 93.51%, 89.59%, 92.87%, and 95.18%. F1 score is obtained by ResNet, Alex Net, and DenseNet and MHCNN is 90.31%, 89.98%, 87.89%, and 95.34%. The MHCNN achieves a higher accuracy rate than the currently used models.

Architecture of GhostNet.
Figure 8 shows the false alarm ratio evaluation of the proposed MHCNN with traditional deep learning classifiers. The proposed MHCNN achieves higher detection accuracy than other traditional DL methods with a very low false alarm rate (FAR) or false positive rate.

FAR comparison of MHCNN with traditional DL classifiers.
Figure 9 shows the false negative rate evaluation of the proposed MHCNN with traditional deep-learning classifiers. The proposed MHCNN achieves higher detection accuracy than other traditional DL methods with a very low false negative rate (FNR).

FNR comparison of MHCNN with traditional DL classifiers.
The effectiveness of the suggested ID-SODA approach was compared to the classic DL models based on the classification results, in which the proposed ID-SODA achieves a better accuracy rate of 98.04%. From the derived results, we accomplish that the proposed ID-SODA approach produces better accuracy with a low false alarm rate. Figure 10, illustrates the comparison of the proposed ID-SODA method to other existing approaches such as SMOTE, SLGBM, and GWOSVM-IDS. The proposed strategy increases network lifespan. As the number of nodes increases, the lifespan of the nodes increases. The graph below shows the network node lifetime pattern (for multiple technologies). With ID-SODA, the network has a longer lifespan, performs much better than previous approaches, and has a lower dead node percentage.

Increasing network lifetime.
In terms of energy consumption, a performance comparison between the proposed ID-SODA and existing approaches such as SMOTE, SLGBM and GWOSVM-IDS is shown in Fig. 11. A lower energy consumption can be observed in the proposed protocol compared to other existing protocols. The proposed ID-SODA involves a minimal number of nodes in the packet forwarding process. The node with the highest is always preferred when forwarding packets, thus saving a lot of energy. Conversely, the existing methods, more nodes are responsible for forwarding the same packet. As a result, SMOTE, SLGBM, and GWOSVM-IDS consume more power.

Comparison of total energy consumption.
Intrusion detection time is the total amount of time needed to identify the attacker, identify susceptible behavior, and contact the coordinator to validate other properties. Figure 12 displays the average amount of time needed to identify a single attack. When compared to existing approaches, the suggested ID-SODA has the quickest attack detection time and is about 65% more effective than the SMOTE, SLGBM, and GWOSVM-IDS. But the proposed method will not detect the specific attack.

Intrusion detection time.
Figure 13 illustrates the accuracy rate comparison of the proposed ID-SODA with different methods such as SMOTE, SLGBM, and GWOSVM-IDS respectively. The proposed ID-SODA achieves higher detection accuracy than other methods, in which the proposed MHCNN-IDS achieves a better accuracy rate of 98.95%.

Compare proposed accuracy with existing methods.
Table 5 illustrates, the suggested ID-SODA model compared with existing methods, it improves the overall accuracy better than 6.56%, 2.94%, and 2.95% in SMOTE, SLGBM, and GWOSVM-IDS respectively. According to the comparison above, the suggested ID-SODA model outperforms the current models in terms of accuracy. Future developments will improve the suggested method’s accuracy.
Comparison between existing and suggested ID-SODA
In this paper, a novel Intrusion Detection via Salp Swarm Optimization based Deep Learning Algorithm (ID-SODA) has been proposed which classifies intrusion node and non-intrusion node. The proposed method has been simulated by using MATLAB. The performance analysis of the proposed method is evaluated based on the parameters like accuracy, precision, F1 score, detection rate, recall, false alarm rate, and false negative rate. The proposed ID-SODA achieves an accuracy range of 98.95%. The result shows that the proposed ID-SODA improves the overall accuracy better than 6.56%, 2.94%, and 2.95% in SMOTE, SLGBM, and GWOSVM-IDS respectively. In the future, the proposed method can be used to detect more vulnerable attacks using different algorithms to predict the final result. Moreover, different feature selection methods can be analyzed to reduce the computational complexity of the system. Additionally, the problem of intrusion detection in wireless cellular networks must be addressed.
