Abstract
In today’s world, a Network Intrusion Detection System (NIDS) plays a vital role in order to secure the Wireless Sensor Network (WSN). However, the traditional NIDS model faced critical constraints with network traffic data due to growth in the complexity of modern attacks. These constraints have a direct impact on the overall performance of the WSN. In this paper, a new robust network intrusion classification framework based on the enhanced Visual Geometry Group (VGG-19) pre-trained model has been proposed to prolong the performance of WSN. Primarily, the pre-trained weights from the ImageNet dataset are utilized to train the parameters of the VGG-19. Afterward, a Hybrid Deep Neural Network based on Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) will be employed to extract the influential features from network traffic data to enlarge the intrusion detection accuracy. The proposed VGG-19 + Hybrid CNN-LSTM model exploits both binary classification and multi-classification to classify attacks as either normal or attacked. A network intrusion benchmark dataset is used to assess the performance of the suggested system. The results reveal that the proposed VGG-19 + Hybrid CNN-LSTM learning system surpasses other pre-trained models with a superior accuracy of 98.86% during the multi-classification test.
Keywords
Introduction
In recent years, WSNs are widely employed in various smart applications such as smart healthcare, smart city, military, aerospace, and environmental monitoring [1]. Nevertheless, the formation of secure WSN is critical for these smart applications because of the dynamically changing routing, resource-constrained sensor nodes, monitoring distribution, small size, and no switches or gateways supervising the information flow. In the last few decades, several security areas of WSN have been investigated as follows: (a) encryption algorithms, (b) multiple attack models and defence tactics, (c) key management (d) Intrusion detection and response models, and (e) network security architecture [2–4]. In different network layers, multiple security defence and detection mechanisms may be applied and they support each other.
While WSNs have numerous benefits such as minimal installation costs and unattended network operations, the lack of a physical defence line makes security a major concern [5]. An attacker can snoop on radio traffic and/or interrupt shared readings in wireless networks. As a result, it’s critical to safeguard sensor node data, particularly in cases where secrecy is critical. Typically, WSNs have plenty of new attacks that are not found in the traditional networks [6].
A challenge to be solved is how to increase the intrusion detection system’s ability to detect unexpected threats and choose appropriate algorithms. Certain algorithms are effective during identifying the known attacks whereas others are superior at detecting the undiscovered ones. Some algorithms perform well with a flat surface network structure whereas others work well with a hierarchical network structure. The optimal algorithm should be chosen or designed based on the network’s requirements.
Most present intrusion detection systems add overhead in terms of messages exchanged for decision-making that depletes a lot of energy and shorten the life of WSNs [7–9]. NIDS are vital in the defence of WSNs because they detect unauthorized access and investigate potential threats. Traditional NIDS struggle to keep up with newly developed security threats that are unpredictable and complex [10]. The security threats to traffic data are growing at an exponential rate, and newly developed attacks are becoming increasingly complex and varied. As a result, typical NIDS approaches including heuristic detection, signature-based detection, and behaviour-based are ineffective against malicious events. Hence, an efficient NIDS is necessitated to significantly detect the various attacks in WSN environment.
There are two key constraints are existed while developing an effective and responsive NIDS for unknown future threats [11]. First, selecting the appropriate features from a network traffic dataset for anomaly detection is tricky. Due to constantly changing and evolving attack conditions, features learned for one type of attack may not function well for other types of attacks. Second, there is a lack of labelled traffic datasets from real networks that can be utilized to create a NIDS.
In the last few decades, Machine Learning (ML) algorithms have shown remarkable outcomes in classifying network intrusions [12]. Standard detection methods have some limitations but these ML algorithms can overcome them and deliver a gratifying accuracy score. Deep Neural Network (DNN) is an advanced ML approach which widely employed in a variety of applications like natural language processing, computer vision, and speech recognition [13].
There are two different ways to train the classifier model: Training from the beginning and Transfer learning. Training from the beginning is initialized by using random weights. In transfer learning, the model is pre-trained on a related task and after that, the model is optimized for the target task [14]. Transfer learning can be used rather than developing the network from the beginning that is to solve problems that are similar by using weights or other forms of knowledge from the earlier trained models (e.g., pre-trained VGG).
Transfer learning using a pre-trained model is a successful and most utilized approach. But it does not employ for detecting network intrusion. It is a type of extended ML approach which utilizes knowledge such as weights of earlier trained models to improvise learning in a new problem [15]. To minimize the cost of computation and increase the accuracy, transfer learning using pre-trained models is applied effectively for image classification problems. Some of the largely used pre-trained models on image classification problems and visual recognition problems are VGG-16, VGG-19, AlexNet, GoogleNet, ResNet, MobileNet, and Inception [16]. To improve classification performance, these models use more extensive network architecture training on the ImageNet dataset.
In 2012, an 8-layer deep architecture called AlexNet was developed for the first time and achieved an error of 16.4 percent on the ImageNet challenge which is a top-5 classification error. By achieving a classification error of 7.3 percent, VGG-based architecture including VGG-16 (sixteen layers) and VGG-19 (nineteen layers) was ranked as top-5 in 2014. Therefore, the VGG-19 is preferred in this work for intrusion detection in WSNs.
The key hypotheses of the proposed model are to detect several attacks in a minimal time. In this work, a new enhanced Hybrid architecture model has been introduced based on the combination of VGG-19 architecture and pre-trained convolutional layers to support the transfer learning for intrusion detection. The Hybrid DNN model is integrated with the VGG-19 to enhance the classification process. Here, the intrusion inputs are converted to a gray-scale image, which is then converted to a Red-Green-Blue image format, and finally, it is fed into the VGG-19 pre-trained model. The results manifest that the proposed model prolongs intrusion detection, saves sensor node resources, and reduces the energy consumption in the WSN environment.
The following contributions expose the novelty of the research work: A Hybrid DNN model can be formulated with the aid of CNN and LSTM algorithms. Afterward, the formulated CNN-LSTM model is integrated with the VGG-19 to extract the temporal and spatial features from network traffic data. This Hybrid formulation assists the proposed framework to obtain better intrusion detection during the classification phase. The proposed framework utilizes both binary classification and multi-classification to classify the attacks as either normal or attacked nodes.
The remaining article is laid out as follows: The relevant work is discussed in Section 2. The proposed technique and its requirements are enumerated in Section 3. Section 4 offers the experiment configuration as well as outcomes. Finally, Section 5 provides the conclusion about the current research work.
Related work
The following section discusses the various ML and transfer learning approaches for intrusion detection in WSN. A typical NIDS model faced difficulties owing to the exponential evolution of network traffic data and the necessities of current attacks. In order to detect modern attacks in WSN, the Decision Tree (DT)-based classifier model has been established for NIDS [17]. The DT model performs both binary and multi-class classification of network traffic data. A multi-classification scheme offers exhaustive attack statistics when compared with a binary classification scheme. Similar to a binary classification scheme, a multi-class classification scheme recognizes a node either as a normal or attacked node. The suggested model initiates with the data pre-processing for eliminating the duplicate and missing value occurrences of the input dataset. Thereafter the feature collection stage is invoked to learn the critical attributes by eradicating redundant attributes of the input dataset. A robust DT classifier is implemented in the final phase to classify the various network attacks. However, the anticipated model faced the class imbalance constraints which remain more in the multiclass datasets.
Toldinas et al. have proposed a hybrid model that integrates a ResNet50 and DNN to detect the attacks in WSN [18]. The learned features are converted into four-channel. They test the suggested hybrid model with two different datasets. The results show that the combined model is more accurate and takes lesser time to train and test than the existing models. While identifying the network intrusions in real-time, the class imbalance is the key issue that has a predominant impact on classification results. This suggested hybrid ResNet50 + DNN model entirely anticipates the foremost classes but it does not have the ability to identify the sparse classes. Inadequate identification can consequence in the loss of vital information and disturb the classification process.
Liu et al. [19] have examined the accuracy of intrusion detection using the DL and ML models. The Random Forest (RF) classifier is introduced to detect the various attacks in WSN. This model utilizes binary classification to divide the NSL-KDD dataset into benign and attack categories. They also perform multiclass classification which divides the data into four distinct groups. While all four models are highly accurate in binary classification, intrusion detection effectiveness in multiclass classification differs depending on the model type.
In [20], a new NIDS model has been established through a DNN based on the VGG-16 framework. The VGG-16 + DNN framework consists of two stages to detect the various attacks. Initially, the features are learned via the VGG-16 model to train the anticipated model. Next, the DNN is implemented to the learned features for binary classification. The combined model provides higher accuracy with lesser misclassification errors. Nevertheless, this model creates more complexity to the classifier network.
Transfer learning is a well-recognized ML approach that has been widely applied to representing the classification process. There are a few documented studies using transfer learning for NIDS in WSN. To acquire the useful information from an input dataset and transmit the acquired information to the learning of the target dataset, a transfer learning-based technique named VGG-16 was presented [21]. As a target dataset and basis dataset, the VGG-16 is combined with DNN (VGG-16 + DNN) which uses NSL-KDD and UNSW-NB15 respectively. The VGG-16 model weights are accessible on a variety of platforms including Keras and it may be utilized for additional investigation, modelling, as well as application development. The anticipated model also examines the Inception V3 + DNN and VGG-19 + DNN techniques to classify the various attacks in WSN. The utilization of transfer learning instead of standard CNN resulted in considerable performance gains in the experimental results. But, this kind of collaborative model yields higher misclassification errors in the NIDS.
In the last few decades, a huge number of attacks are established in WSN environment. The researchers employed the DNNs model to effectually find the malware activity and recognize the various network intrusions [22]. Primarily, the sparse autoencoder with optimal normalization is enforced to extract the characteristics of sparse features. Next, the DNN was exploited to forecast and categorize the different attacks. It also classifies the multi-class classification from the learned features. The optimal normalization in the suggested model assists to train an efficient model during the high dimensional dataset. At the same time, the sparse autoencoder lagged to learn the dense features in intrusion detection.
The malicious node identification during the data transmission between the sensor node and the central sink node is a trending research idea. To achieve this, a new Support Vector Machine (SVM) has been presented to detect the occurrence of attack significantly with a minimal time [23]. In the SVM model, a two-phase classification mechanism is employed to solve the constraints of intrusion detection. In the first phase, an acknowledgment message is flooded to perceive whether the corresponding node is the abnormal node or not. Based on the received information, the second phase classification can be carried out to identify various categories of attacks. An acknowledgment-based strategy offers accurate detection of the malicious node in WSN. However, an additional amount of energy is depleted for each sensor node by using the acknowledgment strategy in the anticipated model.
In [24], the collaborative model based on CNN and RNN has been proposed to address the various issues related to intrusion detection in WSN. Here, the two most popular DNN models are employed to learn the spatial and temporal features. In the realm of image recognition, CNNs are quite effective. However, it has been demonstrated that the CNN may only analyze a single input package-it is unable to assess timing information in specific traffic set up. In an attack traffic situation, a single packet is normal data. If a high quantity of packets is transmitted simultaneously or in a small duration, the packet becomes malicious traffic. In this case, CNN doesn’t apply which could result in a huge number of missed warnings.
In [25], a robust Logistic Regression (LR)-based NIDS model has been introduced to detect the intrusion in the WSN environment. The suggested LR model acquires the input information from local node parameters. Further, this information would be manipulated for both benign and malicious behavior detection processes. It also designs a typical behavior model which assists to identify the abnormalities within the restrained node. The training stage of LR regulates whether the model can be utilized to detect the malicious activity of the deployed nodes in WSN. The experimental findings produced the precision in a better way for training in binary classification and multiclass classification. But, this LR model failed to obtain superior classification accuracy during the larger dataset.
In [26], a Naive Bayes (NB) classifier has been proposed to address the intrusion detection issues in the WSN environment. The likelihoods of every feature that belongs to every class are manipulated for a prediction process. Afterward, the maximum likelihood class value is selected as the optimal result of the NB classifier. Similar to [17], this model also evaluates in both binary and multi-classification. This NB classifier provides better accuracy and precision value for the larger datasets. Nevertheless, it lagged to categorize modern types of attacks and malicious nodes in WSN.
In general, the signs of a few attacks are recognized easily whereas other kinds of attacks only replicate some variation from the normal shapes. The K-Nearest Neighbour (KNN) classifier is applied in the anticipated model to identify several attacks in WSN [27]. Besides, the anticipated model mainly emphases on two categories of intrusion detection: Host-based and network-based intrusion detection. The host-based technique determines their decisions based on the information received from a single host. Subsequently, the network-based technique acquire data by supervising the network traffic throughout the sensing field. The aim of KNN is to categorize a new attack according to its characteristics and training samples. It does not require any prototype to fit because it works based on memory only. Nonetheless, the KNN model typically suffers from the constraints of high computational time.
According to the above-mentioned analysis, most of the suggested frameworks are failed to enhance the efficiency and accuracy of intrusion detection in the WSN. This paves a way to reduce the overall performance of WSN. To evade this, the proposed framework is built to optimize the efficiency and accuracy of the intrusion in the WSN based on the enhanced VGG-19 model. In the proposed model, the 3rd, 4th, and 5th blocks of VGG-16 are added with an extra convolutional layer in the enhanced VGG-19 architecture which is the key change between the existing 16 and 19 VGG models. The optimal utilization of blocks in the proposed enhanced VGG-19 model facilitates to obtaining accurate classification of various attacks. The implementation and results produced by the proposed enhanced VGG 19 and the Hybrid DNN models are discussed in the following sections.
Materials and methods
Datasets
The CIC-IDS-2017 [28] dataset is employed in the proposed work that includes benign and updated known attacks and it closely reflects real-world data (PCAPs). Network traffic analysis results using CICFlowMeter are also provided which comprises labelled flows on the basis of destination and source ports, timestamps, destination, and source IPs, attack vectors (CSV files), and protocols. Each network connection record contains data regarding the interaction with ID traffic input as well as a label element that indicates the connection state if it is normal or attacked.
The CIC-IDS-2017 datasets are built with several features like anonymity, capturing the entire network traffic, diversity of attacks, available protocols, monitoring the entire network interaction, defining the entire network architecture, and labelled data samples. These datasets were created using the notion of profiles to provide well-ordered datasets that show in-depth knowledge of attacks as well as conceptual knowledge of various application models, protocols, and network devices. Both binary and multi-class classification is performed using the files in the dataset. An optimal NIDS is capable of accurately detecting each attack type and hence it prolongs the lifespan of the network.
Data pre-processing
The data in PCAP format is utilized to format the model input data in the data pre-processing step. In order to formulate the model input, the proposed framework has undergone several processes like time division, traffic segmentation, production, and labelling of XML files as depicted in Fig. 1. Primarily, the time division extracts the matching period of the PCAP file from the original PCAP file based on the attack time and type. Subsequently, the traffic segmentation is achieved by sharping the PCAP file received in 1st step into matching sessions according to the IP address of the victim host and attack host every time. After extraction, the PCAP file remains enormous and poses a severe difficulty to capture data in the model. This work packed the traffic using Python’s pickle function to speed up the data reading process.

Sequential Flow of Pre-processing phase.
Data items with missing values are restored with patches during the data pre-processing phase and all data is subsequently encoded into numerical values. To improve the learning efficiency, the values are further scaled (standardized). Finally, the data is reshaped into the ConvNet function requirement format.
The proposed framework has been evaluated using the CIC-IDS-017 dataset. Pre-processing stage is essential to convert the original raw information into an image format that can be fed into the proposed framework. The categorical data from the CIC-IDS-017 firstly must be converted into numeric data and the whole dataset must be standardized. Service, protocol type, and flag are the three category aspects of the CIC-IDS-017 that are encoded into numeric information with the one-hot encoder approach.
In CIC-IDS-017, the proposed model utilized the min-max standardization method to scale the raw information to a predefined range of 0 to 1. The standardization guarantees that the data dispersal is consistent and eliminates the problem of exploding gradients during the training stage. The computation strategy of min-max standardization is represented by Eq. 1, where X and X-scaled are the original and normalized values, correspondingly. Let min(X) and max(X) be the minimum and maximum values of the data.
There are advantages and disadvantages in developing a CNN from the ground up but it does necessitate a huge database. Using the pre-trained models on huge datasets such as ResNet, VGG-16, VGG-19, and others is a potential method. Transfer learning is a concept that allows us to utilize present models to complete our objectives. Using a similar model to extract specific versions from new pictures is a common method of transfer learning but there are many more techniques are available. Due to the smaller dataset, the pre-trained model is chosen as the feature extractor which suggested that only the fully connected classifier would be trained.
In this work, the enhanced VGG-19 model is formulated by combining VGG-19 with the CNN mechanism. This combined model offers an excellent accuracy rate when processing large datasets like ImageNet. The ImageNet dataset comprises 1.2 million general object pictures from 1,000 distinct object types that are utilized to train the parameters of the enhanced VGG-19 model. Convolutional, fully linked layers, dropout, and max pooling are included in the VGG-19’s 19 trainable layers.
The proposed framework combines a learned convolutional base with a tailored classification component where it includes a densely-connected classifier and a regularization dropout layer. The updated version of the VGG-19 model is illustrated in Fig. 2. At each position, convolutional layers perform a convolution function on the image (feature map) and transfer the output to the next layer. The convolutional layer’s filters are generally three-size trainable feature extractors.

Proposed System Architecture.
Each convolutional layer stack has a max-pooling operation and a rectified linear unit (ReLU) activation function. Let x represents the neuron’s input and the ReLU is the most often utilized non-linear activation function. It is more computationally efficient that has superior convergence and avoids the vanishing gradient issue when compared to the sigmoid function [24].
A down-sampling max-pooling layer is implemented after the ReLU activation algorithm. The proposed model essentially uses a filter with a size of 22 and a stride of the same duration. The output is the greatest number in each sub-region. The classification stage includes a dropout layer and densely-connected classifier which was followed by a succession of convolutional layers (conv1 to conv5). A dense layer is completely interconnected with each neuron getting information from every other neuron in the preceding layer.
A densely connected layer learns from all of the information from the preceding layer while a convolutional layer depends on consistent features with a small repeating field. The activation function for the densely connected layer must be identified. A random activation set in the dropout layer is set to zero that enabling the network to become redundant. During the training phase, neurons are randomly eliminated (p-dropout rate) to prevent overfitting.
A convolution layer extracts the visual information and a Fully Connected Layer (FCL) determines about the class details of the input image. These processes make the CNN mostly use deep learning model for image recognition. The convolution layer retrieves the image’s unique features while maintaining I/O and spatial information. The size of the feature data is lowered by adding a pooling layer to the convolution layer. An image is processed using the following equation:
The length of the input image is denoted by the L. The K and P indicate the kernel size and zero need to be filled by the level of a dimension on both ends. The kernel’s stride is represented by S.
Using a CNN to train the system on the properties of a single packet as the foundation for judging the type of traffic makes the data tough. The LSTM corrects this by grouping the data of a single connection (from starting to termination) and using the attributes of all the packets of data in that group, as well as their linkages, for judging the traffic’s nature. The FCL layer, Softmax, and output layers are established in the LSTM phase. Further, the LSTM networks are superior at predicting the processing timing because of their unique selective memory and forgetting gates. The initial stage in the LSTM layer is to decide which data from the node state may be discarded by the model. The following equation is exploited to make this decision [29].
After being processed by the pre-trained VGG -19 model, the proposed model accepts grayscale images or RGB images as input. Two other parameters of the CNN model can be changed: the number of convolutional layers and the size of the kernel. These factors are referred to as hyperparameters and they are used to build 18 different scenarios.
The CNN model has one, two, or three convolutional layers, with the number of kernels increasing by a factor of two as the number of neurons per layer increases. Furthermore, kernel size is typically set to 3 × 3. However, the proposed model employs 3 × 3 as the median number and conducts experiments with sizes of 2 × 2 and 4 × 4 to determine the best size. By moving over the picture as much as the stride value, the kernel builds a feature map.
The CNN section examines a pre-processed PCAP file and sends a multi-dimensional package vector to the LSTM component. The LSTM phase can assess a set of multi-dimensional package vectors to generate a vector reflecting the likelihood such that the phase belongs to every class. Based on the probability vector, the Softmax stage delivers the optimal outcome of the classification. The proposed framework learns the characteristics of the data via the CNN phase. Then, it drives into the LSTM phase which deliberates past and future information about the input data. The time-series information arrived in the proposed Hybrid CNN-LSTM model will properly be categorized the various attacks.
Figure 3 demonstrates the block illustration of the proposed LSTM model. The sigmoid (ϑ) and hyperbolic tangent (q ta ) factors are employed as the initiation values inside the unit cell. In this work, the LSTM acquires input from the output of the suggested CNN model at every consecutive input information. The LSTM chooses mutually which data is associated with the input information and henceforth it maintains the l i (cell state). The LSTM phase acquires li-1, hvi-1, and w i at every timestamp ‘i’ to carry out its operations. The forget gate identifies which preceding data li-1 is not vital at the instantaneous, the input gate selects related data from the input information w i , and the output gate produces the hidden status hv i for the time i. The output from one LSTM phase is the input for the next phase. The output from the CNN is boosted frontward via time and upwards via three levels of LSTMs.

Block diagram of LSTM model.
The learning is iterations of the net tuning for a target compute function. In this work, the dataset has been trained using the learning mechanism. The model is trained using 75% of the data, and the rest of the 25% percent of the total is utilized for testing. The goal of the learning on the dataset is to extract enough information to aid in the learning of the target dataset. The model is finally formed once the training has converged with consistent detection accuracy. When training a ConvNet, it’s important to remember that an output layer (FCL) should be used to generate the classification results.
Results and discussions
The efficacy of the proposed framework can be simulated in the Python platform. Table 1 exposes the training parameters of the simulation environment. The proposed framework is computed for both binary and multiclass classification through the various metrics enumerated in Section 4.1.
Training parameters
Training parameters
The CIC-IDS-2017 [28] dataset is employed to examine the performance of the proposed method. Since the dataset is huge, there is no official separation between training and testing samples. Therefore, we randomly select 20,000 records for analyzing the proposed method. The model is trained using 75% of the data and the rest 25% of the data is utilized for testing. Among 20,540 records, 15405 records are used for training the model and 5135 records are exploited for testing the system. In the testing data, 3250 are normal data whereas 1885 are attacked data.
A 5-fold Cross-Validation (5CV) strategy has been investigated to substantiate the efficiency of the anticipated VGG16 + Hybrid DNN. It is exploited to separate the entire training set into five diverse classes. The individual subset of training schemes is used to train the VGG16 + Hybrid DNN framework against (5-1) validation throughout the whole training occurrence. This validation evolution is repeated ‘5’ times for all folds and it ensures that the whole dataset has been evenly predictable. Figure 4 demonstrates the structure of the 5CV system.

Construction of 5CV system.
The effectiveness of the suggested models has been assessed via numerous metrics including accuracy, precision, recall, and F1-score. The mathematical formulation of these metrics is represented in the following equations. These metrics are manipulated with the aid of True Positive (T
P
), True Negative (T
N
), False Positive (F
P
), and False Negative (F
N
) [30–34]. The number of attacks accurately recognized as attacks is indicated by T
P
whereas F
P
symbolizes the number of attacks mistakenly classified. F
N
stands for the number of wrongly categorized normal data while T
N
stands for normal data successfully classified.
In this paper, the confusion matrix is established to analyze the testing records. In practice, it is applied to identify the efficacy of several classifiers. It produces the tabular instance of the detected values and the actual values of the considered samples as demonstrated in Fig. 5. Normal and attacked data are recognized appropriately but thirteen records are incorrectly identified as normal data instead of attacked data from the selected 1885 records. At the same time, eight records are incorrectly recognized as attacked data instead of normal data from the selected 3250 records. The performance result exposes that the detection accuracy of the confusion matrix is acquired as 99.46%. Likewise, the precision, recall, and F1-score will be evaluated through the confusion matrix.

Confusion matrix of the proposed classifier.
The performance of the proposed VGG19 + Hybrid DNN classifier has been computed and compare with the existing state-of-the-art methods such as NB [26], DT [17], KNN [27], SVM [23], RF [19], LR [25], ResNet50 + DNN [18], and VGG16 + DNN [21]. In this work, the CIC-IDS-2017 dataset is utilized to analyse these methods.
The binary classification accuracy obtained using the different existing learning models is illustrated in Fig. 6. From Fig. 6(a–e), the training accuracy outcome of the DT, KNN, LR, NB, and VGG 16 + DNN is 83.65%, 84.66%, 84.91%, 83.36%, and 88.30% respectively. These results validate that the conventional models require a higher number of parameters to train the classifier model.

Binary classification accuracy obtained using different existing learning models.
Similarly, Fig. 7 depicts the binary classification accuracy achieved by the proposed VGG19 + Hybrid DNN learning approach. Based on the simulated results of Figs. 6 7, it is observed that the proposed learning approach outperforms well than the existing learning models with the highest accuracy of 94.76%. These better results are owing to the accomplishment of an influential learning model in the proposed framework with the assistance of VGG19 architecture. A minimal number of parameters are utilized to train the proposed framework. This lower usage of parameters causes a better accuracy than the existing learning models.

Binary classification accuracy of the suggested approach with 94.76%.
The percentage of accuracy achieved by the proposed framework is enumerated in Table 2. Further, the performance comparison of binary classification for various classifiers is depicted in Table 3. It is noticed from Table 3 that the proposed VGG19 + Hybrid DNN acquires superior accuracy of 99.46% and precision of 99.61% when compared with the conventional classifiers.
Computation of recognition rate, error rate, and execution time for 10 epochs
Performance comparison of binary classification for various classifiers
The major rationale behind this accomplishment is that the proposed model employs appropriate learning and a combined classifier. Additionally, this combined proposed model offers an excellent accuracy rate when processing large datasets like ImageNet. The ImageNet dataset comprises 1.2 million general object pictures from 1,000 distinct object types that are utilized to train the parameters of the enhanced VGG-19 model. Convolutional, fully linked layers, dropout, and max-pooling are included in the proposed VGG-19’s 19 trainable layers. Based on these optimal layers, the proposed framework manipulates the accurate classification of various attacks in the WSN environment.
On the contrary, the existing classifiers do not concentrate on acquiring higher accuracy in the NIDS. They met difficulties to train the system on the properties of a single packet as the foundation for judging the type of traffic. In particular, the parameters are not suitably designated for attack classification in the existing SVM, NB, and DT models. The inadequate selection can consequence in lesser accuracy than the proposed classifier. In the case of VGG-16 + DNN and ResNet50 + DNN models, adequate parameters are utilized for the classification process. It leads to attaining a better accuracy than the SVM, NB, and DT models. However, the VGG-16 + DNN and ResNet50 + DNN models lagged to categorize the modern types of attacks and malicious nodes in WSN. These issues are overwhelmed in the proposed framework that significantly classifies the various modern attacks in the network.
Based on the Table 3 results, the proposed framework yields a superior recognition rate and lesser execution time in NIDS of WSN as compared with the existing classifiers. The key reason behind this achievement is that the proposed model utilized the min-max standardization method to scale the raw information to a predefined range of 0 to 1. The standardization guarantees that the data dispersal is consistent and eliminates the problem of exploding gradients during the training stage. This paves a way to learn a greater number of valuable features from the larger input dataset. Furthermore, the proposed framework combines a learned convolutional base with a tailored classification component where it includes a densely-connected classifier and a regularization dropout layer. The optimal layer selection will reduce the misclassification errors and execution time during the training stage. Alternatively, the existing classifiers typically suffer from the constraints of high computational time because of the inappropriate selection of processing layers.
The performance comparison of the binary classification between the existing and proposed learning models is also evaluated for the other metrics including F1-score and recall as illustrated in Fig. 8. It is apparent from Fig. 8 that the proposed framework exposes a better recall of 99.32% and an F1-score of 98.4% than the existing classifiers. This is due to the involvement of the proper filters in the pre-processing stages of the proposed framework.

Evaluation of Recall and F1-score for different classifiers.
The proposed convolutional layer filters are generally three-size trainable feature extractors. Each convolutional layer stack has a max-pooling operation and a ReLU activation function. These functions make the proposed framework attain superior recall values in the classification process. On the other hand, the existing classifiers lagged to formulate the optimal functions causing larger processing errors. They employed traditional filters in the pre-processing stages which allow redundant information from the input dataset. The redundant information produces a lesser recall value than the proposed framework.
The results of each multi-class classification of network intrusion are demonstrated in Fig. 9. The findings depict that the proposed framework (VGG-19 + Hybrid DNN) has the greatest accuracy (98.86%) and precision (98.24%) for network detection under the CIC-IDS-2017 dataset whereas VGG-16 + DNN has the second maximum accuracy (95%).

Comparison of multi-class classification results over various classifiers.
The key motivation behind this enhanced accuracy results in the proposed framework is the accumulation of 3rd, 4th, and 5th blocks of VGG-16 with an extra convolutional layer in the enhanced VGG-19 architecture. Besides, the output from the CNN is boosted frontward via time and upwards via three levels of LSTMs. The CNN-LSTM model groups the data of a single connection (from starting to termination) and uses the attributes of all the packets of data in that group, as well as their linkages, for predicting the traffic nature. The LSTM networks are superior at predicting the processing timing because of their unique selective memory and forgetting gates.
In contrast, the transfer learning approaches like ResNet50 + DNN and VGG-16 + DNN show minimal accuracy than the proposed framework. This is owing to fact that the existing transfer learning approaches lagged to learning the dense features in intrusion detection. Alternatively, the accuracy of the SVM is close to other methods like RF and LR classifiers. These models faced the class imbalance constraints which remain more in the multiclass datasets.
This kind of class imbalance yields higher misclassification errors in the NIDS. After feature extraction, the PCAP file remains enormous and poses a severe difficulty to capture data in the existing models. Inadequate identification can consequence in the loss of vital information and disturb the classification process. It leads to obtaining lower accuracy than the proposed framework.
The accuracy of the various assaults discovered using the suggested model is enumerated in Fig. 10. The various attacks like ‘Benign’, ‘PortScan’, ‘Bot’, ‘DoS slowloris’, ‘DoS Slowhttptest’, ‘DoS Hulk’, ‘DoS GoldenEye’, ‘Heartbleed’, ‘Web attack’, ‘Infiltration’, ‘FTP-Patator’, and ‘SSH-Patator’ are identified by the proposed framework. Here, the Heartbleed and Infiltration attacks are highly detected through the proposed framework. This is because of the optimal utilization of blocks in the proposed enhanced VGG-19 model facilitates to obtaining accurate classification of various attacks.

Comparison of the various attack classification.
In the proposed framework, the CNN phase examines a pre-processed PCAP file and sends a multi-dimensional package vector to the LSTM component. The LSTM phase can assess a set of multi-dimensional package vectors to generate a vector reflecting the likelihood such that the phase belongs to every class. Based on the probability vector, the Softmax stage delivers the optimal outcome of the classification. Moreover, the optimization strategy increased the resilience of the proposed framework by reducing the number of unbalanced illustrations of attack types in the training set. As a result, the proposed framework achieves higher accuracy of 99.46% for the CIC-IDS-2017 dataset which is much superior than the other classification models. These results validate that the proposed model improves intrusion detection which will be utilized in various real-time applications such as smart cities, smart grids, and smart healthcare.
The WSN is increasingly vulnerable to various network attacks which emphasizes a severe security risk. To mitigate this, a novel enhanced VGG-19 + Hybrid DNN (CNN+LSTM) model has been introduced to classify the modern network attacks in the WSN environment. Initially, the proposed framework learns the characteristics of the data via the CNN phase. Then, it drives into the LSTM phase which deliberates past and future information about the input data. The efficacy of the proposed framework can be simulated in the Python platform under the CIC-IDS-2017 dataset. The performance results reveal that the proposed VGG-19 + Hybrid DNN framework achieves the maximal accuracy of 99.46% and 98.86% for binary and multi-class classification respectively. The accuracy result proves that the proposed framework significantly classifies the various modern network attacks. Furthermore, the proposed framework exposes a superior recall of 99.32% and an F1-score of 98.4% than the existing state-of-the-art classifiers.
In the future, it is planned to experiment with the various deep learning models to learn the spatial and temporal characteristics. This learning strategy will further enhance the classification accuracy in intrusion detection.
