Abstract
The Internet of Things (IoT) is developing so quickly that cloud-centric computing finds it challenging to keep up with the demands of usability and low latency. Edge computing unifies computing, networking, storage and applications into a distributed open system. At the edge of the IoT, it provides intelligent services. The edge network is made up of several wired and wireless networks, and edge nodes have constrained amounts of memory and processing power. The edge network is vulnerable to several types of cyberattacks because of these factors. Large-scale network data gathering and detection for IoT security is also challenging for an IoT edge node to provide. Data analytics for intrusion detection guarantees high accuracy of intrusion detection systems (IDSs), but the implementation of such algorithms on IoT might be a challenge due to the limited resources on edge nodes. Inspired in part by these challenges, we suggest a sophisticated IDS based on a generative adversarial network (GAN). This article suggests a novel method for detecting intrusions in the IoTs networks that make use of Colony Predator Algorithm (CPA) for optimization of the detection process. Through the use of GANs to create real-looking data samples, the suggested technique helps to have reliable anomaly and intrusion detection. The integration improves the training process, making it more efficient and enabling the CPA system to better differentiate between benign and malicious activities, which exponentially raises the system's efficiency.
Keywords
Introduction
In the current environment, the number of Internet of Things (IoT) gadgets number is growing without any sign of stagnation, which leads to a crucial demand for the most rigid defence mechanisms that can easily protect vital information and infrastructure from the various cyber threats that are being created day by day. 1 An increase in demand for this purpose has been accompanied by a rise in research, which aims to create powerful intrusion detection systems (IDSs) that can operate in the complicated, dynamic environments of the IoT. The development of these technologies, such as generative adversarial networks (GANs) 2 and optimization algorithms, is being shown in numerous research studies as a tool that security specialists are using to reinforce their security measures and increase resilience, especially at the edge of IoT networks where they are the most susceptible to attacks. 3
The development of complex IDSs has advanced significantly in recent years, 4 but the availability of straightforward, scalable solutions that adequately address the unique characteristics of IoT edge devices remains a startling challenge. IoT device integration presents several multifaceted hurdles, including problems with resource limitations, diverse network topologies and IoT rollout dynamics. 5 The activities of detecting intrusions perpetually involve the development of next-generation intrusion detection techniques with the dual objectives of achieving high accuracy and efficiency in detecting and defending against threats that are operating within the complex and resource-constrained perimeters of IoT edge environments. 6
The IoT ecosystem is evolving at an exponential rate, being characterized by the cultural trend of interconnecting devices and the adoption of edge computing paradigms. 7 This has resulted in the introduction of new security challenges in the IoT environment, such as the data transferred via IoT networks. The current cloud computing architectures are incapable of fulfilling the complicated usability and low latency demands imposed by the IoT expansion. 8 These constraints are even more challenging at the edge where resources are limited. Thus, edge nodes of IoT are at high risk of a spectrum of cyber threats, including different attacks that use minimal hardware and software resources as their target. 9
A GAN is a machine learning model that consists of two neural networks competing with each other throughout the training process: a generator and a discriminator. While the discriminator seeks to discern between samples of genuine data and those produced by the generator, the GAN strives to teach the generator how to make data samples that closely mimic real data. Figure 1 shows the GAN's fundamental design.

Overview of GAN.
The creation of effective IDSs that can precisely identify unusual and potentially harmful activities in the massive volumes of data being created by the many IoT devices is the main challenge in assuring the security of the IoT. 10 Despite the high accuracy of big data-empowered intrusion detection methods, their implementation on the resource-limited IoT nodes on their edges presents significant challenges from the computational perspective and may not be possible in reality. Moreover, the variegated nature of the edge networks, consisting of a combination of wired and wireless links, makes the job even more difficult for mass-scale network data collection and compromise detection. 11
The necessity to address the security concerns that the rapidly expanding IoT ecosystem, particularly the edge computing settings, is facing is what motivated the study that is described in the literature review. 12 While the number of IoT devices increases astronomically, the cloud computing paradigm tries to address this rise, but it does not cut it regarding usability and low latency, especially at the edge where the computational resources are constrained. This gap between the evolving digital infrastructure and the resources available to handle the challenges of the IoT environment creates weaknesses that permit interference by diverse cyber threats. 13
The study outlines a new approach to network security for IoT systems that consists of GANs optimized by the Colony Predator Algorithm (CPA) 14 for intrusion detection. The main stimulus for this approach is the opportunity it presents to fix the existing flaw of intrusion detection techniques by combining advanced machine learning with the ability to create realistic data samples and enhance anomaly and intrusion detection. The proposed model, with the training process being streamlined to a great extent and the GAN being more effective in discriminating between non-malicious and malicious activity, aims at significantly improving the security and sustainability of IoT networks, therefore protecting sensitive data and the infrastructure from cyberattacks. 15
The goal of this research is to use advanced GANs to build a sophisticated framework for efficient data transmission across the IoT. The main purpose is to provide solutions for the most critical security issues which are a specific feature of IoT environments, and, especially at the edge, where the resources are limited. By suggesting a GAN-based IDS, the research aims to mitigate all loopholes for cyber-attacks in edge networks. The framework can achieve that by utilizing the power of GANs to produce data samples that are true-to-life enabling it to perform anomaly and intrusion detection dependably, thereby improving the security and resilience of IoT networks. Also, the CPA is the integration of the algorithm which intends to automate the training process and enhance the system's ability to discern the benign from malicious activity leading to a significant increase in the system's efficiency. The research envisages an in-depth analysis and comparison of the proposed framework to the existing approaches to prove its fairly higher accuracy and superiority, which will lead to the improvement of IoT security and data integrity. The major contributions of this study are as follows:
This paper describes an IDS system that uses a GAN in conjunction with a CPA. This approach is novel because it makes use of GANs to create synthetic data samples that can be dependably utilized to cause anomaly and intrusion detection in IoT. A method is proposed whereby the CPA is tied up with the GAN to increase the GAN's capability of differentiating between harmless and dangerous activities. The CPA streamlines the training module and boosts the efficiency of GAN, therefore making the GAN fit for intrusion detection. It is the proposed GAN-CPA IDS that is accurate, fast and efficient enough for the security mechanism of the IoT networks. It deals with the problem of low capabilities of edge devices by offering a smart and secure way for data transmission in IoT networks. The GAN model structure has two main components—the generator and the discriminator, which perform different tasks. The generator produces data samples, which appear as authentic ones, and then it is the turn of the discriminator to find out whether they are true or fake. The process of training and commissioning of the GAN-CPA model will be done as a whole using CPA optimization algorithms to fine-tune the weight parameters of the generator and discriminator by carrying out backpropagation. The loop allows the model to get more and more perfect in its adaptation. The research contains five of the widely used dataset experiments to evaluate the model performance precisely (KDDCUP99, NSL-KDD, CICIDS 2017, WSN-DS and UNSW-NB15).
The remainder of the document is structured as follows: The background and literature review section provides context for the research by providing an overview of the major frameworks and a brief synopsis of the body of previous research. The methodology is defined by the investigative strategy of comparative analysis containing criteria and tools employed. This comparative analysis investigates what works and does not work in well-known frameworks. The conclusion section sums up the crucial concepts, highlighting the indispensable nature of frameworks in this field.
Several research papers have been dedicated to adopting new types of assurance methods in IoT systems. Bagaa and colleagues 16 elaborated on a machine learning security framework for the systems dealing with the use of IoT in their research paper. They discovered that the suggested architecture employed correlation techniques to acquire sensor data and was extremely effective in detecting assaults. Nevertheless, these limitations include the possibility of the need for a standardized interface among the framework modules and the trade-off of security requirements and the QoS.
The research paper of Ferrag and others 17 states a cyber-threat intelligence detection framework through the use of GAN, which was found to be effective in the case of intrusion detection, yet it is unclear what limitations it has.
In their research work, Tabassum and Lebda 18 highlighted different security mechanisms for the implementation of strong IoT security. The proposed framework was considered to play a significant role although it had some limitations such as the need for improved performance against real-time cyber-attacks and scalability concerns. Likewise, Charaan et al. 19 presented the concept of an improved IoT security framework, but it is not generalizable on other testbeds because they only tested it on a particular setup. Danda and Hota 20 also mention the problem of insufficient datasets for IoT IDS that can be solved by a customized data collection framework, although its generalizability is very low.
In contrast, Dhifallah et al. 21 discussed the HybridChainIDS framework for bi-level intrusion detection, which was assessed using simulation, without mentioning any limitations. Sharadqh and their colleagues presented their intrusion detection proposal 22 having an accuracy of 99.5% which is focused on edge computing but faces the challenge of detecting encrypted traffic. Saranya and Ramachandran 23 suggested an intelligent IoT attack detection framework, which is not restricted by way of limited specification.
In Olakanmi and Odeyemi's security and aggregation scheme, 24 the authors presented a new scheme for IoT devices that aims to enhance the security and efficiency aspects that are yet to be addressed in detail. Sugitha et al. 25 have introduced a blockchain-facilitated framework for intrusion detection, showing improvement in precision but without indicating weaknesses of the approach. Table 1 displays the overview of the literature review.
Summary of literature review.
Summary of literature review.
It was discovered that the body of literature currently available on assurance techniques in IoT systems still has several research gaps. Despite the rapid developments in machine learning and deep learning techniques, frameworks still face challenges related to scale and generalization. Since the majority of the existing methods have only been tried in a few different contexts or use cases, it is still unclear how well they work in actual situations and whether they can scale up to different IoT scenarios. The last point is that we should develop standardized interfaces and start cooperation between the different framework modules which will help to enhance the application compatibility and interoperability. A second focal area is the absence of research on real-time security mechanisms about performance and scalability. Some structures have demonstrated high accuracy in the detection of attacks, but sometimes they cannot handle the encryption or the attacks that they do not know. Therefore, QoS and security requirements are in opposition and can create a trade-off between them.
The proposed method of using the sophisticated GAN that is enhanced by the CPA for intrusion detection in IoT networks provides huge extensions to the existing literature. It optimizes efficiency and enhances GAN's classifying ability through the streamlining of the training process and the increased capability of differentiating between normal and abnormal activity, especially important for economic edge nodes. It gives the technique access to actual data samples, which improve anomaly and intrusion detection's efficacy and accuracy.
Materials and methods
The system is a cutting-edge IDS system for IoT networks, where GAN technology is improved by CPA. This approach aims to provide a security mechanism that can address the resource constraints faced by edge devices in IoT networks while remaining precise, quick and effective. The following sections detail the various components of the proposed framework:
Overview of the proposed system
The suggested methodology takes a comprehensive approach to creating and refining a GAN model with a discriminator and generator weaved in. The generator converts the sample noise from a standard normal distribution into data samples that look very similar to real data to bluff the discriminator The structure of the generator is made up of layers such as the dense layer, the convolutional layer or the transposed convolutional layer, and so on, and has non-linear activation functions to boost learning. The discriminator, in contrast, is charged with the task of distinguishing real from fake samples and supplying probability scores to guide the generator in its efforts to improve. GAN training is based on a min-max objective function, maximizing the optimization of the generator and the discriminator at the same time. To stabilize and improve the performance, the methodology incorporates the CPA inspired by social animals’ cooperative predation behaviour, so that the model can be optimized efficiently. Furthermore, a deep network for intrusion detection is designed which utilizes convolutional and LSTM layers and long input sequences for better results that combine the short term and long term to improve the accuracy. The concept involves tackling the two primary issues with GANs – mode collapse and vanishing gradients – by utilizing complex loss functions and optimization techniques. Figure 2 shows how the procedure is represented visually.

Overview of the proposed approach.
The generator and the discriminator are the two primary parts of the GAN architecture.
Generator
The entity in the GAN is the generator which is supposed to produce data samples that look real and are derived from an input in the form of noise. The generator starts with a latent vector or a noise vector taken from a standard normal distribution. This noise vector is the starting point of the generator's transformation process and would be tailored for each segment of the population. An array of data that closely resembles the actual data produced by the trained model is the generator's output. Creating samples that are indistinguishable from the actual data that acts as a discriminator is the primary goal. The structure of the generator is intended to include several layers, some of which may be dense (fully connected) layers, some may be convolutional layers or some can be transposed convolutional layers (also known as deconvolution layers). The selected number of layers depends on the type of information created and the required level of complexity. Non-linear functions such as ReLU or Leaky ReLU can be introduced within the generator to induce non-linearity and thus increase the learning process. Those activation functions can give a generator the ability to mirror the data in the parameters of complexity. The similarity scores between the created samples and the actual data are verified by the generator using a loss function. This loss is generally evaluated by the output of the discriminator, which pushes the generator to design samples that cannot be discriminated by the discriminator from the real samples. This is because the generated data samples are hard to differentiate from genuine data, thus the generator will be able to obtain feedback from them and continue honing its craft.
Discriminator
The discriminator, which is an integral part of GAN, is responsible for the assessment of the input data and classifying them into real or fake ones. Data is fed into this neural network either from the generator (false data) or from the training set (actual data). At last, it assigns a probability score which tells how much the sample resembles the original. The discriminator produces a probability score, which is used to identify if the data sample is real or not. The higher the score, the more the sample is likely to be genuine compared to the other samples. However, it is vice versa; that is, the sample has a higher possibility of being fake if the score is lower. Generally, the discriminator has several layers which may include dense (fully connected) or convolutional layers in the architecture. The generator is built to understand the patterns and features of the data and with the help of this, the discriminator successfully detects real and fake samples. The discriminator requires ReLU, Leaky ReLU or Sigmoid activation functions for adding non-linearity and smoothing the learning process. The function of activation is a very important thing that impacts the model's ability to differentiate natural data from fake data. The discriminator's design makes use of a loss function to assess how closely data samples are classified as authentic or fraudulent. The discrepancy between the actual (real or false) labels that the model possesses and its predictions is what gives birth to the loss function. The loss function acts as a guide for the discriminator while fine-tuning the parameters to uplift the classifier's performance.
The goals of GANs are to ascertain the training set's distribution and learn the density function. Using a min-max objective function, the generator and discriminator neural networks that comprise a GAN train simultaneously. While the generator tries to trick the discriminator by producing realistic instances, the discriminator aims to accurately distinguish between actual and fake data. The generator starts each iteration with a random vector input composed of noise produced by a specific differential function to approximate the real samples. Neural networks that take both created and source input samples are frequently used as the discriminator. After that, it generates a probability that is almost 0 for bogus data and nearly one for true data. The weights of the generator and discriminator networks are updated based on the discriminator network's error rate, which is determined using the cross-entropy function.
39
Equation (1) depicts the objective function of a GAN, with the generator and discriminator represented by the letters D and G, respectively. The probability distribution of the generated data is represented by P
a
, whereas the probability distribution of the original data is represented by P
a
. The discriminator seeks to maximize the objective function, whereas the generator seeks to reduce it:
It has been demonstrated that the discriminator is concave-concave rather than convex–concave, which results in instability in the optimization process, even in cases where the generator and discriminator are linear. The generator update should, in theory, thoroughly optimize the discriminator before estimating the generator's optimal value. A fully optimized discriminator appears to be quite difficult to achieve in reality, as the generator update gets worse the stronger the discriminator is. Other issues with GAN include mode collapse and disappearing gradient. To trick the discriminator, the generator in mode collapse only learns a tiny portion of the training set and produces samples that lack variety. Several research approaches that modify the discriminator and generator's target function or structure have been put forth to overcome the aforementioned problems.
Initial interpretation (f-divergence). Assume that two distributions of f*. A is given by the continuous density functions p and q, and P and Q, respectively. A convex lower semi-continuous function is R+:
The Z-divergence's variational lower bound. Suppose that the conjugate function of convexity of v is v *X → R: 21
The social animals’ group predation behaviour is a model for the CPA, which helps the successful execution of the predation process. This algorithm is designed in light of the principles of cooperation and communication used by animals in the groups’ hunting strategy, such as wolves or lions. The CPA algorithm logic builds up on the fact that cooperative predation can help to attain a greater probability of being successful and thus it seeks to replicate this in optimization situations.
Collaborative, communicative group predation of animals living in groups enhances the likelihood of successful predation, which served as the inspiration for the colony predation algorithm.
Equation (10) mimics the way that social animals use communication in pursuit of their prey:
During the chasing phase, the predator uses encircling motions in addition to the dispersion of the prey as strategies to improve the odds of a successful predation. The process of prey dispersion, which involves pushing the prey in different directions and undermining the prey group, is modelled by equation (11):
Equation (15) illustrates how the predators use a siege assault on the prey once they have been well scattered:
Finally, we proposed the use of a deep network for malware identification. Two one-dimensional convolutional layers are used to analyze the long input sequences to extract several local features. The following formula is used to create the resulting feature maps: The input is represented by f for a length of n, while the kernel is represented by a length of m:
To reduce execution durations, key features from long sequences have been extracted using convolutional and max-pooling layers and then sent to the LSTM layer. Together, CNN and LSTM can concurrently record both short- and long-term dependability. Stated differently, CNN leverages an LSTM with several gates and a convolving filter to extract local features, after which correlations between the features are stored in memory. Then, the significance between various sequence places is discovered via the attention mechanism. By altering the input sequences’ concentration and the contribution of particular segments, the aim is to increase accuracy. The attention mechanism is modelled after the human visual system. The human sight system does not scan the whole scene when it detects an item; instead, it concentrates on a specific region dependent on the target. By grading attributes, attention discovers the relationship between characteristics and how they affect target identification. The stages are explained by the following equations. The weights matrix is represented by w0 and w1, whereas the latent states of the LSTM layer are represented by h
i
. Before a neural network can assign values to linked and connected inputs, it has to be trained. The softmax technique is used to produce attention weights when scores are computed. Every weight should weigh the same overall. The weights are then applied to the output by multiplying this vector by the input vector:
The overview of the training process is shown in Figure 3.

Training process of GAN.
The generator then takes the input noise vector, which is also called a latent vector, and the process of creating novel samples starts. This vector is derived from a standard normal distribution, and it is equipped to randomly provide the starting point to the generator. The foundation of the generator that enables the transformation process of various and unique data samples is the latent vector. The generator's goal is to become proficient in using this input noise to produce samples that are as near to real data as feasible.
When the input noise vector gets to the generator, the generator feeds it through the model to create a data sample that looks very much like the real data from the training dataset. This change entails the noise signal going through a set of neural networks namely dense (fully connected), convolutional or transposed-convolutional layers, depending on the type of data being generated. While learning, the generator models the behaviour of the real data by changing its internal settings. The generator's main objective is to produce samples that cannot be distinctly told from the real data. This implies that the generator would be meticulously tuned by the parameters to ensure that it could obtain the smallest variations and nuances present in the real data. As the generator improves, it will be able to generate more samples that closely match genuine data, making it harder for the discriminator to distinguish between samples coming from the generator and actual data. The process of sample generation, which plays a crucial role in GAN performance and training, is recognized as the deciding element.
Sample generation
Initially, the discriminator receives two types of data samples for evaluation: real samples from the transmission dataset and fictitious samples created by the generator. Real samples, by definition, are extracted from the actual distribution of the original data which serves as a reference for the data which is supposed to be genuine. On the contrary, the main role of the generator is to produce fake samples that are based on the random noise vectors and therefore are supposed to imitate the main features of the real data.
Upon obtaining the inputs, the classifier must evaluate every sample and determine if it is authentic or fraudulent. The discriminator uses the data distribution it has learnt to carry out the task of identifying the difference between genuine and false data. This is executed by filtering the inputted data through a series of neural network layers where the model exploits and analyzes characteristics that are distinctive of real and fake data. The discriminator, however, may offer a probability score for each input data sample which is the level of likeliness for the sample to be real or not. This score runs from 0 to 1, where a higher score stands for a higher probability that the sample is real, and a lower score reveals a higher likelihood that the sample is fake. The classificatory work of the discriminator is very important to the GAN since it not only gives the generator sample quality information but also forms the basis for the GAN's learning process. The cycle of training enables both the generator and the discriminator to keep upgrading their functions with each new iteration.
Loss calculation
The most important part of the GAN training cycle is the loss calculation process, which directs the discriminator and generator to perform better. The generator's ability to deceive the discriminator into believing that its (false) samples are real is measured by the generator's error metric. It is computed by providing the fake sample probability for the discriminator output. A higher generator loss signifies that the discriminator effortlessly captures the generator's samples as fake, which can imply that the generator is not generating realistic or convincing images. The goal is to cut down on its losses by producing more realistic examples of the data that are as close as possible to the real one. This is how the generator is driven to learn the underlying patterns and features of the data distribution on which it is trained. The generator is getting better and better as it minimizes its loss by iteration and fooling the discriminator more and more successfully. The discriminator's loss algorithm assesses how much it can separate the real and fake samples. This marginal loss is computed by the deviation in the probabilities of the discriminator as compared to the ground truth labels (real or fake) for the given input samples. The larger discriminator loss shows that the discriminator is in the process of classifying the original and one cooked up, respectively. Through training that intends to better identify real data and false data samples, the discriminator aims to decrease loss. Thus, the training relies on the difference between real and fake data, which is demonstrated by specific patterns and features. The discriminator through this process of continuous reduction of its error can properly classify fake samples as such and true ones.
Parameter updates
Backpropagation is an indispensable algorithm in neural networks which is responsible for the adjustment of the generator and discriminator parameters as it analyses the errors. The generator and discriminator loss functions of the GAN model are calculated once it has completed processing a batch of data. Gradient calculation is then carried out via backpropagation about the network's parameters (weights and biases) utilizing loss functions and associated gradients. The gradient in these graphs denotes the sign and size of the adjustment that must be made to each parameter to lower the value of the associated loss functions. The function of gradients is to compute them, and backpropagation instructs the model on how to modify its internal parameters to improve performance. Such calculations are done recursively from the output layer of each network and move step-by-step through the layers until we reach the input layer. This is how the name ‘backpropagation’ originated.
The calculated gradients update the generator and discriminator parameters only by using the optimization algorithm CPA. The CPA algorithm calculates the gradients and uses them to tune the weights and biases of each network to decrease the loss functions and make the networks more accurate. CPA adjusts the parameters by subtracting a fraction (the learning rate) of the complete gradients from each parameter. It presupposes that the model will make the next improvements step by step with each iteration. Adam, being the implementation of CPA, computes the adaptive learning rates per parameter through the utilization of estimates that have been made about the first and second moments of the gradients. Using this approach is instrumental in making the training of neural networks faster and more stable.
Training cycle
The training cycle in a GAN is an iterative process that involves several key steps: the generator producing data samples, the discriminator evaluating these samples, the loss functions calculated for each network, and the network parameters updated based on these losses. This cycle is continued for several epochs or iterations to allow both the generator and discriminator to get better at their performance as they go. The generator is guaranteed to create a fresh data sample with the noise vector as input at the start of each iteration. After giving a try to these samples and comparing them with the actual data from the training dataset, the discriminator finally classifies these samples as either authentic or phoney. The generator and discriminator loss functions are calculated once the classification result is obtained. In the end, backpropagation and optimization techniques are utilized to minimize losses and optimize the network's parameters.
As time goes on, the discriminator will get increasingly adept at differentiating between bogus and authentic data, while the generator will provide sample data that is most comparable to the data in the training dataset. The generator's samples are meant to be as near to the real data as feasible through the training process, and the discriminator's ability to identify them as genuine or fake should be extremely close to collision with the true and false labels (i.e., a probability close to 0.5). It is not an easy task to find equilibrium when both the generator and the discriminator are constantly updating themselves to overcome each other. The generator thus attempts to generate more real-like samples to deceive the discriminator, and the discriminator on the other side tries to enhance its classification accuracy. It is this thriving rivalry that fuels the learning curve for both competitors.
The training process continues until the goal is achieved, such as reaching the maximum number of epochs. Standard features are epoch or iteration count, error or accuracy with a certain value or convergence of the generator and discriminator loss functions. For instance, the training process may be interrupted if the signs of over-fitting or mode collapse are detected and they can damage the quality of the generated samples. The training procedure is then completed when the generator can generate high-quality samples that include most of the real features, while the discriminator has to have a score close to 0.5 when it classifies the generator's samples. The conclusion of this is that the generator and the discriminator are at a balance and that the GAN model is ready for deployment or further testing.
Experimentation and results
We have selected several popular datasets for this study to assess the learning model's performance in-depth. KDD Cup99, NSL-KDD, CICIDS 2017, WSN-DS, and UNSW-NB15 are some of the datasets included. Each is made to offer a particular set of features and problems related to cybersecurity and network intrusion detection. Each dataset was divided between training and testing sets in a 75:1:25 ratio, implying the ratio, to enable a thorough analysis. The study experiment was done by using Google Colab, a cloud-based development environment that gives the user access to high computational power resources. Python 3.3, a potent and widely used programming language in the data science and machine learning domains, was utilized to write the software. As a result, the 11th Gen Intel(R) Core (TM) i5 processor made it simple to run the software and offered sufficient processing power for modelling and handling large datasets. The system with 16 GB of RAM can perform data analysis and learning algorithms efficiently. The OS running was Windows 11 here, it provided compatibility with the software and tools that were required. Beyond that, the simulation incorporates 145 GB storage which was enough to cope with the large data sets and the intermediate data that were used for the training and testing. The widely used open-source software TensorFlow 2.13.0 was utilized for machine learning and deep learning applications. The accelerator, an NVIDIA GTX 750 GPU, increased processing speed and made it possible to handle large, complex models. This made it possible to compare a model's performance over the whole data set, which in turn provided the research with the proof it needed to justify using the model for cybersecurity and intrusion detection applications.
Dataset used
KDD CUP99
In DARPA's IDS evaluation program, the KDD Cup ‘99 dataset was utilized. Four gigabytes of extensively compressed ‘tcpdump’ data, utilized for seven weeks to assess network activity, are included in the collection. Five million connection records, each containing around 100 bytes of data, may be handled. About 4,900,000 unique connection vectors with 41 different attributes each make up the dataset. Twenty-two different attack types in the dataset are being examined, and they may be divided into four main groups.
NSL-KDD
The KDD Cup 99 incursion dataset is now part of the NSL-KDD dataset. Filters were employed in the study to remove duplicate data from the KDD Cup 99 dataset. More precisely, the resulting dataset does not contain the data values 136,489 and 136,497. It could be possible to reduce potential bias in ML algorithms successfully with NSL-KDD. When compared to the KDD Cup 99 dataset, this method does a good job of identifying abuse. Moreover, this demonstrates that not all characteristics and specifications of network profiles for real-time communication were precisely recorded.
CICIDS 2017
Including both benign and offensive behaviour, the CICIDS 2017 dataset offers a representation of real-time network traffic. This representation encompasses both types of activity. This particular dataset has a primary focus on the collecting of traffic in real time. The B-profile was used to collect completely harmless background traffic. There is information on the usage of email, FTP, SSH, HTTP and HTTPS for a total of 25 individuals. During five consecutive days, the data about the traffic on the network was collected and logged.
WSN-DS
The dataset in question, known as WSN-DS, was developed specifically with IDSs in mind for use with wireless sensor networks (WSN). There are four distinct types of attacks included in this category. The study employed the low-energy adaptive clustering hierarchy (LEACH) protocol to gather data from the Network Simulator 2.
UNSW-NB15
Due to some of the limitations of the KDDCup 99 and NSL-KDD datasets, the Australian Centre for Cyber Security developed a new dataset called UNSW-NB15. The collection of the data is accomplished by the use of a hybrid approach, which combines both malicious and benign network traffic activities in real time. The IXIA Perfect Storm tool, which includes a wide range of unique attacks and common vulnerability exposures (CVEs), is used to do this. These CVEs serve as a repository for publicly available information on security flaws and vulnerabilities. The IXIA's traffic-generating strategy employed two servers, one of which was a load generator and the other was a verification server. Within the network, one server was doing peaceful activity and the other server was doing bad activities. To take a sample of network packets, I use the tcpdump tool. The 100 terabytes of data have to be assembled completely, and it took more or less a few hours to do so. With the help of tcpdump tool, we have moreover split down the data into 1 GB cap files. As well as that, the previously mentioned method was combined with a comprehensive scrutiny of all the data packets by 12 C# language algorithms built into the program.
Performance matrices
We have used six key metrics in the evaluation of IDS including Accuracy, Precision, specificity, sensitivity, false negative rate (FNR) and false positive rate (FPR). These metrics are explained in detail below:
Accuracy
An accurate detection of attacks into the appropriate categories is measured statistically as accuracy. It is a fundamental metric for classification problems and offers a thorough assessment of the accuracy of the test or model. The mathematical representation is:
By calculating the proportion of expected positives (intrusions) that were real positives, precision quantifies the quality of positive predictions. Precision can be computed as:
The False positive rate, also known as the False Alarm Rate or Fall-Out, is a statistical metric used to assess a classification model's performance, particularly in circumstances where the accurate identification of negative cases is crucial. It quantifies the rate at which the model incorrectly classifies actual negative cases as positive or affected. False positive rate is calculated as follows:
False negative rate is a statistical metric that measures the ability and accuracy of a binary classification model in situations when all the cases with positive outcomes should be identified. In this case, the ROC curve plots the ratio of actual positives to false negatives, whereby an increase represents the model's inability to identify the presence of a condition when it is there. This value is very important in medical diagnostics since delayed or missed diagnosis can lead to wrong treatment. The FNR is calculated as follows:
Specificity in medicine refers to a test's ability to correctly identify the absence of a specific disease or condition in healthy individuals. Specificity can be calculated as:
A classifier's sensitivity can be defined as the ratio of properly recognized positive data to real positive data. The sensitivity can be calculated as:
The performances of the generated IDS on the KDD Cup99 dataset are displayed in Table 2 and Figure 4. The dataset covers multiple sorts of attacks: U2R (User to Root), DoS (Denial of Service), Probe, Normal and R2L (Remote to Local). Each assault type is rated using numerous measures such as accuracy, FNR, sensitivity, FPR, precision and specificity. As regards the IDS attacks of the U2R type, the IDS demonstrated an accuracy of 97.6%, which implies the number of right classifications was high. The FNR is quite low at 3.2% which means that relatively few real U2R assaults have been missed by the system. The proportion of correctly identified true positives, or sensitivity, is 96.8%, while the FPR is minuscule at 0.7%. This shows a respectable ability to maintain standard circumstances where they are correctly recognized as U2R attacks at the lowest level. The specificity of 99.3% demonstrates a great capability to appropriately distinguish real negatives. Precision, reflecting the percentage of real U2R assaults among all the recognized as U2R, is 97.6%.

Experimental results of proposed IDS on the KDD Cup99 dataset.
Experimental results of proposed IDS on the KDD Cup99 dataset.
Likewise, the IDS has a DoS attack accuracy of 96.8%. The FNR is 3.7%, which suggests there is a low likelihood of missing true DoS assaults. Sensitivity is 96.3%, although FPR is somewhat over at 4.2%. The degree of specificity is preserved at a pleasing level of 95.8%. A precision of 92.7% demonstrates the great competency of the approach in properly recognizing incidents of DoS assaults.
Table 3 and Figure 5 demonstrate how the suggested IDS performed when tested against the NSL-KDD dataset and a variety of inside attacks. This IDS has distinct accuracy and effectiveness measurements in detecting different sorts of attacks. As for the accuracy, there is a percentage of 84.1%, including the sensitivity, which reached 86.4% and a specificity of 72.3%. Nevertheless, there is a large FPR of 27.7%, which means that additional routine cases might be misclassified as assaults. Unlike the previous, the IDS employed for the R2L attack preserved a detection accuracy of 99.8% with extreme FNR and a high precision of 99.1%. Furthermore, the model is proved to be extremely competent against U2R and DoS assaults, with an accuracy of 90.6% and 91.5%, respectively, with excellent sensitivity and specificity. Regarding the U2R kind of attack and low FPR of 0.2%, it suggests that there is an infinitesimally tiny number of false alarms. On Probe assaults, the IDS successfully obtains an accuracy of 92.4% and has reasonably harmonic sensitivity and specificity. The acquired findings show that, despite minor performance variations across the various attack classes, the recommended IDS is capable of successfully detecting a broad variety of distinct attack types from the NSL-KDD dataset.
The ratios for the training and testing datasets, as well as the performance indicators of our IDS approach against various attacks, are shown in Table 4 and Figure 6. Every attack method is tested on its foundation of percentage of accuracy, FNR, sensitivity, specificity and precision. As for the performance of the IDS, it demonstrates that it performs various degrees of detection depending on the assault. Such as by example, in carrying out Web assault detection, it brings together an accuracy score of 91.5% which has an enhanced strength with a sensitivity level of 92.6% and precision of 84.8%. Despite this, there was a remarkable FPR of 17.6%, that is, typical conduct may be thought to be an assault, hence, the misjudgment. In the same manner for a normal quantity of instances, the IDS demonstrates good precision (89.6%) and accuracy (95.3%), true positive with FNR and balanced sensitivity and specificity. In contrast, with the SSH-Patator attack, the system is proved to also have a higher FNR (23.6%) and a lower recall (87.2%), suggesting that there is still space for improvements and a need to identify and appropriately label these assaults. From the outcomes of the experiment, various conclusions on the general performance of the IDS across multiple attack classes can be derived, showing the benefits and significant areas for future enhancement of this instrument for detecting and dealing with cyber-attacks.
Experimental results of proposed IDS on NSL-KDD dataset.
Table 5 and Figure 7 display the experimental findings of the recommended IDS when it was evaluated on the WSN-DS dataset, proving its capacity to prevent various sorts of assaults. The IDS demonstrates a good rate of accuracy in general with Normal cases concluding the highest which is 97.5%. This illustrates that the IDS can occasionally operate as successfully as recognizing the usual behaviour as feasible. The recidivism rate of black hole assaults is 95.6% combined with 14.0% FNR implying that the security mechanisms of IDS may fail to manage certain situations. In parallel, Greyhole assaults work extremely well demonstrating an accuracy score of 91.9% with a lower FNR of 4.0% indicating better specificity. In particular, the IDS has very good performance in both flooding and scheduling attacks, yielding accuracy values of 99.4% and 99.2%, respectively. On top of that, these attacks have zero FPRs, evidence of the IDS's accuracy in the differentiation between normal conduct and deviant behaviour. Overall, these findings indicate that the IDS can identify significantly distinct sorts of attacks from the WSN-DS research, and it displays good accuracy in Flooding and Scheduling attack detection.

Experimental results of proposed CPA-GAN on NSL-KDD dataset.
Experimental results of proposed IDS on CICIDS 2017 dataset.

Experimental results of the proposed CPA-GAN model on CICIDS 2017 dataset.
Experimental results of the CPA-GAN on the WSN-DS dataset.

Experimental results of proposed CPA-GAN on WSN-DS dataset.
Table 6 and Figure 8 display the testing results of the built IDS on the UNSW-NB15 dataset, which provides us with a look into the functioning of IDS against various attack types. While the IDS is grappling with collecting proper detection of multiple threats (to a varied degree). Normal IDS obtains an accuracy of 79.7% with sensitivity equalling 95.6% and precision close to 87.9%. However, the FPR of 28.5% suggests that a substantial part of the data may be collected as a false positive which may be regular activity misunderstood as cyber-attacks. For instance, the IDS demonstrates a high degree of accuracy in terms of certain threats such as Fuzzers, DoS, Shell Code, Backdoors, Worms, Reconnaissance, Exploits and Analysis with figures up to 99.9%. This suggests that the IDS is competent for the appropriate detection of the described attacks and it can accomplish so with a limited number of misidentifications (false negatives) and unintended identification (false positives). On the contrary, the outcome of the generic assault was 78.3% which was 12% less than that of the phishing and spam attacks, respectively, the detection was not simple, with a sensitivity of 91.5% and accuracy of 89.2%. Indeed, the experimental findings displayed the IDS which is proposed in the work as efficient and successful that is to recognise a wide set of attack typologies on the UNSW-NB15 data set including substantial skills to recognize particular classes of attacks with some challenges in discriminating accurately between the Basic attack category.
Experimental results of proposed CPA-GAN on the UNSW-NB15 dataset.

Experimental results of proposed CPA-GAN on the UNSW-NB15 dataset.
According to the statistics, accuracy and FPR are generally high. In contrast, a high attack frequency in normal data – where abnormalities are rare – indicates that this IDS is adequate for intrusion detection. There are noticeable variances in the results when comparing the ‘U2R’, ‘DoS’ and ‘Probe’ categories. The fundamental reason for this difficulty is the limited quantity of data in the training sets for both attack types. This also compares the recommended IDS to the most well-known ML models. Six Ml models were selected for the research, and these five datasets were utilized to analyze the models’ accuracy and other properties DT and RF classifiers outperformed the logistic regression and naïve Bayes classifiers in terms of accuracy. Moreover, consistent results are obtained from the performance of the DT and RF classifiers on all datasets. Nevertheless, when applied to distinct datasets, the outcomes of logistic regression, naïve Bayes, KNN, and SVM vary. This implies that the DT and RF classifiers may be applied in a range of scenarios and can quickly detect novel types of assaults. To increase the classifier's power to accurately detect every unique assault in the situation of multi-class categorization is the aim. As a consequence, these attack categories are assigned less weight by the classifiers during training. The performance of the recommended IDS beats all other techniques when accuracy is taken into consideration.
Table 7 shows how well the ML models performed on the dataset's contextualized and non-contextualized attack texts to recognize different types of assaults. Each dataset (KDD Cup 99, NSL-KDD, CICIDS 2017, WSN-DS and UNSW-NB15) is meticulously investigated separately using a variety of machine learning approaches, such as logistic regression, decision trees, random forests and naïve Bayes. In particular, the suggested CPA-GAN approach regularly performs remarkably well across a variety of datasets and nearly always receives the highest score. For instance, the suggested technique outperformed other models with an accuracy of 93.1% when tested against the KDD Cup 99 dataset. With the CICIDS 2017 dataset once more, the suggested approach yields a top-scoring accuracy of 95.2%. Furthermore, the suggested approach achieves the highest accuracy of all the models analyzed here, 99.8%, when tested on the WSN-DS dataset. Nevertheless, each ML model has its distinct qualities where it shines in some datasets while other models are superior on another portion of the datasets. In essence, the results show how well the suggested CPA-GAN model performs in accurately categorizing attacks across the various datasets. This emphasizes the need to select appropriate machine learning models that are tailored to the specific dataset features to achieve the best results possible for intrusion detection tasks.
Detailed test results for categorizing attacks on different datasets.
Detailed test results for categorizing attacks on different datasets.
It can be said that the created CPA-GAN model is particularly good in identifying different types of assaults because it has been shown to beat other machine learning methods across a variety of datasets. While the model performed well in the majority of attack functionalities, there is still an opportunity for improvement, especially when handling high FPR and FNR. Another excellent example is the CPA-GAN model, which can handle a variety of data sets and attack strategies. Nevertheless, improvement in this regards is underway as the wrongful classification of regular activities as attacks is a matter of concern. The study recommends the use of such kind of classifiers as decision trees or random forests to detect new attack types in diverse environments, which could bring more efficiency to machine learning.
Practical implications
The research proposed framework not only provides useful insights for the security of IoT networks, especially at the edge computing level, but also contains practical discussions and recommendations that can enormously strengthen the security of IoT networks. IoT edge networks are exposed to limited resources and risks of unauthorized attacks making them become the targets of unlawful activities. Through advanced GAN with CPA-enhanced GANs, the framework implements a powerful source of IDS, which has been designed specifically to detect threats from IoT edge nodes. In such a case, the IDS would be highly accurate in terms of anomaly detection, yet it would generate very few false positives, which is critical for quick threat response. Besides, the CPA brings the training process into one platform that can be used to optimize resource utilization and reduce computational overloads, deployable in diverse IoT edge environments. Using extensive evaluation and simulation, the proposed framework has been shown to supersede the accuracy of existing methods, with a maximum improvement of 4% in detection accuracy. This upgrade will make the system stronger against malicious activities, hence, the security level of the IoT deployment will increase. Further, the framework has the unique ability to be scalable and adaptable which increases its efficiency in changing IoT landscapes that can handle the increasing networks and variety of devices. Its cost-effectiveness makes it feasible for enterprises of all sizes to implement robust security measures without costing a fortune.
Limitations
The fact that the proposed framework isn't scalable in real-IoT deployments is a matter of concern. The paper does show that we have improved the intrusion detection accuracy as compared to the existing methods but the paper may not have answered all the questions faced in the implementation and maintenance of such framework in large-scale IoT networks. The use of computational resources for the GAN and CPA at edge nodes with reduced memory and processing power might reduce the scalability. As well as this, the supervision of the suitability of the framework to address different types of IoT environments and the ability of the framework to cope with the data volumes in the bigger networks are also important to ensure the feasibility of the framework. Able to solve the scaling issues is one of the key factors of the framework's success. The moving to real-time applications is one of its main usages.
Conclusion and future scope
The objective of the study is to evaluate the accuracy measures such as false positives and false negatives of an IDS. The research work was focused on the model stability in the detection of different types of attacks and various datasets, namely KDD Cup99, NSL-KDD, CICIDS 2017, WSN-DS and UNSW-NB15, were used. The dataset is used to compare the IDS with logistic regression, random forests, decision trees, naïve Bayes and our suggested CPA-GAN model. In summary, the CPA-GAN model outperforms existing machine learning models in most cases and can effectively handle a variety of assaults on a wide range of datasets. A very important feature of this system is its adaptiveness and diversity, which make it an important factor in cybersecurity and IDSs. However, additional work has to be done to lower the model's false positive and FNRs to get higher accuracy.
Future research could direct its efforts towards improving the CPA-GAN model on the indicated weaknesses. Apart from that, the model's algorithms and approaches could be improved to provide a more accurate performance against different categories of attacks, although normal actions could be misclassified as attacks. Moreover, the mental models of the classifiers such as decision trees and random forest representation may produce more robust and refined intrusion detection software. Besides, scaling up and testing the framework in multiple real-world scenarios will be vital for its practical application in cybersecurity systems.
Footnotes
Abbreviations
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
