Generate qualified adversarial attacks and foster enhanced models based on generative adversarial networks

Abstract

In cybersecurity, intrusion detection systems (IDSes) are of vital importance, allowing different companies and their departments to identify malicious attacks from magnanimous network traffic; however, the effectiveness and stability of these artificial intelligence-based systems are challenged when coping with adversarial attacks. This work explores a creative framework based on a generative adversarial network (GAN) with a series of training algorithms that aims to generate instances of adversarial attacks and utilize them to help establish a new IDS based on a neural network that can replace the old IDS without knowledge of any of its parameters. Furthermore, to verify the quality of the generated attacks, a transfer mechanism is proposed for calculating the Frechet inception distance (FID). Experiments show that based on the original CICIDS2017 dataset, the proposed framework can generate four types of adversarial attacks (DDoS, DoS, Bruteforce, and Infiltration), which precipitate four types of classifiers (Decision Tree, Random Forest, Adaboost, and Deep Neural Network), set as black-box old IDSes, with low detection rates; additionally, the IDSes that the proposed framework newly establish have an average detection rate of 98% in coping with both generated adversarial and original attacks.

Keywords

Adversarial attacks deep learning (DL)generative adversarial networks (GAN)intrusion detection system (IDS)machine learning (ML)

1. Introduction

With the development of the internet and communication technology, the scale of network traffic has sharply increased, causing rapid growth in the number of network attacks and malware programs. For many institutions and companies, protecting the systems from these potential threats is never trivial; thus, it is necessary to build stable and powerful network intrusion detection systems (IDSes) that can detect and identify these malicious actions [1].

Recently, improvements in computer calculating power and capability have promoted the use of machine learning (ML) and deep learning (DL) techniques in a wide variety of fields, including the field of network attack. By training with a certain scale of normal and attack instances, an ML- or DL-based classifier can surpass the classifying ability of humans and show relatively excellent performance in detecting anomaly in network flow [2]. However, these classifiers are not without their disadvantages, one of the most remarkable of which is their vulnerability to adversarial attacks [3].

The goal of an adversarial attack is to look for flaws in these ML- or DL-based models and ‘trick’ them into report misleading classification reports [4]. Traditionally, adversarial attacks are created by methods such as convex programming, local search and combinatorial optimization. Based on the original attack, these methods can find perturbations that can confuse the target IDS [5]. Generative adversarial networks (GANs), proposed by Goodfellow in 2014, are useful networks for generating such attacks [6]. Attacks generated with GANs can continuously adapt to the target IDS while the target IDS attempts to make predictions regarding the attack; as a result, the attacks more easily expose vulnerabilities in ML- or DL-based classifiers [7].

Although GANs are able to generate adversarial attacks by continuously adapting to the target IDS, these attacks can also be fed into models as training samples, enabling direct enhancement of the original ML- or DL-based IDS [8, 9]. However, many institutions have long-deployed IDSes, and the parameters in these models are usually unavailable, preventing the configuration of the models from being changed; in these cases, there is a need for establishing a new IDS that both prevents adversarial attacks and maintains the classifying ability of the original IDS, whose construction is unknown.

This paper presents a solution to address this need; its main contributions are as follows:

•
A creative framework based on a GAN and a series of training algorithms are proposed for generating qualified adversarial attacks and developing a new classifier to detect both adversarial attacks and the original dataset.
•
A comprehensive detection experiment is performed to evaluate both the original ML/DL IDSes and the developed classifiers on both adversarial attacks and the original dataset.
•
Another experiment is conducted to evaluate the authenticity of a generated adversarial attack at all training steps.

The remainder of the paper is structured as follows. Sections 2 and 3 discuss the related works and restriction in this domain, respectively. Section 4 explains the construction of the proposed framework and the training algorithms. The experimental setup and results are given in Section 5. Section 6 depicts critical comparisons with related works. Final conclusions and future work are provided in Section 7.
2. Related works

2.1 Intrusion detection system

To enhance defensive capabilities in cybersecurity, IDSes, virtual protection mechanisms that monitor data traffic on the internet or local networks for malicious activity or policy violations, were created [1]. With advancements in the fields of ML and DL, families of algorithms developed to recognize and simulate patterns of data features based on chunks of datasets or continuous data streams and to predict unknown and upcoming features, researchers and developers today typically use popular methods such as Decision Trees (DTs), Random Forests (RFs), and Deep Neural Networks (DNNs) to create IDSes based on a number of specific datasets [2]. One example benchmark dataset is KDDCup99, which is currently widely applied to create effective models by many academic researchers and companies [10]; other alternative datasets include UNSW-NB15, Kyoto, WSN-DS, and CICIDS2017 [11].

2.2 Adversarial attacks

In 2014, Christian Szegedy found that DNNs learn input-output mappings that are fairly discontinuous to a significant extent [12]; this indicates that certain, almost imperceptible perturbations may cause the DNN to misclassify testing samples. For example, when a perturbation is added to a set of images, and these images are applied to a typical DNN for sex classification, these images are falsely classified as male despite the ground truth, according to human eyes, of female. Following Szegedy’s work, the concept of an adversarial attack, which can confuse a neural network with slight perturbations, was developed. Currently, scientists have proposed numerous methods to generate this perturbation: the fast gradient sign method (FGSM), one-pixel attack, and so on [13]. In addition to image attacks, adversarial attacks can be applied in other domains, such as video attacks and reinforcement learning.

2.3 Adversarial training

Adversarial training is the process of training a model to correctly classify both original and adversarial attack instances [14], which can improve not only the robustness of the model to modified attacks but also its generalization performance for unmodified instances. A common way to perform adversarial training is to utilize adversarial attacks as training samples; thus, adversarial training is usually integrated with the generation of adversarial attacks.

2.4 Generative adversarial networks

In 2014, Google researcher Ian Goodfellow proposed GANs, whose main structure consists of two neural networks named generator and discriminator [6]; the basic architecture of a GAN is shown in Fig. 1. The generator receives a noise vector generated from Gaussian sampling and converts it into an instance in such a way that the instance can bypass the discriminator, injecting noise in the input to the generator is used for data augmentation, preventing overfitting and improving model generalization. The discriminator is indeed a binary classifier, with a sigmoid activation function in its last layer that outputs a value from 0 to 1; if the vector is assigned a value more than 0.5 by the discriminator, it is treated as an authentic instance and as a false instance otherwise. The purpose of this discriminator is to distinguish instances originated from a real dataset or the generator. In the training phase, the generator and discriminator compete steadily with each other to achieve their purposes. After a certain number of training epochs, ideally, the generator is able to create authentic samples that the discriminator has difficulty distinguishing from real samples.

Figure 1.

GAN architecture.

Shortly after their creation, GANs became a popular topic, and currently, many researchers are attempting to optimize them on stability of the generation, higher resolutions or the greater authenticity of the generated vectors, such as WGAN [15], CycleGAN [16], SEGAN [17], and styleGAN [18]. Meanwhile, GANs have been applied in all kinds of fields: in image processing, GANs are used to generate unseen but realistic pictures [19], while domains such as video generation [20] and medical care [21] have also made use of this technology.

2.5 Attack generation and adversarial training in domain of intrusion detection

Lin et al. [22] proposed a framework named IDSGAN for generating adversarial attacks and bypassing target IDSes by using NSL-KDD; the discriminator in this framework steadily simulates the target IDS, and as a result, rather than the target IDS, this discriminator is the target that the generator must bypass. Yan et al. [7] applied a WGAN to generate attack traffic to bypass an IDS automatically, using the standardized Euclidean distance and information entropy to access the model; implementing the KDDCup99 dataset, the authors achieved a reduction in the accuracy of the IDS from 97.34% to 47.62%. Lee et al. [23] designed autoencoder-conditional GAN to improve the performance of an RF model; in experiments, the proposed GAN-RF obtained an F1-score of nearly 0.95 F1 with the CICIDS2017 dataset after data augmentation. Usama et al. [5] proposed a GAN-based architecture to generate adversarial attacks by feeding the content features from KDDCup99 to the model; the generated attacks successfully evaded ML- and DL-based IDSes and reduced the average accuracy of these IDSes to approximately 50%. The generator in this architecture was then selected to perform adversarial training to strengthen these IDSes, after which their average accuracy reached nearly 80%. Msika et al. [9] proposed a framework named SIGMA by using a GAN, metaheuristics, and local search to leverage attacks. SIGMA then applied these attacks to strengthen the IDS and finally improved the performance of an existing IDS to up to a 100% detection rate on these generated attacks. To detect adversarial attacks in network traffic, Ye et al. [24] designed an adversarial sample detector (ASD) based on the bidirectional GAN. The generator was trained to reflect the normal data distribution and then calculated the reconstruction error and the discriminator matching error from the samples. In experiments, the ASD helped a network IDS defend against adversarial attacks generated via three typical methods (e.g., FGSM). The framework proposed in this paper aims to perform both attack adversarial generation and IDS building.

3. Restriction on attack generation

To preserve the functional behavior of modified attacks, constraints are imposed on adversarial attack generation [5]; for example, for computer vision, the visual appearance of generated instances should be the same as that of the original instances. In language processing, the semantic meaning cannot be changed when generating adversarial text instances.

In the network intrusion detection, the generated network traffic should not invalidate specific network traffic features, such as intrinsic features in NSL-KDD [22] and flow duration in CICIDS2017 [9]. These features are named functional features and related to the feasibility of network traffic, while the remaining features are named nonfunctional features and are allowed to be modified. As a result, when performing attack generation, only nonfunctional features can be generated in a network vector.

Due to this restriction, typical methods such as the Fast Gradient Signed Method (FGSM) in image attack generation are not suitable in network traffic generation because they do not distinguish functional and nonfunctional features, instead attempting to modify every feature. GANs thus serve as an alternative method to generate network traffic instances given the condition that the features fed into GAN are nonfunctional features only.

4. Proposed methods

4.1 GANCIDS framework

To establish a new IDS to both prevent adversarial attacks and maintain the classifying ability of the original IDS, the authors propose a Generative Adversarial Network with Classifier for Intrusion Detection System (GANCIDS) to both generate adversarial attacks and perform adversarial training. This framework contains three trainable models, named Generator (G), Discriminator (D), and Classifier (C), as well as a pretrained model named Feature Extractor (FE), all of which are based on neural networks. The details of GANCIDS are shown in Fig. 2. In addition to the four mentioned models, the figure also shows IDS, which refers to the old IDS and is deployed as a target model (i.e., target IDS) for G to bypass and C to replace.

Figure 2.

Structure of GANCIDS.

Due to the restriction in Section 3, features in every instance must be separated into functional and nonfunctional features, only the latter of which can be modified. The resulting process for generating an adversarial attack is illustrated based on the Generating Part in Fig. 2. First, nonfunctional features are separated from the original attack and transformed into a latent vector by FE; second, if the dimension of this latent vector is $k$ , a $k$ -dimensional random noise vector is generated from samples from a Gaussian distribution; third, the latent vector and the noise are merged into a $2k$ -dimensional vector by concatenation and fed into G; fourth, G outputs the generative, nonfunctional part, which is to be integrated with the functional features from the original attack to form a generative instance; finally, D is utilized to determine whether this generative instance can be considered authentic. From Section 2.4, generally, in a GAN architecture, 0.5 is the threshold used by the discriminator to judge whether a generated vector is real or fake: if this generative instance attains a value of less than 0.5 from D, it is treated as a fake instance and will be discarded; otherwise, it is treated as real adversarial attack.

After attack generation, using different training algorithms, features are passed to C and the target IDS to determine the corresponding losses, which help train G and C.

Figure 3.

Structure of feature extractor.

Figure 4.

T-SNE before and after feature extractor.

In this framework, nonfunctional features are not supposed to be input to G directly because they are usually complicated and high dimensional. To gain the most useful information and simultaneously reduce the complexity in these features, FE is designed as a feature extractor based on transfer learning [25, 26], it is constructed from a pre-trained model via a two-step process: first, nonfunctional features and labels from the original attacks and normal examples are fed into a multilayer neural network to train a binary classifier; second, the last layer of this neural network is discarded. The construction of FE is illustrated in Fig. 3.

4.2 Training methodology

For mathematical interpretation, some symbols and equations are introduced in advance:

•
$x$ are a batch of original instances.
•
$\bar{x}$ are a batch of generative attack instances.
•
$\tilde{x}$ are a batch of attack instances, including original and generative attack instances.
•
$\hat{x}$ are a batch of mixed instances, including benign, original attack and generative attack instances.
•
$z$ are a batch of noise vectors sampled from a Gaussian distribution.
•
$y$ , $\hat{y}$ are a batch of real labels of $x$ and $\hat{x}$ , label of attack equals 1 and label of benign equals 0.
•
$x^{\textit{func}}$ and $x^{\textit{nonf}}$ are functional and nonfunctional features in $x$ , respectively.
•
$\theta_{G}$ , $\theta_{D}$ and $\theta_{C}$ are parameters of Generator, Discriminator and Classifier respectively.
•
$F(x)$ are the outputs from Feature Extractor by inputting $x$ (i.e. features after extraction).
•
$G(x)$ are the outputs from Generator by inputting $x$ (i.e. generated nonfunctional features).
•
$D(x)$ are the outputs from Discriminator by inputting $x$ .
•
$T(x)$ , $C(x)$ are the outputs from target IDS and Classifier by inputting $x$ .
•
$J(x_{1},x_{2})$ is a joint function for $x_{1}$ and $x_{2}$ , using $J$ operation is for connecting functional and nonfunctional features, thus $x$ and $\bar{x}$ can be calculated by Eqs (1) and (2) respectively:

$\displaystyle x=J(x^{\textit{func}},x^{\textit{nonf}})$ (1) $\displaystyle\bar{x}=J(x^{\textit{func}},G(F(x^{\textit{nonf}}),z))$ (2)

Using these symbols and equations, the following paragraph introduces the training strategies used for GANCIDS. For each attack label and target model, GANCIDS is trained in a fixed order: first, G and D are trained to generate attack instances with high authenticity; second, based on the original dataset, C is trained to simulate the classifying ability of the target IDS; third, G is trained to generate attacks to bypass both C and the target IDS; finally, by using these generative attack instances and the original dataset, C is trained to improve its classifying ability on adversarial attacks. These four steps are named Training G and D, Simulation, Bypassing, and Surpassing.
4.2.1 Training G and D

For this part, only original attack instances are applied to train G and D. From the beginning, every instance is separated into a functional part and a nonfunctional part; then, FE extracts the nonfunctional part into a latent vector, a noise vector is sampled from a Gaussian distribution, and these two vectors are combined as the input of G. G then outputs a generative, nonfunctional feature vector, which is combined with the functional part from the original attack, forming a generative attack that is fed into D. As a consequence, the loss function for G and D in this step is firstly defined as follow:

$\displaystyle L(\theta_{G},\theta_{D})=\frac{1}{n}\sum\nolimits_{i=1}^{n}\log(% 1-D(\bar{x_{i}}))+\frac{1}{n}\sum\nolimits_{i=1}^{n}\log D(x_{i})$ (3)

The objective of D is to distinguish generator attack instances from original dataset attack instances (i.e. to increase the probabilities of original attacks being detected as true and generated attacks being detected as false), thus it maximizes $L$ in Eq. (3). The objective of G is to generate instances that can bypass D (i.e. to reduce the probability of generated attacks being detected as false), thus it minimizes $L$ in Eq. (3). Meanwhile, because $\theta_{G}$ has relationship only with the generation of $\bar{x}$ , the second term in Eq. (3) can be omitted. Furthermore, in order to speed up training G across all training steps, nonsaturating loss functions (e.i. $-\log(D(x))$ ), providing a higher gradient than saturating loss functions (e.i. $\log(1-D(x))$ ), are utilized [27]. As a result, the loss function of G can be rewritten as follow after combining with Eq. (2):

$\displaystyle L_{G}(\theta_{G})=-\frac{1}{n}\sum\nolimits_{i=1}^{n}\log(D(J(x^% {\textit{func}},G(F(x^{\textit{nonf}}),z))))$ (4)

Totally, the whole process is outlined in Algorithm 4.2.1. Indeed, the objective of this step is to choose suitable $\theta_{G}$ and $\theta_{D}$ to solve a minimaximum problem, similar to that used in training a vanilla GAN [6]:

$\displaystyle\min_{\theta_{G}}\max_{\theta_{D}}L(\theta_{G},\theta_{D})$ (5)

[H] : Training G and D Initialize:

$X^{O}$ : Original attacks;

each training epoch each batch size in $X^{O}$ Obtain real attack $\{x_{i}\}^{n}_{i}$ from $X^{O}$ Sample prior $\{z_{i}\}^{n}_{i}\sim N(0,1)$ Gain $\bar{x}$ by Eq. (2) Update $\theta_{D}$ to maximize Eq. (3) Update $\theta_{G}$ to minimize Eq. (4)

4.2.2 Simulation

Before attacking C and the target IDS, C is trained to gain the basic classifying ability of the target IDS on the original dataset. The objective of C in this step is to minimize the binary cross-entropy loss function, where the output from the target IDS is treated as the ground truth for training C. Then, the loss function $L_{C}$ is defined in Eq. (6). Totally, the whole progress is outlined in Algorithm 4.2.2.

$\displaystyle L_{C}(\theta_{C})=-\frac{1}{n}\sum\nolimits_{i=1}^{n}(1-T(x_{i})% )*\log(1-C(x_{i}))-\frac{1}{n}\sum\nolimits_{i=1}^{n}T(x_{i})*\log(C(x_{i}))$ (6)

[H] : Simulation Initialize:

$X^{O}$ : Original attacks;

$X^{B}$ : Benign instances;

Mix $X^{O}$ , $X^{B}$ into $X$ each training epoch each batch size in $X$ Obtain instances $\{x_{i}\}^{n}_{i}$ from $X$ Gain predictions: $T(x_{i})$ , $C(x_{i})$ Update $\theta_{C}$ to minimize Eq. (6)

4.2.3 Bypassing

After the simulation step, C is a suitable attack target, then G is enhanced to generate attacks to both bypass D and confuse C: bypassing D means that an instance should obtain a value larger than 0.5 from D; confusing C means that its loss function should be a combination of its original adversarial loss function with a new loss function opposited to the one of C; the loss function $L_{G}$ is defined as follow:

$\displaystyle L_{G}(\theta_{G})=-\frac{1}{n}\sum\nolimits_{i=1}^{n}\log D(\bar% {x_{i}})-\frac{1}{n}\sum\nolimits_{i=1}^{n}T(\bar{x_{i}})*\log(1-C(\bar{x_{i}}% )){}\!-\frac{1}{n}\sum\nolimits_{i=1}^{n}(1-T(\bar{x_{i}}))*\log(C(\bar{x_{i}}))$ (7)

There are three terms in this formula: the first term illustrates the objective of G to bypass D; the remaining two terms are the opposite of the binary cross-entropy to Eq. (6), in which the output from the target IDS is treated as the ground false to guide G to confuse the target IDS. Simultaneously, C is still trained to encourage it to approach the classification ability of the target IDS; thus, the loss function of C is the same as Eq. (6) in the Simulation step. The entire process of this step is presented in Algorithm 4.2.3: for each outer epoch, G uses the original attacks to generate attack instances until the number of generated attacks reaches a certain number $N$ . Then, these generative attacks are applied across several epochs to train G and C. Finally, G is able to generate attacks to confuse both C and the target IDS.

[H] : Bypassing Initialize:

$X^{O}$ : Original attacks;

each training outer epoch Create $X^{G}$ as an empty array $X^{G}<N$ each batch size in $X^{O}$ Obtain $\{x_{i}\}^{n}_{i}$ from $X^{O}$ Sample prior $\{z_{i}\}^{n}_{i}\sim N(0,1)$ Gain $\bar{x}$ by Eq. (2) $D({\bar{x_{i}}})>0.5$ $X^{G}.\textit{append}(\bar{x_{i}})$ Suffle $X^{G}$ each inner epoch each batch size in $X^{G}$ Obtain $\{\bar{x_{i}}\}^{n}_{i}$ from $X^{G}$ Gain predictions: $T(\bar{x_{i}})$ , $C(\bar{x_{i}})$ Update $\theta_{C}$ to minimize Eq. (6) Update $\theta_{G}$ to minimize Eq. (7)

4.2.4 Surpassing

By inputting C with these generative attacks, C can perform better than the target IDS in terms of classification ability. Different from the steps in Bypassing, the real labels of the related instances are used as training targets. In this step, original instances and generative attacks are mixed to train C. The objective of C is to classify all instances correctly, and its loss function $L_{C}$ is shown as follow:

$\displaystyle L_{C}(\theta_{C})=-\frac{1}{n}\sum\nolimits_{i=1}^{n}(1-\hat{y_{% i}})*\log(1-C(\hat{x_{i}}))-\frac{1}{n}\sum\nolimits_{i=1}^{n}\hat{y_{i}}*\log% (C(\hat{x_{i}}))$ (8)

G is still trained to strengthen itself, but not in the same way as C. Instead, G is trained only by generative attack instances; thus, the label is always equal to 1 when training G. As a result, the loss function of G is simplified as follows:

$\displaystyle L_{G}(\theta_{G})=-\frac{1}{n}\sum\nolimits_{i=1}^{n}\log D(% \tilde{x_{i}})-\frac{1}{n}\sum\nolimits_{i=1}^{n}\log(1-C(\tilde{x_{i}}))$ (9)

The entire process is illustrated in Algorithm 4.2.4. Although there are many similarities between the Bypassing and Surpassing algorithms, there are two key differences: the Surpassing algorithm uses real labels instead of the outputs from the target IDS and trains C with all types of instances instead of only with generative attacks. After this step, C should be trained to distinguish these generative attacks.

: Surpassing Initialize:

$X^{O},Y^{O}$ : Original attacks with labels;

$X^{B},Y^{B}$ : Benign instances with labels;

each training outer epoch Create $X^{G},Y^{G}$ as empty arrays $X^{G}<N$ each batch size in $X^{O}$ Obtain $\{x_{i}\}^{n}_{i}$ from $X^{O}$ Sample prior $\{z_{i}\}^{n}_{i}\sim N(0,1)$ Gain $\bar{x}$ by Eq. (2) $D({\bar{x_{i}}})>0.5$ $X^{G}.\textit{append}(\bar{x_{i}})$ ; $Y^{G}.\textit{append}(1)$ Mix all instances, labels into $(X,Y)$ Mix $X^{G}$ , $X^{O}$ into $X^{\prime}$ each inner epoch each batch size in $X$ Obtain $\{\hat{x_{i}}\}^{n}_{i}$ , $\{\hat{y_{i}}\}^{n}_{i}$ from ( $X$ , $Y$ ) Gain Classifier prediction: $C(\hat{x_{i}})$ Update $\theta_{C}$ to minimize Eq. (8) each batch size in $X^{\prime}$ Obtain instances $\{\tilde{x_{i}}\}^{n}_{i}$ from $X^{\prime}$ Gain Classifier prediction: $C(\tilde{x_{i}})$ Update $\theta_{G}$ to minimize Eq. (9)

In conclusion, our framework can be a role for this domain knowledge, it is functionally complete and obey the restriction in this domain. More importantly, its training method is arranged in a suitable order and the loss functions are deduced from original GAN, the deduction is scientific and easy to understand.

5. Experimental setup and results

The experiments in this paper are implemented for three purposes: to train the proposed GANCIDS, to test the capability and quality of the generative attacks, and to compare the classifying ability of the developed classifier C with that of the target IDS. The experiments were conducted on a 3.60 GHz Intel Core i9-9900K processor, 32 GB DDR3, and GeForce RTX 2080Ti graphics card using the Python programming language and a Jupyter notebook as the running platform.

5.1 Dataset preparation and preprocessing

CICIDS2017 is chosen as the experimental dataset because it contains data on modern network traffic and the most up-to-date and common attacks [28]. Although original CICIDS2017 consists of 3,119,345 instances, 288,602 instances have missing class label and 203 instances have missing information [29]. After removing these instances, basic information on the dataset is shown in Table 1. In this dataset, benign instances represent normal network traffic, whereas instances with other labels are attacks.

Table 1
Overall characteristics of CICIDS2017

Dataset	CICIDS2017	Dataset type	Multiclass
Release year	2017	Num instances	2,830,540
Num features	78	Categories	15
Features	1. Destination Port: target port number.
	2. Bwd Packet Length Std: standard deviation of backwarding packet length.
	3. Fwd IAT Min: minimum of forwarding inter arrival time.
	…
Label	Benign, DDoS, DoS Hulk, DoS GoldenEye, DoS Slowloris, DoS Slowhttptest, Bruteforce, PortScan, FTP-Patator, SSH-Patator, Bot, Infiltration, XSS, SQL Injection, Heartbleed

Table 2

Sublabels of each attack group

Attack group	Sublabels
DDoS	DDoS
DoS	DoS Hulk, DoS GoldenEye, DoS Slowloris, DoS Slowhttptest
Bruteforce	Bruteforce, PortScan, FTP-Patator, SSH-Patator, Bot
Infiltration	Infiltration, XSS, SQL Injection, Heartbleed

Table 3

Functional features of each attack group

Attack group	Functional features	Attack group	Functional features
DDoS	Flow Duration, Bwd Packet Length Std, Average Packet Size, Packet Length Std, Flow IAT Std, ACK Flag Count	Bruteforce	PSH Flag Count, Flow Duration, Total Length of Fwd Packets, Init Win bytes forward, Packet Length Std, Subflow Fwd Bytes, Fwd PSH Flags
DoS	Flow Duration, Active Mean, Average Packet Size, Packet Length Std, Flow IAT Mean, PSH Flag Count, Idle Max	Infiltration	Subflow Fwd Bytes, Total Length of Fwd Packets, Flow Duration, Idle Mean, Active Mean, Init Win bytes backward, PSH Flag Count

The functional and nonfunctional features in CICIDS2017 were divided based on Simon’s work [9]: first, four attack groups – DDoS, DoS, Bruteforce, and Infiltration – were selected, with details on the formations of these attack groups shown in Table 2; second, based on the results of feature selection by the dataset creators [28], the functional features for each attack group were selected (Table 3). In the experiments, for every attack group, the corresponding functional features never change; only the other features in CICIDS2017 are fed into FE and G in the training step.

Next, CICIDS2017 was divided into a training set and a testing set at an 80% to 20% ratio, respectively. Then, to extract basic information from these instances, the data were normalized: all feature values were mapped to a corresponding positive number less than 1 by the following formula:

$\displaystyle x^{\prime}_{(i)}=\frac{x_{(i)}-x_{\min}}{x_{\max}+x_{\min}}$ (10)

5.2 Evaluation metrics

Certain performance metrics tend to be used by researchers to estimate the functionality of their models. In this paper, the following metrics are utilized in the binary classification of an instance when labeled as attack or normal:

•
True positive (TP): the total number of correctly predicted normal samples.
•
True negative (TN): the total number of correctly predicted attack samples.
•
False positive (FP): the total number of normal samples predicted as attacks.
•
False negative (FN): the total number of attack samples predicted as normal.
•
Detection rate (DR), true positive rate (TPR) or recall: the ratio of the number of correctly detected attack samples to the total number of attack samples, that is, the total number of correctly classified positive samples to the total number of actually positive samples.

$\displaystyle\textit{DR}=\textit{TPR}=\textit{Recall}=\frac{\textit{TP}}{% \textit{TP}+\textit{FN}}$ (11)
•
False positive rate (FPR): the ratio of the number of normal samples incorrectly classified as attack samples over the total number of normal samples.

$\displaystyle\textit{FPR}=\frac{\textit{FP}}{\textit{FP}+\textit{TN}}$ (12)

DR is selected as a metric to assess the detecting capability of a model with attack samples, and FPR is selected to assess the stability of the model with normal samples. A model is said to have good performance on a specific type of attack when it possesses a high DR and a low FPR.

Table 4
DR (%)/FPR (%) from 16 target IDSes for different attack groups and structures

Group DT ADA RF DNN Groups DT ADA RF DNN

DDoS 98.3/0.2 98.4/0.3 98.6/0.1 98.3/1.0 Bruteforce 97.9/0.7 97.7/0.3 97.9/0.3 96.5/6.9

DoS 99.0/0.4 98.9/0.4 99.3/0.3 98.8/5.4 Infiltration 98.2/0.5 97.7/0.9 98.4/0.2 95.2/8.3

5.3 Target IDS preparation

Group	DT	ADA	RF	DNN	Groups	DT	ADA	RF	DNN
DDoS	98.3/0.2	98.4/0.3	98.6/0.1	98.3/1.0	Bruteforce	97.9/0.7	97.7/0.3	97.9/0.3	96.5/6.9
DoS	99.0/0.4	98.9/0.4	99.3/0.3	98.8/5.4	Infiltration	98.2/0.5	97.7/0.9	98.4/0.2	95.2/8.3

For the original ML-based or DL-based IDS, 16 binary classifying models were chosen, derived from 4 attack groups and four structures each. For each attack group, anomaly detection IDSes with the original CICIDS2017 training dataset based on DT, RF, Adaboost (ADA) and DNN were pretrained, and their performances were measured by DR for the corresponding attack group and FPRs with the original CICIDS2017 testing dataset. The results are collected in Table 4, which shows that these IDSes have good performance on the original CICIDS2017 dataset. In the following experiments, these IDSes were set as the black-box target IDSes in GANCIDS; only feedback information was available, while the parameters were hidden.

5.4 GANCIDS training process and analysis

Regarding the configuration of GANCIDS, the models in this framework are all based on a DNN (i.e., a neural network with at least one hidden layer); the activation function of the hidden layers for all models is LeakyReLU, the activation function of the last layer for FE, D, and C is the sigmoid function, and the activation function of the last layer of G is the tanh function. All layers in all models are linear; in addition, for each hidden layer of FE and D, BatchNorm1d and dropout, respectively, is applied.

For training the hyperparameters, the Adam optimizer was selected; FE, G, and D were trained with a learning rate of 0.001, and C was trained with a learning rate of 0.0002. N was set to 10,000, the size of latent space was set to 16, and the size of the Gaussian vector was set to 16. To find the optimal structure of GANCIDS (i.e. FE, G, D, and C), a grid search with 5-fold cross-validation was used to select the structure that possessed the highest DR for an adversarial attack. All hyperparameters configured for constructing and training GANCIDS obtained after the grid search are presented in Table 5. In this table, $k$ denotes the number of nonfunctional features, which differs in different attack groups.

Table 5
Hyperparameters configured for constructing and training GANCIDS after grid search

Hyperparameter	Parameter values or selections	Hyperparameter	Parameter values or selections
Batch size	64	Learning rate	0.001 (FE, G, D), 0.0002 (C)
Optimizer	Adam	N	10,000
Size of latent space	16	Size of vector from	16
		Gaussian sampling
Epochs for Training G and D	25	Epochs for Simulation	3
Epochs for Bypassing	5 outer, 15 inner	Epochs for Surpassing	10 outer, 15 inner
Sizes of FE	78, 128, 64, 32, 16, 1	Sizes of G	32, $k\times 0.5$ , $k\times 0.75$ , $k$ , $k\times 1.25$ , $k$
Activation functions of FE	LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, Sigmoid	Activation functions of G	LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, Tanh
Layer type of FE	Linear with BatchNorm1d	Layer type of G	Linear
Sizes of D	78, 32, 16, 1	Sizes of C	78, 128, 64, 32, 16, 1
Activation functions of D	LeakyReLU, LeakyReLU, LeakyReLU, Sigmoid	Activation functions of C	LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, LeakyReLU, Sigmoid
Layer type of D	Linear with Dropout	Layer type of C	Linear

For each attack group, first, FE was constructed by pretraining a binary model with the CICIDS2017 training dataset and deleting its last layer, as shown in Fig. 3. Then, the training methodology was applied to G, D and C step by step.

To analyze the training process, the losses of the models in GANCIDS per 100 batches under each training step are collected in Fig. 5; although only the models based on the RF target model designed to detect DDoS attacks are shown, they nevertheless serve as representatives to show the variation in the loss for other types of target models and for the detection of other types of attacks.

In Fig. 5a, the loss of G and D are shown for the Training G and D step, which is identical to the training step for a vanilla GAN. These two pictures show that, in this step, the losses of G and D are both noisy because the two models were constructed to fool each other steadily; consequently, the two losses were still unstable after this step, similar to what the losses for a vanilla GAN.

Figure 5.

Training losses of models in GANCIDS per 100 batches based on the RF target model and DDoS attack detection.

Figure 6.

Epochs for $X^{G}$ to reach $N$ in the Bypassing and Surpassing steps.

Figure 5b presents the loss of C during training to simulate the target IDS; the loss of C drops dramatically at the beginning and remains relatively low with slight variability, indicating that the classifying capability of the target IDS had been transferred to C to a certain extent.

Figure 5c shows the losses of G and C in the Bypassing step; the loss of G drops and remains near 1.9 with slight variability, whereas the loss of C is almost 0. These two losses indicate that G was trying to generate attacks to confuse C while C was still simulating the target IDS in this step.

Figure 5d shows the losses of G and C in the Surpassing step; the loss of G increases gradually with some variability, finally exceeding 25 after 100,000 batches, while the loss of C remains at nearly 0 with slight variation. These two losses indicate that it became increasingly difficult for G to generate attacks to confuse C while C was constantly learning from the correct labels in this step.

In Fig. 6, for every attack group, the number of epochs for $X^{G}$ (i.e., the generated attacks from G) to reach N in Algorithms 4.2.3 and 4.2.4 is also recorded: in these two steps, for every epoch, $1,500\sim 2,000$ DDos, DoS, or Bruteforce adversarial attacks are generated while only less than 1,000 Infiltration adversarial attacks are generated. Hence, G has greater difficulty generating Infiltration attacks than the other three types of attacks; additionally, for every attack group, the generation speed does not change significantly from the Bypassing step to the Surpassing step.

5.5 GANCIDS performance and analysis

For each attack group and each structure, the target IDS was trained 5 times, and DR was used to assess the detection capability of C on adversarial attacks and original attacks from the test dataset. As a reference, the DR of the target IDS was also tested. Figure 7 shows the corresponding average DRs of C and the target IDS on DDoS, DoS, Bruteforce, and Infiltration attacks.

From these four plots, many similarities between the attack groups can be found: first, all target models show great performance on original attacks, but the detection capabilities decrease sharply when facing adversarial attacks generated by GANCIDS, as shown by the drop in the DRs of the target models on adversarial attacks to 3.7%–55.5%; second, C produced by GANCIDS performs excellently in classifying both generative attacks and original attacks, with average DRs of 99.2% and 97.9%, respectively.

In addition to attack detection, the performance of the developed C on normal examples was also tested; the average FPRs of each developed C were measured and are collected in Table 6, which shows that the developed C models still possess low FPRs and have good performances on normal instances.

Table 6
Average FPR (%) of generated classifiers for different attack groups and IDS target bases

Attack groups	DT	ADA	RF	DNN	Attack groups	DT	ADA	RF	DNN
DDoS	0.1	0.2	0.2	0.9	Bruteforce	0.3	0.3	0.4	3.2
DoS	0.3	0.6	0.5	5.5	Infiltration	1.1	0.9	0.6	7.7

Figure 7.

Average detection rates of the target IDS and classifier on each attack group.

Figure 8.

Construction of the FID calculating mechanism.

Figure 9.

FIDs of generated instances in training steps.

Integrating the information from Table 4 and the four DR figures, two interesting comparisons can be found: the DR of C on original attacks is lower than that of the target IDS, and the FPR of C is higher than that of the target IDS. From these two comparisons, it can be seen that the detectability of C on the original CICIDS2017 (i.e., the original attack and benign instances) is slightly weaker than that of the target IDS, indicating that C trades off the ability to perfectly detect the original dataset in order to be able to powerfully detect adversarial attacks.

5.6 Frechet inception distance for generated attacks

The Frechet inception distance (FID) is a metric for evaluating the quality of generated sets and was specifically developed to evaluate the performance of GANs. In image processing, before using FID, scientists applied a pretrained feature extractor, such as Inception V3 [30], to determine useful, condensed features from a set of generative and real images. For addressing datasets in the IDS domain, however, direct application of Inception V3 as a pretrained feature extractor is unsuitable; instead, another transfer model, the Frechet inception distance feature extractor (FIDFE), is utilized. The construction of the FIDFE is almost the same as FE, but the FIDFE originates from a multiple classifier pretrained by CICIDS2017. The construction of the whole FID calculation mechanism is shown in Fig. 8.

In the experiments, to verify the quality of the generated attacks, this mechanism was used to record the FIDs of the attacks generated by G for each attack group for all training steps related to G: Training G and D, Bypassing and Surpassing. Figure 9 shows the FID of generated attacks in these training steps.

In this chart, for all attack groups, the FIDs start at nearly 500. When G is trained to bypass D in Algorithm 4.2.1, the FIDs drop dramatically and reach lower values. After 15 epochs, the FIDs are all under 100; specifically, the DDoS, DoS and BruteForce attack scores are under 20. Finally, the DDoS, DoS and BruteForce FIDs have approached 0, whereas the Infiltration score converges to approximately 50. This chart can also be explained as follows: the features generated by G are random and disordered at first; then, under the Training G and D step, the distribution of the nonfunctional features produced by G gradually approaches that of the original attacks, and G is ultimately able to generate attacks of high quality; in the Bypassing and Surpassing steps, G maintains the quality of its generated and strengthened attacks to bypass the target IDS and simultaneously allow C to catch up with the target IDS.

Overall, all FIDs remain low after training; compared with other FID experiments [19, 20], the Infiltration score remains acceptable; thus, the quality of the generative attacks is basically guaranteed.

6. Comparison with related works

The works in [7, 22] concentrate on attack generation only, with the objective of bypassing the IDS by using a GAN; in other words, these works only played the role of attackers. By comparison, GANCIDS is designed for not only adversarial attack generation but also improved IDS establishment, which includes concrete and systematic strategies for cybersecurity defenders.

Based on Section 3, feasibility-related features cannot be modified; however, the work in [23] did not mention any information regarding this aspect and fed the GANs with all features, which may lead the generated instances to lose feasibility. Meanwhile, in Ye’s research [24], the adversarial attacks detected by the ASD were generated by typical attack models such as the FGSM, and the restriction in Section 3 is also violated. In this work, functional and nonfunctional features are divided clearly.

Usama et al. [5] only applied the generator model and did not filter the generated traffic by setting the discriminator in the adversarial training phase, which may cause fake traffic to be fed into classifier and affect the strengthening of the IDSes, and no information about the authenticity of the generated traffic was provided. By comparison, to guarantee the authenticity of the generated attacks, the discriminator in this work was always utilized to filter the generated attacks from the generator; furthermore, the FIDs of the generated attacks in this work were also measured after filtration.

In addition, the works in [5, 22, 24] utilized NSL-KDD or KDDCup99; these two datasets are in fact too old to be applied because of their inability to reflect modern network traffic and attacks [31, 32]. Instead, this work uses CICIDS2017, a relatively suitable dataset for the current network environment.

Yin et al. [8] focused on botnet detection, whereas this work focuses on the detection of four other attack groups: DoS, DDoS, BruteForce, and Infiltration. SIGMA [9] focused on ways to strengthen the model from an existing IDS; however, sometimes the existing IDS was not trainable. Compared with SIGMA, GANCIDS establishes an improved IDS from the beginning. In addition, although reinforced models can achieve nearly 100% detection rates on generated attacks, no information about how they perform on the original datasets was given. By comparison, this work measured the detection capability of the models on the original dataset and found that C sacrificed the ability to perfectly detect the original dataset after adversarial training.

7. Conclusion and future work

In this paper, GANCIDS was proposed to both bypass target models selected from traditional ML- and DL-based classifiers, and produce improved classifiers to replace these traditional classifiers. Experiments showed that the traditional models, including DT, ADA, RF, and DNN, had average DRs ranging from 3.7% to 55.5% with adversarial attacks generated by GANCIDS; furthermore, classifiers developed by GANCIDS demonstrated strong performances in detecting both generated and original attacks, with an average DR 98.4% and low FPRs and stability when coping with normal instances. In addition, FIDs were utilized to verify the quality of the generated adversarial attacks; the quality of the generated DDoS, DoS, and BruteForce was found to be good, whereas that of the Infiltration was acceptable.

A number of changeable structures and researchable angles can yet be addressed with regard to GANCIDS. Future studies should consider the following points. First, only the original GAN was used in this work, and other widely used versions, such as the WGAN, are planned as future research objects. Second, only binary classifiers were used as the target models in this paper, and other versions of GANCIDS should be capable of bypassing multiclass or regression models. Third, other up-to-date datasets for generating adversarial attacks and corresponding improved classifiers are planned to be utilized.

Footnotes

Acknowledgments

This work is supported by the Major Science and Technology Special Project of Sichuan Province (No. 2018GZDZX0009) and the Introducing Program of Dongguan for Leading Talents in Innovation and Entrepreneur (Dongren Han [2018], No. 738).

References

Tiwari

and Mishra

, Review of intrusion detection system, International Journal of Scientific Research and Engineering Trends 5 (2019), 2018–2025.

Xin

Kong

Liu

Chen

Zhu

Gao

Hou

and Wang

, Machine learning and deep learning methods for cybersecurity, IEEE Access 6 (2018), 35365–35381.

Usama

Qadir

and Al-Fuqaha

, Adversarial attacks on cognitive self-organizing networks: The challenge and the way forward, in: Proceedings of the 43rd Annual IEEE Conference on Local Computer Networks, LCN Workshops 2018, 2019, pp. 90–97.

Kolosnjaji

Demontis

Biggio

Maiorca

Giacinto

Eckert

and Roli

, Adversarial malware binaries: Evading deep learning for malware detection in executables, in: European Signal Processing Conference, 2018-Septe, pp. 533–537.

Usama

Asim

Latif

Qadir

and Al-Fuqaha

, Generative adversarial networks for launching and thwarting adversarial attacks on network intrusion detection systems, in: 2019 15th International Wireless Communications and Mobile Computing Conference, IWCMC 2019, 2019, pp. 78–83.

Goodfellow

Pouget-Abadie

Mirza

Warde-Farley

Ozair

Courville

and Bengio

, Generative adversarial nets, in: Ghahramani

Welling

Cortes

Lawrence

and Weinberger

K.Q.

, eds, Advances in Neural Information Processing Systems, Curran Associates, Inc., Vol. 27, 2014, pp. 2672–2680.

Yan

Wang

Huang

Luo

and Richard Yu

, Automatically synthesizing DoS attack traces using generative adversarial networks, International Journal of Machine Learning and Cybernetics 10(12) (2019), 3387–3396.

Yin

Zhu

Liu

Fei

and Zhang

, An enhancing framework for botnet detection using generative adversarial networks, in: 2018 International Conference on Artificial Intelligence and Big Data, ICAIBD 2018, 2018, pp. 228–234.

Msika

Quintero

and Khomh

, SIGMA: Strengthening IDS with GAN and Metaheuristics Attacks, 2019, 1–11.

10.

Chandrasekhar

A.M.

and Raghuveer

, Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset, in: 2014 International Conference on Communication and Signal Processing, 2014, pp. 672–676.

11.

Mishra

Varadharajan

Tupakula

and Pilli

E.S.

, A detailed investigation and analysis of using machine learning techniques for intrusion detection, IEEE Communications Surveys and Tutorials 21(1) (2019), 686–728.

12.

Szegedy

Zaremba

Sutskever

Bruna

Erhan

Goodfellow

and Fergus

, Intriguing properties of neural networks, in: 2nd International Conference on Learning Representations, ICLR 2014 – Conference Track Proceedings, 2014, pp. 1–10.

13.

Akhtar

and Mian

, Threat of adversarial attacks on deep learning in computer vision: A survey, IEEE Access 6(August) (2018), 14410–14430.

14.

Miyato

Dai

A.M.

and Goodfellow

, Adversarial training methods for semi-supervised text classification, in: 5th International Conference on Learning Representations, ICLR 2017 – Conference Track Proceedings, 2017, pp. 1–11.

15.

Arjovsky

Chintala

and Bottou

, Wasserstein generative adversarial networks, in: Precup

and Teh

Y.W.

, eds, Proceedings of the 34th International Conference on Machine Learning, Vol. 70 of Proceedings of Machine Learning Research, PMLR, 06–11 Aug 2017, pp. 214–223.

16.

Jay

Renou

J.-P.

Voinnet

and Navarro

, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks Jun-Yan, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 183–202.

17.

Miyato

Kataoka

Koyama

Yoshida

and Networks

, SPECTRAL NORMALIZATION FOR GENERATIVE ADVERSARIAL NETWORKS, 2018.

18.

Karras

Laine

and Aila

, A style-based generator architecture for generative adversarial networks, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-June, pp. 4396–4405.

19.

Guo

Chen

Shi

and Tan

, Auto-embedding generative adversarial networks for high resolution image synthesis, IEEE Transactions on Multimedia 21(11) (2019), 2726–2737.

20.

Chen

Van Den Hengel

and Tan

, Scripted video generation with a bottom-up generative adversarial network, IEEE Transactions on Image Processing 29 (2020), 7454–7467.

21.

Ravishankar

Thiruvenkadam

and Venkataramani

, Unsupervised anomaly detection with generative adversarial networks to guide marker discovery, International Conference on Information Processing in Medical Imaging 6533(9623) (2017), 622–632.

22.

Lin

Shi

and Xue

, IDSGAN: Generative Adversarial Networks for Attack Generation against Intrusion Detection, 2018.

23.

Lee

J.H.

and Park

K.H.

, GAN-based imbalanced data intrusion detection system, Personal and Ubiquitous Computing, 2019.

24.

Peng

Luo

and Yan

, Detecting Adversarial Examples for Network Intrusion Detection System with GAN, in: Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, ICSESS, 2020-October, pp. 6–10.

25.

Marcelino

, Transfer learning from pre-trained models, Website, 2018. https://towardsdatascience.com/transfer-learning-from-pre-trained-models-f2393f124751.

26.

Gupta

, Transfer learning and the art of using pre-trained models in deep learning, Website, 2017. https://analyticsvidhya.com/blog/2017/06/transfer-learning-the-art-of-fine-tuning-a-pre-trained-model/.

27.

Goodfellow

, Nips 2016 tutorial: Generative adversarial networks, 12 2016.

28.

Sharafaldin

Lashkari

A.H.

and Ghorbani

A.A.

, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in: ICISSP 2018 – Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018-Janua (Cic), pp. 108–116.

29.

Panigrahi

and Borah

, A detailed analysis of CICIDS2017 dataset for designing Intrusion Detection Systems, International Journal of Engineering and Technology (UAE) 7(3.24 Special Issue 24) (2018), 479–482.

30.

Szegedy

Vanhoucke

Ioffe

Shlens

and Wojna

, Rethinking the Inception Architecture for Computer Vision, in: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, pp. 2818–2826.

31.

Creech

and Hu

, Generation of a new IDS test dataset: Time to retire the KDD collection, in: IEEE Wireless Communications and Networking Conference, WCNC, 2013, pp. 4487–4492.

32.

Al Tobi

A.M.

and Duncan

, Kdd 1999 generation faults: A review and analysis, Journal of Cyber Security Technology 2(3–4) (2018), 164–200.

Generate qualified adversarial attacks and foster enhanced models based on generative adversarial networks

Abstract

Keywords

1. Introduction

2.1 Intrusion detection system

2.2 Adversarial attacks

2.3 Adversarial training

2.4 Generative adversarial networks

3. Restriction on attack generation

4. Proposed methods

4.1 GANCIDS framework

5.1 Dataset preparation and preprocessing

Table 1 Overall characteristics of CICIDS2017

5.4 GANCIDS training process and analysis

Table 5 Hyperparameters configured for constructing and training GANCIDS after grid search

Table 6 Average FPR (%) of generated classifiers for different attack groups and IDS target bases

6. Comparison with related works

7. Conclusion and future work

Footnotes

Acknowledgments

References

Table 1
Overall characteristics of CICIDS2017

Table 5
Hyperparameters configured for constructing and training GANCIDS after grid search

Table 6
Average FPR (%) of generated classifiers for different attack groups and IDS target bases