Abstract
Compared with the traditional negative selection algorithms produce detectors randomly in whole state space, the boundary-fixed negative selection algorithm (FB-NSA) non-randomly produces a layer of detectors closely surrounding the self space. However, the false alarm rate of FB-NSA is higher than many anomaly detection methods. Its detection rate is very low when normal data close to the boundary of state space. This paper proposed an improved FB-NSA (IFB-NSA) to solve these problems. IFB-NSA enlarges the state space and adds auxiliary detectors in appropriate places to improve the detection rate, and uses variable-sized training samples to reduce the false alarm rate. We present experiments on synthetic datasets and the UCI Iris dataset to demonstrate the effectiveness of this approach. The results show that IFB-NSA outperforms FB-NSA and the other anomaly detection methods in most of the cases.
Introduction
Machine learning has been widely used to solve complex problems in the fields of engineering and scientific [24, 30]. As a significant branch, anomaly detection plays an essential role in machine learning. It is widely applied in the field of computer, financial, control system, medical, et al. [2, 29].
As a typical anomaly detection algorithm, the negative selection algorithm (NSA) was inspired by the immune tolerance in the T-cell maturation process of the biological immune system [3, 16], and proposed by Forrest in 1994 [22]. It can recognize an unlimited number of abnormal samples by training a limited number of normal samples, and broadly applied in the fields of fault detection, fraud detection, intrusion detection, health monitoring [1, 28].
The samples and detectors of NSA are encoded by binary strings at the early stages [22]. The real-valued negative selection algorithm (RNSA) was proposed in 2003, and the samples and detectors are used by constant-sized hypersphere [10]. To improve the detection efficiency and reduce the number of detectors (ND), a variable-sized hypersphere detector (V-detector) was proposed [32, 33]. Soon after, other shaped detectors were proposed, such as hypercube detector [11], hyper-ellipsoid detector [13], and multi-shaped detector [21]. The hypersphere detector is significantly outperforming the others. Therefore, many excellent detectors generation methods were proposed [23, 34]. The hypersphere detector is described by two parameters, which includes the radius of the detector and the coordinate of the detector center. Though the hypersphere detectors in various researches have made great achievements, the hypersphere detectors are generated randomly in whole state space. Their size, location, and quantity are different in each training session, even for the same training data and parameters. This restricts the further development and application of RNSA, especially the work on RNSA with continual learning ability or online learning ability.
This paper proposed an improved boundary-fixed negative selection algorithm (IFB-NSA). It non-randomly produces a layer of detectors closely surrounding the self space in the enlarged state space. Under the same parameters, the size, location, and quantity of its detectors are constant, only related to training data.
The remainder of this paper is structured as follows. Section 2 reviews the development of RNSA in the past. The necessity to improve FB-NSA was presented in Sections 3. The models of IFB-NSA and the experimental analyses are provided in Section 4 and 5, respectively. In Section 6, conclusions and future works are presented.
Related works
The samples and detectors of RNSA are described by hypersphere. Its detectors are randomly produced in whole state space. Its detection rate (DR) increases with the non-self space covered by detectors increases. However, the number of detectors increases with the detector coverage increases. The running efficiency decreases with the number of detectors increase. In order to increase the detector coverage and reduce the number of detectors, many improved methods have been proposed.
The variable-sized training samples are used to reduce the false alarm rate (FA) [14]. The boundary detectors are allowed to cover partial training samples to improve DR [8]. Penalty factor is used to optimize the distribution of detectors [19]. To reduce the time complexity, the NSA with grid cells was proposed [28]. An efficient proactive artificial immune system based anomaly detection and prevention system was proposed to make a detector evolve and facilitate better and correct self and non-self coverage [18]. RNSA with evolutionary preference was proposed to ensure the detectors cover non-self space more effectively [26]. NSA based on a reduced dataset with a filter/ranking feature selection technique was proposed for network anomaly detection [17]. A hybrid approach based on NSA and clonal selection algorithm was proposed [9]. It is used for the detection of abnormal web traffic on the network, and achieved better anomaly performance.
Although these methods mentioned above obtain better anomaly performance, their detectors are generated randomly. The detectors of the negative selection algorithm generation method in non-random ways were proposed in recent years. Interface detectors with constant-sized boundary samples (I-detector) [5] and boundary-fixed negative selection algorithm (FB-NSA) [4] are typical representatives. I-detector, as shown in Fig. 1, is one or more closed hypersurfaces. It is described by constant-sized boundary samples and their position information. Abnormal samples reside outside of I-detector, and normal samples reside inside of I-detector. FB-NSA, as shown in Fig. 2, produces a layer of detectors closely surrounding the self space. Abnormal samples are within detectors or on one side of the detectors, and normal samples are on the other side of detectors. For now, there are three kinds of FB-NSA [4], the boundary-fixed negative selective algorithm with the variable-sized detector (VFB-NSA), as shown in Fig. 2a, the boundary-fixed negative selective algorithm with the constant-sized detector (CFB-NSA), as shown in Fig. 2b and fine boundary-fixed negative selective algorithm (FFB-NSA), as shown in Fig. 2c.

I-detector.

Different kinds of FB-NSA.
The size, location, and quantity of these detectors are fixed in each training session for the same self space that is formed by training samples and parameters. This promotes the development of anomaly detection and fault diagnosis methods with continual learning abilities [5–7]. However, experiments on synthetic datasets of FB-NSA in reference [4] did not consider this case that the self space is close to the boundary of state space. When self space near the boundary of state space and parameter m is relatively small, the detectors can not completely enclose self space, as shown in Fig. 3a, which leads to a lower DR. The experiments on Iris datasets in reference [4] also reveal this problem. Versicolor is taken as normal data; Setosa and Virginica are taken as abnormal data. The detection rate only 57% and 63.12%, when Versicolor data are completely and half used to train FB-NSA (m = 3). When parameter m is large enough, as shown in Fig. 3b, the FB-NSA detector can completely enclose the self space, which leads to lower detection efficiency.

The FB-NSA detectors with different parameter m in [0,1]2.
Besides, the detective blind area (DBA) has a great influence on the DR of FB-NSA [4]. The increase of DBA would decrease DR of FB-NSA. Moreover, FA of FB-NSA is higher than many other anomaly detection methods under similar DR.
Therefore, the FB-NSA needs to be further improved to meet the needs in many fields. The paper presents an improved FB-NSA, IFB-NSA, to increase DR and decrease FA to some extent, which is based on extending the ranges to produce detectors, covering the detective blind areas by the auxiliary detector, and using variable-sized training samples to train the model.
FB-NSA detector is defined as [4]:
FB-NSA only produces a layer of detectors that are closely surrounding the self space in state space, as shown in Fig. 2. The size, location, and quantity of detectors do not change, when the geometric shape of self space and parameter m are constant.
The effects of the radii of training samples
Self space consists of hyperspheres, which are formed by self samples and their radii. In the training stage, the training samples form self space. In general, the radii of training samples are constant. When the radius of the training sample is relatively large, they can cover more nonself space, which can lead to lower DR. When the radius of the training sample is relatively small, there are many holes among these samples, which can lead to higher FA.
The shape of self space only relates to the outermost layer of samples. To reduce the FA of FB-NSA, variable-sized training samples based on their distribution can be used. The radii of the outermost layer of samples can be relatively small, and others can be relatively large.
One way to calculate the radii of training samples is based on the distribution density of training samples, and this can be calculated by the distance between training samples. The whole distribution and the local distribution of one training sample should be considered together. The whole distribution can be obtained by the average distance of all training samples with Eq. (1). The local distribution of one training sample can be obtained by the average distance of its k nearest neighbors with Eq. (2).
Therefore, the radius of every training sample can be calculated with Eq. (3).
The outermost layer of training samples is easy to find, for they are the ones nearest FB-NSA detectors, and used to calculate the detectors’ radii.
The FB-NSA detector is produced in state space [0,1] n and around the self space. When the self space is at a distance from the boundary of state space and parameter m is relatively small, the detectors can completely enclose self space, as shown in Fig. 2. When the self space near the boundary of state space and parameter m is relatively small, the detectors around the self space incompletely, as shown in Fig. 3a, which leads to lower DR.
One approach to this issue is to increase parameter m, and produce more detectors, but doing so decreases the detection efficiency. The FB-NSA detector number increases from 16 to 57, when parameter m increases from 10 to 30, as shown in Fig. 3.
Another approach to this issue is to enlarge the state space. In other words, expand the range to produce FB-NSA detectors. In order to reduce the compute quantity, the detectors can be produced in [–1/m, 1 + 1/m] n , as shown in Fig. 4. The parameter m is the number of the segments per dimension, n is the space dimension. The ND of FB-NSA is 23 in such a condition. It produces fewer detectors than the first approach.

The FB-NSA detectors in [–1/m,1 + 1/m]2 (m = 10, 23 detectors).
DBA has a great influence on the DR of FB-NSA. DR decreases with the increases of DBA. There are two kinds of DBA [4]. The first kind locates between self space and detectors. It is caused by the inherent characteristics of the hypersphere, and cannot be eliminated. The second kind can appear when two adjacent detectors are not overlapping enough and can be eliminated using a certain method. Figure 5 shows the detective blind area in [0,1]2. The dark gray area is the self space. The other colored area is DBA. Arrows denote the second kind of DBA, and the rest is the first kind of DBA.

The detective blind area.
When the two adjacent detectors have a different dimension, and they are not overlapping enough, the second kind of detective blind area is generated, and the first kind of detective blind area is increased. When the two adjacent detectors have the same dimension, and they are not overlapping enough, only the first kind of detective blind area is increased, and the second kind of detective blind area is not generated.
There are two adjacent detectors, as shown in Fig. 6. They have the same radius r, and their position information are {

The second kind of DBA between the same sized adjacent FB-NSA detectors.
There are two adjacent FB-NSA detectors, as shown in Fig. 7. Their radius are r3, r4 (r3 > r4), and their position information are {

The second kind of DBA between the different sized adjacent FB-NSA detectors.
If there is an auxiliary detector between the two adjacent detectors when their distance is larger than

The auxiliary detector.
However, FB-NSA detectors are produced by boundary hypercubes and their position information. It is hard to obtain the position information of the auxiliary detector directly. But its position information can be obtained by the union set of the position information of these two adjacent detectors.
Besides, the FB-NSA detectors are based on the boundary hypercubes. The position of every detector is fixed, so adjacent FB-NSA detectors are easy to obtain by calculating the distance between them. The minimum distance of the two adjacent detectors is τ, and the maximum distance of the two adjacent detectors is
The auxiliary detector is defined as,
The auxiliary detectors are used to eliminate the second kind of detective blind area and decrease the first kind of detective blind area.
The model of IFB-NSA
IFB-NSA reduces FA by using variable-sized training samples and improves DR in two ways. The first one is expanding the range to produce detectors. The second one is eliminating the second kind of detective blind area and reducing the first kind of detective blind area by the auxiliary detector.
The training process of IFB-NSA
The training process of IFB-NSA includes the following seven main steps, which is based on the training process of the FB-NSA.
Step 1: To set the parameter r base , m, and k.
Step 2: To calculate δ i according to Eq. (3).
Step 3: To calculate the distance between self space and the boundary of state space, divide the state space into uniform hypercubes, and calculate their center coordinate.
If this distance is greater than or equal to 2/m, the state space T is [0, 1] n and evenly divided into m n . If this distance is between 1/m and 2/m, the state space T enlarges to [–1/m, 1 + 1/m] n and evenly divided into (m + 2) n hypercubes, and detectors produce in [0, 1] n . If this distance is less than 1/m, the state space T enlarges to [–2/m, 1 + 2/m] n and evenly divided into (m + 4) n hypercubes, and detectors produce in [–1/m, 1 + 1/m] n .
Step 4: To determine whether or not the property of the hypercube is empty.
The property of hypercube is identified by r base + 0.5/m in the FB-NSA training stage, and it is identified by δ i r base + 0.5/m in this training stage.
Step 5: To determine boundary hypercube and generate corresponding position information.
This is the same as that of FB-NSA.
Step 6: To calculate detectors’ radii.
Set the radii of the outermost layer of training samples as r base . The radius r i of the detector can be obtained according to Definition 2.
Step 7: To generate the auxiliary detectors and code their position information.
The center coordinate, radii, and position information of auxiliary detectors can be generated according to Definition 1.
The testing process of IFB-NSA
The testing process of IFB-NSA includes the following three main steps, which are the same as the testing process of FB-NSA.
Step 1: To calculate the distance between testing sample t and IFB-NSA detectors.
Step 2: To find the minimum distance d, if d ≤ r
i
,
Step 3: To find 2 detectors nearest to t, and decode their position information. If the t simultaneously meets these position information,
Experiments
We tested IFB-NSA on synthetic datasets and the UCI Iris dataset to assess its performance and possible advantages. Our results were compared to those obtained by other anomaly detection methods. DR and FA [32] are defined as follows:
Experiment on synthetic datasets
In order to demonstrate the detection performance of IFB-NSA, three types of synthetic datasets were used as the self space, as shown in Fig.9.

Different shaped self space.
There are 10 000 points generated randomly in [0, 1]2. The numbers of normal samples for three datasets are 3 849, 3 774, and 3 544, respectively. Training samples are randomly selected from normal samples, and testing samples are randomly selected from the rest of normal samples and abnormal samples. The numbers of training samples are 100 and 1 000. The number of testing samples is 1 000, include 200 normal samples and 800 abnormal samples. All results were repeated 100 times, the data averaged.
The parameter k is the number of nearest neighbors of the training sample and used to calculate the adjustment coefficient δ of training samples. The detection performance of IFB-NSA varies with k.
Figure 10 shows the change of DR and FA of IFB-NSA with parameter k. DR changes little, for the variable-sized training samples have little effect on the shape of self space, and DR is only related to the self space. When NTS is relatively small, FA presents a downward trend on the whole. A lot of holes in the self space can lead to higher FA. The radii of training samples increase with the k increase and leading to holes decrease. The radii of training samples change little, when k increases to a certain value, and leading to FA changes little. When NTS is relatively large, FA changes little. A small number of holes in the self space and this have little effect on FA. To reduce computational complexity, the parameter k = 16 in this experiment.

The effect of parameter k on the detection performance of IFB-NSA (r base = 0.02).
FB-NSA and IFB-NSA have the same detection performance with the change of parameter rs (r base ), m, and NTS, for they have the same principles. Their DR decreases with the increase of rs (r base ) and m, decreases with the increase of NTS. Their FA increases with the increase of m and decreases with the increase of r s (r base ) and NTS [4].
FB-NSA outperforms the other anomaly detection methods in most of the cases [4], comparing the performance of FB-NSA and IFB-NSA in this experiment. The rate of change of detection rate (RCDR), the rate of change of false alarm rate (RCFA) and the rate of change of detector numbers (RCDN) are defined as follows:
Figure 11 shows RCDR with the different self spaces when all the other settings are the same. Compared with FB-NSA, DR of IFB-NSA is lower when r base = 0.04 and 0.05, as shown in Fig. 11b. Because in this case, the side lengths of the square are 1/40 = 0.025, it is less than r base , and this can generate many detective blind areas near the boundary of state space. DR of IFB-NSA is higher than that of FB-NSA in all other cases. This is decided by the nature of IFB-NSA, for IFB-NSA has more detectors than FB-NSA.

RCDR of different self space.
Figure 12 shows the RCFA of different self spaces when all the other settings are the same. Compared with FB-NSA, the FA of IFB-NSA is higher when r base = 0. For there are many IFB-NSA detectors cover self space, which can lead to higher FA. FA of IFB-NSA is higher than that of FB-NSA when r base = 0.01 and m = 20, as shown in Fig. 12a, 12c, 12e, 12f. When the parameter m, r base , and NTS are relatedly small, many holes can be generated in self space, and IFB-NSA can produce more detectors to cover these holes and leading to higher FA. FA of IFB-NSA is lower than that of FB-NSA in all other cases. The variable-sized training samples can decrease the holes in self space; for another, the increased detectors of IFB-NSA improve the correct recognition rate.

RCFA of different self spaces.
Figure 13 shows RCDN of different self space when all the other settings are the same. Compared with FB-NSA, the ND of IFB-NSA is lower in some cases, as shown in Fig. 13a, 13c, and 13e. There are many holes in these cases, and FB-NSA can produce detectors to cover these holes when the training samples have the same radius. There are fewer holes in these cases when the training samples have a variable radius and leading to a small ND of IFB-NSA. ND of IFB-NSA is higher in all other cases. The holes change little in these cases, and this is decided by the nature of IFB-NSA.

RCDN of different self space.
We present experiments on the UCI Iris dataset to show the advantages of IFB-NSA. Table 1 shows the comparison results. The Iris dataset includes 3 classes of 50 instances each. One class of Iris dataset is used as normal data, and the others are used as abnormal data. The models were trained by all or half normal data. The results of FB-NSA and IFB-NSA were only calculated once when all the normal data was trained. This is because the size, location, and quantity of FB-NSA and IFB-NSA detectors are constant in the same condition. The other results were repeated 100 times, the data averaged under the parameter r (r base )=0.1. The results of IFB-NSA are obtained with k = 7, and the others are from previous literature [4, 32].
Comparison between different anomaly detection methods using the Iris dataset
Comparison between different anomaly detection methods using the Iris dataset
The top DR and FA are I-detector. NSA and MILA have the largest ND. Compared with FB-NSA, DR of IFB-NSA is higher when Versicolor and Virginica are used as normal data, respectively. Especially DR rises 59.7% and 49.1% when Versicolor data are used as normal data, and completely or half used to train the model. Because the detectors of IFB-NSA can completely enclose the self space. However, the FA of IFB-NSA is higher than that of FB-NSA. This is due to NTS is small, and many of them belong to the outermost layer of self space, and the advantage of variable-sized training samples has not fully played out.
Compared with V-detector, DR of IFB-NSA is higher in all cases. Because the detectors of V-detector are generated randomly, there are many holes in the non-self space, and leading to lower DR. The FA of IFB-NSA is higher when Setosa and Virginica data are used as normal data, and half used to train the model. Because the detectors of IFB-NSA cover more holes near the self space, and leading to higher FA.
Taken together, the detection performance of IFB-NSA outperforms the other anomaly detection methods in most of the cases.
In this paper, we introduced an improved boundary-fixed negative selection algorithm (IFB-NSA) for anomaly detection. To improve its applicability and keep higher DR, IFB-NSA produces detectors in the enlarged state space to ensure the detectors around the self space completely. It generates auxiliary detectors to eliminate the second kind of detective blind area and reduce the first kind of detective blind area. At the same time, variable-sized training samples are used to train the model, leading to lower FA.
We presented the experiments on synthetic datasets and the UCI Iris dataset to show the advantages of IFB-NSA. IFB-NSA has higher DR and lower FA compared with FB-NSA and other anomaly detection methods, which randomly produce detectors in most of the cases.
It is noticeable that the IFB-NSA needs to be further improved. It is necessary to unite other methods to improve its performance, such as further reduce FA, automatic select parameter k, and deal with high dimensional data.
Footnotes
Acknowledgments
This work was sponsored by the National Natural Science Foundation of China (Grant No. 52075310, 51575331).
