Abstract
With the increase in the amount of data available today, the responsibility of keeping that data safe has also taken a more severe form. To prevent confidential data from getting in the hands of an attacker, some measures need to be taken. Here comes the need for an effective system, which can classify the traffic as an attack or normal. Intrusion Detection Systems can do this work with perfection. Many machine learning algorithms are used to develop efficient IDS. These IDS provide remarkable results. However, ensemble-based IDS using voting have been seen to outperform individual approaches (Support Vector Machine and ExtraTree). Since the Voting methodology is able to work around both, theoretically similar and different classifiers and produce a single classifier based on the majority characteristics, it proved to be better than the other ensemble based techniques. In this paper, an ensemble IDS implementation is presented based on the voting ensemble method, using the two algorithms, Support Vector Machine (SVC) and ExtraTree. The experiment is performed on the KDDCup99 Dataset. The evaluation of the performance of the proposed method is based on the comparison with an unoptimized implementation of the same. The results based on performing the experiment in Python fetched an accuracy of 99.90%.
Introduction
Introduction of intrusion detection systems
Internet usage is increasing day by day. The generation has made the internet his living. Nowadays, online shopping transaction and social media everything is happening through the internet. Despite using the internet so much, the human being does not know how much security is hidden behind the internet and how much misuse of our data can be done. During the establishment of internet access, the need to protect our information from any type of attack is at peak importance. Therefore two-level security is used in modern technology to protect information or assets.
Intrusion Detection Systems (IDSs) are used to stop the attacks or malicious activity on the Networks or in a host. Malicious activity is done by the attacker easily due to the presence of vulnerability in the computer system, such as, encryption/decryption of message is weak, session management is not properly done and so on. Intrusion Detection System (IDS) is used to protect the confidential information on the computer system or a network from the attacker [1].
An Intrusion Detection System observes the computer system or a network continuously in order to detect any malicious activity on the computer system or the network and analyze it and then send an alert to the administrator who operates the system or the network in real time [2]. Intrusion detection technique is a technique in which system detects any malicious activity in a network. An effective IDS should do most of the tasks without needing the instructions every-time [3]. Apart from the automation of the task implementation, there is a need of recovering from the system crashes by itself. Also, high detection rate and low false positive rate is the main feature of an effective machine learning based IDS [4]. Firewalls and other security systems, are well versed with some attacks, and they are programmed in such a way to stop those attacks from occurring, but any foreign activity that is alien to their system, is completely ignored by these services. Hence, they are not secured as considered, which creates a need for a system that provides security against unknown attacks as well [5, 6].
A basic representation of the IDS system is depicted in the Fig. 1 [7].

Basic Intrusion detection system.
Signature Based and Anomaly Based are the major category in which the intrusion detection system is classified. By the Signature intrusion detection, it is meant that the intrusion is detected by using the existing data while in the anomaly intrusion detection, it identifies pattern based on the examination of the data. Intrusion took place over the system due to the vulnerabilities that are occurring in the system. A factor which is very important in network security is efficient intrusion detection [8].
These basic techniques are given below.
–
The system has a pre-defined library of attack patterns. Whenever it encounters with an activity, it matches that activity with the attacks defined in the library. If matched, it classifies the activity as an intrusion and takes the necessary steps against it. The disadvantage of this IDS is that it can only identify the attacks which are listed in its training data and not any other attacks. In case of any other attacks, it treats them as normal or usual activity.
–
In the case of anomaly based IDS, the system monitors the past activity of the system and matches the current activity with that past activity as a reference. When the system encounters an unusual behaviour, it marks it as an intrusion. The disadvantage of this IDS is that the intruder can train itself to perform the intrusion by performing it with the actions that are already present in the system to make the intrusive activity look like a normal activity, the system will never be able to detect that some intrusion is being executed in it.
Ensemble technique
Ensemble based learning is the usage of multiple learning algorithms at the same time. It is an approach which works on amalgamating multiple similar or non-similar classifiers, producing one model, with a combined efficiency and accuracy, that is better than all of the individual models. The working of ensemble technique is described in the Fig. 2. It provides a highly efficient solution to the problem of receiving low accuracy rates, and it provides help in IDS by not being specific to certain attacks. The usage of ensemble techniques helps IDS to detect unknown attacks to a certain degree.

Ensemble based learning.
There are multiple techniques under the umbrella of ensemble based learning, all of them have different implementation, but the primary goal is to achieve better prediction as a combined methodology.
The ensemble techniques are, Bagging Stacking Boosting Voting
Bagging is also known as Bootstrap Aggregation. Suppose there is a data set ‘D’. and there are four base classifiers M1, M2, M3, M4. Different samples of data from dataset D to all these classifiers are provided. Now, these base classifiers get trained on the samples of data given to them. On testing, all these classifiers produce outcomes individually. Now, using the majority voting technique, a decision is taken. This process is called bagging, or Bootstrap Aggregation.
Bagging is a well-known ensemble technique. Zhao [9] used a bagging ensemble technique on base classifiers. Also, Patel et al. [10], Gaikwad et al. [11] used bagging as their ensemble technique.
Stacking
Stacking is another widely used way for the ensemble. In stacking, there is a two level classifier. The first level classifiers are the individual learners, and the second level of the classifier is the meta-classifier. Once the first level classifiers are trained, their outputs are provided to the meta classifier as features, and then, the meta-classifier is trained. After this, the final prediction is made. Roy et al. [12], used stacking for the ensemble. Abirami et al. [13], obtained the accuracy of 95% on their model using stacking.
Boosting
Boosting is another widely used ensemble technique. In case of boosting, let us suppose there is a data set ‘D’. Here, the base learners are made sequentially. i.e. at the start, only one base learner is created. That learner is trained on the data set and then tested. Now, the next base learner is trained only for that data, which has been wrongly classified by the previous base learner. This process keeps on repeating itself until the time the specified number of base learners are created. This technique is known as boosting. Some boosting algorithms are AdaBoost, Gradient Boost, XGBoost.
Voting
The Voting classifier is an ensemble based classifier which is able to combine, both, theoretically similar and different classifiers and produce a single classifier based on the majority characteristics via voting. It combines the individual classifier outputs, and takes a new vote among them, to select the correct prediction. There are different techniques in which the acknowledgement can be made for the voting of a classifier, namely, [14].
–Hard Voting
This process is also known as majority voting. In this process, the mode value is calculated and taken into consideration. If any certain mode value is highest in terms of count, then it considered as to be the voted winner.
The class label
–Soft Voting
In this type of voting, prediction is made for the class label
The class label
The output class will be predicted at the threshold, given by the above formula.
Rest of the paper is organized as follow Section 2 gives the related work. Section 3 describe the proposed scheme. The results of the proposed scheme are mentioned in the Section 4. The comparison and Discussion is given in Section 5. Finally the paper is concluded in Section 6.
Kavitha et al. [15] used neutroscopic logic classifier and improvised genetic algorithm for creating an ensemble classifier. KDDCup99 Dataset was used for this purpose. The attacks were divided into four categories, namely, Denial Of Service (DoS), User to Root (U2R), Remote to Local (R2L) and Probe.
The real world data is incomplete, inconsistent, and redundant or say, fuzzy. There are many methods which can be used to handle these kinds of data. Some of them are fuzzy theory, probability theory etc. But the disadvantage of these methods is that they can handle one or two flaws at a time and not all-together. So, neutroscopic logic is introduced. This method can handle all these flaws under one framework. A false alarm rate of 3.19% was achieved by this work.
Govindaranjan et al. in [16], used radial Basis Function and Support Vector Machine for the ensemble. The dataset used for this work was the NSL-KDD dataset. Since the purpose of the work is to determine whether individual classifiers perform better or the ensemble classifier does, first the individual approach was made and then the results of the ensemble technique were compared with it. Radial basis function individually gave the accuracy of 83.57% and Support Vector Machine individually gave the accuracy of 83.58%. On the other hand, the ensemble technique gave the accuracy of 85.19%.
Hui Zhao [9] got an accuracy of DoS: 93.17%, Probing: 94.26%, U2R: 92.30%, R2L: 95.03%. In this work, the Neighbourhood rough set was used for feature selection. Parameter setting of SVM was done based on Particle Swarm Optimisation.
Patel et al. [10] used SVM, Decision trees and bagging technique for carrying out the ensemble approach. The boosting technique was also applied and the results of both of techniques were compared. It was found out that bagging ensemble technique performed better than boosting.
Roy et al. [12] used KDDCup99 dataset. The attacks were classified into four classes: Probe, U2R, R2L, DoS. Smurf and Neptune in DoS. Satan and ipsweep in Probe. Warezclient in R2L. buffer_overflow and rootkit in U2R. An accuracy of 82.7206% was obtained.
Abirami et al. [13] combined the SVM, Random forest and Naïve Bayes, Logistic Regression and stacking. From these, logistic regression is used as the meta-classifier and rest others are the class one classifiers. The predictions of the class one classifiers is given as features to the meta-classifier and then the final predictions are made. The accuracy of 95% is obtained from this model. The dataset used for this is UNSW-NB15 dataset. An algorithm named “Least Square Support Vector Machine” is also developed which gives an accuracy of 95% on KDD Cup 99, NSL-KDD and Kyoto datasets.
Aburomman et al. [17] discussed Principle Component Analysis, Linear Discriminant Analysis and SVM for the model. KDD99 dataset was used for this work. Data pre-processing involved two steps. Converting symbolic features into numeric. Secondly, normalisation was done. Also, the attacks were grouped in classes named: Probe, DoS, U2R, R2L. Feature extraction is needed as unnecessary features may lead to overfitting of data. Principal Component Analysis and Linear Discriminant Analysis is used for feature extraction. For combination, Weighted Majority Voting is used.
Gaikwad [18] used REPTree and bagging technique on the NSL-KDD Dataset. Since the need of dimensionality reduction is high, the important features were selected from the feature set of the dataset. REPTree makes a number of regression trees and selects the best formed tree for representation. Bagging method is used because the time taken to build a model in case of bagging, is very low.
Jabbar et al. [19] used random forest and Average One-Dependence Estimator for ensemble. A vote is casted from each of the random forest’s trees. After that, according to the majority voting, the decision is made. Average One-dependence Estimator can be called a semi-Naïve Bayes method. It makes classes according to the individual predictions of single classifiers. All these single classifiers are based on the same parent. To prove that the ensemble method outperforms the single classifier approach, results were obtained from both, ensemble and single classifier model. Naïve Bayes gives the accuracy of only 82.5% while AODE and RF gave 89.68% and 89.34% respectively. RF-AODE outperforms them with an accuracy of 90.51%.
Abdulrahaman et al. [20] used Multilayer Perceptron Neural Network and Sequential Minimal Optimization for ensemble. The Kyoto 2006+ dataset was used for this work. SVM comes under supervised machine learning. This algorithm makes an N-dimensional plane to effectively classify the data points. Here ‘N’ is the number of features. SMO is the SVMs which use optimize training method. MPNN gave the detection rate of 95.63% while SMO gave the detection rate of 96.92%. The ensemble of SMO and MPNN provides with an accuracy of 96.92%.
Khonde et al. [21] presented Random forest, SVM, ANN, Decision Trees and kNN as classifiers. The final prediction of the experiment is based on the combination of the results of these classifiers. The combination is done based on the weighted majority voting technique. For feature selection, Gini index and variable importance method have been adopted. These variable importance measures include Variable Importance Index, probability Index. The ensemble method outperforms the individual methods by obtaining the highest accuracy, i.e. 98.97%. the ANN, DT, kNN, RF, SVM are 91.91%, 97.74%, 96.91%, 97.77%, and 89.73% respectively.
The related work is summarised in the Table 1 for the convenience of the reader.
Summary of related work
Summary of related work
Firewalls and other security systems, are well versed with some attacks, and they are programmed in such a way to stop those attacks from occurring, but any foreign activity that is alien to their system, is completely ignored by these services. Hence, they are not secured as considered, which creates a need for a system that provides security against unknown attacks as well. The capability of finding anomalies in the aforementioned techniques and other IDS implementations like Signature based, is very less when compared to the voting based IDS implementation. The accuracy in determination of the attacks is very less when compared to the proposed implementation of voting based IDS. The efficiency with the workflow of Realtime database is very much better with voting based IDS.
In this paper an ensemble based IDS is presented, which uses voting classifier as the basis of its implementation. The method of the proposed scheme is based on the Support Vector Machine algorithm and the ExtraTree algorithm. Voting would be performed on these two algorithms. The block diagram of the proposed scheme is shown in Fig. 3.

Block diagram of the proposed scheme.
Tavallaee et al. [22] analysed the KDD dataset which was created by Stolfo et al. [18], using the data collected during the implementation of DARPA98 Intrusion Detection System. There are 41 features in every sample, which are used to classify the attacks into four different categories of attacks or as normal.
The dataset has been split into the ratio of 4:1, i.e., 80% for training and 20% of the dataset is used for testing the experiment.
The aforementioned features have the following classified groups, Basic Features: Features that are easily extractable without using the payload from a TCP/IP connection. Same Host Traffic Features: Features evaluated in a time interval of two seconds, when the host is in the same connection. Same Service Traffic Features: Features evaluated in a time interval of two seconds, with only those connections which are having the same service. Content features: Features used to detect suspicious behaviour.
Data Pre-processing
Data, which is collected from various sources, is often considered to be “dirty”, which means that the captured data can be inconsistent or poorly documented. This phenomenon is also known as noise. Noise is the wrong values or a duplicate record. Data pre-processing is performed to fix such errors. In which data is cleaned by fixing the missing values, and the integrity of data is verified and the multiple formats are converted into one, and in the end, the dimensions are reduced for simple analysis.
During data pre-processing, the following processes have also been implemented in the implementation of the proposed scheme, Scaling: Assigning a weight value to the data items, according to the rules specified in the classifier. Feature selection: The process of assigning the most valuable features during the training of the classifier, that help in contributing to creating the most accurate prediction.
Training and testing
–Ensemble-based machine learning technique
The method of Ensemble is an advanced machine based learning technique, which uses the features of multiple classifiers to provide a very high prediction rate, which is an improvement over the individual results of the said classifiers. Ensemble based learning is an approach towards amalgamation of similar or non-similar classifiers to improve accuracy and prediction power than its individual constituents.
It provides highly efficient classifiers and an IDS that is not susceptible to only certain types of attacks. The pictorial description of the ensemble based learning is described in Fig. 2 in Section 3.
–Voting
The Voting classifier is an ensemble based classifier which is able to combine, both, theoretically similar and different classifiers and produce a single classifier based on the majority characteristics via voting. It combines the individual classifier outputs, and takes a new vote among them, to select the correct prediction. The working of the technique is described in Fig. 4. There are different techniques in which the acknowledgement can be made for the vote of a classifier, namely, (a) Hard Voting and (b) Soft Voting [21].

Voting classifier.
In the process of voting, different datasets are extracted from the same database by making a few changes to provide a heterogeneous training to the classifiers, and that causes the classifiers to produce different outcomes. For the final outcome, voting is based on either the mode of all the outcomes, or the average of all the outcomes. In any case, the bigger the number of occurrences of the same outcome, the more probable is the chance of it becoming the final outcome.
In majority voting schemes, which is the most used voting technique, the system is created in which a collection of all the binary outputs is analysed, which determines the highest number of votes as the output. Nonetheless, there are other rules and procedures of the voting system as well, like, an average taken from the probabilities, finding the mode of probabilities, and even the multiplication of probabilities constitute the several methods of voting.
In Equation 1 for instance, let LC
i
(X) represent the largest probability, used to determine the output, [24].
The function, which lowers the value of the vote of a classifier which is of lower accuracy in the training set, is used to perform weighting of the votes of the class members. The algorithm is presented in the following table (Table 2), [25, 26].
Voting Algorithm [23]
There are several attack classes that can be detected by the proposed system, this section specifies those classes, which are as follows, Denial of Service Attack (DoS): The users are denied the access to the network by causing congestion in the network by creating a huge amount of requests to the server by the attacker. Probe Attack (PRB): the attacker infiltrates through a specific computer and collects information from it and hinders its security controls. User to Root Attack (U2): The root control of a system is taken over by an attacker through a local infiltrated user account. Remote to Local Attack (R2L): an attacker is able to access another set of systems by using the interconnectivity between them.
The proposed system is able to detect other attacks as well, there is one more detection that can be made by the proposed system, and that is known as normal behaviour.
Empirical evolution and results
The proposed system’s performance is evaluated by comparing it with an unoptimized method of implementation of voting on SVC and ExtraTree algorithms. The proposed method is an optimized version of SVC and ExtraTree algorithms. The environment in which the proposed system was implemented, is Python. The analysis of the proposed system in terms of the performance is based on the classification report generated. The classification report is based on the following benchmarks, Precision: The classifier classifies the instance as whether as a negative or as a false one. Recall: The classifier able to find all the relevant instances, correctly. F1 score: The result of the calculation of the harmonic mean of precision and recall, where the best score is represented by 1 and the worst score is represented by 0. Support: Number of occurrences of class in a specified dataset.
Performance evaluation of unoptimized classifier on KDDCup99 Dataset
In this section, the unoptimized classifier’s performance based on SVC and ExtraTree is being evaluated and the experiment is being performed on the KDDCup99 dataset.
In Table 3, the classification report attained using the classifier is shown.
Classification report without optimization
Classification report without optimization
The unoptimized system provides an accuracy of 81.31% for the KDDCup99 dataset.
For further comparison purposes, the performance of the proposed method is to be evaluated by performing the experiment on the same “KDDCup99” dataset. In this section, the performance is being evaluated and recorded in the Table 4, which is showing the classification report.
Classification report of the proposed method
Classification report of the proposed method
The proposed system provides an accuracy of 99.90% for the KDDCup99 dataset.
In this section, a comparative analysis, based on results, is presented between the optimized and unoptimized implementation of voting ensemble on SVC and ExtraTree algorithms. The voting method is being used as it provides a highly efficient solution for the problem of receiving low accuracy rates, and it provides help in IDS by not being specific to certain attacks. It combines the individual classifier outputs, and takes a new vote among them, to select the correct prediction The results generated by both the implementations on the KDDCup99 dataset are mentioned in the Section 4 of this paper. The metrics which are being used for creating a comparison are accuracy, precision, recall, and F1-score.
Constituting the results from Section 4, it can be seen that unoptimized voting implementation has fetched an accuracy of only 81.31%. These results are not acceptable in the present scenario, as it is competing with highly efficient classifiers like XGBoost and AdaBoost. Hence it was required to be optimized, and the results of the optimization caused a drastic growth in accuracy, leading to 99.90% accuracy. The difference between the optimized and the unoptimized implementation of voting in IDS is given in the Fig. 5.

Comparison between unoptimized voting classifier and proposed scheme.
The proposed scheme is also compared with the work done on the topic before to show how much the proposed scheme is better in terms of accuracy in the Fig. 6 (graphical way) and in Table 5 (Tabular way).

Comparison of proposed scheme with related work.
Comparison of the proposed method with other machine learning algorithms
This paper reviews the data mining approaches for developing an efficient Intrusion Detection System. The ensemble methods for IDS are analysed and it is figured that the ensemble approach outperforms the individual approach in almost all the cases. All the reviewed methods are robust for handling the large amounts of network traffic of present time. Constituting the results from Section 4, It can be seen that unoptimized voting implementation has fetched an accuracy of only 81.31%. These results are not acceptable in the present scenario, as it is competing with highly efficient classifiers like XGBoost and AdaBoost. Hence it was required to be optimized, and the results of the optimization caused a drastic growth in accuracy, leading to 99.90% accuracy. The IDS implementation based on the voting ensemble method, using the two algorithms, SVC and ExtraTree, is robust enough to handle large amount of network traffic and outperforms the previously done research on the same subject in every factor.
In the future, the need for IDS will increase as cities would go smart, and the security of the data would become vital, an IDS which can work on real time data streams would become a necessity. Such an IDS would make the security of the smart city appliances a lot more seamless and efficient.
