Abstract
In recent decades, network security for organizations and individuals has become more and more important, and intrusion detection systems play a key role in protecting network security. To improve intrusion detection effect, different machine learning techniques have been widely applied and achieved exciting results. However, the premise that these methods achieve reliable results is that there are enough available and well-labeled training data, training and test data being from the same distribution. In real life, the limited label data generated by a single organization is not enough to train a reliable learning model, and the distribution of data collected by different organizations is difficult to be the same. In addition, various organizations protect their privacy and data security through data islands. Therefore, this paper proposes an efficient intrusion detection method using transfer learning and support vector machine with privacy-preserving (FETLSVMP). FETLSVMP performs aggregation of data distributed in various organizations through federated learning, then utilizes transfer learning and support vector machines build personalized models for each organization. Specifically, FETLSVMP first builds a transfer support vector machine model to solve the problem of data distribution differences among various organizations; then, under the mechanism of federated learning, the model is used for learning without sharing training data on each organization to protect data privacy; finally, the intrusion detection model is obtained with protecting the privacy of data. Experiments are carried out on NSL-KDD, KDD CUP99 and ISCX2012, the experimental results verify that the proposed method can achieve better results of detection and robust performance, especially for small samples and emerging intrusion behaviors, and have the ability to protect data privacy.
Introduction
The network has not only become the foundation of society and our modern life, but also stores a large amount of data related to people’s private information and national security. Nowadays, computer network and the Internet are the fundamental components of our society, having made great contributions to economy and impacts on peopleâs work and lifestyle [1]. Attacks on the network are increasing at an alarming rate. If the network is invaded or attacked, it will certainly threaten our normal activities and national security. Therefore, network security has become more and more important, and the problem of cybersecurity has been the focus of a growing number of people [2, 3, 4]. Researchers have proposed and implemented many measures to protect the network from intrusion and attack, such as firewall, digital signature and Intrusion Detection System (IDS) [5]. As emerging security defense technology, IDS [6, 7] can improve the reliability and security of the system by detecting and responding to various malicious behaviors, actively protect the network system from illegal external attacks and has become an important technical method to protect cyberspace security against network attacks and intrusions.
In recent years, with the rapid development of machine learning, deep learning and artificial intelligence, their application in intrusion detection has become a research hotspot in the field of network security [8]. Li et al. [9] proposed a new active transfer learning algorithm based on support vector machine (SVM) combined with the advantages of transfer learning and active learning, experiments show the effectiveness of the algorithm. Cheng et al. [10] proposed a basic Extreme Learning Machine (ELM) method based on random features and a kernel-based ELM classification method, which are superior to the Support Vector Machine in terms of classification, training and testing speed, and detection accuracy. Singh et al. [11] proposed an intrusion detection technology based on an online sequential extreme learning machine (OS-ELM), which uses alpha analysis to reduce time complexity and feature selection based on filtering, correlation and consistency discards irrelevant features. Wang et al. [12] proposed a kernel-based extreme learning machine (KELM) with supervised learning capabilities to shorten the training cycle. Abdulla et al. [13] proposed a new integration construction method, which create classifier integration with higher accuracy in intrusion detection.
Although these methods achieve a good application effect in intrusion detection, and reduce the false alarm rate and the false alarm rate, but it still faces some problems: (1) the labeled data of intrusion detection generated by a single organization is limited in terms of data volume and data diversity, sufficient available and high-quality data is required owing to having a direct impact on model [14]; (2) the distribution of data generated by each organization is different, but machine learning achieve good learning results need that the data meets the condition of independent and identical distribution, if the data of organizations are directly aggregated to lead to the poor detection performance [15]; 3) in reality, data usually exists in the form of isolated islands, although each organization contains a wealth of data, which is stored independently of each other and cannot centrally set up data pools to collect and share user data, so only the performance of machine learning models trained with independent data from various departments cannot achieve global optimization [16]; 4) the privacy and security of data are getting more and more attention, each organization has a strict privacy policy to protect its own data, and it is forbidden to exchange data without clear user approval [17].
Recently, federated learning [18] has become one of the most promising directions in the future development of machine learning. The purpose of federated learning is to conduct collaborative training without sharing private data. It does not need to aggregate the data required for model training for centralized calculation, but transmits encrypted gradient-related data, and uses multi-source data to collaboratively train the same model [19]. Its emergence allows traditional machine learning models to achieve better training results while ensuring data security and privacy, which has the advantages of distributed collaboration, good scalability, strong privacy protection capabilities, and low cost. Even federated learning is positioned as the last mile of artificial intelligence [17]. After federal learning was proposed, its related research work was carried out successively: such as edge computing [20, 21], wearable device [22], privacy protection [24, 44], mobile keyboard prediction [25] and intrusion detection [26]. In particular, another machine learning method – distributed machine learning (DML) [45, 46, 47] has much in common with federated learning. For example, both use decentralized data sets and distributed model training. Many researchers also regard federated learning as a special form of DML, such as [48, 49, 50], or consider federated learning as the next development of DML. However, compared with DML, federated learning has significant advantages in data decentralization, solving data islands and privacy protection. In this paper, benefiting from the advantages of federated learning, FETLSVMP based on federated transfer learning and SVM is proposed to solve the problems of data islands, scarce labeled samples, data privacy protection and personalization in intrusion detection. FETLSVMP utilizes federated learning [27] and homomorphic encryption [28] to build a powerful SVM model by aggregating data from independent institutions while protecting data privacy. FETLSVMP aggregates data distributed in various organizations through federated learning, and then uses transfer learning and SVM to build a personalized model suitable for each organization: first constructs a transfer SVM model to solve the problem of data distribution differences among various organizations; then, under the federal learning mechanism, the learned model can be used for learning without sharing the training data of each institution, so as to protect the data privacy of each institution; finally, an intrusion detection model is obtained.
Our contributions are highlighted as follows:
To the best of our knowledge, we are the first to apply federated transfer learning and SVM to intrusion detection and propose FETLSVMP. It aggregates intrusion detection data from different organizations without sacrificing data privacy and security, at the same time obtains a strong learning model that includes individualized behaviors suitable for each organization through knowledge transfer. The experimental results show excellent performance: compared with traditional machine learning methods, FETLSVMP achieves more than 99% for three types of normal, prob and DOS attacks with a larger number, and more than 70% for R2L and U2R with a smaller number, which are significantly better than the best benchmark algorithm. Therefore, FETLSVMP improves the detection accuracy, especially for small samples and new intrusion behaviors, and also protects the privacy.
The rest of the paper is arranged as follows: Section 2 reviews the related works of federated transfer learning and SVM; in Section 3, an intrusion detection algorithm based on federated transfer learning and SVM is proposed; in Section 4, the effectiveness of the algorithm is verified on NSL-KDD, KDD CUP99 and ISCX2012; Section 5 summarizes the main work of this paper.
Related works
Federated transfer learning
Federated learning was first proposed by McMahan et al. [18] in 2016, and used to train machine learning models based on mobile phones distributed around the world by Google. Compared with traditional machine learning algorithms that require the large amount of high-quality data collected from various institutions to be trained on the cloud server for centralized training, it allows each user to train the model on the local machine, and upload the model to the server for aggregation after being encrypted, finally a global learning model are obtained through multiple iterations. This learning method not only protects the privacy of users, but also does not require data aggregation to cause uncontrollable data flow and sensitive data leakage. The process of federated learning is shown in Fig. 1.
Process of Federated Learning.
It can be seen from Fig. 1 that the learning process of federated learning is as follows:
The organization downloads the global model Organization Each organization uploads the locally updated model to the central server; The central server performs a weighted aggregation operation after receiving each model to obtain the global model
In order to ensure data privacy, federated learning only allows all remote devices to exchange model gradients with a central server. In this process, each distributed device uses local data to train its own model, and then uploads the local model to the central server. After aggregating all the collected models, the server returns the new global model to each device.
According to different distribution patterns of samples and data feature space, federated learning can be divided into three categories: horizontal federated learning, vertical federated learning, and federated transfer learning [27]. Horizontal federated learning is suitable for situations where the user features of the two data sets overlap a lot, but the user overlap is small; vertical federated learning is applicable to the situation where the user features of the two data sets overlap very little, but the user overlaps a lot. Federated transfer learning [29] is different from the previous two federated learning algorithms. It is used when the user and user characteristics of the two data sets rarely overlap, without segmenting the data, but using transfer learning [30] (The transfer of knowledge from an existing field to a new field related to it) to overcome the lack of data or labels is often used to solve the problem of different feature spaces of data sets and scarcity of label samples. Therefore, the federated transfer learning formed by federated learning
SVM was officially published by Vapnik [33] in 1995, which is based on the VC dimension theory of statistical learning and the principle of structural risk minimization. The learning strategy of SVM is interval maximization, which can be formalized into a convex quadratic programming problem. Therefore, SVM is an optimization algorithm for convex quadratic programming. In the support vector machine, it is assumed that the training samples are linearly separable in the sample space or feature space, however it is difficult to find a linearly separable situation in reality. To alleviate this situation, SVM is allowed to make errors on some samples, introducing “soft margin”. The typical algorithms derived from SVM are [34, 35, 36, 37, 38].
Given training dataset
In Eq. (1),
The soft margin SVM algorithm requires the same distribution of training and testing samples, which solves the linear inseparable problem by the slack variables. However, for target domain with a small number of training sample datasets, the soft margin SVM approach is not sufficient to obtain an accurate model. In response to this situation, using similar domain knowledge with sufficient training samples to transfer to target domain can not only accelerates the establishment of learning tasks in target domain, but also alleviates the problem of reduced accuracy due to lack of training sample datasets. In addition, pay special attention to negative transfer issues when transferring knowledge, once negative transfer phenomenon occurs, the effect of classifier obtained after using the similar domain knowledge may be worse than that when not used. This is one of the problems that the paper focuses on.
Definition of problems
Given data
Among them,
Framework of FETLSVMP
Framework of FETLSVMP.
FETLSVMP aims to achieve accurate detection of malicious network intrusions without sacrificing the privacy and security of data. Without loss of generality, we assume that there are 3 organizations (users) and 1 server, and more organizations can be expanded according to actual conditions. Figure 2 gives an overview of the framework.
The framework mainly includes four procedures: firstly, the cloud model on the server is trained according to the public data set; then, the cloud model is distributed to all organizations, and each organization can train their own models on their own data; subsequently, the model of each organization can be uploaded to the cloud server, and the new cloud model can be trained through model aggregation; finally, each organization can use cloud models and data as well as local data for training to build a personalized model. In the last step, because there is a huge distribution difference between the distributed server data and the data of each institution, it is necessary to adopt the method of transfer learning to perform probabilistic adaptation to obtain a model that is more suitable for each organization (as shown in Fig. 3, organization
Process of transfer knowledge.
It can be seen from Fig. 3 that domain knowledge plays an important role in the whole process. For example, firstly, the cloud model SVM is obtained by using the knowledge training of the source domain
The learning process of FETLSVMP involves model establishment and parameter sharing. After the cloud model is established, it can be directly applied to various organizations. In the actual situation, it is obvious that the samples in the server and the data of various organizations have highly different probability differences. Therefore, traditional intrusion detection models fail in personalization, and transfer learning can adapt to the probability differences between models to achieve the purpose of personalization. In addition, due to the data privacy and security issues of various institutions, the models of various institutions cannot be easily and continuously updated.
(1) Construction of transfer learning model
Federal learning solves the problem of data islands between institutions. Therefore, the data of all organizations can be used to build a cloud model, and then each organization can directly use the cloud model. However, due to differences in the distribution of data from various organizations and cloud data, it is obvious that the model does not perform well for specific users, that is, it cannot provide users with personalized features. In this paper, use transfer learning to build a personalized model for each user (organization), as shown in Fig. 3. In this way, through the acquired cloud model parameters, transfer learning is performed on users to learn their personalized models.
Server as a source domain
According to the principles of structural risk minimization and domain negative similarity minimization, the objective function constructed based on SVM is shown in Eq. (3.3):
In Eq. (3.3),
Since
Thus, Eq. (3.3) can be transformed into the following Eq. (3.3):
Introducing a slack variable in source domain, further transforming into Eq. (3.3):
The solution to Eq. (3.3) can be transformed into the dual problem in theorem 3.1.
Where
Substitute Eqs (9)
Simplify Eq. (3.3) to obtain Eq. (14):
Simplifying Eq. (13) yields a quadratic programming problem of Eq. (3.3).
The Eq. (3.3) is solved to obtain the optimal solution
The objective decision function can be obtained from Eqs (14) and (15):
It can be seen from Eqs (14) and (15) that the target domain classifier parameters use the model knowledge from source domain, that is, the knowledge in source domain is transferred into target domain, and the knowledge effectively improve the performance of target classifier. That means that the server can transfer its own data according to different users, so as to help each user obtain a personalized learning model.
(2) Federal learning process
FETLSVMP uses federated learning to implement encryption model training and sharing. The learning process of federated learning is mainly composed of two key parts: cloud model learning and user model learning. The basic learning model is SVM.
Among them,
After obtaining the cloud model
The learning model of
In all organizational models,
Among them,
From Section 3.3, the learning process of FTLTrELM is given in Algorithm 1. The improved algorithm can continue to work with newly emerging organizational data, and update the user model and cloud model at the same time when faced with new data. Therefore, the longer the FTLTrELM is used, the more personalized the model in each organization, and the better the effect of intrusion detection.
Experimental results
Experimental setting evaluation criteria
All experiments in this paper are performed on a PC machine with a processor Intel Core (TM), 3.6GHz, 8GB RAM, and Windows 10 operating system. In order to verify the effectiveness and generalization performance of the proposed algorithm in intrusion detection FETLSVMP, three intrusion detection data sets, NSL-KDD, KDD CUP99 and ISCX2012, are used as experimental data sets. The benchmark algorithms selected in the experiment are: ELM [41], ACTrAdaBoost [42], NBSVM [43], and the results in Table 4 [51] are added.
The 10-fold cross-validation method is a standard method for evaluating machine learning algorithms, so this article uses the intrusion detection model proposed by its evaluation. Specifically, randomly sample the original data set into 10 mutually exclusive subsets of equal size. In each run of the model, nine subsets are selected to train the intrusion detection model, and the remaining one is used to test the model. Therefore, by repeating the above process 10 times, each subset has an equal chance to be selected to train and test the model. Finally, the performance of the proposed detection model is obtained by averaging the results of the test subset. The average of the results of all experiments repeated ten times is used as the final comparison result.
Commonly used evaluation indicators for detection include: Precision, Detection Rate and Accuracy, False positive rate and miss rate. Precision reflects the proportion of correctly classified samples to the total number of samples, the larger the better; accuracy reflects the proportion of true positive samples to the total number of samples classified as positive, the larger the better; the detection rate reflects the proportion of positive samples classified as positive in all positive samples. Accuracy and detection rate are a pair of contradictory indicators. The higher the accuracy, the fewer false positives, and the higher the detection rate, the fewer false negatives. If more precision, the detection rate will increase, but the accuracy will decrease, and vice versa. In intrusion detection, the false positive rate refers to the proportion of the number of misclassified positive samples to the number of all negative samples. The smaller the value, the better, and the higher it is, which is prone to “the wolf is coming”.
The formal description of precision rate, detection rate, accuracy rate, false positive rate and miss rate is as follows:
Among them,
In the work of this paper, the average accuracy rate, false alarm rate and false alarm rate of the experimental results obtained by the 10-fold cross-validation method are used as overall evaluation indicators to verify the effectiveness and accuracy of the algorithm.
Dataset
This section describes NSL-KDD, KDD CUP99 and ISCX2012 data sets and preprocesses them.
a. Dataset
ISCX2012 dataset
Researchers have noticed that the attack types considered in KDD CUP99 intrusion detection data set are now out of date. In 2012, the center of information security Excellence (ISCX) of the University of New Brunswick (UNB) released an intrusion detection data set named ISCX2012 [32]. This data set contains seven days of original network traffic data, including normal traffic and four intrusion types Dos and Prob, R2L and U2R. See Table 1 for details. In the experiment, 2% data is selected from the training data set, most of labeled information is deleted as source domain data set, the remaining labeled data is composed of target data set, and the two data sets together constitute the training data set; similarly, 1% data is taken from the test data set as the test data set.
Distribution of attack types in ISCX2012 dataset
Distribution of attack types in ISCX2012 dataset
KDD CUP99
KDD CUP99 is a widely used competition data for intrusion detection provided by Lincoln Laboratory of Massachusetts Institute of Technology. It is an intrusion detection data set with the best influence and credibility in academia [37]. The data set has 5*106 pieces of data, and each piece of data has 41 characteristic attributes and 1 class identifier. Contains about 38 attack types, of which 21 attack types appear in the training data set, and another 17 unknown attack types appear only in the test data set. The purpose of this design is to test the generalization ability of the classifier model. The ability to detect unknown attack types is also one of the important indicators to evaluate the effect of classifiers in intrusion detection applications.
So far, the most used by researchers is the 10% KDD CUP99 data set (including training data set and test data set), which is a sample of 10% of all data sets of the KDD CUP99 data set, and this data set is also used in the article. The 10% data set contains 1 type normal with normal signs, and 4 major network attack types: DOS, Probing, U2R and R2L. In the two 10% data sets, the four types of cyber attacks contain different amounts of attack behavior. Table 2 details 22 attack behaviors in the training data set, 39 attack behaviors in the test data set, and normal data is also counted as one type of attack in the table.
KDD 99 dataset
Distribution of attack types in KDD 99 dataset
In order for the intrusion detection algorithm to be able to recognize new attack behaviors by learning from the training data set, the test data set in Table 3 contains more new attack behaviors than the training data set. In Table 3, the proportion of Normal in the two data sets in the 10% data set is basically the same, but the proportions of the other four attack types are significantly different; because U2R and R2L have very small proportions, most of the current detection algorithms have difficulty to detect these two types of attacks.
NSL-KDD
NSL-KDD [38] is an optimization of the KDD CUP99 data set, deleting some duplicate records, including different classification difficulty levels, and the number is more balanced, so that it can be used as an effective benchmark data set to correct and effectively detect the ability of model. The NSL-KDD data set includes 4 sub-data sets: KDDTrain
Distribution of attack types in KDD 99 dataset
b. Data preprocessing
In the intrusion detection data set, there are non-numerical data and the dimension difference between the values, and these data need to be converted into numerical data and unified dimension processing. Therefore, the data preprocessing operation includes two steps: character type digitization and data normalization.
Character type digitization
ISCX2012, NSL-KDD and KDD CUP99 data set processing methods are also the same. In each record, their symbol characteristics are converted into numerical data by using 1 to N encoding. Take KDD CUP99 as an example, convert 3 network connection types, 70 network service types, 23 attack types (including normal type Normal), and 11 network connection states of the character type of the data set into numerical types. The converted forms of the 11 network connection types are shown in Table 5, and other character types are similar.
Data cleaning
The actual data set is very vulnerable to noise, missing values and inconsistent data, because the sample size of the data set is too large, and most comes from multiple heterogeneous data sources. Low quality data will lead to low-quality results, so it is necessary to clean up the data.
Network connection type conversion
For duplicate values, find out the duplicate values in data and eliminate them. The missing value adopts statistical method, and for numerical data, the average method is used to make up; For categorical data, use the value with the largest category mode to supplement. For the noise data, the box division method is used, which is a simple and commonly used preprocessing method, and the final value is determined by examining the adjacent data. The so-called “box division” is actually a sub interval divided according to the attribute value. If an attribute value is within the range of a sub interval, it is said to put the attribute value into the “box” represented by this sub interval. In this paper, the minimum entropy method is used to put the data to be processed (a column of attribute values) into some boxes according to certain rules, investigate the data in each box, and use some method to process the data in each box respectively. When adopting the box splitting technology, the two main problems that need to be determined are: how to divide the boxes and how to smooth the data in each box.
Average accuracy rate, false positive rate (%) and miss rate on NSL-KDD dataset (%)
Average accuracy rate, false positive rate (%) and miss rate on KDD 99 dataset (%)
Data normalization
After digitization, for the continuous feature attributes in data set, the measurement methods of each attribute are different, so the calculation of the distance between the data has a greater impact, which in turn affects the accuracy of the calculation results. In order to avoid the above situation, the difference between different features can be eliminated. For discrete features, the method normalization is adopted. For continuous features, the method of Z-Score is used to fix the value at [0, 1], As shown in Eqs (20) and (21).
Among them,
In this section, the experimental results of the algorithms on NSL-KDD, KDD CUP 99 and ISCX2012 datasets are analyzed to verify the effectiveness of the algorithm proposed in this chapter. In addition, the influence of the adjustable parameter
Average accuracy rate, false positive rate (%) and miss rate on ISCX2012 dataset (%)
Average accuracy rate, false positive rate (%) and miss rate on ISCX2012 dataset (%)
Tables 6–8 are the average accuracy, false positive rate and miss rate of the algorithm on NSL-KDD, KDD CUP99 and ISCX2012 data sets. The conclusions that can be obtained are as follows:
Sufficient available intrusion detection training samples are the base of high-accuracy classifier trained. On the intrusion detection data set NSL-KDD and KDD CUP99, there are a large number of three types of attacks: Normal, Prob and DOS. All algorithms have high accuracy for these three types of attacks, reaching over 95%. In the same way, on the ISCX2012 data set, the accuracy of all algorithms against a large number of Normal, Infiltration, and DDoS attacks reached 80%. On the intrusion detection data sets NSL-KDD and KDD CUP99, for the attack types U2R and R2L with a small number of samples, traditional intrusion detection algorithms are not enough to train and obtain a high-accuracy detection model. Therefore, they have low accuracy against the two types. FETLSVMP and ACTrAdaBoost are transfer learning algorithms that use knowledge from a large number of well-labeled intrusion detection samples to train the detection types for U2R and R2L, so their detection rates for U2R and R2L will be improved; from Tables 6 and 7 it can be seen that FETLSVMP and ACTrAdaBoost have improved accuracy on U2R and R2L, especially the accuracy rate of R2L: R2L is above 70%, and U2R is above 46%. On the ISCX2012 data set in Table 8, the transfer learning algorithms FETLSVMP and ACTrAdaBoost are more accurate than ELM, NB-SVM, KNN Cubic and SVM-Linear for the smaller number of attack types HttpDos and BFSSH. Therefore, FETLSVMP has a significant effect on improving the detection rate of U2R and R2L attack types that contain a small number of samples. In terms of false alarm rate, Tables 6 and 7 show that the false alarm rate of the three intrusion attack behaviors Normal, Probe and DOS in the intrusion detection data set NSL-KDD and KDD CUP99, all algorithms do not exceed 5%. Among them, FETLSVMP has the lowest false alarm rate, all below 4%. In the intrusion detection behavior U2R and R2L, the three non-transfer benchmark algorithms performed poorly. The false alarm rate of R2L and U2R on the NSL-KDD data set reached 10% and 9% or more, on the KDD CUP99 data set has reached more than 9%. However, FETLSVMP performed relatively well on these two data sets below 6%. On the ISXC2012 data set, Table 8 shows that FETLSVMP has the lowest false positive rate among all attack types. In terms of miss rate, it can be seen from Tables 6–8 that FETLSVMP is the lowest in the nine attack behaviors compared with the benchmark algorithm.
The experimental results show that on the KDD 99 and NSL-KDD data sets, accuracy of FETLSVMP for the five attack types is higher than the benchmark algorithm, and the accuracy of the attack type with a small number of samples has also been significantly improved; in ISCX2012 data set, FETLSVMP’s accuracy for the 5 attack types is also better than the benchmark algorithms.
Therefore, FETLSVMP has improved the detection rate of all 9 attack behaviors, especially for the R2L attack behavior detection rate with sparse samples; there is no problem that the detection rate of a certain attack behavior is too low, and the detection rate is very different. Effectively alleviate the imbalance problem of the attack type detection in the machine learning algorithm; FETLSVMP has shown significant advantages in the false positive rate and the false negative rate. In other words, FETLSVMP achieves the best classification accuracy for all intrusions. This is because federated learning can indirectly use more information in distributed data to train better models, and the model becomes more consistent through transfer learning. The characteristics of each organization, compared with traditional methods (ELM, NB-SVM, KNN Cubic and SVM-Linear), greatly improve the recognition results. In short, the experiment proves the effectiveness of FETLSVMP, and the proposed algorithm can also protect the privacy of data, which is lacking in other algorithms. The experimental results on the three data sets also show that the FETLSVMP algorithm has better generalization performance.
Average training time on NSL-KDD, KDD CUP99 and ISCX2012(s)
Table 9 shows the average training time of the algorithm on three intrusion detection data sets. Compared with the transfer learning algorithms TrAdaBoost and FETLSVMP, the non-transfer learning algorithm ELM, NB-SVM, KNN Cubic and SVM-Linear have no transfer learning process and do not need to process additional data, so their training time is relatively less; while FETLSVMP needs to transfer the knowledge in the auxiliary data to help build the target learning task, and the training time is slightly increased.
It can be seen from Eq. (11) that the objective function of FETLSVMP includes the parameter
The sensitivity of parameter 
The sensitivity of parameter 
The sensitivity of parameter 
From Figs 4–6, the conclusions obtained through analysis are as follows:
Using the parameter grid search method provided by literature [39], determine the value of
In this paper, FETLSVMP based on federated transfer learning and extreme learning machine for intrusion detection is proposed. FETLSVMP aggregates data from different organizations without compromising privacy and security, realizes the adaptation to each user through knowledge transfer, and conducts personalized model learning. FETLSVMP provides a method for future research on intrusion detection. The effectiveness of the algorithm is evaluated in the experiment. The experimental results show that FETLSVMP effectively solves small samples and emerging attack behaviors with low detection accuracy, privacy protection, data islanding problems, and improves the detection effect. In the future work plan, in the process of transfer learning, further consideration will be given to measuring the difference in conditional probability of each organization’s data and the improvement of training efficiency.
