Abstract
The development and utilization of network big data is also accompanied by data theft and destruction, so the monitoring of network security is particularly important. Based on this, the study applies the fuzzy C-mean clustering algorithm to the network security model, however, the algorithm has major defects in discrete data processing and the influence of feature weights. Therefore, the study introduces the concept of local density and optimizes the initial clustering center to solve its sensitive defects as well as empirical limitations; at the same time, the study introduces the adaptive methods of fuzzy indicators and feature weighting, and uses the concepts such as fuzzy center-of-mass distribution to avoid problems such as the model converging too fast and not being able to handle discrete data. Finally, the study does a simulation analysis of the performance of each module, and the comparison of the overall algorithm with the rest of the models. The experimental results show that in the comparison of the overall algorithm, its false detection rate decreases by 8.57% in the IDS Dataset dataset, compared to the particle swarm algorithm. Therefore, the adaptive weighted fuzzy C-Means algorithm based on local density proposed in the study can effectively improve the network intrusion detection performance.
Introduction
The popularity of the network has made a large amount of private data also exist in it, and the continuous emergence of security incidents has drawn the attention of scholars to the maintenance of network security. Intrusion detection is a technique that automates the removal of hidden dangers before an attack occurs, and this technique can better maintain network order than the detection model of passive processing, which is deficient in timeliness and other aspects. However, network intrusion means continue to strengthen, the simplest active network security model can no longer meet the needs when facing the monitoring of large networks. Network security models can be divided into for misuse detection, and anomaly detection. The former requires continuous training and learning to enhance its performance, and thus is not effective enough for the identification of unknown intrusions; the latter departs from the limitations of data tagging to achieve the identification of anomalous behavior in the form of feature detection analysis [1]. The clustering algorithm is a common technique to identify anomalous behavior, and it uses unsupervised learning mode to autonomously search for anomalous data to achieve the identification of unknown damage. There are various types of clustering analysis, such as by division, by density, by grid, and by hierarchy, etc. The selection of different clustering algorithms should be carried out specifically according to the characteristics of the dataset [2]. Clustering algorithm is widely used in the field of network security. However, most studies do not break through the traditional algorithm, and do not pay attention to the characteristics of intrusion data and the influence of algorithm parameter selection on the detection performance. Because intrusion detection data sets often have two kinds of characteristics: continuous and discrete, most studies only analyze one of them. And most of the current clustering algorithms are also difficult to adapt to the high-dimensional space of the intrusion data, and the algorithm parameters are manually adjusted according to experience. It can be seen that the realization of adaptive adjustment of clustering algorithm is the key to the combination of clustering algorithm and network security detection. Therefore, a fuzzy C-means clustering algorithm model based on local density is proposed. The innovation of this study is that the initial cluster center selection algorithm based on local density is introduced, and a new iterative adaptive optimal cluster number search method is provided. Secondly, a new fuzzy clustering index is used to analyze the correlation between clusters, which improves the performance of the algorithm. Finally, the concept of fuzzy distribution centroid is introduced, and the feature weights of the mixed feature data set are calculated adaptively, which improves the accuracy and applicability of the algorithm. Overall, the research improves the flexibility and effectiveness of clustering algorithms in different application scenarios, and these advances are of great value to researchers in the field of data science and machine learning. The study is divided into four main parts. The first part introduces the development status of intrusion detection technology; the second part designs the fuzzy C-mean clustering algorithm model based on local density with adaptive weighting and introduces the whole network security model; the third part does two modules of simulation experimental analysis for the performance of the algorithm; and the fourth part summarizes the experimental results.
Related works
The maintenance of network security is one of the popular trends in today’s research. Chopra [3] has designed a security maintenance model for mobile self-organizing networks with independence. Scholars use the concept of dynamic mean-field game to identify false nodes in data transmission. False nodes obtain data information with short paths and high destination sequences. Statistical simulation techniques of game theory can effectively avoid this problem, and this dynamic identification technique, also enables real-time update of thresholds, in which the optimal strategy can enhance the utility and the elimination of corrupted paths can also enhance the performance of the model such as throughput. Ding et al. [4] applied fuzzy logic to anomaly analysis, identified by the fractal characteristics of traffic, designed a network model based on fractal theory and wavelet analysis, and distributed the fuzzy logic in Denial of service is applied to the detection of anomalous behavior intensity, etc., and the experimental results verify the effectiveness of this method in network security maintenance. Muthukumaran et al. [5] concluded that long and short term memory networks can effectively reduce the network load, and studied the introduction of traffic repair techniques in this model and connected with traffic failure detectors to achieve normal and efficient operation of the network. Lopez-Martin et al. [6] used comparative learning to calculate and analyze the distance between each data and the embedding space to obtain its loss function and finally compare the similarity of each data, but the traditional method relies too much on data features, so the study introduced label information into this model to achieve classification of label prototypes and also optimized the loss function to better identify the unsmooth abnormal labels and improve the network intrusion detection.
Wang et al. [7] concluded that the generalization performance of the classifier is no longer sufficient to meet the requirements of network intrusion detection, so the study proposed a detection model based on convolutional neural network and used a symmetric structure instead of a normal network, and introduced cyclic cosine annealing learning for the training of the model, and the experimental results showed that the accuracy of the model on the NSL-KDD and UNSW-NB15 datasets reached 85.82% and 80.38%. Gavagsaz [8] concluded that the traditional k-neighborhood algorithm could not cope with large-scale datasets, based on this, the study proposed an improved k-neighborhood algorithm by introducing a hierarchical clustering algorithm to classify the dataset while performing a four-stage improvement to produce a better clustering effect, and the experimental results showed that the optimized algorithm was superior in terms of performance such as recognition accuracy. Hao et al. [9] applied the long and short term memory network to the network information maintenance of high-dimensional data, and also introduced the attention mechanism to transmit the model to the hidden layer before learning, which eliminated the impact of dimensionality and other data features on the accuracy of the model and further reduced the false detection rate of the model. Rathish et al. [10] applied the MANET model to the detection of intrusion data, and introduced the Ad-hoc network for clustering, using a path-weighted clustering algorithm to achieve distributed clustering and improve the timeliness of the network, and finally experimentally verified the effectiveness of the clustering algorithm model.
Numerous studies have demonstrated the reliability of clustering analysis in intrusion detection. However, general clustering algorithms are inadequate in the face of high-dimensional and discrete data, etc. Based on this, this study introduces local density, adaptive mechanism and feature weighting concept into fuzzy C-mean clustering algorithm to compensate the limitations of traditional clustering analysis.
Structure of the paper
Clustering is equivalent to an automated classification process, which meets the requirements of network intrusion data detection. Therefore, unsupervised clustering algorithms are commonly used for anomaly monitoring and protection of networks, but clustering algorithms often rely on the correct selection of the number of clusters, and therefore need to be continuously enhanced in this area to achieve better intrusion detection results.
Improved fuzzy c-mean clustering algorithm based on local density and adaptive mechanism
Clustering algorithms are often applied in the field of network security maintenance, and it is usually automated learning through natural grouping. The Fuzzy C-Means (FCM) algorithm is selected as the basis for intrusion detection. In the feature space, FCM algorithm determines the membership degree of data points and the center of each cluster by iterative optimization. However, the FCM algorithm is very sensitive to the selection of the initial cluster center. The wrong or inappropriate initial clustering center may cause the algorithm to converge to the local optimal solution, produce inaccurate clustering results, and affect the final clustering quality. A large number of studies have been conducted to optimize this problem. Oskouei et al. [11] reduced the sensitivity to initialization by automatic cluster weighting, and combined with local feature weighting strategy to improve image segmentation. Finally, the feature weight allocation process is optimized with the imperialist competition algorithm. Hashemzadeh et al. [12] adjusted the feature weights of each cluster through the cluster weighting process, which was able to better deal with the interrelationships among features in fuzzy clustering and reduce the impact of initialization selection on the final clustering results. It can be seen that they all adopt the way of feature weighting to improve the initialization sensitivity, which can solve the problem to a certain extent. However, the optimization of algorithm convergence is neglected, which may lead to the reduction of operation efficiency. The adaptive cluster selection is realized in the form of fuzzy cluster index. This paper not only provides an upper bound setting for the number of clusters to avoid using empirical rules, but also introduces a penalty function to suppress the decline of index value when the number of clusters is too large, so as to determine the optimal number of clusters adaptively. It ensures that it will not become 0 when the number of clusters increases, thus improving the convergence speed of the algorithm and reducing the number of iterations. Compared with other methods, the balance of adaptive clustering optimization and convergence performance is achieved. Clustering algorithms usually require an exact number of clusters to be selected in order to achieve high quality clustering results. The parameter generally relies on empirical setting with low accuracy; when the parameter is too large, it will lead to cumbersome results and difficult to identify; conversely, it will cause the loss of important data. Therefore, the study introduces an adaptive mechanism to solve this drawback. Before that, it is firstly needed to calculate the dissimilarity when clustering, and the study selects the calculation of squared Euclidean distance combined with simple matching, as shown in Equation (1).
In the above Equation (1), l represents the dimension; x
i
and x
j
represent two different data; r represents numerical attributes; c represents categorical attributes; l
r
and l
c
represent the number of numerical attributes and categorical attributes, respectively; γ represents the weight of categorical attributes, and
In the above Equation (2), ρ i denotes the local density of the data x i ; dc denotes the truncation distance. The corresponding truncation distances are obtained by comparing and sorting the sizes of the random data distances from each other. The magnitude of the local density is related to the distance of this data from the rest of the data; when most of the distance values are smaller than the truncation distance, the local density is large, and vice versa, it becomes smaller, as shown in Fig. 1.

Selection of cluster centers based on local density.
From Fig. 1, it can be seen that the truncation distance is closely related to the number of initial clusters, which in turn derives the relationship between the local density and the number of clusters, which indicates that dc is robust. Assuming that x i belongs to the set X, the conditions that need to be satisfied for this point to be able to become a core point are shown in Equation (3):
When the distance between two data is less than the truncation distance, it indicates that the two data are directly density reachable; the direct density reachable and density reachable objects of a certain data belong to the neighborhood of that data, and the points that do not contain the neighborhood data are boundary points. Therefore, the basic operation process of FCM algorithm based on local density is roughly divided into four steps. First, the distance matrix, truncation distance and local density are calculated; then the local density values are sorted; after selecting the data with the largest local density, the central cluster is selected according to the nearest neighbor principle; finally, the information output of the initial clustering centers and their numbers can be performed. The visualization flow of the algorithm is shown in Fig. 2.

Visual flow of FCM algorithm based on local density.
In Fig. 2, the triangular symbols indicate the boundary points, the four-pointed stars of different colors represent different clusters, and the diamond shape indicates the cluster centers. To further obtain the global optimum, the study introduces fuzzy clustering as a new validity index, as shown in Equation (4).
In the above Equation (4), U denotes the fuzzy division matrix; V denotes the clustering center; n k denotes the fuzzy base; u ik denotes the affiliation degree; c denotes the optimal number of clusters; and v denotes the number of clustering centers. Meanwhile, the study introduces the fuzzy index m into the algorithm, and this parameter is proportional to the degree of fuzziness, and its effect on the clustering effect is shown in Fig. 3.

Influence of fuzzy index on clustering effect.
It can be seen that the ambiguity disappears when this parameter is equal to 1, but as the value of the parameter keeps increasing, the division characteristic is weakened again. Therefore, the algorithm needs to select an optimal fuzzy value that guarantees the effect on both sides. The operation of the objective function of the algorithm to derive the fuzzy value is shown in Equation (5) [13].
In the above Equation (5), ∂J
m
(U, V) denotes the FCM function. The affiliation is in the interval [0, 1], so this partial derivative will not be greater than zero, which will also lead to the disappearance of the gradient iteration. Based on this, the study defines the fuzzy correlation degree between j, k clusters ρ
jk
as shown in Equation (6) [14].
From the above Equation (6), the numerator and denominator map the degree of fuzziness between and within clusters, respectively, and the fuzzy correlation can reach a maximum value of 1 under the condition that the two clusters are equal. However, this formula does not reflect the result that ρ
jk
tends to zero, and the improved formula used in the study is shown in Equation (7).
The validity function of the fuzzy C-mean clustering algorithm has a certain proportional relationship with the fuzzy correlation value, and the algorithm divides best when the validity function is the minimum value [15].
When there are too homogeneous data in the FCM algorithm, its clustering results will be greatly affected. However, network data features are usually accompanied by large dimensionality, as well as low validity. Based on this, the study introduces the concept of feature weighting to address this drawback. Traditional feature weighting generally cannot cope with discrete intrusive information, and the weights are specific and constant for different clusters, which does not completely solve the problems such as dimensionality explosion [16]. Therefore, the study introduces rough set and shadow set in the form of hybrid weighting to achieve better clustering effect. The representation of each cluster class contains three major categories: centroid, lower approximation, and boundary. Among them, lower approximation indicates the data that are definitely contained in the cluster, upper approximation indicates the data that do not necessarily belong to the cluster, and boundary maps the uncertainty of the data. According to the above relationship, the representation of cluster centers can be obtained as shown in Equation (8).
In the above Equation (8),
The study uses shaded sets to realize the dynamic selection technique of thresholds. Shaded sets are similar to fuzzy sets in that the final value of the function is determined by the relationship between the size of its affiliation function and the threshold β, which is essentially the process of projecting the fuzzy set into the three-dimensional space [17]. The shaded set construction is shown in Fig. 4.

Shadow set construction with relative membership degree.
From Fig. 4, it can be seen that the ambiguity of the features is preserved when the affiliation is in the range of [β, β - 1]. The clustering results of the shaded set are usually discontinuous, so its objective function should also be minimized, and the corresponding objective function O (β) is shown in Equation (10).
In the above Equation (10), O (β), ψ2, and ψ3 represent the reduced part of affiliation; they are the upper and lower limits of affiliation, respectively. The range of the threshold value is [umin, (umin + umax)/2]. In the actual dataset, which often contains various mixed features, the study introduces the concept of fuzzy distribution center of mass for the expression of cluster class center, and also redefines the weighted objective function. The study defines the jth discrete feature as A
j
and its value domain as
From the above Equation (11), it can be seen that the clustering center of the feature is the set of
The study introduces the concept of feature weighting for the autonomous operation of the iterative process with the objective function shown in Equation (13) [19].
In the above Equation (13), w
kj
represents the weight of the feature in the jk cluster, and the value range of this parameter is [0,1]; a represents the index of the feature weight, and this parameter is located in the interval of [-10, 0) ∪ (1, 10].
In the above Equation (14), λ
i
, γ
k
denote Lagrange multipliers. However, the above objective function is not considered comprehensively, combining intra-cluster tightness and ignoring inter-cluster separation, based on this, the study updates the objective function as shown in Equation (15) [20, 21].
From the above Equation (15), it can be seen that the numerator is related to the intra-cluster tightness and the two are negatively correlated; the denominator is related to the inter-cluster separation and the two are positively correlated, and the value represents the mean value of the separation of the cluster class centers. The overall flow of the adaptive feature-weighted FCM algorithm based on local density is shown in Fig. 5.

Operation flow of adaptive feature-weighted FCM algorithm based on local density.
As can be seen from Fig. 5, the algorithm can be roughly divided into 7 steps. First is the initialization of the data, taking the set of cluster centers based on the local density and performing the fuzzy distribution of the center of mass; then calculates the affiliation matrix; the threshold values of each cluster class are different; calculates the boundary conditions of each cluster class according to the operational formula; then compares the objective function values before and after with the threshold values and proceeds to the next step according to the comparison results; performs the update of the cluster centers; and calculates the feature weights. The study uses a network security detection model based on the optimized FCM algorithm, as shown in Fig. 6.

Overall framework of network security detection model.
As can be seen from Fig. 6, in the overall model, the client and the server side are included. The former needs to collect the data and transmit it to the server, and the latter forensically analyzes it, which in turn enables the client to download forensic reports, etc. Both cooperate with each other and work together. Network intrusion forensics is mainly divided into six stages, which is carried out through two ports.
To further validate the effectiveness of the intrusion detection model and achieve the maintenance of network security. The following paper uses the Optimize Fuzzy C-Mean Clustering (OFCM) algorithm to represent the research design algorithm. The experiment is divided into two parts, firstly, the performance of each module in the algorithm is verified, and then the performance of this design algorithm is compared with the rest of the commonly used algorithms, and the reliability of the research design algorithm is verified by comparison.
Performance verification experiments on each combined module of OFCM algorithm
The data set used in the experiment and its basic information are shown in Table 1.
Introduction of experimental data set
Introduction of experimental data set
The study first analyzes the selection of the initial clustering centers, which is done by the random generation method used in the original algorithm and the improved algorithm by using the concept of local density. The study sets the fuzzy indicator value to 2, the maximum number of iterations to 199, and the convergence threshold to 10-5. And the artificial dataset with two-dimensional Gaussian distribution, the Iris dataset, the Wine dataset, and the SubKDD dataset are selected for comparison. To eliminate the chance caused by random selection, the experiments were done 30 times for each algorithm on each dataset, and the experimental results are shown in Fig. 7.

Iteration times of each algorithm on different data sets.
From Fig. (7), it can be seen that the number of iterations for randomly selected initial clusters is higher and unstable, fluctuating more, and the chance error is higher; the initial clustering selection based on local density can make the number of iterations reduced effectively. In the artificial dataset with two-dimensional Gaussian distribution, the mean number of iterations of the former is 39, while the mean number of iterations of the local density-based algorithm is only 14, which is a relative decrease of 25 iterations. In the Iris dataset, the Wine dataset, and the SubKDD dataset, the mean value of iteration counts of the latter is lower than that of the random clustering algorithm by 5, 9, and 8 iterations, respectively. It can be seen that the difference in the number of iterations between the two types of algorithms is the largest in the artificial dataset, with the highest number of 64 iterations and the lowest number of 23 iterations for the random selection clustering algorithm, with great overall fluctuations. Therefore, the clustering algorithm with the introduction of local density can not only improve its fast convergence performance, but also has a more obvious improvement in stability and better clustering effect. To verify the effectiveness of the adaptive mechanism, the study simulates it using artificial dataset, Iris dataset and SubKDD dataset, respectively. In this study, three centroids are selected for the artificial dataset construction, each cluster class has 45 samples and arranged according to Gaussian distribution, and the Particle Swarm Optimization (PSO) algorithm is introduced for the control experiment in combination with the FCM algorithm. The study sets the particle swarm size to 20, sets the adjustment factor to 100, sets the initial and final values of inertia weights to 0.8 and 0.2, respectively, and the learning factors are all adjusted to 2. The maximum number of iterations is still set to 100, and the experimental results are shown in Table 2.
Clustering effect analysis of each algorithm in manual data set
Among the sum of the Euclidean distances between the cluster centers obtained by the three algorithms and the actual cluster centers, the OFCM algorithm has the smallest value, with a decrease of 0.195 and 0.068 compared to the initial FCM algorithm and the particle swarm optimization FCM algorithm, respectively. The difference in the Xie-Beni index is not significant, but still improved, with a decrease of 0.67% and 0.39%, respectively. The above data indicate that the FCM optimization algorithm used in the study can better identify the actual clustering centers with relatively smaller errors, and also, the algorithm has the best fuzzy delineation effect. The study further conducted simulation experiments on the Iris dataset and the intrusion detection dataset, and also tested the best value of the fuzzy metric for each dataset, and finally the fuzzy metric was selected as 2.2 in the Iris dataset and 1.8 in the intrusion detection dataset, and the experimental results of each algorithm in each dataset are shown in Fig. 8.

Comparison between the adaptive FCM algorithm and other algorithms in different data sets.
As can be seen in Fig. 8(a), the FCM dataset with the addition of the adaptive module has the highest clustering accuracy of 90.09% in the Iris dataset, while the clustering accuracy of the FCM algorithm and the PSO + FCM algorithm are 89.12% and 89.61%, respectively, with a relative decrease of 0.97% and 0.48%. It can improve the clustering effect of the model. As can be seen in Fig. 8(b), the detection accuracy and false detection rate of the initial FCM algorithm are 91.02% and 11.49%, respectively, while the detection accuracy and false detection rate of the FCM algorithm with the addition of the adaptive module are 91.27% and 11.49%, respectively, with a relative improvement of 0.25% in accuracy and a relative reduction of 0.17% in error rate; therefore, the adaptive module is able to effectively improve the recognition performance of the model. In addition, the optimization effect of the feature weighting module on the initial clustering sensitivity is also analyzed. The experiments before and after optimization were compared by using different initial cluster numbers and positions. The result is shown in Fig. 9.

Optimization effect of feature weighting module on initial clustering sensitivity.
Figure 9(a) and Fig. 9(b) show the sensitivity of the feature weighted module to the number of initial clustering centers of the algorithm before and after optimization. It can be seen that the convergence performance of the model without feature weighted optimization decreases with the increase of the number of initial clustering centers. Except for the case where the initial cluster number is 10, the global convergence is completed, and the remaining cases fall into local optimal, and the final convergence number is 7/8/12 respectively. After the feature weighted optimization, the number of cluster centers can converge to 3 regardless of the initial cluster number. This indicates that the optimized algorithm is insensitive to the initial cluster number and can achieve global convergence better. Figure 9(c) and Fig. 9(d) show the sensitivity of the feature weighted module to the location of the initial clustering center of the algorithm before and after optimization. The models without feature weighted optimization are greatly affected by the initial clustering position, and almost all fail to achieve global convergence. However, the models with feature weighted optimization are insensitive to the initial clustering position and can converge smoothly. The above data indicate that the feature weighting method proposed in this study can produce better optimization effect on the initial clustering sensitivity of the algorithm.
The experiments to study the overall performance of the OFCM algorithm are divided into three parts. The first is the experiment in the continuous feature dataset, using the artificial dataset selected for the above experiment, the Iris dataset and the Glass dataset, where the total number of samples in the Glass dataset is 214, divided into nine features and two cluster classes. The fuzzy index of the research design algorithm is set to 2, the maximum number of iterations and the convergence threshold are set the same as the previous experiments, the contribution weight of the lower approximation is adjusted to 0.95, and the iteration compensation of each cluster class threshold is set to 0.001. Meanwhile, the research introduces k-means algorithm, particle swarm algorithm and genetic algorithm (GA) for comparison, and each algorithm is used in the clustering results in the artificial dataset are shown in Fig. 10.

Clustering results of each algorithm in artificial data set.
The clustering results of each algorithm can be clearly seen in Fig. 10, where Fig. 10(a) indicates the initial clustering state, and the initial FCM algorithm and k-means algorithm can only simply classify the data and find out the clustering centers of each cluster class. The rest of the algorithms can identify the boundary of each cluster class, and it can be seen from Fig. 10(d) that the particle swarm algorithm has relatively less boundary data; from Fig. 10(e) and Fig. 10(f), it can be seen that the clustering results of OFCM algorithm and genetic algorithm are basically the same, but the different feature weights of the algorithms lead to some differences in the boundary identified by the algorithms, and the number of boundary samples of both algorithms is higher than that of the particle swarm algorithm, which indicates that The clustering effect of the algorithms is more deterministic and the clustering results are more accurate. The experimental results of each algorithm on the Iris dataset and Glass dataset are shown in Fig. 11.

Clustering accuracy results of each algorithm in Iris dataset and Glass dataset.
As can be seen from Fig. 11, the initial FCM algorithm has the lowest clustering accuracy, while the improved FCM algorithm has the highest clustering accuracy. In Fig. 11(a), the clustering accuracy of FCM algorithm, particle swarm algorithm, k-means algorithm, genetic algorithm, and improved FCM algorithm are stabilized at 90.02%, 91.34%, 92.68%, 95.38%, and 96.14%, respectively, and the accuracy of OFCM algorithm is improved by 6.12% compared with the initial FCM algorithm, and the performance effect is closest to that of The genetic algorithm also improved by 0.76% compared to the genetic algorithm, so the improved FCM algorithm has the best clustering effect in the Iris dataset. However, in the Glass dataset, the performance of each algorithm decreases. In Fig. 11(b), the clustering accuracy of FCM algorithm, particle swarm algorithm, k-means algorithm, genetic algorithm and improved FCM algorithm are stabilized at 87.91%, 92.92%, 92.45%, 94.62% and 94.87%, respectively, which can be seen that the clustering accuracy of OFCM algorithm decreases by 1.27% compared with the Iris dataset, and the decrease relatively small, and its accuracy improves by 6.96% compared with the initial FCM algorithm, and by 1.95%, 2.42%, and 0.25% compared with the particle swarm algorithm, k-means algorithm, and genetic algorithm, respectively. In summary, it can be concluded that the improved FCM has the best clustering results on both datasets. Finally, the study applies each algorithm to the mixed feature dataset as well as the intrusion detection dataset, in which the study combines the Heart Disease dataset and the SubKDD dataset as the mixed feature dataset, the number of discrete features and the number of continuous features in the Heart Disease dataset are 6 and 9, respectively, and the total number of samples is 303, and the study classifies them according to the different classification features The study divided them into 2-cluster class data and 5-cluster class data according to the different classification features. The IDS Dataset dataset was used for intrusion detection, and the performance results of each algorithm are shown in Table 3.
Experimental results of each algorithm in mixed feature data set and intrusion detection data set
As can be seen from Table 3, all algorithms performed relatively poorly in the Heart Disease dataset with cluster class 5, but the improved FCM algorithm still achieved the highest clustering accuracy, with an 18.26% increase compared to the lowest particle swarm algorithm. When this dataset was divided into two cluster classes, the clustering accuracy of the OFCM algorithm and the genetic algorithm was significantly higher, but for the PSO and k-means algorithms, the increase in accuracy was not significant, and the accuracy of the OFCM algorithm increased by 27.47% relative to that of the PSO algorithm. In the detection experiments in the intrusion data, the detection rate and false detection rate of the improved FCM algorithm reach the highest and lowest values, respectively. In summary, it can be seen that the adaptive weighted FCM algorithm based on local density can effectively improve the accuracy of network intrusion detection and achieve the maintenance of network security order. In order to verify the excellent performance of the model in high-dimensional big data sets, the study applied it to high-dimensional big data MNIST dataset and UCI Letter Recognition dataset. The comparative experimental results are shown in Fig. 12.

Performance analysis of the model in a high-dimensional large data set.
As can be seen from Fig. 12, the accuracy of the research design model is always higher than that of the GA algorithm. In Fig. 12(a), the average accuracy of the model is 91.26%, which is an increase of 6.13% compared with the GA algorithm. In Fig. 12(b), the performance gap between the two models is still large. The average accuracy of the research design model is 96.54%, which is increased by 4.37% compared with GA algorithm. To sum up, it can be seen that the research design model can still maintain better detection effect when processing high-dimensional data and large volume data.
Network security maintenance is an inevitable trend in the development of today’s data-based society. The study proposes an adaptive feature-plus-cluster FCM algorithm based on local density, which realizes the processing of discrete feature data, and is effectively optimized in the selection of initial clustering centers and the setting of cluster classes, further improving the detection performance of the algorithm. Finally, the study does simulation experiments with each module of the algorithm, as well as the performance of the overall algorithm. The study compared the number of iterations of the local density algorithm with the random selection algorithm in four data sets, and the results showed that the local density algorithm can effectively reduce the number of model iterations and improve the stability of the model, and the mean value of its iteration number decreased by 25 times in the artificial data set, for example; in the experiments on the adaptive module, the entropy of the algorithm division was relatively reduced by 5.86%. The study’s experiments on the overall algorithm are divided into three parts: continuous feature dataset, mixed feature dataset and intrusion detection dataset. In the first type of experiments, the improved FCM algorithm shows a more stable and accurate clustering effect, and its clustering accuracy is improved by 3.8% on average compared with other algorithms in the Iris dataset; in the mixed feature dataset, compared with the PSO algorithm, it is improved by 18.26%; in intrusion detection, the false detection rate decreased by 8.57%. It can be seen that the performance of the improved FCM algorithm proposed in the study is superior. However, the model still has some shortcomings, such as the algorithm in the condition of satisfying the detection accuracy has sacrificed the simplicity and speed of the model, which makes the model slightly bulky, therefore, in the follow-up research, we should focus on the improvement of the spatio-temporal performance of the model.
