Abstract
This paper proposes a deep mining method of high-dimensional abnormal data in Internet of things based on improved ant colony algorithm. Preprocess the high-dimensional abnormal data of the Internet of things and extract the data correlation feature quantity; The ant colony algorithm is improved by updating the pheromone and state transition probability; With the help of the improved ant colony algorithm, the feature response signal of high-dimensional abnormal data in Internet of things is extracted, the judgment threshold of high-dimensional abnormal data in Internet of things is determined, and the objective function is constructed to optimize the mining depth, so as to realize the deep data mining. The results show that the average error of the proposed method is only 0.48%.
Keywords
Introduction
At present, with the continuous development of mobile Internet, Internet of things technology and social media applications, the number of data generated in practical applications is increasing, and its dimension and complexity are getting higher and higher [11]. For different types of interactive information data, web site text data, mobile Internet communication data, multimedia video, audio, image data, etc., but the data dimension in the Internet of things is higher [4]. Due to the complex data types and strong heterogeneity of the Internet of things, in order to better monitor different objects in the real world, the Internet of things has been widely used in terminal systems. The Internet of things can collect a large amount of data, constantly enrich the data types, at the same time, more complex data formats appear [2]. However, the data of the Internet of things is dynamic, complex in time and space, and partially incomplete, which easily leads to a large number of abnormal data. Therefore, it is of great significance to study the deep mining of high-dimensional abnormal data in the Internet of things.
In reference [15], an unsupervised data mining method based on deep learning is proposed, which is used to separate the normal operation and fault conditions in the chemical process, so as to effectively create the tag database and build the fault diagnosis model. This method mainly includes three steps: convolution superimposed self encoder (SAE) feature extraction, t-distribution random neighbor embedding (t-sne) algorithm feature visualization and clustering to realize unsupervised data mining. However, this method does not consider too many data dimensions and has the problem of poor data mining accuracy. In reference [12], a modeling method of Internet public information data mining based on probabilistic topic model is proposed. In view of the increasing amount of data in the network, based on the probability topic model, the in-depth mining and analysis of network information, so as to obtain the deep intelligence knowledge needed by the military, the data mining technology is introduced into the military intelligence analysis, the network intelligence analysis system based on the data mining model is constructed, and the Internet public information data mining is realized. This method does not consider the local optimal problem of the algorithm, and has the problems of long mining time and small mining capacity. In reference [5], a high-dimensional sensor data deep mining method based on deep belief network is proposed. This method obtains the dimension abnormal data in the Internet of things through sensors, and then determines the data characteristics for this type of data, and then reduces the dimension of the original data, and introduces the sliding window to complete the high-dimensional data mining. This method can effectively determine the dimensions of the Internet of things data, but less factors are considered for its feature extraction, which has a certain impact on the in-depth acquisition of the data.
To solve the above problems, this paper proposes a deep mining method of
high-dimensional abnormal data in the Internet of things based on improved ant
colony algorithm. The specific route of this paper is as follows Preprocess the high-dimensional
abnormal data of the Internet of things, extract the correlation feature
quantity of data information, classify the high-dimensional abnormal
data features of the Internet of things, and complete the extraction of
high-dimensional abnormal data of the Internet of
things; The principle of ant
colony algorithm is analyzed, and the improvement of ant colony
algorithm is completed by updating ant colony pheromone and changing
state transition probability; The improved ant colony algorithm extracts the characteristic
response signal of high-dimensional abnormal data in Internet of things,
determines the data gain, determines the judgment threshold of
high-dimensional abnormal data in Internet of things, constructs the
objective function to optimize the mining depth, and realizes the deep
data mining.
Extraction of high-dimensional abnormal data from Internet of things
Internet of things high-dimensional abnormal data usually refers to the existence of hundreds of dimensions of data in the Internet of things, these data are different from the common one and two-dimensional data, its typical characteristics are higher dimensions, the need for storage space, dimensionality reduction is more difficult data. In order to ensure the normal operation of the Internet of things, it is very important to mine the high dimension abnormal data in order to ensure the normal operation of the Internet of things.
Aiming at the high-dimensional abnormal data mining of the Internet of things, the high-dimensional abnormal data of the Internet of things are preprocessed to extract the high-dimensional abnormal data information correlation feature quantity, and to classify and process the high-dimensional abnormal data feature correlation threshold. According to the data mining constraints, the high-dimensional abnormal data extraction of the Internet of things is realized.
If the high-dimensional data acquisition sample of the connected network terminal is
represented ω, The high-dimensional data set of the initial link
network terminal is expressed as
In formula (1),
If the change rate of high-dimensional data clustering is expressed as
σ, the average time-consuming for clustering high-dimensional
data in a connected network terminal is expressed as
In formula (3),
If the number of training samples for high-dimensional abnormal data set of the
connected network terminal is expressed
In formula (5),
In the extraction of high-dimensional abnormal data of Internet of things, the high-dimensional abnormal data of Internet of things are preprocessed, the correlation feature of data information is extracted, the high-dimensional abnormal data feature of Internet of things is classified and processed, and the high-dimensional abnormal data extraction of Internet of things is completed.
Improvement of ant colony algorithm and realization of high-dimensional exception data mining in Internet of things
Improvement of ant colony algorithm
Ant colony algorithm refers to the behavior of ants in the process of finding food. During foraging, ants secrete a special chemical called pheromone and release it in the passing path. For a period of time, the ant colony can find the whole optimal path from the nest to the food source. The principle of ant colony algorithm is shown in Fig. 1.

Principle of ant colony algorithm.
The A points are represented as ant colony nests, F points as food sources, and C points to D points as obstacles from ant colony nests to food sources. The ant colony can only reach the food source by B point to C point, B point to D point, C point to point or D point to point.
Because ant colony algorithm is prone to local optimal problem in finding optimal
target, this paper improves ant colony algorithm. First, update ant colony
pheromones. After a certain ant colony construction rule, all ant colony
pheromones belonging to this property term need to be updated globally, that is:
In formula (7),
If the amount of ant colony pheromone is large, the search results show that the
ant colony pheromone belonging to this property decreases gradually, or
approaches to zero, which weakens the global search ability. The current search
indicates that there is a possibility that the condition item can be
re-selected, resulting in a decrease in global search performance [8]. As a result, it is necessary to
determine the dynamic variation of ant colony pheromone dispersion and introduce
the
In formula (8),
On this basis, the state transfer probability is changed to improve the
performance of ant colony algorithm. If the heuristic factor and ant colony
pheromone are in the state of interaction [7], the term belonging to this property is placed in [3] current rule. Set the property to be
represented as
In formula (9), the
Considering that the path that the ant colony has passed belongs to this property
term and can well classify the actual number covered [13], the heuristic factor can be expressed as:
In formula (10),

Improvement of ant colony algorithm.
By updating ant colony pheromones and changing the probability of state transition, the search efficiency is improved, the calculation time is reduced, and the problem of local optimization and system stagnation is avoided.
In the depth optimization mining of high-dimensional abnormal data in connection network, the feature response signal of high-dimensional abnormal data is extracted based on improved ant colony algorithm, and the gain of high-dimensional abnormal data in connection network is determined. The threshold of high-dimensional abnormal data is obtained.
If the high-dimensional abnormal data channel signal of the connected network
terminal is expressed as
Formula (11) shows the value of genetic diffusion
regulation by
In formula (12), Ant colony cross-regulation values
of high-dimensional abnormal data in connected networks are expressed by
In formula (13), High dimensional abnormal data
training sample scale parameters of the connected network are represented by
Because of the above mining process, the depth of update mining position in ant
colony search is not enough in high-dimensional abnormal data mining of physical
network. Therefore, the objective function is constructed to optimize the mining
depth, and the objective function is as follows:
In formula (15), a represents the
current depth of data mining, b represents the number of items
of this nature,
With the help of improved ant colony algorithm, the feature response signal of high-dimensional abnormal data in connection network is obtained, the gain of high-dimensional abnormal data attribute set of connection network is calculated, and the threshold of high-dimensional abnormal data judgment of connection network is obtained. Complete high-dimensional abnormal data depth mining.
Experimental analysis
Design of experimental scheme
For to verify the effectiveness of the iot high-dimensional anomaly data depth mining method based on improved ant colony algorithm, the experiment uses computers configured with IntelCoreTM2 DuoCPU 2.94 GHzz operating system,32.0 GB memory,800 G hard disk,64-bit Windows7 operating system, select DARPA database and VC. NET 2005 compiler. Based on the CloudSim simulation platform, a high-dimensional anomaly data depth mining method is constructed, and the simulation experiment is compared and analyzed
In order to achieve the effectiveness of the method, the parameters are set according to the actual needs, the experimental data parameter settings are shown in Table 1.
Parameter setting of experimental data
Parameter setting of experimental data
The experimental data are selected from the datasets of Museum, glass, balance scale and nurse. The data in these data sets are consistent with the research of this experiment. The specific characteristics of the datasets are as Table 2.
Specific characteristics of data set
In this experiment, 2000 records were randomly selected, and the average of 20
experimental results were selected for all experimental results. Among them, the
calculation formula of mining accuracy is:
In the formula,
The formula of mining time is:
In the formula,
Comparison of depth mining error of high-dimensional abnormal data in Internet of things
According to the above experimental simulation environment, experimental parameters and data set feature settings, the reference [15] method, the reference [12] method and the proposed methods are respectively used for comparison, and the error comparison results of different methods for deep mining of high-dimensional abnormal data in the Internet of things are obtained as Fig. 3.

Error comparison results of different methods for deep mining of high dimensional abnormal data in the Internet of things.
According to the analysis of Fig. 3, when the number of iterations is 500, the average error of the reference [15] method is 1.35 m, the average error of the reference [12] method is 0.92 m, and the average error of the proposed method is only 0.48 m. It can be seen that the depth mining error of the proposed method is small, which can effectively improve the depth mining accuracy of high-dimensional abnormal data in the Internet of things. According to the cumulative variance contribution rate of high-dimensional abnormal data characteristics of IOT, the proposed method uses improved ant colony algorithm and fuzzy theory to establish the training samples of high-dimensional abnormal data of IOT, obtain the judgment threshold of high-dimensional abnormal data of IOT, and deeply mine high-dimensional abnormal data of IOT, so as to effectively improve the mining accuracy of high-dimensional abnormal data of IOT.
In order to verify the mining time of the deep mining method of high-dimensional abnormal data in the Internet of things based on the improved ant colony algorithm, the reference [15] method, the reference [12] method and the proposed method are respectively used for comparison, and the comparison results of different methods of deep mining time of high-dimensional abnormal data in the Internet of things are obtained as Fig. 4.

Time comparison results of deep mining of high dimensional abnormal data in the Internet of things with different methods.
Analysis of Fig. 4 shows that with the increase of the number of high-dimensional data samples, the depth mining time of high-dimensional abnormal data of Internet of things with different methods increases. When the number of high-dimensional data samples is 500, the deep mining time of high-dimensional abnormal data of Internet of things based on the reference [15] method is 27.5 s, the deep mining time of high-dimensional abnormal data of Internet of things based on the reference [12] method is 34.2 s, and the deep mining time of high-dimensional abnormal data of Internet of things based on the proposed method is only 17.6 s. Therefore, the time of deep mining of high-dimensional abnormal data in the Internet of things is short. Because the proposed method is based on the improved ant colony algorithm, by updating the ant colony pheromone and changing the state transition probability, it can improve the search efficiency, reduce the calculation time, and effectively shorten the deep mining time of high-dimensional abnormal data in the Internet of things.
Further verify the deep mining capacity of high-dimensional abnormal data in the Internet of things based on the improved ant colony algorithm. Compare the reference [15] method, the reference [12] method and the proposed methods respectively, and get the comparison results of the deep mining capacity of high-dimensional abnormal data in the Internet of things with different methods, as shown in Table 3.
Comparison results of deep mining capacity of high dimensional
abnormal data of Internet of things with different methods
Comparison results of deep mining capacity of high dimensional abnormal data of Internet of things with different methods
According to the data in Table 3, with the increase of the number of high-dimensional data samples, the deep mining capacity of high-dimensional abnormal data of IOT with different methods increases. When the number of high-dimensional data samples is 500, the deep mining capacity of high-dimensional abnormal data in the Internet of things based on the reference [15] method is 39T, the deep mining capacity of high-dimensional abnormal data in the Internet of things based on the reference [12] method is 32T, and the deep mining capacity of high-dimensional abnormal data in the Internet of things based on the proposed method is 53T. This is because this method uses the improved ant colony algorithm to extract the network high-dimensional abnormal data characteristic response signal, determines the high-dimensional abnormal data judgment threshold of the Internet of things, and constructs the objective function to optimize its mining depth. Therefore, the proposed method has a large capacity for deep mining of high-dimensional abnormal data in the Internet of things.
This paper proposes a deep mining method of high-dimensional abnormal data in the Internet of things based on the improved ant colony algorithm. Based on the principle of ant colony algorithm, it preprocesses the high-dimensional abnormal data in the Internet of things, extracts the data information correlation characteristic quantity and carries on the classification processing. By updating the ant colony pheromone and changing the state transition probability, the feature response signal of high-dimensional abnormal data in IOT is obtained, and the judgment threshold of high-dimensional abnormal data in IOT is obtained, so as to realize the deep mining of high-dimensional abnormal data in IOT. This method can effectively improve the search efficiency and data mining accuracy, reduce the calculation time, and avoid falling into local optimum and system stagnation. Compared with traditional methods, this method has the following advantages
The average error of the high-dimensional abnormal data mined by the proposed method is only 0.48%, which has a certain degree of credibility;
The time of mining high-dimensional abnormal data with the proposed method is always less than 20 s, and the mining speed is fast;
The capacity of high-dimensional abnormal data mined by the proposed method is large and has certain advantages.
Footnotes
Acknowledgements
This work was supported by National Key R&D Program of China under grant no. 2018YFB0105205.
