Abstract
In order to overcome the problems of poor timeliness and low accuracy of mining existing in traditional methods, this paper designs a bit-object based maximum frequent pattern mining method for intensive cloud computing data. After judging the support number according to the bit object of the maximum frequent pattern, the intensive cloud computing data is accurately collected according to the difference between the load value of cloud data and the true value of load, so as to improve the accuracy of subsequent mining results, and then the maximum frequent pattern of data is accurately mined by combining the bit object. Experimental results show that the maximum time to generate mining results is only 4.6 s, the maximum bit error rate of output results is only 7%, and the maximum memory occupancy is only 3.90%. The above results show that this method is more suitable for practical excavation.
Keywords
Introduction
With the rapid development of Internet, cloud computing and communication technology, the amount of data in the network is also increasing rapidly. In the intensive cloud computing environment, data has the characteristics of large volume, many modes and rapid growth in quantity. Compared with the static data in the network, the intensive cloud computing data will rapidly increment the data sequence with the increase of time. Due to this nature, the intensive cloud computing data is difficult to be completely stored and effectively applied [4, 13]. In this case, if the association rules of data series are mined, the storage and application efficiency of intensive cloud computing data can be effectively improved. In the process of data association rule mining, maximum frequent pattern mining is a key link [3, 5]. Therefore, the effective results of mining the maximum frequent patterns of data have drawn close attention from related fields.
Design based on the reference [1] in coding list fast frequent pattern mining method, the method first build node coding model improve the bitmap representation of the mode, and establish the bitmap in the model tree, the bitmap tree node as frequent structure to collect data, and then through counting support pruning strategy to achieve fast frequent pattern mining. However, in practical application, this method has the disadvantage of slow generation process of mining results. In Reference [8], a method of data frequent pattern mining based on correlation measurement is designed. This method gives a measurement parameter to the tree structure on the basis of establishing the tree structure, so as to improve the accuracy of mining results. Then the uncertain confidence index is used to mine the frequent patterns in the data. However, in practical application, this method has the problem of high memory occupancy. In the reference [7], a Hadoop based data frequent pattern mining method is designed. This method improves the efficiency of data mining in stand-alone mode by combining pruning strategy on the basis of compressed data stream, and then mining frequent patterns of data on the Hadoop platform by Kulczynski measurement. However, in practical application, this method has the disadvantage of high bit error rate of output results.
In view of the shortcomings of traditional methods, in order to improve the analysis and application ability of intensive cloud computing data, a new data maximum frequent pattern mining method based on bit objects is designed in this study. The design idea of the method is as follows: Firstly, the theory of maximum frequent pattern and bit-object is analyzed, and the supporting number of the maximum frequent pattern is judged according to the bit-object of the maximum frequent pattern, which fundamentally reduces the computation amount in the subsequent mining process, thus improving the timeliness of the mining method. Based on this, intensive cloud computing data is accurately collected according to the difference between cloud data load value and load true value, which lays a foundation for improving the accuracy of subsequent mining results. Then, the maximum frequent pattern tree of data is constructed, and the maximum frequent pattern of data is accurately mined with bit objects.
Maximum frequent patterns and bit-object theory analysis
This study analyzes the theory of maximum frequent pattern and bit object, and then judges the number of support for frequent pattern according to the bit object of maximum frequent pattern, so as to effectively reduce the computation amount of subsequent mining process and improve the timeliness of mining process.
Frequent pattern and maximum frequent pattern analysis
Suppose there is a transaction database E, which contains the number of n transactions, and each transaction has its own tag number, denoized as
E for a certain things database and the minimum support count tau σ and minimum support τ, the project set
On this basis, the frequent itemsets in the transaction database E are arranged according to its minimum support τ from small to large to form the list of frequent itemsets. In this list, the item collection
Any maximum frequent pattern set is composed of items in the frequent item set list, and all subsets of the maximum frequent patterns are also frequent patterns.
Bit object data format and its operations
A bit-string object is a bit-string format that represents the maximum frequent item set by a fixed length of binary bits. Where, the fixed length of binary bits is determined by the number of item sets in the list of frequent item sets. According to the order of frequent item sets in terms of their minimum support, each binary bit is given its weight from small to large [2, 12, 19].
The value of the binary bit is 0 or 1, that is, the item set at the position of the binary bit is stored in the frequent item set list. The value of this position is 1, otherwise it is 0 [6, 18]. The binary bit string is the bit object of the frequent itemset when the positions of all the itemsets in the frequent itemset list are evaluated and given weights.
It is assumed that
According to the results shown in Equation (2), the supporting number of frequent itemsets
Mining the maximum frequency pattern of intensive cloud computing data based on bit-object
In this chapter, according to the difference between the load value of cloud data and the true value of load, the intensive cloud computing data are accurately collected to fundamentally improve the accuracy of subsequent mining results. Then the data maximum frequent pattern tree is constructed and the data maximum frequent pattern is mined accurately with bit object.
Collect intensive cloud computing data
Due to the large amount of intensive cloud computing data, it is necessary to accurately collect the intensive cloud computing data in order to reduce the amount of computing. In this study, the load truth value of each cloud data source in the period of t on day q is denoted as
Where, p represents the number of cloud data sources, while
If multiple data nodes in a cloud data source meet the constraint conditions in Equation (4) in the period of time q–t, it is proved that there is intensive cloud computing data in the cloud data source, thus accurate collection of intensive cloud computing data can be completed.
When collecting data, attention should be paid to the fact that different sources of data vary greatly in their characteristics, leading to different degrees of integration of the information contained therein [15–17]. Therefore, the data characteristics should be strongly correlated during the collection process.
Build the data maximum frequent pattern tree
After collecting intensive cloud computing data, due to the large amount of information in transaction database E, if all frequent patterns are directly extracted from transaction database E, the subsequent computation will be greatly increased. Therefore, the extracted frequent patterns are compressed and integrated. If only the maximum frequent patterns in the transaction database E are compressed and integrated, the result is the data maximum frequent pattern tree.
Data maximum frequent pattern tree is a classical algorithm in data association rule analysis. The prerequisites for constructing the maximum frequent pattern tree of data are as follows: because in the process of frequent pattern compression and integration, only the minimum support and the minimum number of support are needed. Therefore, this study determined the minimum number of supports after scanning the database and searching all frequent item sets. On this basis, in order to facilitate the traversal of the data maximum frequent pattern tree, the corresponding item header table is created. Each frequent item set in the item header table is pointed through the node chain and appears in the pattern tree in turn [11, 20]. The collections in the maximum frequent pattern tree are connected by the nodes of the pattern tree in turn. After the database is scanned, the maximum frequent pattern tree is constructed by the root node, prefix subtree and item header table of a tree structure. The structure diagram of the maximum frequent pattern tree is shown in Fig. 1.

Schematic diagram of the maximum frequent pattern tree structure.
Through the above process, the conversion of the intensive cloud computing data structure is realized, which fundamentally improves the mining efficiency of the data maximum frequent pattern.
On the basis of the above intensive cloud computing data collection and data structure transformation, this study performed set pair analysis on the maximum frequent item set of the data, and extracted the average set pair characteristic quantity of the maximum frequent item set of the data according to the analysis results.
In the set pair cluster of frequent item sets of intensive cloud computing data, the value in each data sequence is subtracted by 1 to transform the data sequence into a space-frequency structure. Discrete Fourier transform is applied to the weight of the structure, and the iteration step size is adjusted. If the first data values
Where, J represents the inequality constraint condition, α represents the weighted output amplitude of each frequent item set, ρ represents the adaptive adjustment parameter of the maximum frequent item set of data. On this basis, the effective probability calculation equation of the scattering cluster is used to complete the set pair analysis of the frequent item set mining of the data, and the regulatory factor η of the set pair analysis process is obtained as follows:
After the analysis of the set pair, the adaptive filter is used to suppress the intercode interference of the frequent item set according to the regulating factor. Assuming that s represents the number of codes of the broadband beam in the process of data transmission, the number of codes of the data frequent item set beam is:
At this point, if the transmission code rate of the frequent item set is less than the theoretical modulation rate, it is necessary to carry out widening processing on the main lobe beam of the data calculation process. Thus, the time-frequency joint distribution characteristics of the maximum frequent item set of the data are obtained as follows:
In conclusion, through the set pair analysis of the maximum frequent item set of intensive cloud computing data, the number of codes output has a direct proportional relationship with the number of codes s of the broadband beam in the data transmission process, which can fundamentally improve the mining ability of the maximum frequent item set of data.
After obtaining the time-frequency joint distribution characteristics of the maximum frequent item sets, the lower limit of the confidence of frequent pattern is introduced, and the frequent pattern mining is completed by analyzing the relationship between frequent pattern item sets.
For any candidate frequent mode I, if its support is set as
According to the calculation results of the lower limit of confidence of submode
In the formula,
Follow the above steps to traverse the frequency pattern diagram corresponding to each data in the transaction database, so as to complete the effective mining of the maximum frequency pattern of intensive cloud computing data.
Experimental verification and result analysis
In order to verify the practical applicability of the bit-object-based maximum frequent pattern mining method for intensive cloud computing data, the following simulation experiment process is designed.
Experimental environment and scheme design
In order to ensure that the experimental process is clear and the results are true and reliable, the following experimental scheme is designed:
(1) Experimental environment: The experiment was completed on the Matlab platform, the host device was equipped with Windows 10 operating system, the host processor model was I9-9980HK, the CPU main frequency was 2.56 GHz, and the programming language was Visual C++. The data used in the experiment came from Oracle database. Before the formal experiment, the experimental data were normalized. Other experimental environmental parameters are as follows: Length of frequent itemsets for intensive cloud computing data: 3000; Central acquisition frequency: 10 GHz; Noise gain during data transmission: −15 dB; Structural beam weights: 1.25; Equilibrium coefficient: 1.75.
(2) Comparison Method: In order to avoid too single experimental results, the data frequent pattern mining Method based on correlation measurement (Method of Reference [8]) and the data frequent pattern mining Method based on Hadoop (Method of Reference [7]) are used as the comparison Method.
Experiment indicators
(1) The mining result output process is time consuming. This index is a basic verification index, which can reflect the timeliness of different mining methods for maximum frequent patterns of data. The less time it takes to output the mining results, the higher the timeliness of the mining method.
(2) The bit error rate of the output result of the maximum frequent itemset of data. This index can directly reflect the effectiveness of mining methods for maximum frequent patterns of different data. The lower the bit error rate of the maximum frequent pattern mining, the more effective the mining method is.
(3) Memory footprint. This index is an extensibility index, which can reflect the resource utilization of different mining methods of maximum frequent patterns of data, thus reflecting the practical effect of different methods to some extent.
Comparison and analysis of results
Comparison test of time spent in the output process of mining results
First, the output time of mining results of different data maximum frequent pattern mining methods was tested, and the comparison results were shown in Table 1.
Comparison of time consuming in the output process of mining results of different methods (s)
Comparison of time consuming in the output process of mining results of different methods (s)
By analyzing the results shown in Table 1, it can be seen that the time taken to generate mining results by the three different methods is different, but all of them increase with the increase of the number of experimental iterations. Among them, the maximum time of Method of Reference [8] to generate mining results is 9.1 s, the maximum time of Method of Reference [7] to generate mining results is 9.3 s, and the maximum time of Method of this paper to generate mining results is only 4.6 s. In contrast, the path guidance scheme of Method of This Paper takes less time to generate the maximum frequent pattern mining results of intensive cloud computing data, indicating that this Method has higher timeliness.
The reason for the above results is that on the basis of the analysis of the theory of maximum frequent pattern and bit object, the support number of the maximum frequent pattern is judged according to the bit object of the maximum frequent pattern, which effectively reduces the computation amount in the subsequent mining process and thus greatly improves the timeliness of the mining method.
In order to highlight the performance of Method of This Paper in high-precision mining, the bit error rates of the output results of the maximum frequent itemsets of data obtained by Method of This Paper, Method of Reference [8] and Method of Reference [7] are compared. Bit error rate comparison results of output results of different methods are shown in Fig. 2.

Bit error rate comparison of the output results of different methods.
Through the analysis of the results shown in Fig. 2, it can be seen that the bit error rate of the output results of the three different methods is different, and the change of the bit error rate does not show a fixed rule. Among them, the maximum bit error rate of the output result of Method of Reference [8] is 14%, and that of Method of Reference [7] is 9.5%, while that of Method of this paper is only 7%. In contrast, the bit error rate of the output results of Method of This PAPE is lower, indicating that the mining results of this Method are more effective.
The reason for the above results is that Method of this paper accurately collects intensive cloud computing data according to the difference between the load value of cloud data and the true value of load, which reduces the interference of unnecessary data and lays a foundation for improving the accuracy of the mining results of maximum frequent patterns.
Taking the memory occupancy rate as the index to test the resource utilization of different methods, so as to reflect the practical effect of different methods. In the experiment, the changes of memory occupancy of different methods are shown in Table 2.
Changes in memory occupancy of different methods (%)
Changes in memory occupancy of different methods (%)
By analyzing the results shown in Table 2, it can be seen that the memory occupancy of the three different methods increases with the increase of the number of experimental iterations. Among them, the maximum memory footprint of Method of Reference [8] and Method of Reference [7] is 9.32%, while the maximum memory footprint of Method of this paper is only 3.90%. In contrast, the memory occupancy rate of Method of this paper is lower, indicating that the feasibility of this Method is higher in practical application.
(1) In order to effectively improve the storage and application efficiency of intensive cloud computing data, this study designed a data maximum frequent pattern mining method based on bit objects. This method can judge the support number according to the bit object of the maximum frequent pattern, which can fundamentally reduce the computation amount in the subsequent mining process and improve the timeliness of the mining method. Then, according to the difference between the load value of cloud data and the true value of load, the intensive cloud computing data are accurately collected to lay a foundation for improving the accuracy of mining results. Then, the maximum frequent pattern tree of data is constructed to accurately mine the maximum frequent pattern of data with bit objects.
(2) The experimental results show that the method takes only 4.6 s to generate the mining results, the maximum bit error rate of the output results is only 7%, and the maximum memory occupation rate is only 3.90%. The above results indicate that this method has high timeliness and accuracy, and is more suitable for practical application because of its low memory occupancy.
(3) Although this method has good application effect in the present stage, but in the whole process of intensive application and analysis of cloud computing data, this method can only be done through maximum frequent patterns mining based classification, and how to improve based classification process to give the right weight to strengthen effect on the analysis of the data is then some problems which need to be studied further.
Footnotes
Acknowledgement
This work was supported by research subject of Heilongjiang Bayi Agricultural University.
