Abstract
With the rapid development of database and Internet technologies, data collection and storage is possible. It is often impossible to correctly analyze the valuable information contained in the data, and it becomes more difficult to obtain valuable information. Therefore, it faces the status of “rich data and scarce knowledge”. Traditional information processing technology can no longer meet the needs of reality. There is an urgent need for more capable and effective information processing skills to help us analyze the information we need from big data and guide us to make the right decisions. Data mining technology is born in the background. Data mining technology is one of the effective methods to solve rich data and improve lack of knowledge. It is also one of the main research topics in the field of information science. Related research and applications have greatly improved people’s decision-making ability. It has been recognized as one of the extremes of data research and has a very broad application prospect. Large databases often contain redundant and unnecessary attributes for many search rules, so the ability to remove duplicate attributes can greatly improve the clarity of potential system knowledge and reduce the time complexity of finding rules. At the same time, it enables efficient operation and improved adaptability. Because the structure of the neural network is variable, it has strong self-organization, self-learning, nonlinearity and high fault tolerance, but the ability to express and interpret knowledge is very poor. The network parameters lack physical meaning and learning. Therefore, it has become an inevitable trend to form a fuzzy neural network combining the characteristics of the two. Therefore, exploring the organic combination between rough sets and fuzzy neural networks is undoubtedly of great significance for data mining technology research. This paper proposes a data mining method based on the combination of rough set and fuzzy neural network technology. Using the approximate set to discover the rules of the database rules, the initial structure of the fuzzy neural network is determined, and the training data is used to train the network. Since the fuzzy neural network thus constructed has a good topology of data distribution features from the beginning, the network scale can be greatly reduced and the network training speed can be improved.
Introduction
Faced with the explosive growth of data and databases in the information society, the ability of humans to analyze and extract useful information from them is far from meeting actual needs. In fact, large data sets that are informative and complex are common needs in all areas of business, science, and engineering. It allows companies and companies to maximize their benefits. Although the database management system can effectively implement management functions such as real-time data entry, recovery, storage, and maintenance, it cannot discover the association rules in the data and cannot predict the future development trend. Existing data we cannot provide the most needed functions for synthesizing and analyzing data. In addition, in order to implement an effective market analysis, it is necessary to focus on issues that are closely related to decision making, and DBMS does not follow the principles of these issues. Therefore, there is an urgent need for a new technical tool that can intelligently and automatically transform data into useful information and knowledge. Demand is the mother of development, the combination of database management systems and machine learning in artificial intelligence, which led to the birth of new technologies for discovering database knowledge. The discovery of knowledge is an interdisciplinary subject involving machine learning, pattern recognition, statistics, large databases, knowledge extraction, high performance computing, artificial intelligence, and expert fields. Data mining has a wide range of connotations, and it is very difficult to meet the needs of theory and technology. But the results show that the computer industry can quickly get rid of the concept of data mining. Data mining visually treats large data sets as a repository of valuable information and exploits or discovers useful information values through effective knowledge discovery techniques.
In data mining, there are always intelligent ways to acquire information knowledge, express information knowledge, and make informed decisions through a large number of observations and experiments, especially for inaccurate, incomplete and inexperienced raw materials that contain a lot of noise data.
This paper, the developing problems of data mining method based on rough set theory. es, the initial structure of the fuzzy neural network is determined, and the training data is used to train the network. The main idea is to first simplify the database into rough set theory, then use the neural network to complete the attribute reduction in the database in the self-learning process, filter out the noise data from the data, and finally filter to the rough set theory. The simplified database gets the final mining knowledge through rule extraction. The effectiveness of the proposed method is verified by comparison with existing data mining methods. Based on the rough theory and f fuzzy neural network technology, the learning algorithm model is characterized by fast learning speed, strong fault tolerance and high interpretability.
Relation work
Fuzzy logic systems are easy to understand; neural networks have strong adaptive learning capabilities. How to combine fuzzy logic system technology with neural network technology and use the advantages of both to improve the learning ability and performance ability of the whole system is a very worthy concern [1–4]. Fuzzy neural networks are emerging technologies born in this context. It is an access point in the field of intelligent control and automation in recent years, and an access point in the field of data mining. Let it have the characteristics of fuzzy expression, connected learning and distributed information processing [5]. In data mining, information sources are obtained from a large number of observations and experiments, and information knowledge and methods of making informed decisions are expressed. Especially for the data with inaccurate and incomplete information knowledge and without the knowledge of the original information containing a lot of noise, it has always been the focus of intelligent data mining research. Important tasks Rough set theory and fuzzy neural networks have become important research tools in this field by virtue of their unique methods. Because of their different research methods, they have different characteristics. Combining the rough set theory with the fuzzy neural network technology, applying the unrecognizable relationship of rough set theory and the method of calculating knowledge reduction, the simplification and approximation rules are found from a large number of original data. Then the fuzzy neural network model is established according to the obtained rules. And determine the connection relationship between hidden layer nodes [6]. It can make the network have a good topology from the beginning, and the scale of the network will be greatly reduced.
A 7 : 3 set of random groups is used to assess the model’s goodness of fit and predictive power. Robustness was evaluated repeatedly through three trainings and tests as stability of model performance in response to data set changes [7]. In the case where the confusing moving point estimate is zero, the adjusted analysis retains power, while the unadjusted analysis greatly reduces power. Although the full adjustment of the true confounding factor has the best performance, the matching of the propensity scores specified for the medium error significantly improves the type 1 error and power compared to no adjustment. A clinical review and safety study should then be conducted to quantify the extent of the impact and to target the confounding control [8]. Zou, et al. proposed two learning processes in order to train the evolving fuzzy neural network. Firstly, the K-means method is used to divide the input samples into different clusters, and a Gaussian fuzzy membership function is designed for each cluster to measure the membership degree of the sample to the cluster center. Specifically, the prediction performance between the proposed model and the six traditional models is compared, namely artificial neural network, support vector machine, autoregressive integrated moving average model and vector autoregressive model. Lin, H. Y. et al. proposed an effective method to solve image defogging by combining fuzzy inference system and neural network filter. Finally, the brightest 1% atmospheric light is used to calculate the color vector of the atmospheric light to eliminate the color cast [9–11].
Methods
Data mining methods
Data mining usually refers to extracting information or patterns that are implicitly known and potentially useful from the data warehouse. Data mining can not only learn existing information knowledge from hundreds of thousands of incomplete, noisy and weak data, but also discover new unknown information knowledge, display the information knowledge obtained, and understand, store and apply. Therefore, data mining has been widely recognized from the beginning, and is one of the frontier research directions in the field of computer science and information decision-making [12]. Data mining is a new application in database research. Data mining technology combines theory and technology in many fields such as database, artificial intelligence, machine learning, and statistics. Potential links between data can be identified to predict future trends and behaviors of decision support.
(1) Data mining brief
In this “data explosion” information age, people want to be able to analyze and make better use of higher levels of data information. Over time, data mining technology has become a reality. Data preparation is the selection and integration of data from various data sources. The rules of data mining are some way to find the regularity of data. Regular expression is to get as close as possible to the user’s habits. The main goal of data mining is to help decision makers find potential associations, characteristics, trends, etc., or to identify factors that may be overlooked and useful for forecasting and decision making [13].
Data mining is an advanced process. The core technologies are artificial intelligence, machine learning, statistics, etc., but not only a simple combination of several technologies, but also a fusion process of the entire technology, and also need to support other auxiliary technologies to complete data processing. The acquisition, preprocessing, data analysis and result representation of this series of advanced processes is a multi-step process in which multiple steps interact and are repeatedly adjusted to form a spiraling process [14].
(2) Data mining mining tasks and mining techniques
The generalized knowledge that can be found in data mining includes the following aspects, the types of knowledge that reflect the common features of similar things, the related knowledge that reflects all aspects of things, and the related knowledge that reflects the differences in the attributes of various things. The following focuses on four very important search tasks: data extraction, classification search, clustering, and regular connection rule mining for mining operations and mining methods.
1) Data extraction
The purpose of data extraction is to simplify the data and provide a brief description. The traditional method of data extraction is to calculate statistical values from various fields or graphs in the database. Data mining mainly describes data according to the degree of data generalization. Data summarization is the process of abstracting related data in a database from a low level to a high level. There are currently two main methods of data promotion, namely multidimensional data analysis and attribute-oriented derivation [15].
2) Classification found
Classification is a very critical step in data mining and is the most widely used business today. The purpose of classification is to construct a classification function or classification model that maps data items in the database to one of the given categories. Classifiers are constructed by machine learning, statistics, and neural networks.
3) Clustering
Clustering is based on similarity to treat each similar component as a category. The goal is to make the distance between individuals in the same category as small as possible and to maximize the distance between individuals in different categories. Clustering methods include statistical methods, machine learning methods, neural network methods, and database-oriented methods.
4) Association rule discovery
The base object for link rule discovery is the transaction database, which has an unlimited number of link rules. When it finds a meaningful association rule, it requires two minimum closed value support and a minimum trust value. The minimum supported value indicates that, in a statistical sense, the minimum confidence that the object group must satisfy can be reflected in the minimum confidence of the association rule.
(3) Evaluation criteria for data mining tools
Evaluating a data mining tool can be considered from the following aspects:
1) How many types of patterns are generated and the ability to solve complex problems
Increasing the amount of data is a requirement to improve the accuracy and accuracy of the model and will lead to more complex problems. Combining different models and different categories of data mining systems can help us find useful patterns and reduce the complexity of the problem [16]. For example, first clustering the data and then mining the prediction patterns in each group is more efficient and more accurate than simply manipulating the entire data set.
Data selection and mode conversion are often affected by many data items. Some data is redundant and some data is completely uncorrelated. The existence of these data items may affect the search for important information. A very important feature of data mining systems is the ability to handle data complexity, provide tools, select the right data items, and convert data values. Rough set theory and fuzzy neural networks have great advantages in this respect.
2) Easy to operate
Easy operation is an important factor. Some tools have a graphical interface that guides users semi-automatically and uses a scripting language. Some tools also provide data mining APIs that can be included in other programming languages. Some data mining tools can use SQL statements to read data directly from the database management system. This simplifies data preparation and leverages the database. Data mining tools must take into account many factors, making it difficult to assess the pros and cons of the tool in principle. The most important thing is the user’s needs, depending on the specific needs.
Data mining analysis and improvement based on rough set
Among many data mining methods, we applied and developed mathematical statistics, machine learning, neural networks and other knowledge. Recently, rough set theory applied to data mining has made great progress is one of the main methods of data mining [17]. This paper mainly analyzes the data mining process based on rough set, then introduces three sub-processes in the data mining process in detail, analyzes and compares the existing algorithms applied to them, and then introduces the concept of class distribution list and class distribution.
(1) Data mining analysis based on rough set
By studying the existing knowledge system and discovering the fuzzy neural network system, we can divide the fuzzy neural network system based on the rough theory into three parts. That is, the data preprocessing part, the reduction part of the attribute and attribute value, the generation of the rule, and the comprehensive part of the rule.
In order to deal with incomplete big data tables, the combination of rough set theory and fuzzy neural network method theory, database technology directly uses rough set theory to deal with big data and sorting algorithms in disk. Reduce the complexity of processing in order to achieve large-scale data mining [18].
Considering the three components of data mining based on approximate set, the discretization process of data in the data preprocessing part, the attribute reduction process and the attribute value reduction process in the restored part are extracted into three independent module analysis.
1) Data discretization
The data obtained through data mining analysis can be divided into two types. One is a continuous quantitative attribute that represents some measurable property of the described object, and the values come from consecutive intervals, such as temperature, length, and the like. Decision table S = (U, R, V, f), R = C∪ { d } is a set of attributes, C ={ a
j
|i = 1, 2, ⋯ n } is called a set of conditional attributes, {d} is called a set of decision attributes, and U ={ x1, x2, ⋯ , x
n
} is a domain of arguments. Let the number of decision types be r (d). A breakpoint on the value field V
a
of the attribute a can be denoted as (a, c), where a ∈ R, c are real sets. Any set of breakpoints
Therefore, any
Discretization is actually a problem of partitioning the space formed by conditional attributes using selected breakpoints. The process of selecting breakpoints is also the process of merging attribute values. By reducing the number of attribute values and the complexity of the problem, combining attribute values also helps to improve the applicability of the rule knowledge gained in the knowledge acquisition process [19].
2) Attribute reduction
The concept of attribute reduction is the core of rough set theory. When discussing the decision information system reduction, the condition attribute A corresponds to an equivalent relationship, also referred to as an unclear or indistinguishable relationship, that is, an equivalent relationship with the condition attribute A forming the domain U. Divided into U/A. All conditional attributes in the decision table constitute the conditional attribute set C U/C is the universe U. At the same time, the decision attribute D ={ d } forms the domain partition U/D. The classification of the two departments in the parasitic domain sample Knowledge of the formation of conditions and decision attributes.
The time complexity of an attribute reduction algorithm based on recognizable matrices and logical operations should consist of two parts, namely the calculation of recognizable matrices and logical operations. The time complexity used to calculate the identifiable matrix is O (MN2), where M is the number of conditional attributes in the decision information system and N is the number of samples in the decision information system. However, the time complexity of calculating the identifiable matrix itself is very large, and in order to reduce the computational complexity and complexity of the logic operation, the identifiable matrix is simplified, so that the total time complexity of the algorithm is greater than O (MN2).
The attribute reduction algorithm based on mutual information—MIBARK algorithm is a heuristic algorithm. In most cases, the attribute reduction of the decision table can be obtained, but the minimum attribute reduction of the decision table cannot be guaranteed. Assuming that the algorithm uses a discernible matrix to obtain C relative to D-core Core D (C), then the time complexity of the algorithm is O (MN2 + O (N3)), where M is the number of conditional attributes of the decision information system, and N is the number of samples in the decision information system.
The attribute reduction algorithm based on feature selection can obtain a subset of conditional attributes that have strong dependence on decision attributes. When β = 0, a relative reduction of the set of conditional attributes is obtained. The time complexity of the algorithm is analyzed as O (N2), where N is the number of samples in the decision information system.
3) Attribute value reduction
The simplification of the decision table is achieved, and the impact of key attributes and key attribute values on the decision is highlighted. Therefore, reducing this attribute removes some duplicate attributes from the decision table and does not remove duplicate information from the decision table.
(2) Improvement plan
After several years of research, the algorithm based on some heuristic information is more effective than the traditional algorithm in data discreteness, attribute reduction and attribute value reduction, and has achieved satisfactory results. Heuristic reduction based on rough set mainly includes reduction algorithm based on attribute importance, attribute frequency, condition information, genetic algorithm and resolution matrix.
The class distribution list represents U/P, a list of implicit Y j ∩ X i and single-type compatible distribution link lists, and a multi-compatible link in the class distribution link list, so you can easily get condition information through the class distribution list. If the condition information is 0, then only the condition information of the incompatible class without the distribution list is calculated.
Before explaining the use of the incompatible class distribution list to calculate the condition information, the concept knowledge Q (U|IND (Q) = { Y1, Y2, ⋯ , Y m }) of the condition information is described as the condition H (Q|P) relative to the knowledge P (U|IND (P) = { X1, X2, ⋯ , X N }) is defined as
Among them, A p (Y j |X i )= |Y j ∩ X i |/|X i |, B i = 1, 2, ⋯ , n, j = 1, 2, ⋯ , m. To get conditional information about an incompatible class distribution list, you must scan from start to finish. During the scan, a temporary allocation table is dynamically created, condition information is calculated and added to the total.
Class distribution lists can be thought of as index blocks based on large data sets, making it easier to work with large data sets. At the same time, there are two advantages to using a class distribution link list to handle large data sets.
The software that implements the rough set algorithm in this paper is Rose2, and the training of neural network is implemented in MATLAB [20]. Two experimental data sets were used, one for the lymphography data set in the UCI data set and one for the cancer data set in the sample data set of the Rose2 software. Their data composition is shown in Table 1.
Experimental data set
Experimental data set
Implementing C3RST involves four steps
A random sample of 150 samples was selected from the lymphography data set as a training sample. When the information system consisting of 150 training samples was reduced by the heuristic search algorithm, 10 different reduction results were obtained in 10 seconds. In this paper, we randomly select Red(1), Red(5), Red(7), and Red(9) to construct four neural network classifiers. The input layer of classifier 1 corresponds to 6 nodes and the number of hidden nodes is as follows. 28; classifier 2 input layer 6 nodes, hidden layer 22 nodes; classifier 3 input layer 7 nodes, hidden layer 21 nodes; classifier 4 input layer 7 nodes, hidden layer 15 nodes. Each classifier has a node in its output layer. The sum of the four classifiers and the attribute importance of each of the attributes corresponding to each classifier is then combined as a weight.
Analysis of data mining methods based on rough sets
Firstly, the classification rate is tested based on the rough set information metric method, then the number of effective rules is obtained, and compared with the classical fuzzy neural network method and the rough set rule derivation method RSES classification result, and the algorithm is verified. Since the fuzzy neural network method is a very common machine learning method, it is a comparison and is a general top-down derivation method based on information acquisition, which is widely used to compare with some new algorithms. To verify the effectiveness of the new algorithm, since the method is based on the theory of rough sets, the RSE method starts from the reduction of rough set theory and rule generation to achieve effective classification. The method is to derive the final classification rule set by hierarchical decomposition.
The selected data set represents different kinds of data table types, such as Car, Dermatolgy’s attribute types are discrete, while Australia’s Ecoli dataset attribute types contain discrete data and continuous data. Therefore, before using this method for classification, we need to discretize continuous data. Here we use a simple equal-space discretization method. The experimental results are shown in Figs. 1 and 2.

Comparison of the number of rules.

Comparison of classification rate results.
As can be seen from Figure 1 and Figure 2, the data mining algorithm based on rough set is an effective machine learning algorithm. Comparing the classification results in Figure 1 shows that the classification rate of the algorithm is slightly lower than that of the fuzzy neural network and the RSES method, but the algorithm in Figure 2 is much less than the rules of the other two algorithms. Usually it is 1/5 to 1/3 of the other two algorithms. The data mining method based on rough set uses a smaller rule set to achieve higher classification rate. Therefore, it can be seen that the classification model established by these rules is relatively simple and very effective for decision analysis.
With the increase of the number of samples in the dataset, the recognition rate of the algorithm is slightly reduced, and there is data fluctuation in the middle, but the overall recognition rate is above 90%, indicating that the improved algorithm can even process a large number of data sets without loss of recognition rate, Fig. 3. The change in the recognition rate of the algorithm is shown.

Change rate of recognition rate.
Figure 3 shows the change in the correct rate of the algorithm. The correct rate curve of the algorithm is similar to the recognition rate. The state of the curve is slightly larger than the recognition rate, but the overall correct rate is also above 90%. This also means that the improved algorithm does not lose the correct rate even when processing large data sets.
Figure 4 shows the change in the time required to acquire knowledge. As the amount of data increases, the time required for the algorithm to process the data also increases, the rate of time increase increases relatively, and increases substantially linearly.

Curve of correct rate.
The improved algorithm proposed in this paper improves the scalability of the algorithm, so it can be seen in Figs. 4 and 5 that the original algorithm can adapt to a larger data set, process more data, and have better performance. The algorithm can handle large amounts of data well without degrading performance, such as accurate speed and recognition rate, and because of the improvement of the knowledge reduction algorithm and the efficiency of SQL Server for the algorithm test itself, the processing time required for data processing is relatively Longer. The next step is to find better optimization methods to speed up processing large amounts of data. This paper tests the rough set algorithm improved by class independent list from two aspects of class precision and extensibility. The experimental data shows that the improved algorithm has better performance.

Time curve required for knowledge acquisition.
Rough set theory is a powerful tool for dealing with fuzzy data and uncertain information. It has attracted wide interest of researchers since its introduction and has been widely used in data mining, pattern recognition, automatic control, fault diagnosis and other fields. With the maturity of database technology and the development of data storage technology, data mining technology has been advancing with the times and has become a research hotspot of information resource development. Rough set theory is a new mathematical tool for dealing with weak and uncertain information. It proposes a new method from the traditional intelligent information processing method. The combination of the two will bring new methods and ideas to many data analysis problems and has important research value.
In this paper, a fuzzy neural network construction method combining rough set theory and fuzzy neural network is proposed, which makes full use of rough set theory and fuzzy neural network to make up for this defect. If rough set theory is used in attribute reduction and extraction rules, the characteristics of the sample data can be fully utilized, so that the configured network has a good topology from the beginning, which significantly reduces the network size and through the rough set. The theory has gotten a lot. The a priori knowledge of setting the initial weights of the fuzzy neural network makes the learning very fast, so that the systematic error starts small and the learning speed is fast.
Footnotes
Acknowledgments
This research is supported by Provincial Quality Engineering Project of Colleges and Universities in Anhui Province (2008yljc286), Natural Science Project of Hefei Technology College (201814KJA005), Anhui Provincial Natural Science Foundation (1808085ME126), the Provincial (Key) Natural Science Research Project of Anhui Colleges (KJ2017A538), Talent Research Fund Project of Hefei University in 2016-2017(16-17RC25), the Support Program Project for Excellent Youth Talent in Higher Education of Anhui Province (gxyq2018069).
