Abstract
Data mining refers to discover unknown, effective and practical information from large database. Taking a brief introduction of data mining, and combining an explanation for data mining process and analysis of Bayesian network, it investigates the implementation of Bayesian network in this research work. Experimental results suggest that this proposed method is feasible and correct. The characteristics as its unique expression form of uncertainty knowledge, rich probabilistic expression abilities, and the incremental learning method for comprehensive priori knowledge, indicate the probability distributions and causal relations of objects, becoming one of the most striking focuses among numerous current data mining methods.
Introduction
Computer teaching is an interdisciplinary subject of preventive computer science, closely relating to human health. In the teaching contents, it mainly introduces functional factors, development of a variety of functional computers, and evaluation and detection of processing technologies for functional computers. In order for the students to have a better grasp of this course and achieve the training goal of computer science and engineering specialty and the training goal of computer nutrition and safety, we must renew our education concepts, continuously organize reasonable teaching contents, improve teaching methods and trains of thought, increase amount of information as much as possible, and improve the teaching quality of relevant computer specialized courses.
Introduction of data mining. With the development of global informatization, automatic data acquisition tools and mature database technologies have led to massive data stored in databases. It is very important to extract reliable, novel, and effective knowledge from massive data which can also be understood by people, hence data mining has caused great concerns to information industry. Its extensive application fields involve agriculture, medical diagnostics, business management, product control, market analysis, engineering design, scientific research, and so on. Data mining process. Data mining is a process of mining interesting knowledge from among mass data stored in databases, data warehouses or other information bases. It is a definition by Dr. Han in his work, Data Structures – Concepts and Technologies. Data mining also refers to the process of discovering knowledge from mass data, as shown in Fig. 1, which represents well the process of knowledge discovery. It is an important step for data mining to clearly define its business problems and determine the purposes of data mining. It has a blindness and will not be successful to mine data simply for the purpose of data mining itself.
Flow chart of data mining. Data preparation. The pretreatment of data, eliminating inconsistent and noisy data, and combining together the data from different data sources. Choice of data, searching for all internal and external data information relating to business objects, among which the data suitable for the application of data mining are chosen. Data transformation, aiming at certain methods for mining algorithm, and transforming data into forms suitable for mining. Data mining. Using intelligent methods, we mine the acquired and transformed data. In addition to choosing the right mining algorithm, all the rest of the work can be completed automatically. Knowledge assessment, interpreting and evaluating the results. The adopted analytical methods are generally determined by data mining operations, and visualization techniques are often used. Knowledge representation, the knowledge obtained through an analysis will be provided to the users, or integrated into the organizational structures of business information systems. Classification and forecasting. Based on known training sets, classification is used to find out models or functions which describe and distinguish the data classes or concepts, and to accurately classify each group and entities according to the classified information, so as to forecast object classes with unknown signs by using models. While in forecasting, the forecasted values are numerical data. Cluster analysis. Cluster analysis aims at a collection of data objects, aggregates entities with the same characteristics to become one category, enables data objects in the same category to share as many similarities as possible, and uses certain rules to describe the common properties of the category, whereas there are large differences among objects in different categories. Clustering, in essence, is an unsupervised learning method, the purpose of which is to find out similarities and differences in data sets and to aggregate the data objects sharing common characteristics into the same category, the characteristics of each cluster can usually be analyzed and explained. Outlier analysis. Outliers are those data which are inconsistent with general acts or models of the most of the data in data sources. Much of this data are considered noises or abnormal and are discarded. However, these data are more interesting in such fields as analyzing customer behaviors, credit fraud screening and quality control of data, network security management and fault detection than those data appearing normally.

The research motivation of this paper can be summarized as follows. In order to improve the performance of data mining methods, it investigates the implementation of Bayesian network in this research work. We conduct this research work on teaching methods for relevant computer specialized courses and introduces data mining techniques, thus enhancing students’ learning validity.
Bayesian network, also known as probabilistic causal network, web of trust, knowledge graph and so on, is a directed acyclic graph. A Bayesian network is composed of two parts: a directed acyclic graph
An exemplified example of Bayesian network structure.
The network constituted by a graph
Bayesian network is a graph model to describe dependence among data variables. The description may consist of two parts:
Network structure Partial probability distribution
For any data variable
At this moment, the variable in
A Bayesian network is confirmed according to
For example, after a manufacturing enterprise adopts a new technique, through trial production, it is necessary to assess effectiveness of this technique. Now, there are several data variables:
Effective value Product disqualification quantity Product disqualification quantity Age Sex
Example’s Bayesian network structure.
Based on existing experiential knowledge, we find out causal relationship among data variables and gain Bayesian network structure
Now, the following conditional independent relationship can be gained through causal relationship among data variables in Fig. 3.
Probability calculation is conducted by use of Bayesian network. For example, there are
Bayesian network learning is actually a Bayesian network model which can reflect dependence relationship among each data variable in existing database.
Posterior probability
Here, value range of
For Bayesian network learning process, we propose the following 3 postulated conditions:
Random sample Parametric variables are mutually independent, i.e.
Parametric variables are of Dirichlet distribution. i.e.,
Where,
In example
Next, we predict the probability of the lth example with the first l-1 examples. Similar to treatment of Eq. (1),
According to assumed condition Eq. (3), i.e. parametric variables are of Dirichlet distribution, the probability of example
Where,
Then, the following is gained:
It thus can be seen that joint probability
Then,
A Bayesian network constructed according to users’ priori knowledge is called a priori Bayesian network, and a Bayesian network obtained by the combination of priori Bayesian networks and data is called a posteriori Bayesian networks, the process of obtaining posteriori Bayesian networks from priori Bayesian networks is known as Bayesian network learning. Bayesian network can keep learning, the posteriori Bayesian network obtained by the last learning can become the prior Bayesian network for next learning. Before each learning, users can make adjustments to the prior Bayes networks, enabling new Bayesian networks to be able to better reflect the knowledge contained in the data, as shown in Fig. 4.
Continuous learning graph of a Bayesian network.
The computer learning based on a Bayesian network includes two contents: parameter learning and structure learning, meanwhile, according to different natures of the sample data, each part includes two aspects: complete instance data and incomplete instance data. Parameter learning methods are mainly the learning based on classical statistical learning and the learning based on Bayesian statistics – conditional probability table (CPT). Structure learning methods are mainly based on the Bayesian statistical measurement methods and based on coding theory measurement methods. The learning based on structures is presented below.
In a Bayesian network, firstly a random variable
where
On the premises of multinomial distribution without constraints, independent parameters, and the adoption of Dirichlet priori and complete data, the structure likelihood of the data is exactly equal to the product of the structure likelihood of every
The following variables are found out to stuffy their effects on pursuing advanced studies through investigating postgraduate entrance exams of university graduates in a region:
Gender (A): male, female; Intelligence (B): low, medium and below, medium and above, high; Family economy (C): poor, medium and below, medium and above, excellent; Employment situation (D): bad, good; Whether to take postgraduate program (E): yes, no.
Table 1 shows the statistical result of 10000 university students. There are 128 data in total. In Table 1, the first data in the first row means the number of A
Statistical results of 10000 university students
Bayesian network is constructed to find out causal relationship among these variables for data mining. The specific network construction process is as follows: gender (A), intelligence (B), family economy (C), employment situation (D), whether to take postgraduate program (E). Based on existing expert knowledge, we only choose the most possible a and b network structures. Their difference is that causal relationship between students’ employment situation and family economy and the causal relationship between intelligence and family economy are different, as shown in Fig. 5.
Two most possible network structures a and b.
For this example, classical sample statistical method is adopted to calculate probability parameter of each variable. For example, for Fig. 5a, calculate P (B
For the above case, decision-making tree method is used to analyze the data again. The results are compared with Bayesian network method. Figure 4 shows the learning curves when Bayesian network method and decision-making tree are used in the case. Seeing from the learning curve of decision-making tree method, the proportion on the test set before the number of 5000 increases on the whole, but when the number exceeds 5000, the correct proportion reduces with the rise in the number. This indicates when the data size is large and the data are complex, decision-making tree algorithm is powerless, and the performance becomes increasingly poor. Seeing from the learning curve of Bayesian network method, as the number increases, correct proportion is on the rise. When the number reaches about 7000, the proportion value starts to tend to 1. The performance becomes better and better. When the data size is large, Bayesian network method is superior to decision-making tree method. Bayesian network owns causality and probabilistic semantics. Prior knowledge and sample data may be organically combined to combine subjectivity and objectivity. Besides, it expresses clearly. Hence, it can reflect internal connection and essence of data object more comprehensively and objectively and achieve data mining conveniently. Through repeated verification, this conclusion has general applicability.
Conclusions
In recent years, the data that the government and enterprises accumulate have become increasingly huge. Plentiful important information is hidden bind the soaring data. Entire understanding of the data has exceeded the capacity of human brain, while data mining becomes a research hotspot. Data mining refers to a complete process which mines unknown, effective and practical information from large database and uses the information to make decisions or enrich knowledge. Bayesian Model and its prediction algorithm are prediction methods which develop in order to predict emergencies. Bayesian Model and its prediction algorithms not just depend on historical data before t and model knowledge, but also utilize experts’ experience and their subjective judgment. This is especially useful for predicting emergencies. Historical data and models stipulated in advance cannot completely reflect them. When the model performance is poor, experts’ experience and information may be used to improve the model. Experimental results suggest that this proposed method is feasible and correct.
The characteristics of proposed method can be summarized as follows. The characteristics as its unique expression form of uncertainty knowledge, rich probabilistic expression abilities, and the incremental learning method for comprehensive priori knowledge, indicate the probability distributions and causal relations of objects, becoming one of the most striking focuses among numerous current data mining methods.
The future research directions can be summarized as follows. The first one is to develop some more application researches. The second one is to construct some available practical software.
