Abstract
The core problem of Ontology mapping and various kinds of ontology engineering applications is the calculation of similarity between concepts in ontology. From the machine learning point of view, by means of learning the sample set, it gets the optimal ontology similarity calculation function, so that each pair of concepts mapped to a positive real number, thus reflected the similarities between concepts. After representing the ontology using graph, the goal of ontology learning is to obtain a real-valued function, which maps each pair of vertices into real axes and uses distances to reflect the similarities between concepts of vertices. In this paper, we present an ontology learning algorithm in view of ontology geometry distance computation and deep learning tricks. The iteration procedure is designed and the experiments show the effectiveness of given ontology algorithm.
Introduction
As a structured data storage, processing, analysis and calculation model, ontology is widely used in the mainstream of computer such as information retrieval, graphic image processing, pattern recognition and granularity computing. At the same time, due to its powerful auxiliary functions, it is applied to many disciplines such as physics, chemistry, biology, pharmacy, medicine and materials science. With the deepening research, new techniques are constantly updated and applied to practice, making the role of ontology more and more effective. See Swanson [1], Teymourlouie et al. [2], Viani et al. [3], Hippolyte et al. [3], Roth and Jornet [5], Cooper et al. [6], Pfaff and Krcmar [7], Travin et al. [8], Seipel et al. [9], and Blondet et al. [10] for more details.
In particular, ontology is a structured set of concepts that are related by certain form of structure. In general, the ontology of this association structure can be represented as the graph model. Each vertex in the graph corresponds to a concept in ontology, and the edge between two vertices indicates that there is some direct superior-subordinate relationship between the two concepts. Combined with learning theory, the goal of ontology learning is to obtain an ontology function that is used to calculate the similarity between ontology vertices. That is, by learning of ontology samples, we get
Several papers contribute on the ontology learning algorithm from the theory and application point of view. Gao and Xu [20] presented the stability analysis of learning algorithms for ontology similarity computation. Gao et al. [21] studied the strong and weak stability of k-partite ranking based ontology learning algorithm. Gao et al. [22] proposed the ontology learning algorithm for similarity measuring and ontology mapping by means of linear programming. Gao and Farahani [23] determined the Generalization bounds and uniform bounds for multi-dividing ontology algorithms with convex ontology loss function from a statistical point of view. Gao et al. [24] discussed the distance learning tricks for ontology similarity measuring and ontology mapping.
In this paper, we focus on the ontology learning algorithm based on the geometry distance computation and deep learning technique. The rest of the paper is organized as follows: first we introduce the geometry distance calculating setting in ontology problem, and review the ontology distance computation algorithm which is projected on the positive cone in a reproducing kernel Hilbert space; then, we propose our main ontology geometry distance learning algorithm using deep learning neural networks; finally, we verify the effectiveness of the algorithm by several experiments.
Overview of geometry distance computation
In order to represent the ontology algorithm in a mathematical model, the information of each vertex on ontology graph is represented by a p-dimensional vector, i.e.,
Let I
n
be the n × n identity matrix. The Frobenius norm of a matrix V is
In this fashion, the class of Mahalanobis distances with
In this section, we mainly overview the distance calculating based ontology learning algorithm connected to a reproducing kernel Hilbert space. In this setting, the training data can be denoted as
A large ϱ (u
i
, v
i
) implies that u
i
and v
i
are dissimilar, and vice-versa. Let Proj (·) be a projection to the cone of positive definite matrices. Then, the Mahalanobis matrix in ontology setting can be obtained by
Let
Next, we show how to get
However,
By computation, we yield
Deep learning based ontology distance calculating algorithm
Deep learning is a new research field in machine learning. In recent years, it has made breakthrough progress in large kinds of applications such as speech recognition and computer vision. The motivation of this model is to establish a model that simulates the neural connections of the human brain. When dealing with such signals as images, sounds and texts, the data features are described by stratification through multiple transformation stages, and the data interpretation is obtained (related studies can refer to Phong et al. [25], Hassan et al. [26], Oh et al. [27], Proenca and Neves [28], Biswas et al. [29], Ren et al. [30], Rad et al. [31], Lore et al. [32], Bianco et al. [33], Olmos et al. [34], and Treder et al. [35]). In this section, we introduce our main ontology distance function learning algorithm based on the deep neural network learning technique.
Assume that there are M + 1 layers in the designed network and d(m) units in the m-th layer, where m ∈ {1, 2, ⋯, M}. Let
Similarly, the projection matrix, bias, and nonlinear activation function in the second layer are denoted by
Define the threshold parameter τ1, τ2 with τ2 > τ1 > 0, and τ > 1 is related to τ1 and τ2. We use
In this section, we present four experiments to show the effectiveness of our proposed ontology learning algorithm.
Similarity measuring experiment on gene data
In our first experiment, we use the gene ontology data from http://www.geneontology. P @ N average precision ratio is applied to test the effectiveness of result data.
It’s easily seen from the Table 1 that the precision ratio calculated via our newly proposed algorithm is becoming higher than that via algorithms in Gao et al. [36, 39], as N=3, 5, 10 or 20. Meanwhile, when N increases, the precision ratios will keep increasing with it. In result, the experiment shows clearly the efficiency and superiority of our newly proposed algorithm, compared with the method in Gao et al. [36–38].
The experiment data for gene ontology
The experiment data for gene ontology
In our second experiment, we use the plant ontology data from http://www.plantontology.org. Again, P @ N average precision ratio is applied to test the effectiveness of data result.
It’s apparent in the Table 2 that when N= 3, 5, 10 or 20, the precision ratio from our newly proposed algorithm is much higher than that from algorithms which are proposed before in Gao et al. [36, 39]. More than others, we notice that the precision ratios keep increasing with the increase of N. Therefore, the new algorithm is more effective than the other algorithms proposed before by [36, 39].
The experiment data for plant ontology
The experiment data for plant ontology
In the third experiment, we use the physical education ontology data which is widely used in the ontology learning applications. P @ N average precision ratio is also applied to test the effectiveness of average conclusion.
It’s easily seen from the Table 3 that the precision ratio calculated via our newly proposed algorithm is becoming higher than that via algorithms in Gao et al. [36, 39]. And the larger N is, the more efficient our newly proposed algorithm is.
The experiment data for plant ontology
The experiment data for plant ontology
In the last experiment, we use the humanoid robotics ontology data which was defined in Gao and Zhu [39]. We use P @ N average precision ratio to test the effectiveness of average conclusion.
It’s easily seen from the Table 4 that the precision ratio calculated via our newly proposed algorithm is becoming higher than that via algorithms in Gao et al. [36 and 37], and Gao and Zhu [39]. And the larger N is, the more apparent the contrast between them will become. In other words, the newly proposed algorithm turns out to be more efficient that the other three algorithms.
The experiment data for humanoid robotics ontology
The experiment data for humanoid robotics ontology
Deep learning is also known as unsupervised feature learning, in which features are extracted without human design and features are learned from the data. Depth learning is essentially a non-linear combination of the methods of representation learning. It indicates that learning refers to learning representations (or features) from data to extract useful information in the data when categorizing and predicting. Depth learning begins with raw data and transforms each representation (or feature) layer by layer into a higher-level, more abstract representation, thereby discovering the intricate structure of high-dimensional data. In this paper, we design an ontology distance learning algorithm by means of deep learning. It is applied to ontology similarity measuring and ontology mapping, and been used in various engineering applications.
Conflict of Interests
The authors hereby declare that there is no conflict of interests regarding the publication of this paper.
Footnotes
Acknowledgments
We thank the reviewers for their constructive comments in improving the quality of this paper. This work has been partially supported by Postdoctoral Research Grant of China (2017M621690), postdoctoral research grant in Jiangsu province (1701128B).
