Abstract
Knowledge graphs exhibit a typical hierarchical structure and find extensive applications in various artificial intelligence domains. However, large-scale knowledge graphs need to be completed, which limits the performance of knowledge graphs in downstream tasks. Knowledge graph embedding methods have emerged as a primary solution to enhance knowledge graph completeness. These methods aim to represent entities and relations as low-dimensional vectors, focusing on handling relation patterns and multi-relation types. Researchers need to pay more attention to the crucial feature of hierarchical relationships in real-world knowledge graphs. We propose a novel knowledge graph embedding model called
Keywords
Introduction
A knowledge graph (KG) is a knowledge base composed of nodes and edges, representing complex relationships in the real world in the form of a graph. With the rapid development of artificial intelligence, knowledge graphs have been widely used in various fields, such as information extraction [1], question answering [2], and recommendation systems [3]. Knowledge graphs are potent representations of structured information, capturing relationships between entities and providing a foundation for various applications. However, these knowledge graphs often need to be complete, where important relations and facts are missing. The incompleteness of knowledge graphs significantly hampers the performance of downstream tasks that rely on accurate and comprehensive knowledge representations. For instance, in question-answering systems, the need for certain relationships can lead to incorrect or incomplete responses. Information about user preferences or item characteristics is necessary for recommendation systems to avoid suboptimal recommendations. Similarly, incomplete knowledge graphs limit the effectiveness of search and retrieval algorithms in information retrieval. Knowledge graph completion (KGC) effectively alleviates the incompleteness of complex graphs by inferring relations or entities that do not exist in the graph using the existing data in the knowledge base. The primary objective of KGC is to enrich the data of the knowledge graph through this inference process. KGC can be formulated to predict missing relations or entities in a knowledge graph. We denote the knowledge graph as KG ={ (h, r, t) }, where h represents the head entity, r represents relation, and t represents the tail entity. The objective of KGC is predict the missing triples (h, r, t) in the knowledge graph.
Researchers widely use knowledge graph representation learning in the knowledge graph completion task. This approach involves learning entities and relations in a low-dimensional continuous vector space to obtain improved semantic representation. Knowledge graph embedding offers better generalization and greater ease of transfer to the downstream task. An example of a knowledge graph is illustrated in Fig. 1, which contains multiple relationships and entities arranged hierarchically. For instance, “City,” “Honolulu,” “State,” and “California.” The hierarchy of “City” is higher than that of “Honolulu,” while the hierarchy of “State” is higher than that of “California.”

An example of the knowledge graph.
In order to solve the problem of knowledge graph incompletion, many researchers have proposed multiple knowledge graphs embedding models, such as TransE [4], TransH [5], and RotatE [6]. These models represent entities and relations via low-dimensional vectors, and head entity vectors, tail entity vectors, and relation vectors satisfy specific mathematical relations. The RotatE model can make full use of the spatial relationship. The RotatE model simulates the relationship as the rotation between the head and tail entities. The particular mechanism to better model symmetry/anti-symmetry, inverse, and composition. In practical applications, knowledge graphs often contain multi-relational, and the RotatE model must accurately simulate multi-relationship [7]. The hierarchical relationship between entities is also crucial for model training, which needs to be considered in the RotatE model [8]. Multi-dimensional space can increase the expressive power of the model. MRoatE [7] and StructurE [9] fully use of 2D space, and the models achieve excellent performance in link prediction tasks. Rotate3D [10] models map relationships and entities in 3D vector space, and performance has been further improved. These models still need to incorporate hierarchical information into the model.
In the current model, three main issues have been identified: (1) The Translation model fails to handle multi-relation patterns, such as the inability of the TransE model to handle symmetric/anti-symmetric relation models, and the inability of the RotatE model to handle complex relations. (2) The translation model cannot differentiate between entities at different semantic levels. For instance, in the knowledge graph, the semantic hierarchy of the head entity “palm” and the tail entity “tree” in the triple (palm, _hypernym, tree) is distinct, with “palm” having a lower semantic hierarchy than “tree”. (3) The translation model cannot differentiate between model entities at the same level. For instance, although the head entity of two triples (palm, _hypernym, tree) and (olive, _hypernym, tree) have the same semantic hierarchy, they represent different meanings. The model’s performance can be enhanced by distinguishing semantic distinctions at the same level. Hierarchical features are significant for knowledge graph embedding models.
The

Illustration of entities at different levels of the hierarchy.
The contributions of this paper are summarized as follows: The HPRE model framework we present is a powerful tool for effectively modeling various aspects of relational data. It can comprehensively capture and analyze simple and complex relation patterns, multi-relational types, and hierarchical features. The HPRE model includes identifying superficial one-to-one relationships between entities and more complex multi-relational patterns involving multiple entities and relationships. Furthermore, the framework is designed to account for the hierarchical nature of many datasets, enabling it to capture the relationships between individual entities and the broader hierarchical structure of the knowledge graph. In the HPRE model, capturing hierarchical features is essential for accurately representing knowledge graphs. However, there may be better choices for capturing such features due to their linear and orthogonal nature than traditional complex systems, such as Cartesian coordinates. In contrast, polar coordinates offer a more natural way of describing hierarchical structures that exhibit circular or radial symmetry. Using polar coordinate pairs in the HPRE model can better capture the multi-scale and multi-level features standard in complex systems. The experimental results indicate that the HPRE model achieves commendable performance in the link prediction task and effectively discriminates the hierarchical characteristics within the knowledge graph.
This research paper presents the following structure: Section 2 mainly introduces the related work on problem definition and knowledge graph embedding. Additionally, we discuss the commonly used definitions and models in this paper in Section 3. The experimental process and results are described in Section 4, which serves as the experimental part of this paper. Finally, Section 5 presents our research’s conclusion and future work.
Knowledge graph embedding is widely used in link prediction tasks. At present, most researchers project relations and entities into low-dimensional vectors and operate between them [4–6]. Many methods improve the expressive ability of the model by adding attachment information, such as path information [12–14], text description [15–17], type features [18, 19], etc. In this paper, we do not discuss models that use additional information. The main goal of the knowledge graph embedding model is to represent entities and relations through low-dimensional vectors. In the rest of this section, we mainly introduce the related work of knowledge graph embedding and mainly divide the knowledge graph embedding into three types: Translation models, Multiplication models, and Network models. We show these three aspects’ details and standard models and introduce their connection to the HPRE model.
Details of several knowledge graph embedding models
Details of several knowledge graph embedding models
HPRE stands out from other models in the link prediction task due to several key differences that provide it with distinct advantages. Unlike traditional translation and multiplication models, HAKE explicitly considers the hierarchical structure present in knowledge graphs. HAKE incorporates the hierarchical relations between entities, enabling it to capture fine-grained and high-level semantic information. This hierarchy-aware approach enhances the representation power of HAKE and improves its ability to model complex relationships. Compared with the HAKE model, HPRE explicitly models the interactions between entity pairs. HPRE captures pairwise relationships between entities. In contrast, the HAKE model does not explicitly model entity pairs and their interactions. Meanwhile, HPRE generates relation-specific representations for each entity and relation type. By learning distinct embedding for different relations, HPRE can capture the unique characteristics associated with each relation. HPRE enables the model to distinguish between different relation types and make more accurate predictions. Therefore, compared to the HAKE model, model HPRE exhibits more advantages in handling complex relational patterns, particularly many-to-many relationships. Compared to neural network models, the HPRE model exhibits stronger interpretability, superior performance, and faster execution speed. Overall, the HPRE model stands out from other models in the link prediction task by incorporating hierarchy-aware embeddings, joint entity-relation embedding, and achieving state-of-the-art performance. These unique features and advantages make HPRE a promising approach for knowledge graph completion and link prediction tasks.
In this section, we present the implementation details of the HPRE model. The HPRE model comprises two main components: the paired relation part and the hierarchy aware part. We begin by introducing the problem formulation and providing specific definitions used in this paper. Subsequently, we present a comprehensive description of the two model components and elaborate on the fusion process employed to integrate these components. Next, we present the loss function of the HPRE model, outlining its formulation and significance. Finally, we empirically demonstrate the model’s efficacy in capturing relation patterns, handling multiple relations types, and leveraging hierarchical features through mathematical formulas.
Problem formulation and notations
The link prediction task is crucial in evaluating knowledge graph representation embedding. This task aims to predict the miss triples in the knowledge graph by leveraging the existing triples. To address the incompleteness of the knowledge graph, we employ standard formulas such as (h, r, ?) or (? , r, t). We consider the set of triples in the knowledge graph as positive samples, while the triples that do not exist in the knowledge graph form the negative samples. In this paper, we represent the entity sets as
To leverage the hierarchical features present in knowledge graphs, we propose the HPRE model. The HPRE model comprises two key components: the paired relation and hierarchy-aware parts. This model represents the head entity embedding, relation embedding, and tail entity embedding by
Figure 3 illustrates the overall architecture of the model, showcasing its components and flow. The input layer primarily comprises the embedding vectors of the head entity, the relation, and the tail entity. Pink dots denote the head vector, red dots denote the relation vector, and orange dots denote the tail entity vector. The model is partitioned into the paired relation and hierarchical aware parts. The embedding vector of the head entity and the embedding vector of the tail entity are split into two separate vectors, which serve as inputs to the paired relation part and hierarchical aware part, respectively. Similarly, the embedding vector of the relation is divided into three vectors, with two of them being input to the paired relation part and one to the hierarchical aware part. The paired relation part effectively handles multiple relation types and patterns, while the hierarchical aware part accurately captures and simulates hierarchical features of relations. The model derives the final prediction result by incorporating these two parts in the calculation.

Illustration of the architecture of HPRE.
The paired relation is crucial in handling relation patterns and multiple relations. In this part, we map the head and tail entities to a low-dimensional vector space, denoted as.
The anti-symmetry relations described as
The inverse relations described as
The composition relations described as
The paired relation can handle multiple relations, including one-to-many, many-to-one, and many-to-many relationships. During the model training process, it is inherent to separate the relationship’s head entity features and tail entity features. When confronted with multiple relations, it essentially involves associating the head entity (tail entity) with various types of tail entities (head entities) based on the specific relationships. Drawing inspiration from this inherent characteristic, our model splits the relationship into two parts during training, which has been empirically validated as effective through experimentation. The score function of the triples in the paired relation is defined as follows:
The primary purpose of this score function is to assess the scores of triples, assigning high scores to positive samples and low scores to negative samples. Let (h
i
, r
i
, t
i
) represent a true triple in the knowledge base, and (h
j
, r
j
, t
j
) represent a false one. Our objective is to maximize fr
i
,p (
The hierarchy-aware part of the model primarily simulates the hierarchical features present in the knowledge graphs. Within this component, the head embedding and tail embedding represent
In the HPRE model, entities’ hierarchical relationships are captured by utilizing a polar coordinate system. This coordinate system provides a robust framework for representing the hierarchical structure present in the knowledge graph. In the polar coordinate system, entities are represented using the radial and angular coordinates. The radial coordinate is crucial in modeling entities at different hierarchy levels. The radial coordinate captures an entity’s hierarchical depth or distance from a reference point, such as the root of the hierarchy. Entities closer to the reference point have smaller radial coordinates, indicating their higher position in the hierarchy, while entities further away have large radial coordinates, signifying their lower position. On the other hand, the angular coordinate is employed to distinguish entities at the same level of the hierarchy. It enables the model to capture the variations and nuances within a particular hierarchical level. By assigning different angular coordinates to entities within the same level, the HPRE model can represent their unique characteristics within the hierarchy. The corresponding score function is:
where sin(.) is the sine function, the sine function is incorporated into the base score function, enabling it to effectively capture and express various levels of periodicity. The inspiration for incorporating the score function with periodic regularity stems from the pRotatE [6] model, which has demonstrated its efficacy in handling such hierarchical patterns.
The paired relation and hierarchy-aware parts exhibit limitations when applied individually in knowledge graph applications. However, their functionalities are complementary, motivating the integration of both parts in the HPRE model. The HPRE model can simultaneously address relational patterns, multi-relation scenarios, and hierarchy-level features by combining the paired relation part and the hierarchy-aware part. Incorporating relation pairs and hierarchical features enhances the interaction between entities and relations, improving the robustness and performance of the HPRE model. We present the formulation of HPRE as follows:
The composite score function for HPRE captures the complex relationships between entities and relations in the knowledge graph while considering the relational patterns and hierarchy features associated with relations. The score function combines the paired relation part score function and the hierarchy-aware part score function to generate a total score for triple evaluation. The composite score function is defined as follows:
1: compute Initialize;
2: r ← zeros (∥ R ∥ , k) for each
3:
4:
5:
6:
7: /* Sample a minibatch of size b */
8: T batch ← sample (T, b)
9: /* Initialize the set of triples */
10: U batch ← φ
11:
12: /* Sample a corrupted triple */
13:
14: U batch ← U batch ⋃ { ((h, r, t) , (h′, r, t′)) }
15:
16: Score Calculate
17: Update Loss Function
18:
In this section, we present the procedure and results of the experiments, focusing on the evaluation metrics, comparison of results with other models in the link prediction task, and further analysis of the effects of various components of the HPRE model.
Datasets
To evaluate the effectiveness of the HPRE model, we conducted experiments using five distinct datasets: WN18 [4], WN18RR [28], FB15k [38], FB15k-237 [39], and YAGO3-10 [40]. The specific details of these datasets are presented in Table 2. The
Statistics of datasets
Statistics of datasets
When assessing the performance of models in the link prediction task, we commonly use several evaluation metrics. Three famous evaluation metrics are Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hits@N. The Mean Rank metric measures the average rank of the correct answer among all the possible answers. The model predicts a rank for the true tail entity among all possible tail entities for each test triple. We calculate MR by averaging the ranks across all test triples. A lower MR value indicates better performance, meaning the model ranks the correct answer higher. The MR is defined as follows:
The Hits@N metric measure the percentage of test triples for which the true entity is within the top N predicted entities. The Hits@N metric determines how frequently the top N ranks include the correct answer. Typical values for N include 1, 3, and 10. A higher Hits@N value indicates excellent performance. We use
We implemented the HAKE model using the Python programming language and the PyTorch deep learning frameworks. According to our experiments, the optimal hyperparameter settings are as follows: For the FB15k-237 dataset, the dimension of embeddings is 1000; the learning rate is 5e - 4; the hidden dimension and negative sample size are 500 and 1024; the hyperparameters ξ and γ are 1.0 and 11.0; the hyperparameters α and β are 3.5 and 1.0. For the FB15 dataset, the dimension of embeddings is 1000; the learning rate is 5e-4, the hidden dimension and negative sample size are 500 and 1024; the hyperparameters ξ and γ are 1.5 and 17.0; the hyperparameters α and β are 0.8 and 0.3. For the WN18RR dataset, the dimension of embeddings is 1000; the learning rate is 5e-4, the hidden dimension and negative sample size are 1000 and 1024; the hyperparameters ξ and γ are 0.5 and 8.0; the hyperparameters α and β are 0.5 and 0.5. For the WN18 dataset, the dimension of embeddings is 1000; the learning rate is 5e-4, the hidden dimension and negative sample size are 1000 and 1024; the hyperparameters ξ and γ are 1.0 and 8.0; the hyperparameters α and β are 0.5 and 0.5. For the YAGO3-10 dataset, the dimension of embeddings is 1000; the learning rate is 2e-4, the hidden dimension and negative sample size are 1000 and 1024; the hyperparameters ξ and γ are 1.0 and 28.0; the hyperparameters α and β are 1.0 and 0.5.
Main results on link prediction
In the link prediction task, we performed experiments on five datasets and Tables 3–7 presents the HPRE model’s experimental results on these datasets. We employed comparison models DistMult [23], ComplEx [24], ConvE [28], TransE [4], TransH [5], TransR [20], ConvKB [29], CapsE [41], RotatE [6], QuatE [21], ModE [8], HAKE [8], PairRE [11], DensE [43], DualDE [44], and DiriE [45] for evaluation during the experiments. In the following parts, we will provide a detailed analysis of the experimental results.
Link prediction results on the FB15k-237
Link prediction results on the FB15k-237
Table 3 presents the HPRE model’s results on the FB15k-237 dataset. The data in the table demonstrates the favorable experimental outcomes achieved by the HPRE model. A comparative analysis with the PairRE model reveals significant improvements in various evaluation metrics. Specifically, the HPRE model outperformed the PairRE model with notable enhancements of 0.6% on MMR. The effective utilization of the hierarchical features inherent in the knowledge graph contributes to the HPRE model’s superior performance. HPRE demonstrated significant improvements when compared to HAKE. Specifically, it achieved a notable increase of 1.1% on MMR, 0.9% on Hits@10, and 1.9% on Hits@1. Similarly, compared to ModE, HPRE exhibited substantial enhancements with a 1.6% improvement in MMR, 1.7% on Hits@10, and 2.5% on Hits@1. These results highlight the superior performance of the HPRE model in effectively leveraging the relationship patterns embedded within the knowledge graph, thus leading to improved performance when compared to both HAKE models.
HPRE achieved remarkable improvements of 6.3%, 7.3%, and 4.7% on MMR compared to TransE, TransH, and TransR, respectively. Compared to QuatE and RotatE, HPRE demonstrated 0.9% and 1.9% on MMR competitive enhancements, respectively. When compared to DualDE, HPRE exhibited a 2% improvement in MMR. Similarly, HPRE outperformance DensE by 0.6% on MMR. The HPRE model performs excellently due to its effective handling of relational patterns and hierarchical features. Unlike the TransE model, which is limited to handling a single relationship and cannot address complex relationship patterns and hierarchical features, the HPRE model significantly outperforms the TransE model. Other translation models cannot handle hierarchical features except for the HAKE model, resulting in HPRE outperforming most translation models. Furthermore, the multiplication model, which utilizes less relational information, is inferior to the HAKE model, as evident from the data presented in the data.
HPRE also has performance advantages compared to neural network models. Compared with ConvE, HPRE significantly improved by 3.2% on MMR, 5% on Hits@10, 2.3% on Hits@3, and 3.2% on Hits@1. The table shows that the ConKB model outperforms the HPRE model primarily because it utilizes global relationships among the entities and relations embedding. Additionally, we observed that ConvKB effectively generalizes the transitional characteristics in the transition-based embedding model. Our future research will focus on incorporating hierarchical features into convolutional neural networks to improve performance further.
In the FB16k-237 dataset, the HPRE model exhibits a generally lower MR index than most models, ranking higher in correctly predicting triples. However, translation-based models, such as RotatE and QuatE, currently exhibit lower MR indicators than HPRE models due to the slight decrease in model stability observed after adding hierarchical features. The QuatE model represents a more stable expression in a four-dimensional space, which results in a lower MR than the HPRE model.
Table 4 shows the results of the HRPE on the FB15k dataset; HPRE achieved competitive results in various indicators. Compared with the PairRE and RotatE, HPRE achieved great results of 1.3% and 2.7% on MMR, 0.2% and 1.4% on Hits@10, respectively, 0.6% and 2.1% on Hits@3 respectively, 0.8% and 2.7% on Hits@1 respectively. Compared to the HPRE model, the RotatE model exhibits a fitting of the relationship pattern through relational rotation but lacks hierarchical features. The performance of the HPRE is significantly improved compared to the translation model. Compared with TransR, HPRE obtained practical improvements of 21.1% on Hits@10. HPRE employs more sophisticated scoring functions compared to TransR. These functions capture complex relationship patterns and provide a more nuanced data representation. This enhanced modeling approach contributes to the improved performance of HPRE on the Hits@10 metric. The HPRE model achieves a significant improvement of 16.7% on MMR compared to the neural network model ConvE. The HPRE model improved due to its effective handling of relational patterns and integration of hierarchical features, while the ConvE model solely focused on extracting implicit features of entities and relations.
Link prediction results on the FB15k
Table 5 presents the results of the HPRE model on the WN18RR dataset, demonstrating its outstanding performance. The HPRE model achieves excellent results on the WN18RR dataset, benefiting from paired relations that provide additional relation features. The competitive results achieved by the HPRE model, with improvements of 1.3% and 1.7% on MMR compared to DualE and QuatE, respectively, can be attributed to its unique characteristics. The HPRE model effectively handles relational patterns and incorporates hierarchical features, allowing it to capture more nuanced and complex relationships within the knowledge graph. This enhanced representation capability enables the HPRE model to outperform DualE and QuatRE regarding MMR, showcasing its superior performance in link prediction tasks. The significant improvements achieved by the HPRE model, with enhancements of 9% on MMR, 1.8% on Hits@10, and 12.2% on Hits@1 compared to CapsE, can be attributed to the combination of hierarchical feature integration and effective handling of relational patterns in the HPRE model. The CapsE model tends to focus more on extracting implicit features of entities and relations rather than explicitly implicit features of entities and relations rather than explicitly modeling the relationships. The CapsE model can not fully leverage the relationship-specific features and interactions in the knowledge graph. In the WN18RR dataset, the neural network model demonstrates a better generalization capability than the HPRE model, resulting in significantly higher MR values for the latter. The presence of noise or outliers in the WN18RR dataset can affect the performance of different models. If the HPRE model is more sensitive to such instances, it results in higher MR values. On the other hand, the neural network model may be more robust to noise or outliers due to its ability to learn more flexible representations, resulting in lower MR values. The DualE model exhibits higher performance in Hits@10 compared to the HPRE model, while the HPRE model outperforms the DualE model in other evaluation indicators. We attribute this phenomenon to data fluctuations.
Link prediction results on the WN18RR
Table 6 shows the results of the HRPE on the WN18 dataset. Compared with QuatE and DualE, HPRE significantly improved by 0.6% and 0.4% on MMR. Compared with RotatE, HPPR activated improvements of 0.9% on MMR. Compared with DistMult and ComplEx, the HPRE obtained signification improvements of 15.9% and 1.5% on MMR, respectively, and 1.4% and 1.3% on Hits@10. DistMult and ComplEx models utilize simple scoring functions based on inner products or bilinear forms. While these functions can capture fundamental interactions between entities and relations, they may need to capture more nuanced relationships and higher-order interactions. The HPRE model outperforms DistMult and ComplEx due to its effective incorporation of hierarchical features, improved modeling of relation patterns, ability to handle multiple relation types and optimization of the score function. These factors collectively contribute to the superior performance of the HPRE model on MMR and Hits@10 evaluation metrics. In the WN18RR, HPRE exhibits slightly higher MR values than other models. The conclusion attributing the larger MR values exhibited by the HPRE model compared to other models in the WN18RR dataset to the integration of too many features reflects the model’s instability. In future work, we will focus on enhancing model stability to address this issue.
Link prediction results on the WN18
Table 7 shows the results of the HRPE on the YAGO3-10 dataset. Compared with HAKE and ModE, HPRE obtained significant improvements of 0.4% and 3.9% on MMR. Compared with the relation rotate model RotatE, HPRE performance significantly improved by 5.4% on MMR, 4.3% on Hits@10, 4.0% on Hits@3, and 6.7% on Hits@1. Compared with DensE models, HPRE significantly improved performance by 0.8% on MMR. The DensE model, when compared to the HPRE model, exhibits certain limitations. One of the areas for improvement is its inability to effectively capture and utilize hierarchical features o the dense embeddings of entities and relations, neglecting the hierarchical structure in the data. The HPRE and HAKE models were compared on the structural dataset YAGO3-10 to evaluate their performance. The results indicated that there was no significant improvement in performance between the two models. Both models showed comparable performance because we optimized them for the model structure, which led to a need for improvement. On the other hand, the HPRE model showed significantly better performance than the ModE and RotatE modes. The HPRE model’s ability to fit structural features resulted in improved performance. In the YOGO3-10 dataset, the hierarchical model based on HPRE surpasses neural network models by exhibiting significantly higher MR values. The hierarchical model’s superiority over neural network models is a notable advantage of using HPRE-based models.
Link prediction results on the YAGO3-10
The visualization of triple embeddings verified that the HPRE model effectively utilizes the hierarchical features of the knowledge graph. In this section, we conducted a visualization analysis of triple embeddings from three models: RotatE, PairE, and HPRE. We projected these models’ head and tail entities onto a 2D vector space, representing the entity vectors using a rectangular coordinate system. The entity vectors had a dimensionality of 1000. We observed the distribution of these points by mapping the head and tail entities to 1000 points in the 2D space. We observed that the models successfully learned the hierarchical features of the knowledge graph by partitioning the head and tail entity boundaries.
Figure 4 illustrates the visualization results obtained from the WN18RR dataset. Specifically, Fig. 4a, 4d, and 4c present the embedding visualization using the RotatE model. Fig. 4b, 4e, and 4h depict the embedding visualization using the PairRE model. Lastly, Fig. 4c, 4f, and 4i showcase the embedding visualization achieved using the HPRE model. In the figure, the blue dots represent the head entity, while the orange dots represent the tail entity. We project the entity vector onto a coordinate system, and the distribution of points in the coordinate system reflects different hierarchical features of the entity. Notably, since all modulus values are less than 1, a larger radius in the figure corresponds to a sampler modulus. Figure 4a, 4b, and 4c showcase the triple (syngnathidae, _member_meronym, syngnathus) from diverse viewpoints. The distribution of points in Fig. 4a and 4b must exhibit a clear distinction. However, in Fig. 4c, a clear differentiation between points representing the head and tail entities can be observed. Specifically, entity “syngnathidae” exhibits higher hierarchical features under relation “_member_meronym” than entity “syngnathus.” Consequently, in Fig. 4c, the tail entity “syngnathus” is located closer to the center. In contrast, the head entity “syngnathidae” is positioned farther away. Fig. 4d, 4e, and 4f illustrate the triple (advantageous, _similar_to, meanwhile), emphasizing the relationship under relation “_similar_to” . In this scenario, the head entity “advantageous” and the tail entity “meanwhile” demonstrate an equivalent hierarchical level. There is no prominent differentiation in the uniform distribution of the points representing the head entity “advantageous” and the tail entity “meanwhile” . Figure 4g, 4h, and 4i present the triple (inoculate, _hypernym, stick), providing insights into its characteristics. The distribution of points in Fig. 4g and 4h requires differentiation. However, in Fig. 4i, there is a clear distinction between the points representing the head entity and those representing the tail entity. Under relation “_hypernym” , the hierarchical features of “inoculate” are low compared to those of “stick” . Consequently, in Fig. 4i, the head entity “inoculate” is positioned closer to the center. In contrast, the tail entity “stick” is situated farther away. These figures demonstrate the ability of HPRE to capture hierarchical semantic features accurately. Conversely, distinguishing hierarchical semantic features proves challenging in the RotatE and PairE models, as different entities’ hierarchies are not easily discernible.

Visualization of the embedding of triples from the WN18RR dataset.
Convolutional neural networks can fit contextual features and capture implicit relationships. Specifically, CNNs aggregate local information to obtain general information and capture implicit semantic features from context. We propose that the HRPE model, which captures the knowledge graph’s hierarchical features, is a tree-like feature constituting an explicit structure. In a knowledge graph, hierarchical relationships refer to the relationships between entities in a tree-like structure where each entity is a child or a descendant of a parent entity. On the other hand, semantic or contextual feature relationships in a knowledge graph refer to the relationships between entities based on their meaning or contexts. In summary, hierarchical relationships in a knowledge graph are based on a fixed classification hierarchy, while semantic relationships are based on similarities, associations, or co-occurrences between entities.
In order to demonstrate the advantages of the HPRE model in multi-relational types of 1-to-1, 1-to-N, N-to-1, and N-to-N, we verified the model’s performance on multi-relations. Figure 5 shows the performance comparison between PairRE, HAKE, and HPRE on multi-relations. As can be seen from the figure, HPRE significantly outperforms other models in multiple relation types. At the same time, to observe the effectiveness of HPRE for multi-relation in more detail, we verified the performance of specific relationships under different relationship types. Table 8 shows the results of the HPRE model in specific relations. The superior performance of HPRE in relational types is mainly due to the effective learning of the vectors represented by paired relations. HPRE achieved good results in a specific relationship compared to PairRE and HAKE. The HPRE model outperforms the HAKE model in complex relationships because it represents relation vectors as paired. Experimental suggests that the HPRE model excels in multi-relational patterns.

Performance comparison between RotatE, PairRE, and HPRE.
Hits@10 and MR on some 1-to-1 1-to-N, N-to-1, N-to-N relations in FB15K-237
In order to verify the effectiveness of the HPRE model on the relation patterns, we visualize the relation vector. Figure 6 shows the distribution of each entry of the relational vector. Figure 6a shows the symmetry relation _similar_to. It can be seen that the obtained vectors are approximately symmetric. Figure 6b shows the properties of inverse relations. We use _hypermym-1 to denote the inverse relation of _hypermym, and we plot _hypermym and _hypermym-1 distributions. It can be seen that the image distributions of the two relations are approximately symmetric. This phenomenon shows that HPRE can effectively distinguish inverse relations. Figure 6c shows the properties of composition relation. In the Fig. 6c, winner represents relation /award/award_category/winners./award/award_hon-or/award_winner, for 1 represents relation/award/award_nominee /award_nominations/award/saward_nomination/nominated_for, and for2 represents relation /award/award_category/nominees./award/award_nomination/nominated_for. ⊗ denotes the composition operation. In the FB15k-237 dataset, for 2 is a composition of for 1 and winner. It can be seen from Fig. 6c that the vector entry distribution is symmetrical, and the composition of for 1 and winner is very similar to for 2 .

Histograms of angles corresponding to some relation embeddings.
In this research paper, we calculated the operation time of different operators in the model and counted the parameter quantity of the model using the fvcore 1 toolkit. Table 9 displays the results. The table indicates that the RotatE model has the shortest running time, whereas the HPRE and HAKE models have the same running time. The RotatE model has the smallest parameter quantity in terms of parameter quantity. Regarding performance, the HPRE model outperforms the RotatE model, albeit at the cost of slower running speed. In future work, we will focus on improving the running speed of the HPRE model without compromising its accuracy.
Running time and parameters
Running time and parameters
Our research proposed a knowledge graph embedding approach for link prediction, utilizing a multi-dimensional space to model various relation types, patterns, and hierarchical features. The HPRE model, which consists of two main components, the paired relation part, and the hierarchy aware part, plays a crucial role in achieving our goals. In the paired relation part, we leverage paired relations to effectively capture multiple relation types and discern relation patterns within the knowledge graph. On the other hand, the hierarchy aware part of HPRE simulates the hierarchical characteristics of different entities by assigning them specific angles within a polar coordinate space. To facilitate the integration of these components, we introduced a composite score function explicitly tailored for the HPRE model. We conducted extensive experiments to validate the effectiveness of our proposed model, HPRE. The experimental results demonstrated that our model outperforms most current models for link prediction. Moreover, our model can simultaneously handle multi-relation types, relation patterns, and hierarchical features within a unified framework.
In the future, we plan to enhance the capabilities of our model by incorporating first-order logic rules to encode more intricate relationships within the knowledge graph. First-order predicate logic allows us to define and manipulate variables, constants, predicates, and quantifiers. Using first-order predicate logic, we can capture intricate patterns, dependencies, and constraints within a knowledge graph or domain. By combining the expressive power of first-order logic with the flexibility and representation learning capabilities of neural network models, we aim to achieve a more comprehensive understanding of complex relationship patterns and improve the performance of our model accordingly.
Footnotes
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61976032).
Competing interests
The authors have no competing interests to declare that are relevant to the content of this article.
