A faster and less aggressive algorithm for correcting conservativity violations in ontology alignments

Abstract

Ontologies are computational artifacts that model consensual aspects of reality. In distributed contexts, applications often need to utilize information from several distinct ontologies. In order to integrate multiple ontologies, entities modeled in each ontology must be matched through an ontology alignment. However, imperfect alignments may introduce inconsistencies. One kind of inconsistency, which is often introduced, is the violation of the conservativity principle, that states that the alignment should not introduce new subsumption relations between entities from the same source ontology. We propose a two-step quadratic-time algorithm for automatically correcting such violations, and evaluate it against datasets from the Ontology Alignment Evaluation Initiative 2019, comparing the results to a state-of-the-art approach. The proposed algorithm was significantly faster and less aggressive; that is, it performed fewer modifications over the original alignment when compared to the state-of-the-art algorithm.

Keywords

Ontology ontology alignment semantic interoperability

1. Introduction

Studer et al. (1998) defined ontology as a “formal and explicit specification of a shared conceptualization”. In this definition, “formal” means that it should be machine-readable, “explicit” reflects that the concepts and constraints on their use must be defined explicitly, “shared” means that the knowledge contained in an ontology is consensual among a group of people and it is not private to some individual and, finally, “conceptualization” refers to an abstract model of a portion of reality. Thus, ontologies specify well-defined meaning for the information used by some community. It is often the case that several communities build distinct ontologies for domains that are related or even the same. Frequently, it is useful to integrate information from these separate ontologies, especially in distributed contexts such as the Semantic Web. In order to integrate two ontologies, it is necessary to find mappings between concepts from both ontologies. The process of finding such mappings is called ontology matching, and the result of this process is an ontology alignment.

Although there are several distinct approaches to match ontologies, none is free of errors and ontology alignments frequently lead to unintended consequences. Among the most common consequences of flawed alignments is the violation of the conservativity principle. The conservativity principle states that merging two ontologies through an alignment should not introduce new relations of subsumption or of equivalence between concepts originating from the same source ontology. A subsumption relation holds between two concepts c and d if and only if every instance of d is also an instance of c – thus, we say that c subsumes d. An equivalence relation holds between two concepts if they share all their instances in every possible state of affairs, i.e., c is equivalent to d if and only if every instance of c is also an instance of d and every instance of d is an instance of c. Thus, equivalence implies mutual subsumption and is implied by it. Therefore, c is equivalent to d if and only if both c subsumes d and d subsumes c.

If an alignment violates the conservativity principle, the merged ontology may not preserve the original meanings of the concepts specified by the source ontologies. For example, if an alignment matches the concepts person, client and company from two ontologies A and B, where ontology A specifies that person subsumes client (that is, every client is a person) and ontology B states that client subsumes company (i.e., every company is a client), a query for person in the merged ontology will return every instance of company in the knowledge base, which does not reflect the results of the same query neither in ontology A nor in ontology B separately. That is, the merged ontology does not agree with the intended meaning of either source ontology, i.e., it is incompatible with their conceptualizations.

This disagreement indicates that the alignment is matching concepts with divergent meanings in each ontology. Such matches occur due to terminological or structural similarities between the concepts, which lead the domain expert or the matching algorithm to believe their meaning is the same even when it is not. Figure 1 depicts an abstraction of this problem. An alignment may cover any portion of the intersection between ontologies A and B. However, a large part of the intersection of the ontologies’ terminologies do not account for their conceptualizations, i.e., the intended meaning of the terms. The goal of this work is to improve ontology alignments by removing incorrect matches that originate from this false agreement. The violations of the conservativity principle are evidence of such incorrect matches.

Fig. 1.

False agreement between ontologies A and B.

In the previous example, the concept client in ontology A refers specifically to personal clients, i.e., persons that are clients of some company, while in ontology B the concept client refers to a broader range of entities, including corporate clients, that is, companies which are clients of other companies. Further, in ontology B the concept company actually refers only to corporate clients, excluding suppliers, for example, while the concept company in ontology A does not have this restriction. Therefore, concepts client and company do not have the same meaning in both ontologies. The introduction of new subsumption relations by the alignment, which violates the conservativity principle, evidences this fact.

Another frequent kind of consequence from flawed alignments is the inconsistency of the merged ontology, that is, the presence of concepts that cannot be satisfied. While inconsistency reflects logical flaws in the alignment, conservativity violations reflect mismatching of meaning. That is, inconsistency rises from contradictory axioms in the merged ontology, regardless of the meaning of the concepts being the same. Conservativity violations, on the other hand, rise from conflicting sets of subsumption and equivalence relations. This conflict reveals disagreements between the meaning of concepts as expressed in the ontologies, regardless of the effect these sets have on the satisfiability of the resulting axioms.

As such, conservativity violations are harder for the user to perceive, and much of their danger lies in the fact that the resulting ontology upholds a divergent view of the world, a fact that is not immediately clear for the users. For the same reasons, non-conservative alignments are even more common than inconsistent alignments. The present work deals only with the problem of detecting and correcting conservativity violations. Consistency issues fall outside the scope of this work.

It should be noted that there are distinct perspectives on conservativity violations, regarding whether such violations indicate flaws in the alignment or in one or both source ontologies (Solimando et al., 2017). The latter interpretation holds that the introduction of new subsumption relations in the merged ontology is evidence of missing subsumption relations in the source ontologies. This interpretation is adopted by Lambrix and Liu (2013) and Ivanova and Lambrix (2013). We adopt the perspective that violations of conservativity denote a mismatching of the aligned entities, since separate ontologies are usually designed with distinct outlooks on the world, which will culminate in conflicting sets of subsumption relations – a fact which is disregarded if we consider conflicts as flaws in the ontology itself. On the contrary, we emphasize this consideration by understanding the conflicting subsumption relations as signs of a defective alignment.

Another important aspect of conservativity violations is that there are two types of violations, violations of subsumption conservativity and violations of equivalence conservativity. Violations of subsumption conservativity occur when the alignment introduces new subsumption relations between concepts originating from the same source ontology. Violations of equivalence conservativity occur when the alignment introduces new equivalence relations between concepts originating from the same source ontology. Alignments may set forth new equivalence relations due to either introducing circular chains of subsumption relations or to mapping two or more entities from one of the source ontologies to a single entity in the other.

The algorithm presented in this paper is an evolution of the one proposed by Antunes et al. (2019). Our algorithm corrects both kinds of conservativity violations, i.e., violations of equivalence and of subsumption conservativity. Since equivalence implies (and is implied by) mutual subsumption, the algorithms do not explicitly differentiate violations of subsumption conservativity from violations of equivalence conservativity, even though other approaches deal differently with violations of these two distinct kinds, such as in the work of Solimando et al. (2014b, 2017).

The remainder of this paper is organized as follows. Section 2 presents and discusses related work on the correction of conservativity violations in ontology alignments. Section 3 describes the proposed algorithm. We present a theoretical time-complexity analysis in Section 4 and sketch an analysis of correctness in Section 5. Section 6 discusses the evaluation of the algorithm against reference datasets from the Ontology Alignment Evaluation Initiative (OAEI) and a comparison with a state-of-the-art algorithm. Finally, Section 7 contains a brief conclusion.

2. Related work

The conservativity of ontology mappings appears in the work of Meilicke (2006) under the name of “stability”, along with an algorithm that checks stability in mappings from individual ontologies into the merged ontology. The check is conducted by selecting every concept in the local ontology and verifying if the set of superclasses is the same in the local context (i.e., inside the source ontology) and in the distributed context (in the merged ontology). If the sets are different for at least one concept, the mapping is not stable. However, the algorithm does not correct the detected violations.

In the ASMOV algorithm (ASMOV stands for Automated Semantic Merging of Ontologies with Verification), proposed by Jean-Mary et al. (2009), the semantic verification of ontology alignments is conducted by iteratively selecting two pairs of matched entities and verifying several kinds of inference. The algorithm verifies the occurrence of multiple-entity correspondences (introduction of equivalence relations due to the matching of a single entity to several), crisscross correspondences (introduction of equivalence due to the matching of pairs of entities with inverse subsumption relations in each ontology), disjointness-subsumption contradiction (introduction of subsumption between disjoint entities), subsumption and equivalence incompleteness (introduction of subsumption and equivalence in general), and domain and range incompleteness (mismatch of the domains of matched properties). Excluding domain and range incompleteness, the checks conducted refer to violations of the conservativity principle.

The technique used by Jiménez-Ruiz and Grau (2011) checks only for violations of equivalence conservativity by verifying if the alignment maps directly two different concepts from one source ontology to a single concept in the other. To the extent of our knowledge, this is the only existing approach for correcting conservativity that does not require the merging of the aligned ontologies. However, it also does not account for violations of subsumption conservativity, and ignores the cases when violations of equivalence conservativity arise indirectly from the inclusion of circular subsumption chains.

In the work of Lambrix and Liu (2013), a network of aligned ontologies is debugged by computing a merged ontology over the network and checking if every subsumption relation in the merged ontology between a pair of concepts from the same source ontology also exists in the source ontology. Their approach assumes the opposite perspective from the one adopted in the present paper, that is, it assumes that the violations detected indicate missing subsumption relations in the source ontologies. Therefore, their algorithm corrects the violations through the inclusion of new subsumption relations in the original ontologies. The method conducted this correction by selecting, for each missing subsumption relation “a subsumes b”, a concept c that subsumes a and does not subsume b and a concept d subsumed by b that is not subsumed by a and creating a new subsumption relation “c subsumes d”. The algorithm presents all possible corrections to the user, who then selects the correction of his choice. Ivanova and Lambrix (2013) modified this approach, in order to acknowledge that at least some conservativity violations are indications of flaws in the alignments, by submitting the detected violations to a domain expert for validation.

Solimando et al. (2014a) first extract locality-based modules from both the aligned ontologies and encode the modules as Horn propositional theories and extend such theories with disjointness axioms. By following the assumption of disjointness (i.e., that all concepts that do not share subsumees are disjoint), the problem of detecting subsumption conservativity violations is reduced to the detection of unsatisfiable concepts in the Horn clauses. However, this assumption is not reasonable, since many ontologies accept the instantiation of multiple concepts without a common subsumee, mainly due to the combinatorial explosion of concepts whose explicitation would be otherwise necessary. Take as example a small ontology with concepts man, woman, teenager, adult, elderly, student, artist, retail worker, farmer. In order to make explicit every possible combination of non-disjoint concepts, even if we assume that the sets {man, woman}, {teenager, adult, elderly} and {student, artist, retail worker, farmer} are each pairwise disjoint, would require the inclusion of at least 61 new concepts, including concepts such as teenager woman, elderly farmer and adult male artist. This means that the small ontology would grow more than five times. If we do not assume that the concepts in each set are disjoint, this number grows even bigger. Another important aspect of the mentioned work is that locality-based modules are extracted from the aligned ontologies and codified as Horn propositional theories prior to its extension with additional disjointness axioms and the detection of unsatisfiable concepts.

Solimando et al. (2014b, 2017) extended this approach to cover both type of conservativity violations by using it in conjunction with an algorithm that detects violations of equivalence conservativity through searching for loops in the ontologies represented as directed graphs, where nodes are concepts and edges are subsumption relations. The algorithm proposed was implemented as an extension of the LogMap ontology matching and mapping repair system (Jiménez-Ruiz and Grau, 2011).

Table 1 summarizes the reviewed work. As we have seen, most approaches to the validation of conservativity in ontology alignments rely on merging the aligned ontologies. The work of Solimando et al. (2014a,b, 2017) innovates from anterior work by not iterating over all entities in the ontologies, but still requires heavy reasoning tasks after merging the ontology modules. The two last rows list our previous work (Antunes et al., 2019), which will be detailed in next section, and the current paper. Our techniques do not require merging any portion of the ontologies, and both iterate only over the entities present in the alignment. However, the previous approach was limited to detecting violations without correcting them. The next section presents our current approach in detail.

Table 1
Comparison of alignment conservativity validation approaches

Approach Computes repair Requires merge Violation as flaw in Automatic Extract modules

Meilicke (2006) No Yes Alignment Yes No

Jean-Mary et al. (2009) Yes Yes Alignment Yes No

Jiménez-Ruiz and Grau (2011) 1 Yes No Alignment Yes Yes

Lambrix and Liu (2013) Yes Yes Ontologies Semi No

Ivanova and Lambrix (2013) 2 Yes Yes Either Semi No

Solimando et al. (2014a) 3 ^, 4 Yes Yes Alignment Yes Yes

Solimando et al. (2014b, 2017) 4 Yes Yes Alignment Yes Yes

Antunes et al. (2019) No No Alignment Yes No

This paper Yes No Alignment Yes No

Approach	Computes repair	Requires merge	Violation as flaw in	Automatic	Extract modules
Meilicke (2006)	No	Yes	Alignment	Yes	No
Jean-Mary et al. (2009)	Yes	Yes	Alignment	Yes	No
Jiménez-Ruiz and Grau (2011) 1	Yes	No	Alignment	Yes	Yes
Lambrix and Liu (2013)	Yes	Yes	Ontologies	Semi	No
Ivanova and Lambrix (2013) 2	Yes	Yes	Either	Semi	No
Solimando et al. (2014a) 3 ^, 4	Yes	Yes	Alignment	Yes	Yes
Solimando et al. (2014b, 2017) 4	Yes	Yes	Alignment	Yes	Yes
Antunes et al. (2019)	No	No	Alignment	Yes	No
This paper	Yes	No	Alignment	Yes	No

Only considers violation of equivalence conservativity.

The user defines whether each violation indicates a flaw in the alignment or in the ontologies.

Only considers violation of subsumption conservativity.

⁴

Follows the assumption of disjointness.

Additionally, while some authors regard conservativity violations as flaws in the aligned ontologies, the largest trend is to acknowledge them as flaws in the alignment itself. We follow this trend and take a “better safe than sorry” approach, aiming to remove every violation from the alignment.

3. Our approach

In a previous paper (Antunes et al., 2019), we analyzed conservativity violations under category theory, a branch of mathematics that studies the structure in systems of composable relations. That work defined a category of ontologies where such relations are subsumption-preserving total mappings between ontological structures, and an alignment between ontologies A and B is a structure with two mappings $f_{A} : V \overset{}{\to} A$ and $f_{B} : V \overset{}{\to} B$ , where equivalent entities in A and B are mapped from the same entity in V. Figure 2 depicts one such alignment, where entity c in ontology A is matched to entity g in ontology B and d is matched to h. We note that this formalization suffices only for mappings of equivalence between entities in the aligned ontologies, and is not enough to represent other types of mappings (e.g., an alignment stating that entity e in A subsumes entity g in B). From this analysis, the paper proposed an algorithm for detecting conservativity violations in ontology alignments by computing, for each aligned ontology, the range ontology that contains every entity in the alignment and every subsumption relation between them in the respective ontology. Then, the algorithm checks if there are mappings between the range ontologies that preserve both subsumption and the original mappings, that is, for any entity x in V, $f_{A} (x)$ should be mapped to $f_{B} (x)$ and vice versa. The approach described in Antunes et al. (2019), however, does not include a method for correcting the detected violations.

Fig. 2.

Alignment between ontologies A and B.

In the remainder of this section, we present our current approach for correcting conservativity violations in ontology alignments. The category-theoretic formalization guides the development of our approach, understanding alignments as pairs of mappings and conservativity as the existence of subsumption-preserving mappings between the range ontologies. Essentially, the algorithms search the category of ontologies for an object with an injective mapping into the original alignment, i.e., a “subset” of the alignment, such that the composition of this injective mapping with the original mappings (from the alignment and into the ontologies) yields a new alignment for which the subsumption-preserving mappings between the range ontologies do exist.

In the scope of the proposed algorithm, we formalize ontologies as pairs (E,S) of a set of entities E (which contains both concepts and relations among them) and a reflexive, transitive and antisymmetric relation $S \subseteq E \times E$ that denotes subsumption between entities.

The implementation of the algorithms deals with OWL 2 ontologies. In this context, the set E in the formalization is actually the union of the set of classes and properties in the OWL ontology, and S is the union of the sets of subclass and subproperty relations, both declared and inferred. We utilize the function “ancestors”, that returns the set of superclasses or superproperties for classes and properties respectively. In this manner, the algorithm is able to give a single treatment for both classes and properties, although in reality they constitute disjoint sets. Thus, we present both in a single set for simplification.

Following our previous work, first it is necessary to build the taxonomy of the concepts for each of the range ontologies A’ and B’ that include all aligned entities (and no other entity) and every subsumption relation between them in A and B respectively. These range ontologies are rather “copies” of the alignment, extended to include the subsumption relations in the respective ontology. Therefore, their sets of entities are isomorphic to the set of entities in V. For simplification, we will assume $E_{V} = E_{A^{'}} = E_{B^{'}}$ .

Even when the alignment maps two entities c and d into a single entity e in A (or in B), the two entities will appear in the respective range ontology – with additional subsumption relations c subsumes d and d subsumes c, given that subsumption is a reflexive relation and therefore e subsumes e. Similarly, if the alignment maps c and d respectively to e and f, such that an equivalence relation holds between e and f, the range ontology will include relations c subsumes d and d subsumes c, since equivalence implies mutual subsumption (i.e., e is equivalent to f if and only if e subsumes f and f subsumes e).

Once we have the taxonomies built, still according to Antunes et al. (2019), the existence of a subsumption preserving mapping between the range ontologies determines the conservativity of the alignment. Since the sets of entities in the range ontologies are isomorphic, such mapping relies on compatible subsumption relations in the range ontologies. Thus, the method detects conservativity violations by finding discrepancies between the subsumption relations from the range ontologies, i.e., pairs $(e_{1}, e_{2})$ such that $(e_{1}, e_{2}) \in S_{A^{'}} \oplus (e_{1}, e_{2}) \in S_{B^{'}}$ , where ⊕ denotes exclusive logical disjunction – that is, pairs for which a subsumption relation holds in only one of the aligned ontologies.

Therefore, in order to correct the detected conservativity violations it is necessary to remove from the alignment V a set of entities comprehensive enough so that no subsumption relation between a pair of aligned entities occur in a single aligned ontology. We propose to model the offending subsumption relations as edges in a graph whose nodes are the aligned entities. Thus, all vertex covers for the graph are suitable sets of entities to be removed from the alignment for it to become conservative.

Problem statement. Given a graph G such that:

The nodes of G are the entities in the alignment, and

For any two nodes a and b, (a,b) is an edge if and only if $(a, b) \in S_{A^{'}} \oplus (a, b) \in S_{B^{'}}$ ,

Find a vertex cover C for G, i.e., a set of nodes such that for every edge in G at least one of its endpoints is in the cover, defining

$c_{i} = 1$ if $i \in C$ , 0 otherwise

Goal function. $\begin{array}{l} (1) & \forall (a, b) \in edges, c_{a} + c_{b} ⩾ 1 \\ (2) & minimize \sum_{n \in nodes} c_{n} \end{array}$

It is preferable that the chosen set were as small as possible, in order to minimize the difference between the original alignment and the corrected alignment and preserve the largest possible amount of information. However, finding the minimum vertex cover of a graph is an NP-complete problem, as proved by Karp (1972), and no efficient algorithm exists to fulfill such a task.

Nevertheless, the minimum vertex cover can be approximated by a factor of 2 by a simple greedy algorithm described by Papadimitriou and Steiglitz (1998). Papadimitriou’s algorithm finds a cover that is at most twice the size of the minimum vertex cover by choosing edges from the graph, removing both endpoints from the graph and inserting them into the cover until no edges are left. While slightly better approximations exist, such as the one proposed by Karakostas (2005), our approach takes advantage of the greedy algorithm by exploring the fact that the graph is not given as input, but must be constructed from the alignment. Thus, we compute the cover at the same time as building the graph. Whenever the algorithm, during the graph construction process, would introduce an edge in the graph, it instead removes both endpoints from the graph and includes them in the cover.

Figure 3 shows a diagram with our approach’s main steps. First, the algorithm builds the graph from the alignment (step 1), selecting edges and removing both endpoints from the graph (step 2) while including those nodes in the vertex cover until no edge remains (step 3). We perform an additional improvement procedure to reduce the repair size by verifying if reintroducing the removed nodes leads to conservativity violations (step 4) and reinserting the “safe” nodes (step 5). Finally, the computed vertex cover is the set of mappings to be removed from the alignment for it to become conservative.

Fig. 3.

Depiction of the repair computation for a sample alignment.

We present our greedy algorithm detail in Algorithm 1. This algorithm performs steps 1 through 3 from Fig. 3. In the remainder of this work, we will refer to Algorithm 1 as GREEDY. It receives as input the alignment (that is, V and the two mappings $f_{A} : V \overset{}{\to} A$ and $f_{B} : V \overset{}{\to} B$ ) and the concept hierarchies from both input ontologies provided by an OWL 2 reasoner. The algorithm initializes the repair R as an empty set. The sets $t x_{A}$ and $t x_{B}$ contain the taxonomies of subsumption relations in the range ontologies from A and B, respectively. The algorithm also initializes the taxonomies as empty sets. GREEDY iterates over the entities in the alignment (line 4) and, for each entity e, checks its ancestors first in A ( $a \in ancestors (f_{A} (e))$ , line 6) and then in B ( $b \in ancestors (f_{B} (e))$ , line 10), and finds the entities in the alignment that are mapped to each ancestor ( $e_{a} \in f_{A}^{- 1} (a)$ in line 7 and $e_{b} \in f_{B}^{- 1} (b)$ in line 11). Then, the algorithm includes the corresponding subsumption relation in $t x_{A}$ or $t x_{B}$ accordingly ( $(e, e_{a})$ and $(e, e_{b})$ , respectively). Whenever the algorithm finds ancestors in A while building $t x_{A}$ , it includes them in an auxiliary set and later compares them to the ancestors in B after building $t x_{B}$ (line 17). Whenever the algorithm finds a subsumption relation in B while building $t x_{B}$ , it compares them to those in $t x_{A}$ . If one such relation is present in only one of the taxonomies, GREEDY removes both involved entities from the alignment and insert them into the repair (lines 13 and 20).In this case, the algorithm moves on to the next entity that is still in the alignment.

It is noteworthy that this approach is enough to cover violations of equivalence conservativity as well as violations of subsumption conservativity. Since subsumption is reflexive (i.e., every entity subsumes itself), if the alignment maps two entities e and d in V to a single entity in A (i.e., $f_{A} (e) = f_{A} (d)$ ) and into different entities in B, the subsumption relations (e, d) and (d, e) will be present in $t x_{A}$ , while $t x_{B}$ will not contain both matching relations unless $f_{B} (e) = f_{B} (d)$ , that is, unless the alignment also maps e and d to equivalent entities in B. In this manner, the algorithm detects the introduction of a new equivalence relation in B.

Algorithm 1

GREEDY. Greedy algorithm for computing alignment repair

Another important aspect is that, given that GREEDY is interested only in those subsumption relations that hold between entities in the alignment and that it iterates over the entire alignment, it is not necessary to look into both ancestors and descendants of each entity. Instead, it suffices to look into only one of these groups, since, if the algorithm checks only ancestors (respectively, descendants) for each entity, the subsumption relations between the entity and its descendants (ancestors) will be accounted for when the iteration reaches these descendants (ancestors).

As previously discussed, this algorithm yields an approximation of the minimum repair necessary for removing the conservativity violations from the alignment. Since GREEDY is built based on the algorithm analyzed by Papadimitriou and Steiglitz (1998), the approximation may be up to twice the size of the minimum repair, which frequently leads to extremely aggressive repair strategies.

In order to consider this and reduce the impact of the repair, we include a posterior step of repair improvement. Algorithm 2 depicts this step, corresponding to steps 4 and 5 of Fig. 3. We will refer to Algorithm 2 as IMPROVE. The IMPROVE algorithm iterates over the entities in the original repair ( $e \in R$ ), i.e., the entities removed from the alignment, and checks if reinserting them would lead to conservativity violations.

Algorithm 2

IMPROVE. Repair improvement algorithm

Since IMPROVE does not iterate over all the entities in the alignment, it is necessary to check both ancestors (lines 4 and 13) and descendants (lines 9 and 20) of each entity in order not to reintroduce any conservativity violation. IMPROVE checks every entity in the original repair to verify if reinserting it in the alignment indeed leads to conservativity violations and inserts those that do in the new repair. Thus, while GREEDY includes two entities at a time in the repair, IMPROVE includes only one. Apart from these two key differences, the algorithms are very similar.

It is essential to note that when IMPROVE takes an entity from the previous repair and reinserts it into the alignment, it also includes all subsumption relations in which that entity is involved in the taxonomies $t x_{A}$ and $t x_{B}$ . Thus, when the algorithm conducts the verification for the next entity in the repair, it takes into consideration all subsumption relations between it and any other entity previously introduced by IMPROVE. Therefore, the algorithm will verify all interactions between the mappings being reinserted and guarantee not to introduce new conservativity violations in this manner.

It is also worth pointing out that for every edge in the graph that share no endpoint with other edges, IMPROVE puts only one of the connected nodes in the repair, while GREEDY puts both. In the case where no edges share nodes, which is the worst-case scenario for GREEDY where the computed vertex cover size is two times the number of edges in the graph, the improvement provided by IMPROVE reduces the repair size to its half, i.e., the minimum possible repair. In the empirical evaluation of the algorithms (Section 6), this was the case for all alignments in the Conference dataset. While for many other cases the reduction will not be so significant, it is enough to state that the resulting repair is always smaller than twice the size of the minimum.

4. Time-complexity analysis

The for loop in line 6 in Algorithm 1 (GREEDY) iterates over each ancestor of an entity in one of the input ontologies and contains a nested for loop (line 7) that runs over the entities in the alignment that are mapped to said ancestor. Since each entity in the alignment is mapped to exactly one entity in each ontology, the total number of iterations considering the two nested loops may not be greater than the number of entities in the alignment. This property is expressed in Eq. (3), where f is the function that maps the entities in the alignment to the entities in the aligned ontology and $| f^{- 1} (a) |$ denotes the number of entities in the alignment mapped to a. Since ontologies are usually not densely connected, the actual number of iterations is normally much smaller. Nevertheless, the upper bound for the time complexity of the loop is $O (n)$ , where n is the size of the alignment. $\begin{matrix} (3) & \forall e \in V, \sum_{a \in ancestors (e)} | f^{- 1} (a) | ⩽ | V | \end{matrix}$

Two other loops follow this previous one with similar structure (line 10 and line 17), and the three loops are inside of the main loop (beginning on line 4). The main loop iterates over the entities in the alignment – that is, it runs n times. Thus, the algorithm time complexity is asymptotically bound by $O (n^{2})$ .

IMPROVE has a similar structure, but with the duplication of the internal loops in order to account for the entity’s descendants as well as the ancestors. Since the new loops are included sequentially and present the same complexity limits as the old ones, they do not alter the run-time complexity.

5. Correctness analysis

We have asserted that, in order for an alignment to violate the conservativity principle, there must exist, for each violation, a pair of aligned entities $e_{1}, e_{2} \in V$ such that a subsumption relation between those entities holds in one of the aligned ontologies but not in the other, that is $(e_{1}, e_{2}) \in S_{A^{'}} \oplus (e_{1}, e_{2}) \in S_{B^{'}}$ .

In the context of Algorithms 1 and 2, the subsumption relations between the aligned entities in each ontology are stored in the taxonomies $t x_{A}$ and $t x_{B}$ . Thus, for the resulting repaired alignment to violate the conservativity principle, either there must remain a pair of entities out of the repair for which a subsumption relation holds in $t x_{A}$ or in $t x_{B}$ but not in both, or for which a subsumption relation holds in ontology A or in ontology B and the subsumption relation was not stored in the respective taxonomy.

Property 1.
At the end of execution, there is no pair of entities $(e_{1}, e_{2})$ such that $e_{1} \notin R \land e_{2} \notin R$ for which $(e_{1}, e_{2}) \in S_{A^{'}} \land (e_{1}, e_{2}) \notin t x_{A}$ or $(e_{1}, e_{2}) \in S_{B^{'}} \land (e_{1}, e_{2}) \notin t x_{B}$ .
Property 2.
At the end of each iteration, there is no pair of entities $(e_{1}, e_{2})$ such that $e_{1} \notin R \land e_{2} \notin R \land (e_{1}, e_{2}) \in t x_{A} \oplus (e_{1}, e_{2}) \in t x_{B}$ .

Algorithm 1 (GREEDY) iterates over every entity in the alignment and every ancestor of each entity, reaching every subsumption relation holding between aligned entities. Since the algorithm includes the corresponding subsumption relation in the taxonomy, there can be no relevant subsumption relation in the ontologies that is not in the taxonomies. The only cases when the algorithm does not store the subsumption relation is when the corresponding relation does not exist in the other taxonomy (that is, in lines 13 and 20), and in those cases the algorithm immediately includes the involved entities in the repair, ensuring Property 1. For these same reasons, there can also be no pair of entities involved in such a relation that was not included in the repair, guaranteeing Property 2.

Similarly, Algorithm 2 (IMPROVE) iterates over every entity in the repair computed by GREEDY and every ancestor and descendant of those entities. When validating each entity, the algorithm compares all its subsumption relations in each ontology, store the relations in the respective taxonomies (securing Property 1) and include the entity in the repair in all the cases when there is a disagreement between the taxonomies (lines 16, 23, 30 and 35, securing Property 2). As we have also pointed out, this strategy assures that the algorithm will take into account the interactions between all entities from the input repair before reintroducing them.
6. Experimental evaluation

For the empirical evaluation of the algorithm, we used reference alignments from three tracks from the Ontology Alignment Evaluation Initiative 2019.1

¹
http://oaei.ontologymatching.org

The Conference track provides 21 manually created alignments between 7 ontologies from the OntoFarm collection (Zamazal and Svátek, 2017), covering the domain of academic conferences. This dataset contains small ontologies (the biggest has 141 classes) and small alignments (up to 26 mappings). Out of the 21 alignments in the dataset, only 10 present conservativity violations. The following analysis focuses on these 10 alignments and leaves out the remaining alignments. The Anatomy track comprises the Adult Mouse Anatomy (AMA) ontology and a part of the National Cancer Institute (NCI) Thesaurus describing human anatomy (Zhang et al., 2004), along with a reference alignment containing 1516 mappings. Finally, the LargeBio track involves three ontologies with tens of thousands of classes each, the Foundational Model of Anatomy (FMA), the Systematized Nomenclature of Medicine (SNOMED) – Clinical Terms and the NCI Thesaurus. The track also contains three reference alignments based on the Unified Medical Language System (UMLS) Metathesaurus (Bodenreider, 2004) and built by a combination of automatic techniques, expert assessment, and auditing protocols. The reference alignments from OAEI 2019 contain only equivalence mappings, and therefore the limitation of the proposed alignment in dealing with other types of mappings was of no matter to the evaluation performed.

The algorithm was implemented and executed with the reference alignments as input2

The algorithm was implemented in Python 3 with the libraries RDFLib and Owlready2, the latter of which includes a modified version of the HermiT reasoner. All evaluations were performed on an Ubuntu 18.10 desktop computer with an Intel Core i7-7700 CPU @ 3.60 GHz × 8 and 16 GB of RAM. The source code and corrected alignments are available at http://github.com/BDI-UFRGS/conservativity/.

. Table 2 presents the results. The column “input size” displays the size of the original alignment, the columns “GREEDY time (s)” and “GREEDY repair size” present the execution time in seconds of the GREEDY algorithm and the size of the resulting repair, i.e., the number of modifications over the alignment. IMPROVE then receives this repair as input, and the columns “IMPROVE time (s)” and “IMPROVE repair size” display its execution time and the final repair size. Finally, the percentual repair size refers to the ratio of mappings from the input alignment that were altered (in the present case, removed) by the final repair after the improvement step. This last column indicates the actual impact of the computed repair over the alignment. The most aggressive repair computed removed 42.4% of the mappings from the SNOMED-NCI alignment from the LargeBio dataset.

Table 2

Results of the experimental evaluation over alignments from OEAI 2019

Dataset	Ontologies	Input size	GREEDY repair size	IMPROVE repair size	GREEDY time (s)	IMPROVE time (s)	Repair size (%)
Conference	cmt-conference	15	2	1	0.00034	0.00008	6.7
	cmt-confof	16	4	2	0.00035	0.00023	12.5
	cmt-ekaw	11	2	1	0.00024	0.00020	9.1
	cmt-sigkdd	12	2	1	0.00030	0.00010	8.3
	conference-edas	17	2	1	0.00038	0.00013	5.9
	conference-ekaw	25	2	1	0.00057	0.00032	4.0
	conference-iasted	14	2	1	0.00034	0.00013	7.1
	confof-ekaw	20	4	2	0.00048	0.00016	10.0
	edas-ekaw	23	4	2	0.00049	0.00021	8.7
	edas-iasted	19	4	2	0.00045	0.00028	10.5
Anatomy	AMA-NCI	1516	338	181	0.04307	0.03713	11.9
LargeBio	FMA-NCI	3024	1376	800	0.11024	0.13855	26.5
	FMA-SNOMED	9008	5218	3244	0.73732	1.85429	36.0
	SNOMED-NCI	18844	12028	7995	1.43010	8.73326	42.4

It is noteworthy that the improvement step (i.e., the IMPROVE algorithm) reduced the repair resulting from the GREEDY algorithm to half its size for all alignments from the Conference dataset. As previously discussed, this means that the algorithm has reached the minimum possible repair that removes the conservativity violations from the alignment. The reduction rate provided by IMPROVE ranged from 33.5% (again for the SNOMED-NCI alignment) to the maximum possible reduction of 50% in the Conference dataset.

The execution time of both steps of the algorithm was small – even for the largest alignment, with over 18 thousand mappings, the combined total was of 10.16 seconds. The most significant bottleneck in the entire correction process was actually the execution of the OWL 2 reasoner, which is outside the scope of the current paper but is a prerequisite for the algorithm to remove all possible conservativity violations. The use of module extraction techniques to select only the significant portion of the input ontologies prior to running the reasoner may go a long way towards reducing said bottleneck, but the modularization approach must be chosen carefully, since modules of poor quality will reflect directly on conservativity violations persisting in the resulting alignment.

We compare those results to the performance of the LogMap3

LogMapC was compiled from the source code available at http://github.com/ernestojimenezruiz/logmap-conservativity/.

conservativity extension designed by Solimando et al. (2014a,b, 2017), referred to in the remainder of this paper as LogMapC. We executed the tests with the same inputs and in the same context4, and Table 3 outlines the results. Besides removing mappings from the alignment, LogMapC also replace some equivalence mappings by subsumption mappings. The total repair size is simply the sum of the number of removed mappings and the number of equivalence mappings replaced with subsumption mappings.

⁴

While the authors of LogMapC present a softer version of the conservativity principle, both the referenced paper and the algorithm implementation address the usual definition of conservativity, as taken here. All tests conducted take the stronger notion of conservativity into account.

Table 3

Results of the evaluation of LogMapC over alignments from OAEI 2019

Dataset	Ontologies	Input size	LogMapC repair size	Removed mappings	Replaced mappings	LogMapC time (s)	Repair size (%)
Conference	cmt-conference	15	4	4	0	0.056	26.7
	cmt-confof	16	6	6	0	0.039	37.5
	cmt-ekaw	11	1	1	0	0.036	9.1
	cmt-sigkdd	12	1	0	1	0.025	8.3
	conference-edas	17	3	1	2	0.062	17.6
	conference-ekaw	25	2	1	1	0.029	8.0
	conference-iasted	14	1	0	1	0.042	7.1
	confof-ekaw	20	2	2	0	0.032	10.0
	edas-ekaw	23	4	0	4	0.073	17.4
	edas-iasted	19	4	4	0	0.069	21.1
Anatomy	AMA-NCI	1516	302	132	170	1.372	19.9
LargeBio	FMA-NCI	3024	1639	934	705	5.998	54.2
	FMA-SNOMED	9008	7802	3864	3938	192.840	86.6
	SNOMED-NCI	18844	12706	6684	6022	2167.728	67.4

In the case of LogMapC, the percentage of mappings altered (i.e., either removed or replaced with subsumption mappings) ranged from 8% to 86.6%. The later is verified for the FMA-SNOMED alignment in the LargeBio dataset, with 7802 removed or replaced mappings out of the original 9008. The execution time, which again does not include the time required for computing modules from the source ontologies and running the reasoner, reached over 36 minutes for the SNOMED-NCI alignment.

We summarize the results from both algorithms in Tables 4 and 5. Table 4 presents a size comparison of the repairs computed by both algorithms. The columns under “Removals × 1” present the respective sizes when counting both removing an equivalence mapping or replacing it with a subsumption mapping as a single modification. The columns under “Removals × 2” present the respective sizes when counting removing an equivalence mapping as removing two subsumption mappings, that is, as two modifications. Table 5 presents a comparison of the execution time of both algorithms in seconds.

Table 4

Repair size comparison between GREEDY+IMPROVE and LogMapC (the smaller the better)

			Removals × 1		Removals × 2

Dataset	Ontologies	Input size	GREEDY+IMPROVE	LogMapC	GREEDY+IMPROVE	LogMapC
Conference	cmt-conference	15	1	4	2	8
	cmt-confof	16	2	6	4	12
	cmt-ekaw	11	1	1	2	2
	cmt-sigkdd	12	1	1	2	1
	conference-edas	17	1	3	2	4
	conference-ekaw	25	1	2	2	3
	conference-iasted	14	1	1	2	1
	confof-ekaw	20	2	2	4	4
	edas-ekaw	23	2	4	4	4
	edas-iasted	19	2	4	4	8
Anatomy	AMA-NCI	1516	181	302	362	434
LargeBio	FMA-NCI	3024	800	1639	1600	2678
	FMA-SNOMED	9008	3244	7802	6488	12251
	SNOMED-NCI	18844	7995	12706	15990	19645

Our algorithm compares very favorably to LogMapC. Even when counting the removal of equivalence mappings as two modifications – which is favorable to LogMapC, since it is the only algorithm that performs other types of modifications – there where only two cases where LogMapC’s repair was smaller: for the alignments cmt-sigkdd and conference-iasted in the Conference dataset, both cases where the proposed algorithm’s repair contain a single removal and the repair computed by LogMapC contains the replacement of a single equivalence mapping for a subsumption mapping. In other two cases, both algorithms recommended the removal of the same number of mappings, for the alignments cmt-ekaw and confof-ekaw. Finally, for the alignment edas-ekaw, LogMapC recommends the substitution of two equivalence mappings for subsumption mappings, while our algorithm recommends the removal of a single equivalence mapping – since removing an equivalence mapping counts as two modifications, both repairs are deemed as equally aggressive. These three cases are also in the Conference dataset. For every other alignment, the repair computed by the proposed algorithm is considerably less aggressive, ranging from 16.6% to 75% smaller, when counting the removal of an equivalence mapping as two modifications.

Table 5

Execution time comparison between GREEDY+IMPROVE and LogMapC

Dataset	Ontologies	Input size	GREEDY+IMPROVE	LogMapC
Conference	cmt-conference	15	0.00042	0.056
	cmt-confof	16	0.00058	0.039
	cmt-ekaw	11	0.00044	0.036
	cmt-sigkdd	12	0.00040	0.025
	conference-edas	17	0.00051	0.062
	conference-ekaw	25	0.00089	0.029
	conference-iasted	14	0.00047	0.042
	confof-ekaw	20	0.00064	0.032
	edas-ekaw	23	0.00070	0.073
	edas-iasted	19	0.00073	0.069
Anatomy	AMA-NCI	1516	0.08020	1.372
LargeBio	FMA-NCI	3024	0.25879	5.998
	FMA-SNOMED	9008	2.59161	192.840
	SNOMED-NCI	18844	10.16336	2167.728

When it comes to the execution time, GREEDY+IMPROV was over 17 times faster than LogMapC for every input case. In the case of the SNOMED-NCI alignment, which was the most time-intensive for both algorithms, our algorithm was 213.3 times faster than LogMapC.

Apart from the difference in execution time and repair size between the algorithms, it is adequate to compare the computed repairs qualitatively. We will briefly present a textual description of the similarities and distinctions between the computed repairs for the alignments in the Conference dataset and for the remaining alignments (from the Anatomy and the LargeBio datasets) we shall lay out comparison tables.

For the alignments in the Conference dataset, in one case the two repairs were identical (confof-ekaw), in another, our algorithm removes a subset of the mappings removed by LogMapC (cmt-conference). In two cases, there was an overlap between the sets of mappings to be removed (cmt-confof and edas-iasted), and, in one case, the repairs were disjoint (cmt-ekaw). In the remaining five alignments, our algorithm removed either exactly the same mappings (cmt-sigkdd and conference-iasted) or a subset of the mappings (conference-edas, conference-ekaw and edas-ekaw) which LogMapC replaced with subsumption mappings.

The cases where our algorithm removed mappings that were replaced by LogMapC (cmt-sigkdd, conference-iasted, conference-edas, conference-ekaw and edas-ekaw) reflect that there is a limit after which the algorithms cannot reduce the repair further by reinserting equivalence mappings in the alignment without reintroducing conservativity violations. Instead, LogMapC further reduces the repair’s impact by replacing those equivalence mappings with subsumption mappings. The inclusion of an additional step in our algorithm to check if the removed mappings may be reintroduced as subsumption mappings could result in even less aggressive repairs.

When it comes to the three alignments from the Conference dataset where there were mappings removed by our algorithm that were left unchanged by LogMapC (cmt-ekaw, cmt-confof and edas-iasted), it is essential to notice that, when it comes to an isolated violation, removing either end of the offending relation from the alignment suffices to repair the violation. This behavior was precisely the case for the cmt-ekaw alignment, which introduces a new subsumption relation between concepts paper author and conference participant from the ekaw ontology from the mappings between ekaw’s paper author and cmt’s author and between ekaw’s conference participant and cmt’s conference member. Our algorithm removed the latter mapping, while LogMapC removed the first. Likewise for cmt-confof, which introduces a subsumption relation between author and member.

Finally, a similar situation occurs for edas-iasted, however with the introduction of three new subsumption relations in edas centered in the concept attendee – while our algorithm removed only the mapping between edas’ attendee and iasted’s delegate, LogMapC removed the mappings for the other three concepts (i.e., author, reviewer and session chair). Our algorithm guarantees the significantly smaller repair by removing both endpoints of one of the three violating relations in GREEDY (say, author and attendee), which simultaneously removes the other two relations – since one of the endpoints is not there anymore – and the posterior attempt to reinsert them in IMPROVE – reinserting attendee would reintroduce violating subsumption relations with reviewer and session chair while reinserting author would not.

Tables 6 through 9 lay out the comparison between repairs computed for the alignments AMA-NCI, from the Anatomy dataset, FMA-NCI, FMA-SNOMED and SNOMED-NCI, from the LargeBio dataset. The rows present the actions our algorithm performed on the mappings, while the columns present the actions taken by LogMapC. Each cell contains the number of times that the algorithms chose those actions for a mapping in the alignment. The tables refer to the equivalence mappings exchanged for subsumption mappings by LogMapC as “replaced” mappings.

The first thing to notice in these tables is that both algorithms agree to a large extent on the non-altered mappings. Our algorithm maintained between 82,59% (FMA-SNOMED) and 96,21% (ANA-NCI) of the mappings that LogMapC kept unchanged. Another point of some agreement is the removed mappings. Close to half of the mappings removed by our algorithm were also removed by LogMapC for the LargeBio alignments (59,25% for FMA-SNOMED, 49% for FMA-NCI and 45,73% for SNOMED-NCI). The percentage is lower for the AMA-NCI alignment, where it is 19,89%.

Table 6

Comparison of repairs for the AMA-NCI alignment

AMA-NCI	Maintained by LogMapC	Replaced by LogMapC	Removed by LogMapC
Mantained	1168	71	96
Removed	46	99	36

Table 7

Comparison of repairs for the FMA-NCI alignment

FMA-NCI	Maintained by LogMapC	Replaced by LogMapC	Removed by LogMapC
Mantained	1242	440	542
Removed	143	265	392

Table 8

Comparison of repairs for the FMA-SNOMED alignment

FMA-SNOMED	Maintained by LogMapC	Replaced by LogMapC	Removed by LogMapC
Mantained	996	2826	1942
Removed	210	1112	1922

Table 9

Comparison of repairs for the SNOMED-NCI alignment

SNOMED-NCI	Maintained by LogMapC	Replaced by LogMapC	Removed by LogMapC
Mantained	5177	2644	3028
Removed	961	3378	3656

A final relevant aspect is that LogMapC does not guarantee that the resulting alignment is completely free of conservativity violations. In fact, for the alignments AMA-NCI from the Anatomy track and FMA-SNOMED and SNOMED-NCI from the LargeBio track, the alignments still contained 4, 56 and 528 violations, respectively, after LogMapC’s repair, according to the algorithm’s output.

We submitted the alignments resulting from our repair process to LogMapC for validation. LogMapC found no remaining violation for all alignments but one. For the SNOMED-NCI alignment, however, LogMapC detected 1227 remaining violations – although it has proposed no repair for those violations. In order to diagnose such violations, we computed the merge of SNOMED and NCI through the alignment and submitted the merged ontology to the HermiT reasoner for classification. Each violation of the conservativity principle should appear in the resulting hierarchy as a subsumption or equivalence relation. Nevertheless, we have found none of the supposed violations in the set of relations that actually occur in the resulting ontology. For example, LogMapC indicated violating subsumption relations between Carcinoma in situ of lip and Malignant tumor of lip in SNOMED and between Recurrent Neuroblastoma and Relapse in NCI, but neither relation exists in the final merged ontology. The same is true for the remaining reported violations. Therefore, we propound that the violations detected by LogMapC are rather false positives. These false positives may originate from the assumption of disjointness – it is possible that LogMapC accuses violations when merging the ontologies introduces a common subsumee between concepts which previously had none.

7. Conclusion

In the context of the Semantic Web, targeted at computer programs rather than human users, the validation and correction of ontology alignments play a prominent role in the actual realization of semantic interoperability. Further, efficiency in evaluating alignments obtained from diverse sources on the web is critical in enabling the automatic processing of decentralized information.

While ontology alignments are essential for the interoperability of ontology-based systems, alignments frequently lead to conservativity violations. This is the case even for the reference alignments from the Ontology Alignment Evaluation Initiative. This work proposed a quadratic-time algorithm for correcting such violations and evaluated it over 14 alignments from three different tracks from OAEI 2019 edition. The algorithm’s performance compared favorably to a state-of-the-art algorithm for correcting conservativity violations. For all alignments, the proposed algorithm was significantly faster, ranging from 17 times to 213 times faster. For most of the alignments, the computed repair was also considerably less aggressive, preserving more information from the input alignment. Finally, the resulting repaired alignments were verified to contain no remaining conservativity violations.

However, further work may improve some aspects of the proposed algorithm. The algorithm is currently limited to alignments containing only equivalence mappings. A development allowing the algorithm to run over more complex alignments would vastly increase its utility. Further, the inclusion of an additional step (or the modification of the repair improvement step) to check if the removed equivalence mappings may be reintroduced in the alignment as subsumption mappings without introducing new violations may lead to even less aggressive repairs. Also, the inclusion of a modularization step prior to submitting the source ontologies to a reasoner might reduce the most significant bottleneck for the actual execution of the algorithm. Finally, the algorithm did not consider that a mapping between relations (properties, in OWL terminology) necessarily implies a mapping between the domain and range of such relations. A possible easy fix would be for the algorithm to include the derived domain and range mappings explicitly in the alignment, treating the removal of any such mapping as a removal of the original mapping between the relations. We note, however, that this limitation did not affect the comparison with LogMapC presented in Section 6, as evidenced by the fact that LogMapC found no remaining violations in the alignments repaired by our algorithm. This may be due to the absence of violations of this kind in the reference alignments or to redundancies in said alignments which enable the detection of such violations from other mappings.

Footnotes

Acknowledgements

This work was funded by Brazilian agencies CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior) and CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico).

References

Antunes, C.R., Rademaker, A. & Abel, M. (2019). A category-theoretic approach for the detection of conservativity violations in ontology alignments. In

J.P.A.

Almeida ,

Bax ,

Berardi and

Baião (Eds.), XII Seminar on Ontology Research in Brazil (pp. 11–20). Porto Alegre.

Bodenreider, O. (2004). The Unified Medical Language System (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32, 267–270. doi:10.1093/nar/gkh061.

Ivanova, V. & Lambrix, P. (2013). A unified approach for aligning taxonomies and debugging taxonomies and their alignments. In

Cimiano ,

Corcho ,

Presutti ,

Hollink and

Rudolph (Eds.), The Semantic Web: Semantics and Big Data (pp. 1–15). Berlin, Heidelberg: Springer.

Jean-Mary, Y.R., Shinoroshita, E.P. & Kabuka, M.R. (2009). Ontology matching with semantic verification. Journal of Web Semantics, 7, 235–251. doi:10.1016/j.websem.2009.04.001.

Jiménez-Ruiz, E. & Grau, B.C. (2011). Logmap: Logic-based and scalable ontology matching. In

Aroyo ,

Welty ,

Alani ,

Taylor ,

Bernstein ,

Kagal ,

Noy and

Blomqvist (Eds.), The Semantic Web – ISWC 2011 (pp. 273–288). Berlin, Heidelberg: Springer. doi:10.1007/978-3-642-25073-6_18.

Jiménez-Ruiz, E., Grau, B.C., Horrocks, I. & Berlanga, R. (2011). Logic-based assessment of the compatibility of UMLS ontology sources. Journal of Biomedical Semantics, 2, S2. doi:10.1186/2041-1480-2-S1-S2.

Karakostas, G. (2005). A better approximation ratio for the vertex cover problem. In

Caires ,

G.F.

Italiano ,

Monteiro ,

Palamidessi and

Yung (Eds.), Automata, Languages and Programming (pp. 1043–1050). Berlin, Heidelberg: Springer. doi:10.1007/11523468_84.

Karp, R.M. (1972). Reducibility among combinatorial problems. In

R.E.

Miller ,

J.W.

Thatcher and

J.D.

Bohlinger (Eds.), Complexity of Computer Computations (pp. 85–103). Boston: Springer. doi:10.1007/978-1-4684-2001-2_9.

Lambrix, P. & Liu, Q. (2013). Debugging the missing is-a structure within taxonomies networked by partial reference alignments. Data & Knowledge Engineering, 86, 179–205. doi:10.1016/j.datak.2013.03.003.

10.

Lamy, J.B. (2017). Owlready: Ontology oriented programming in Python with automatic classification and high level constructs for biomedical ontologies. Artificial Inteligence in Medicine, 80, 11–28. doi:10.1016/j.artmed.2017.07.002.

11.

Meilicke, C. (2006). Reasoning about ontology mappings in distributed description logics. PhD thesis, University of Mannheim.

12.

Papadimitriou, C.H. & Steiglitz, K. (1998). Combinatorial Optimization: Algorithms and Complexity. New York: Dover Publications, Inc.

13.

Solimando, A., Jiménez-Ruiz, E. & Guerrini, G. (2014a). Detecting and correcting conservativity principle violations in ontology-to-ontology mappings. In

Mika ,

Tudorache ,

Bernstein ,

Welty ,

Knoblock ,

Vrandecic ,

Groth ,

Noy ,

Janowicz and

Goble (Eds.), The Semantic Web – ISWC 2014 (pp. 1–16). Cham: Springer.

14.

Solimando, A., Jiménez-Ruiz, E. & Guerrini, G. (2014b). A multi-strategy approach for detecting and correcting conservativity principle violations in ontology alignments. In

C.M.

Keet and

Tamma (Eds.), Proceedings of the 11th International Workshop on OWL: Experiences and Directions (pp. 13–14). Cham: Springer.

15.

Solimando, A., Jiménez-Ruiz, E. & Guerrini, G. (2017). Minimizing conservativity violations in ontology alignments: Algorithms and evaluation. Knowledge and Information Systems, 51, 775–819. doi:10.1007/s10115-016-0983-3.

16.

Studer, R., Benjamins, V.R. & Fensel, D. (1998). Knowledge engineering: Principles and methods. Data & Knowledge Engineering, 25, 161–198. doi:10.1016/S0169-023X(97)00056-6.

17.

Zamazal, O. & Svátek, V. (2017). The ten-year OntoFarm and its fertilization within the onto-sphere. Web Semantics: Science, Services and Agents on the World Wide Web, 43, 46–53. doi:10.1016/j.websem.2017.01.001.

18.

Zhang, S., Mork, P. & Bodenreider, O. (2004). Lessons learned from aligning two representations of anatomy. In

Han (Ed.), Proceedings of the First International Workshop on Formal Biomedical Knowledge Representation (pp. 102–108).