Towards fuzzy lexical reasoning

Abstract

Proximity-based Logic Programming is a formal framework for representing general or non-specialized knowledge. Although it is a powerful tool, it is too complex because the values of the proximity equations (fuzzy binary relations that establish the relationships among the symbols of a first-order language) must be manually defined by the designer of the system. In this paper, we propose a new framework for Proximity-based Logic Programming enhanced with WordNet and Interval-Valued Fuzzy Sets. Its main contribution is to compile automatically the information provided by WordNet and generate an interval-valued proximity relation on the set of their words. This proposal is completely integrated inside the unification mechanism of Bousi~Prolog system. This allows us to introduce the lexical knowledge induced from a linguistic resource, such as WordNet, into an approximate reasoning system. To the best of our knowledge, this is the first time that WordNet is introduced into the core of a Prolog system by means of compilation techniques and lexical knowledge is combined with proximity-based unification frame.

Keywords

Fuzzy lexical reasoning knowledge representation proximity-based logic programming Bousi∼Prolog WordNet

1 Introduction

One of the most relevant sources of human awareness is general or non-specialized knowledge [3]. Consequently, the modelling of this type of information is one of the main challenges in the field of Artificial Intelligence. However, this is a complex effort given that vagueness and uncertainty are essential in this type of knowledge, and both phenomena must be adequately addressed in order to develop an efficient computational representation.

In the literature, we distinguish two main approaches to encode it. For the first one, it is assumed that common-sense is analogous to any other type of knowledge, such as a mathematical one, and consequently, it can be described by a formal framework. Thus, some proposals assuming this frame are Situation Calculus [4], Circumscription [5] or Multi-agent Cognitive Architecture [7]. In the second approach, by its hand, it is assumed that this type of human knowledge is too complex for being represented using mathematical logic, and therefore, formal approaches are not valid. Instead, it is induced from a large collection of facts (knowledge bases) and the relationships among them. Some examples of this frame are Cyc Project [9], WordNet [10] or Concept Net [11].

In this paper, we assume the second framework because it allows us to incorporate the field of lexical knowledge, a source of non-specialized knowledge, which proposes the use of linguistic information about words and relationships between them to perform valid inferences. However, developing a computational approach without a formal framework is a difficult task. Thus, for that reason, we propose to use the frame of Proximity-based Logic Programming (PLP) [13], which enhances logic programming with two important capabilities: (a) the capability to specify the meaning of words drawn from natural language using the named Proximity Equations (PE); and, (b) the ability to reason and compute with words and clauses by means of a unification algorithm based on proximity, performing a sort of approximate reasoning.

A challenge in PLP is the management of linguistic knowledge, because this involves dealing with vagueness and word synonymy. Linguistic vagueness has been addressed in the literature by means of PEs, but they are mostly defined for a specific domain [17 –20], being the designer of the system who fixes the values of these equations manually, which makes harder to use PLP systems in real applications. On the other hand, synonymy has been analyzed with the development of semantic similarity measures. In the literature, we can find different families and measures which provide us different results, but there is not a predominant one [22], which leads us to consider assessing synonym between words as a high uncertainty field. From our point of view, PEs also can be defined for non-specific domains whether the definitions provided by the designer are obtained automatically from a general ontology. Using this approach, it is not necessary to establish the set of PEs associated to the PLP system but these can be generated in an automatic way, even also vague expressions.

On the other hand, enhancing knowledge representation with lexical knowledge and linguistic resources allows us: i) adding new specific information for more or less specialized domains, ii) inferring new knowledge from the same set of facts and rules, iii) using the power of lexical knowledge for applications as text cataloging, natural queries on databases, computing with words and perceptions, fuzzy experts systems, semantic web or approximate reasoning 1 .

The seminal ideas of this paper are described in [6, 21], where a research line to get a ‘natural’ fuzzy linguistic Prolog system is initiated. Requisites defined to this system have the capability of working together with linguistic sources (e.g., electronic linguistic dictionaries) and the resolution rule and unification mechanism also should be treated with linguistic relations (e.g., synonymy or antonym). This paper is also a contribution on this matter, given we propose to use WordNet in the formalization of a PLP system; in other words, we enhance the inference mechanism of Prolog system by using lexical resources.

In the literature, there is an implementation available for using WordNet in Prolog [29]. This is formed by files containing the WordNet database in a Prolog-readable format, but a Prolog interface to WordNet is not implemented. Since the Prolog database is very large and may take many minutes to load into the Prolog workspace, a separate file has been created for each WordNet relation. This requires from the user the ability to load only those parts of the database that they are interested in in each case and, hence, a sort of advanced knowledge. In contrast with this proposal, our frame allows a programmer to create an interface with WordNet in a clear and transparent way, avoiding any built-in command; therefore, the programmer does not need to be an expert on WordNet or on the management of vagueness. Moreover, as knowledge provided by WordNet is generated at compilation time, our approach is a useful alternative of incorporating WordNet into a Prolog system based on some type of weak unification.

Our main aim in this paper is to deal with linguistic knowledge from a holistic view, which allows us to address the weaknesses explained above. Thus, PEs are automatically obtained from a general ontology such us WordNet [10], which allows us to automatize also the task of knowledge representation and to avoid the manually definition by the designer, reducing the time for performing the system. Since synonymy is one of the most typical cases that are modeled by PEs, we propose to use semantic similarity measures [22] for assessing it. As we said, this is a field with a high degree of uncertainty and, for that reason, we shall apply Interval-Valued Fuzzy Relations (IVFRs) [27] for modeling the results of the PEs. IVFRs are an extension of standard Fuzzy Sets which allows us to improve the expressive power of the system in two interesting points: i) we can define a lower and an upper bound of similarity degree according to different metrics; and, ii) we can expand or narrow the use interval of linguistics concepts according to the use context. This paper contains the general procedure for generating automatically and independently of a designer the PEs of a knowledge-base of a Prolog system, which are expressed using IVFRs. It enhances and formalizes the approach presented in [12] and it is also a contribution to the development of a general PLP framework connected with lexical resources.

The remainder of this paper is organized as follows: in Section 2 we describe the main concepts used in our framework; in Section 3 and 4, our proposal of lexical reasoning, combining PLP with semantic similarity measures, is introduced; Section 5 describes the details of the implementation of our proposal; Section 6 describes some examples of possible applications of our frame; and, finally, Section 7 summarizes the main conclusions of this paper as well as some proposal for future work.

2 Preliminary concepts

In this section, we introduce the main concepts used in the definition of our framework.

2.1 Proximity equations

PEs are in the core of the PLP frame, since they are used for defining linguistic similarity relationships inside the knowledge base. From a syntactic point of view, a PE “a ∼ b = α” defines an entry ℛ (a, b) = α of a relation ℛ establishing that a is close to b with a degree α ∈ [0, 1]. Thus, PEs play an essential role in the unification step; i.e., Proximity-based Unification Algorithm. Roughly speaking, this algorithm states that two terms f (t₁, …, t_n) and g (s₁, …, s_n) weakly unify if the root symbols f and g are close, with a certain degree, and each of their arguments t_i and s_i matches in a soft sense. For Example 1, we illustrate the behaviour of Bousi~Prolog a practical implementation of this approach, compared with a standard Prolog system.

Example 1. Let us assume a fragment of a deductive database that stores information about people and their preferences.

%% mary loves mountaineering

loves(mary,mountaineering).

%% john likes football

likes(john,football).

%% peter plays basketball

plays(peter,basketball).

%% if a person practises sports

%% the he/she is a healthy person

healthy(X):- practices(X,sport).

In a standard Prolog program, if we ask about healthy people “?-healthy(X).”, the system fails because nobody in the database practice a sport. However “mary”, “john” and “peter” are reasonable candidates to be a healthy person since the verbs to love, to like and to play are semantically related with the verb to practice. Notwithstanding, using BPL, we can add this information by means of PEs. Thus, semantic relations among ‘like’, ‘practice’, ‘love’ and ‘play’, on one side, and among ‘sport’, ‘basketball’, ‘football’ and ‘mountaineering’, on the other side, are defined 2 :

practises~loves=0.9 sport~basketball=1.0

practises~likes=0.7 sport~football=1.0

practises~plays=1.0 sport~mountaineering=0.8

Now, the BPL program allows us to get the answers: “X=mary with 0.8”, “X=john with 0.7”, and “X=peter with 1.0”. The system operates as follows: since we have specified that “sport” is close to “mountaineering”, with a degree 0.8, and that “practises” is close to “loves” with 0.9, the terms “practises(X,sport)” and “loves(mary, mountaineering)” may ‘weakly” unify with a degree of 0.8 ∧ 0.9 = 0.8, generating the binding X=mary; i.e., the assertion healthy(mary) is stated with a truthful degree of 0.8. The remaining answers are obtained analogously.

2.2 Source for lexical knowledge: WordNet

WordNet [10] is an electronic English thesaurus. The basic elements in its knowledge base are words, word senses and synsets (i.e., the set of cognitive synonyms). It is worth noting that the same word has associated different senses and different part-of-speech and, consequently, it appears with different meanings in different pragmatic contexts. Each synset is described by a gloss, which is a brief textual description of the meaning of a synset. The lexical and semantic relations included in WordNet, such as hypernym, hyponym, antonymy, etc., are among synsets, not among words.

In this paper, we focus on the use of WordNet as an ontology. It has been used in different applications both for general/natural language tasks [22], but its behaviour in specialized discourses show significant limitations [1]. In order to address this problem, an extension of WordNet using fuzzy ontologies is proposed because it provides a better frame for the representation of lexical and semantic relationships [1, 2].

2.3 Interval-valued fuzzy sets

IVFSs are a fuzzy formalism based on two membership mappings instead of a single one. They are called, analogously to ordinary fuzzy sets, the lower membership function and the upper membership function. Both are established on a universe of discourse 𝒳, and map each element from 𝒳 to a real number in the [0, 1] interval.

Definition 1. An interval-valued fuzzy set A in 𝒳 is a (crisp) set of ordered triples: $\begin{matrix} 𝒜 = {(x, {\underline{μ}}_{A} (x), {\bar{μ}}_{A} (x)) : x \in 𝒳; \\ {\underline{μ}}_{A} (x), {\bar{μ}}_{A} (x) : 𝒳 \to [0, 1]} \end{matrix}$ where: $\underline{μ}$ , $\bar{μ}$ are the lower and the upper membership functions, respectively, satisfying the following condition: $0 \leq {\underline{μ}}_{A} (x) \leq {\bar{μ}}_{A} (x) \leq 1 \forall x \in 𝒳$

With respect to the name of this kind of fuzzy sets, Interval-valued, values of ${\underline{μ}}_{A}$ and ${\bar{μ}}_{A}$ , computed for any x ∈ 𝒳 are the lower and upper bounds of the interval number and it is the membership degree for x to the set A. That interval is included in [0, 1] and closed at both ends.

Some arithmetic operations on interval-numbers have been recalled since they are useful in operating on cardinalities of IVFSs. Let a =[ $\underline{a}$ , ā], b =[ $\underline{b}$ , $\bar{b}$ ] be intervals in R, and r∈ R +. The arithmetic operations ‘+’, ‘-’, ‘·’ and power are defined as follows:

[ $\underline{a}$ , ā] + [ $\underline{b}$ , $\bar{b}$ ] = [ $\underline{a}$ + $\underline{b}$ , ā + $\bar{b}$ ]

[ $\underline{a}$ , ā] - [ $\underline{b}$ , $\bar{b}$ ] = [ $\underline{a}$ - $\bar{b}$ , ā - $\underline{b}$ ]

[ $\underline{a}$ , ā] · [ $\underline{b}$ , $\bar{b}$ ] = [min ( $\underline{a}$ · $\underline{b}$ , $\underline{a}$ · $\bar{b}$ , ā · $\underline{b}$ , ā · $\bar{b}$ ) , max ( $\underline{a}$ · $\underline{b}$ , $\underline{a}$ · $\bar{b}$ , ā · $\underline{b}$ , ā · $\bar{b}$ )]

([ $\underline{a}$ , ā]) ^r = [ ${\underline{a}}^{r}$ , ā^r] for non-negative ā, $\underline{a}$

The operations of union and intersection for IVFSs are defined by triangular norms. Let A,B be IVFSs in 𝒳, t a t-norm and s a t-conorm. The union of A and B is the interval-valued fuzzy set A ∪ B with the membership function. $μ_{A \cup B} (x) = [s ({\underline{μ}}_{A} (x), {\underline{μ}}_{B} (x)), s ({\bar{μ}}_{A} (x), {\bar{μ}}_{B} (x))]$ (1) and the intersection of A and B is the IVFSs A∩B in which $μ_{A \cap B} (x) = [t ({\underline{μ}}_{A} (x), {\underline{μ}}_{B} (x)), t ({\bar{μ}}_{A} (x), ({\bar{μ}}_{B} (x))]$ (2)

Thus, de Morgan’s laws for IVFSs A,B in 𝒳 are $(A \cup B)^{c} = A^{c} \cap B^{c} (A \cap B)^{c} = A^{c} \cup B^{c}$ (3)

Definition 2. Let L be a lattice of intervals in [0, 1] that satisfies:

L = {[x₁, x₂] ∈ [0, 1] ² with x₁ ≤ x₂}

[x₁, x₂] ≤ _L [y₁, y₂] iff x₁ ≤ y₁ and x₂ ≤ y₂

Also by definition:

[x₁, x₂] < _L [y₁, y₂] ⇔ x₁ < y₁, x₂ ≤ y₂ or x₁ ≤ y₁, x₂ < y₂

[x₁, x₂] = _L [y₁, y₂] ⇔ x₁ = y₁, x₂ = y₂

0_L = [0, 0] and 1_L = [1, 1] are the smallest and the greatest elements in L.

Definition 3. A binary interval-valued fuzzy relation on a set U is a IVFS on U × U (that is, a mapping U × U → L, where L is a lattice of intervals in [0, 1] previously defined).

Since interval-valued fuzzy relations are interval-valued fuzzy subsets they have [ $\underline{α}$ , $\bar{α}$ ]-cuts.

Definition 4. If ℛ is an interval-valued fuzzy relation on U, the [ $\underline{α}$ , $\bar{α}$ ]-cut R^{[ $\underline{α}$ , $\bar{α}$ ]} = {x, y ∣ R (x, y) ≥ [ $\underline{α}$ , $\bar{α}$ ]}

A binary interval-valued fuzzy relation on a set U is an interval-valued fuzzy subset on U × U (that is, a mapping U × U ⟶ ([0, 1])).

3 Enriching PLP with interval-valued fuzzy relations

Any extension of a logic programming language requires keeping a balance between two levels [14]: i) the knowledge level, where the facts and the behaviour of a particular world are described; ii) the symbolic level, where symbols are used for representing the knowledge level. Thus, considering deductive reasoning, the symbolic level can simulate reasoning and inferring new knowledge.

In the case of a logic programming language enriched with linguistic resources and IVFSs, it is also necessary to distinguish between crisp and uncertain knowledge. Uncertain knowledge is not usually present at knowledge level, but it plays an important role in the inference process and, therefore, it must be considered by the inference mechanism. From the point of view of the design of a programming language, this is crucial because the user/programmer must not handle manually the definition of fuzzy sets; in other words, the system should compute automatically all type of fuzzy sets. Thus, assuming these ideas, we separate the representation of precise and uncertain knowledge, which will improve the user’s experience. To this aim, we employ Interval-Valued Fuzzy Relations (IVFR).

3.1 Interval-valued fuzzy relations in PLP

A binary IVFR ℛ is a proximity relation if it fulfils the reflexive and symmetric properties. In standard PLP, different syntactic symbols represent proximity information. Generalizing this idea presented in [16] and introduced in [15], an IVFR ℛ is defined on the alphabet of a first order language. This makes possible to treat as indistinguishable two syntactic symbols which are related by ℛ with a certain degree greater than zero.

The IVFR ℛ on the alphabet of a first order language can be extended to terms and atomic formulas by structural induction in the usual way [15]:

Let f and g be two n-ary function symbols and let t₁, …, t_n, s₁, …, s_n be terms. ℛ (f (t₁, …, t_n) , $g (s_{1}, \dots, s_{n})) = ℛ (f, g) \land (⋀_{i = 1}^{n} ℛ (t_{i}, s_{i}))$ ;

Let p and q be two n-ary predicate symbols and let t₁, …, t_n, s₁, …, s_n be terms. ℛ (p (t₁, …, t_n) , $q (s_{1}, \dots, s_{n})) = ℛ (p, q) \land (⋀_{i = 1}^{n} ℛ (t_{i}, s_{i}))$ ;

Let C = A₀←A₁, …, A_n and $C^{'} = A_{0}^{'} \leftarrow A_{1}^{'}, \dots, A_{n}^{'}$ be two Horn clauses. If n = m, $ℛ (C, C^{'}) = ⋀_{i = 0}^{n} ℛ (A_{i}, A_{i}^{'})$ ;

∧ represents any t-norm acting on interval-valued approximation degrees. Otherwise, the approximation degree of two expressions is zero.

Although substitutions can be “fuzzified” in some respect, in this paper we use the classical notion of a substitution disregarding other possible alternatives in our setting. A substitution σ is a mapping from the set of variables 𝒳 to the set of terms 𝒯 such that its domain Dom (σ) = {x ∈ 𝒳 ∣ xσ ≠ x} is finite. We frequently identify a substitution σ with the set {x/xσ ∣ x ∈ dom (σ)}. We denote the identity substitution by id. The restriction σ_↾𝒱 of a substitution σ to a set 𝒱 of variables is defined by xσ_↾𝒱 = xσ if x ∈ 𝒱 and xσ_↾𝒱 = x if x ∉ 𝒱. We write σ = θ [𝒱] iff σ_↾𝒱 = θ_↾𝒱.

When we apply a substitution to two expressions (terms or atoms) which are similar w.r.t. a proximity relation, ℛ, and a cut value, [ $\underline{λ}$ , $\bar{λ}$ ], their instances remain similar with the same approximation degree.

Proposition 1. [15, pag. 397] Let ℛ be a proximity relation and [ $\underline{λ}$ , $\bar{λ}$ ] > [0, 0] be a cut value. For any substitution θ and terms t, t′, if ℛ (t, t′) ≥ [ $\underline{λ}$ , $\bar{λ}$ ] then ℛ (tθ, t′θ) = ℛ (t, t′) ≥ [ $\underline{λ}$ , $\bar{λ}$ ].

Also, in the context of a proximity relation, ℛ, we can define an analogous concept to the classical notion of variant substitutions. Roughly speaking, two substitutions are similar if they share the same domain and the terms of their respective ranges are pairwise similar w.r.t. ℛ and a cut value [ $\underline{λ}$ , $\bar{λ}$ ].

Definition 5. Let ℛ be a proximity relation and [ $\underline{λ}$ , $\bar{λ}$ ] be a cut value. The substitution σ is similar to the substitution θ with level [ $\underline{λ}$ , $\bar{λ}$ ], denoted by σ ≈ _{ℛ,[ $\underline{λ}$ , $\bar{λ}$ ]}θ, if and only if there exists a renaming substitution ρ such that, for any variable x in dom (σ) = dom (θ), ℛ (xθ, xσρ) ≥ [ $\underline{λ}$ , $\bar{λ}$ ].

The following define a weaker version of the notion of more general substitution.

Definition 6. Let ℛ be a proximity relation and [ $\underline{λ}$ , $\bar{λ}$ ] be a cut value. The substitution σ is more general than the substitution θ with level [ $\underline{λ}$ , $\bar{λ}$ ], denoted by σlapθ, if there exists a substitution δ such that, for any variable x in dom (σ) ∪ dom (θ), ℛ (xθ, xσδ) ≥ [ $\underline{λ}$ , $\bar{λ}$ ].

Note that, the relation lap is a pre-order in the set of substitutions of a first order language [15, pag. 410].

The notion of proximity between substitutions can be characterized in terms of the weaker version of more general substitution just introduced in the last definition, which we name the weak most general unifier (wmgu).

Note that a result of soundness on the lattice [0, 1] has been proved in [16] and, since the proposal defined in this paper is an extension of it, those properties are preserved. Nevertheless, as future work, soundness and completeness will be also formally in proved this frame.

3.2 Weak unification algorithm based on interval-valued fuzzy relations

Now, we extend the weak unification algorithm presented in [12] with IVFRs on syntactic domains. It is formalized as a transition system supported on an interval-valued unification relation “⇒”. The unification of two expressions ɛ₁₁ = f (t₁, …, t_n) and ɛ₁₂ = g (s₁, …, s_n) is obtained by a state transformation sequence starting from an initial state 〈G, id, [ ${\underline{α}}_{0}$ , ${\bar{α}}_{0}$ ], where G = {t₁ ≈ s₁, …, t_n ≈ s_n} is a set of unification problems 3 , id is the identity substitution and [ ${\underline{α}}_{0}$ , ${\bar{α}}_{0}$ ] = [1, 1] is the initial interval-valued degree: $\begin{matrix} 〈 G, id, [{\underline{α}}_{0}, {\bar{α}}_{0}] \Rightarrow 〈 G 1, θ_{1}, [{\underline{α}}_{1}, {\bar{α}}_{1}] \\ \Rightarrow \dots \Rightarrow 〈 G_{n}, θ_{n}, [{\underline{α}}_{n}, {\bar{α}}_{n}] . \end{matrix}$ When the final state 〈G_n, θ_n, [ ${\underline{α}}_{n}$ , ${\bar{α}}_{n}$ ], with G_n =∅, is reached (i.e., the equations in the initial state have been solved), the expressions ɛ₁₁ and ɛ₁₂ are unifiable by proximity with wmgu θ_n and unification degree [ ${\underline{α}}_{0}$ , ${\bar{α}}_{0}$ ]. Therefore, the final state 〈 ∅ , θ_n, [ ${\underline{α}}_{n}$ , ${\bar{α}}_{n}$ ] signals out the unification success. On the other hand, when expressions ɛ₁₁ and ɛ₁₂ are not unifiable, the state transformation sequence ends with failure (i.e., G_n = Fail).

3.3 Weak SLD resolution based on interval-valued fuzzy relations

Let Π be a set of Horn clauses and ℛ an interval-valued fuzzy relation on the alphabet of a first order language ℒ. Let Λ = {[ ${\underline{λ}}_{0}$ , ${\bar{λ}}_{0}$ ] , . . . , [ ${\underline{λ}}_{n}$ , ${\bar{λ}}_{n}$ ]} be the set of approximation levels of ℛ. We extend the Weak SLD (WSLD) resolution presented in [12]. Now, this is a transition system 〈E, ⇒_WSLD〉 where E is a set of triples 〈𝒢, θ, [ $\underline{α}$ , $\bar{α}$ ] (goal, substitution, interval-valued degree), that we call the state of a computation, and whose transitional relation ⇒_WSLD ⊆ (E × E) is the smallest relation that satisfies:

𝒞 = (𝒜 ← 𝒬) ≪Π,

σ = wmgu (𝒜, 𝒜′) ≠ fail,

[ $\underline{β}$ , $\bar{β}$ ] = ℛ (𝒜σ, 𝒜′σ)) ≥ [ $\underline{λ}$ , $\bar{λ}$ ]

〈 (← 𝒜′, 𝒬′) , θ, [ $\underline{α}$ , $\bar{α}$ ] ⇒_WSLD 〈← (𝒬, 𝒬′) σ, θσ, [ $\underline{α}$ , $\bar{α}$ ] ∧ [ $\underline{β}$ , $\bar{β}$ ]

where 𝒬, 𝒬′ are conjunctions of atoms, the notation “𝒞≪Π” is representing that 𝒞 is a standardized apart clause in Π, and that the value [ $\underline{λ}$ , $\bar{λ}$ ] is a cut value in Λ, which imposes a limit to the expansion of the search space in a computation. We say that the performed step is a step of level [ $\underline{λ}$ , $\bar{λ}$ ] because the computed approximation degree is greater than or equal to [ $\underline{λ}$ , $\bar{λ}$ ].

A WSLD derivation of level [ $\underline{λ}$ , $\bar{λ}$ ] for Π ∪ {_𝒢0} and ℛ is a sequence of steps of level [ $\underline{λ}$ , $\bar{λ}$ ]: $\begin{matrix} 〈 𝒢_{0}, id, [1, 1] \Rightarrow_{WSLD} \dots \\ \Rightarrow_{WSLD} 〈 𝒢_{n}, θ_{n}, [{\underline{β}}_{n}, {\bar{β}}_{n}] . \end{matrix}$ That is, each [ ${\underline{β}}_{i}$ , [ ${\bar{β}}_{i}$ ] ≥ [ $\underline{λ}$ , $\bar{λ}$ ]. And a WSLD refutation of level [ $\underline{λ}$ , $\bar{λ}$ ] for Π ∪ {_𝒢0} and ℛ is a WSLD derivation of level [ $\underline{λ}$ , $\bar{λ}$ ] for Π ∪ {_𝒢0} and ℛ: $〈 𝒢_{0}, id, [1, 1] \Rightarrow_{WSLD} * 〈 □, σ, [\underline{β}, \bar{β}],$ where the symbol “□” stands for the empty clause, σ is the computed substitution and [ $\underline{β}$ , $\bar{β}$ ] is its interval-valued approximation degree. The output of a WSLD refutation is the pair 〈σ, [ $\underline{β}$ , $\bar{β}$ ], which is said to be the computed answer. Certainly, a WSLD refutation computes a family of answers, in the sense that, if σ = {x₁/t₁, …, x_n/t_n} then, by definition, whatever substitution θ′ = {x₁/s₁, …, x_n/s_n}, holding that ℛ (s_i, t_i) ≥ [ $\underline{λ}$ , $\bar{λ}$ ], for any 1 ≤ i ≤ n, is also a computed substitution with interval-valued approximation degree $[\underline{β}, \bar{β}] \land (⋀_{1}^{n} ℛ (s_{i}, t_{i}))$ .

Example 2. Let us consider a database storing information on books, including readers preferences and some subjective information concerning the semantic information between some syntactic entities. Then it is possible to perform an inference reasoning step where the antecedent of a conditional formula is allowed to match with some premise only approximately. For example, we have that if X is a mystery book then X is a good one, it could be modelled with the rule “good (X) ← book (X, mystery)”; additionally, we have that Dracula is a horror book, it could be modelled with the fact “book (dracula, horror)”; finally, we know that horror books are similar to mystery books, it could be modelled with an interval-valued PE “horror ∼ mystery = [0.7, 0.9]”. Now, by means of weak resolution based on interval-valued fuzzy sets we can infer that “good (dracula) : - [0.7, 0.9]”, that is, Dracula is a good book with the lower and the upper degrees defining an interval-valued degree. Therefore we have the weak resolution steps showed in Figs. 1 and 2.

Thus, the sequence of steps to achieve the answer is: X = dracula with [0.7, 0.9] is:〈good (X₁) , X/X₁, [1, 1] ⇒_WSLD〉〈 ← book (X₁, mystery) , {X/X₁} , [1, 1] ⇒_WSLD〉〈 □ , {X/X₁, X₁/dracula} , [0.7, 0.9].

4 Semantic proximity equations

As we said, WordNet can be interpreted as an ontology, given nouns and verbs are organized into hierarchies based on “is-a” relations. In the literature, these relationships have been used to obtain different resemblance semantic metrics, which can be classified into two main categories [22]:

Semantic similarity measures: They are exclusively based on “is-a” relations. Some very well known examples are Path length [23] or Wu and Palmer [24].

Semantic relatedness measures: They include other linguistic relations (meronym, glosses, etc.) in addition to “is-a” relations. Some examples are Tversky [25] or Hirst and St-Onge [26].

As we said in the introduction, we propose to use general knowledge for enhancing PLP. Although logic programming requires formal expressions, we propose to induce their values from WordNet, a linguistic resource. In this paper, we shall use semantic similarity measures, which may be classified into five different groups [22]:

Edge-counting measures: They are based on the minimum path length between two terms present in WordNet.

Information content-based measures: These metrics introduce the concept of information content (IC); i.e., the more abstract a concept is, the less IC it has. There are two main ways for calculating the IC of a concept: i) statistical corpora analysis; and, ii) using the topological parameters of ‘is-a’ taxonomy.

Featured-based measures: These measures are based on the concept of feature, being the similarity estimated according to the weighted sum of common and non-common characteristics. The main source for defining features in this approach is using glosses provided by WordNet or other dictionary.

Gloss-based measures: These measures quantify the overlap between the glosses of two concepts with their semantic neighbours.

Hybrid measures: These consist of the combination of different proposed methods.

Although we use WordNet, in general, any other thesaurus (𝒯) can be applied. Thus, a word w₁ ∈ 𝒯 is related to a word w₂ ∈ 𝒯 with a degree [ $\underline{α}$ , $\bar{α}$ ], which is represented in Bousi~Prolog by a PE with the form: $w_{1} \sim w_{2} = [\underline{α}, \bar{α}]$ (4) where w₁ and w₂ stand for the corresponding terms and [ $\underline{α}$ , $\bar{α}$ ] denote the lower and the upper approximation degrees obtained by using a set of similarity metrics {M₁, …, M_n}. If only a single metric is used, then the PE can be reduced to the expression: w₁ ∼ w₂ = α. In the case of w₁ and w₂ are in the same synset, [ $\underline{α}$ , $\bar{α}$ ] = [1, 1] or α = 1. Note that, in the first case [ $\underline{α}$ , $\bar{α}$ ] ∈ L ([0, 1]) and in the second one α ∈ [0, 1]. Our framework allows IVFSs and fuzzy sets to be worked with indistinctly.

Thus, words and their relations become part of the first order language alphabet and take part naturally in the inference process. This requires an inference rule based on the semantic relation that has been used to obtain the set of PEs between words. More formally, we call program, Π, a set of first order horn clauses in which there is a set of symbols that we interpret as words. This program is associated with a thesaurus (𝒯) that we call vocabulary of Π and that are composed of all those symbols (predicate, function or constant) present in the program and that are defined in a thesaurus 𝒯.

Definition 7. [Vocabulary of a program Π] Given a program Π. The vocabulary of Π, denoted by _{𝒱
Π}, is made of all the predicate, function or constant symbols in the program.

Definition 8. [Vocabulary of Π and 𝒯] Given a program Π and a thesaurus 𝒯. The vocabulary of Π and 𝒯, denoted by $𝒱_{Π}^{𝒯}$ , is made up of all the predicate, function or constant symbols in the program Π that are words in 𝒯. That is, symbols which are interpreted as words in a thesaurus 𝒯.

Example 3. Let Π be the program of Example 1 and a thesaurus 𝒯, WordNet. The vocabulary $𝒱_{Π}^{𝒯} =$ {love, like, play, trekking, football, basketball, healthy, practices, sport}.

Now, each one of the elements in $𝒱_{Π}^{𝒯}$ , are provided by the thesaurus 𝒯 in order to obtain the PEs between the words.

The next step is to define the reasoning model, therefore, we shall define the inference mechanism corresponding to this frame, which is based on the semantic relations among words.

4.1 Reasoning using WordNet and IVFSs

Given a program Π, a thesaurus 𝒯 and its associated vocabulary $𝒱_{Π}^{T}$ , then for all $w \in 𝒱_{Π}^{𝒯}$ , the set of terms related with w extracted from 𝒯 is denoted as $𝒮 (w)_{𝒱_{Π}^{𝒯}} = {s_{1}, \dots, s_{n}}$ . Once the vocabulary of synonyms has been constructed, the proximity equations are created applying the corresponding semantic similarity measures. In this paper, we use two of the most simple measures from edge-counting category 4 : Path length’s [23] and Wu and Palmer’s [24] ones. Both metrics are calculated using the library called WS4J 5 . In addition, we introduce a $[\underline{λ}, \bar{λ}] - cut$ as threshold in order to select only words with a high degree of similarity.

Example 4. Let Π be the program of Example 1, a thesaurus 𝒯 = WordNet and $𝒱_{Π}^{𝒯}$ the vocabulary of Example 1, we obtain the terms of a vocabulary by using the Path length and Wu-Palmer measures with a fixed λ-cut=[0.1,0.5]. 6 : $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}} (loves) = {$ (passion,[0.25,0.8]); (eroticlove,[0.1,0.7]); (sexual love,[0.1,0.7]); (enjoy,[0.3,0.5]); (know,[0.3,0.5]); (make out,[0.3,0.5])}; $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}} (plays) = {$ (dramatic play,[1.0,1.0]); (drama, [1,1]); (act,[0.25,0.7]); (wager,[0.3,0.75])}; $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}} (football) = {$ (football game, [1.0,1.0]) }; $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}} (basketball) = {$ (hoops, [1,1]); (basketball game,[1,1]) }; $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}}$ (pract-ises) = {(rehearse, [1,1])}; $𝒮_{𝒱_{Π}^{𝒯_{[0.1, 0.5]}}} (sport) = {$ (variation, [0.1,0.5]); (fun,[0.3,0.8]); (boast,[0.1,0.7]); (feature,[0.5,0.8]); (frolic,[0.3,0.8]); (romp,[0.1,0.5]); (gambol,[0.3,0.8]); (frisk,[0.2,0.7])}. Then the PEs showed in Fig. 3 are created.

Therefore, as the system knows that “mary loves mountaineering”, now it is able to infer that “mountaineering is a passion of mary” or “mary enjoys mountaineering” by asking if “?.-passion (mary,mountaineering),enjoy(mary,mountaineering).”, the system responds “Yes with [0.25,0.8]”. Analogous cases could be performed for the rest of the symbols occurring in the source program.

5 Incorporation of interval-valued fuzzy sets and Wordnet into Bousi∼Prolog

In this section, we are going to explain how the model of reasoning detailed in the previous sections can be incorporated into the Bousi~Prolog system [12, 28]. Two steps are needed: i) incorporation of Interval-valued fuzzy theory in the core of BPL; and, ii) the connection of WordNet with Bousi~Prolog at compilation time 7 .

5.1 Discussion about the incorporation of IVFs

The incorporation of the IVFSs into the BPL system requires two main steps: i) enhancing the compiler in order to allow a programmer to deal with interval-valued PEs defined above; ii) enhancing the abstract machine in order to introduce IVFSs inside the inference process. In particular, we have enhanced the Similarity-based WAM [28] for the execution of interval-valued Bousi~Prolog programs.

5.2 Discussion about the incorporation of WordNet

Introducing WordNet in Bousi~Prolog means addressing syntactic aspects, different implementation phases and generating data structures, which are necessary for facilitating reasoning with words.

During the syntactic analysis phase, it is verified whether there are syntactic errors in the source of the program. At the same time, the syntactic tree, which is the basis for later code generation, is built. Additionally, in this phase, the directive “thesaurus” creates a connection with the thesaurus and generates the vocabulary of Π and 𝒯. Bousi~Prolog uses a directive which allows us to specify the thesaurus 𝒯 to which we want to connect, in this case, WordNet. The specific syntax of this directive is:

:-thesaurus(Thesaurus_Name,[M_1,...,M_N],

Lambda_Cut,[semantic_relations])).

where:

Thesaurus _ Name is the name of the thesaurus.

[M₁, …, M_N] is the semantic metrics employed to generate the set of proximity equations. By default path or wup metrics are considered.

Lambad _ Cut a threshold $[\underline{λ}, \bar{λ}] \in L ([0, 1])$ that indicates the minimum degree of semantic similarity. By default the $[\underline{λ}, \bar{λ}] - cut$ value is zero ([0,0]) (that is, no threshold is imposed).

Example 5. The following directive connects a program to WordNet, the semantic measures ‘Path length’ and ‘Wu and Palmer’. Additionally a λ - cut = [0.1, 0.5] is required :-thesaurus(wordnet, [path,wup], [0.1,0.5],[semantic_relations]).

Once the connection has been established, the next step is the generation of PEs from the semantic relations indicated as arguments in the directive. We proceed as follows: i) we start from a program Π, a thesaurus 𝒯, a set of semantic metrics [M₁, …, M_N] (for now Path Length and/or Wu-Palmer measures) and a λ-cut $[\underline{λ}, \bar{λ}]$ ; ii) the interval-valued fuzzy relation is defined and initialized: ℛ : =∅; iii) the vocabulary $𝒱_{Π}^{𝒯}$ is computed; iv) for each w $\in 𝒱_{Π}^{𝒯}$ we compute compute interval-valued proximity equations 𝒮: $𝒮_{𝒱_{Π}^{𝒯_{[\underline{λ}, \bar{λ}]}}} (w) = {s_{1}, \dots, s_{n}}$ ℛ : = ℛ ∪ {ℛ (w, s_i) = [ ${\underline{α}}_{i}$ , ${\bar{α}}_{i}$ ]}; iv) finally, a set of entries which defines a semantic proximity relation ℛ is returned.

Example 6. Following with Example 1 let’s suppose that we want to reason by using semantic relations. This can be codified in Bousi~Prolog as follows: :-thesaurus(wordnet,[semantic_relations]).

love(mary, mounting).

like(john, football).

play(peter, basketball).

healthy(X):- practise(X,sport).

The algorithm allows us to generate the equations shown in Example 4. Therefore, it could be asked if “ ?.-likes(john,football_game)” the system responds “ Yes with [1,1]”. Finally, as it has been shown in Example 4, our framework allows the system to automatically acquire knowledge which is not initially incorporated in the original knowledge bases.

It is worth noting that general rules have been developed automatically using WordNet; not manually item by item. On the other hand, our reasoning schema is based on it. However, it is also relevant to note that this way to proceed is not free of inefficiencies and problems because all the semantic relations of a given word are included in the program. An immediate solution is to apply automatic techniques in order to prune and improve ontologies, but this task is out of the aim of this paper and it will be proposed as future work.

6 Applications and empirical study

Bousi∼Prolog enriched with IVFSs is well suited to making the query answering process more flexible, given the interval-valued fuzzy unification algorithm. But, in addition, there are several practical applications where our extension can be useful: advanced pattern matching; flexible deductive databases; knowledge-based systems; information retrieval, where textual information is selected or analyzed using an ontology; or approximate reasoning.

Generating PEs automatically using linguistic resources allows us to perform knowledge representation in a simple and quicker way. In the case of big knowledge bases, an automatic procedure for dealing with vagueness is mandatory for addressing the problem within a reasonable time.

In order to perform an empirical comparative, we are going to use the examples shown in [13]. We focus on flexible deductive databases. Here, the fuzzy component is defined by two proximity relations. In the example shown in [13], it is assumed a database storing information on films which are shown at some cinema of a specific neighborhood in Los Angeles city. The database consists of three tables (relations, represented by BPL facts) with a total of eight attributes. The film table has three attributes: title, director and category of the film. The theater table is characterized by the theater name, owner and location of the theater. The engagement table is used to link the information stored in the first two tables and it has two attributes: the title of the film and the name of the theater. The fuzzy component is defined by two proximity relations. The first one states the similarity between the different film categories (i.e., it is defined on the syntactic domain of film categories) and the second one states the closeness of two theater locations (i.e., it is defined on the syntactic domain of theater locations). In this example, both fuzzy relations are implemented explicitly by means of a set of proximity equations.

bervely_hills~downtown=0.3.

bervely_hills~santa_monica=0.45.

bervely_hills~hollywood=0.56.

bervely_hills~westwood=0.9.

downtown~hollywood=0.45.

downtown~santa_monica=0.23.

downtown~westwood=0.25.

hollywood~santa_monica=0.3.

hollywood~westwood=0.45.

santa_monica~westwood=0.9.

comedy~drama=0.6.

comedy~adventure=0.3.

comedy~suspense=0.3.

drama~adventure=0.6.

drama~suspense=0.6.

adventure~suspense=0.9.

Supposing that a user writes down each proximity equation in twenty seconds, the user would need 320 seconds.

Another important case is when an ontology is employed to the querying database [8]. For example, assume a fragment of a database that stores information about people and their jobs. We want to know who is middle-aged and likes science. Suppose a relational table named “person” (see Table 1), where a person is defined by a key, his name, age and job.

In this example the attribute Age is associated to the linguistic variable age, the attribute Profession is associated to an ontology, WordNet in our case. From it, we obtain the following interval-valued proximity equations

programmer~computer_programmer=[1.0,1.0]

programmer~coder=[1.0,1.0]

programmer~software_engineer=[1.0,1.0]

teacher~instructor=[1.0,1.0]

engineer~applied_scientist=[1.0,1.0]

engineer~technologist=[1.0,1.0]

engineer~railroad_engineer=[0.1,0.4]

engineer~locomotive_engineer=[0.1,0.4]

engineer~engine_driver=[0.1,0.4]

engineer~organise=[0.1,0.2]

engineer~mastermind=[0.25,0.84]

engineer~direct=[0.07,0.13]

engineer~organize=[0.2,0.5]

engineer~orchestrate=[0.14,0.4]

Supposing that a user writes down each proximity equation in twenty seconds, the user would need 280 seconds.

The empirical study is simple, illustrative. We must take into account the PEs employed in a knowledge representation task. The above examples support our initial affirmation. It is hard for an expert to model all the semantic knowledge involved in a particular problem. However, although the expert could do it, this is a task which needs too much time to be completed. The connection to a thesaurus allows us to automatize this step, making the task easier for the expert and reducing time.

7 Conclusions and future work

In this paper, framed into a PLP framework, we have proposed a model of logic programming, Bousi~Prolog using PEs, which are automatically calculated using semantic metrics based on linguistic resources, such as WordNet. This allows us to calculate the PEs independently on the background of the designer but supported on natural language semantics. As we have illustrated with the examples, we can generate a relevant amount of PEs for inferring acceptable conclusions attending to intrinsic vagueness of natural language. One of the most important features of our approach is its ability to compile all the information regarding the thesaurus defined in a program.

It is relevant to note that the use of IVFSs allows us to work with several semantic similarity metrics simultaneously, which are automatically computed. This avoid to the programmer handle them manually. Lastly, it is also worth noting that our framework allows the system to automatically acquire knowledge which is not initially incorporated in the original knowledge bases (e.g., general rules have been developed automatically using WordNet; not manually item by item).

As future work, we propose to enhance the inference patterns introducing new reasoning schemes different to deductive ones, and considering other linguistic relationships involved in WordNet, such as meronym, sister terms, etc. as well as applying techniques to improve the behaviour of WordNet as an ontology. On the other hand, we also propose to explore the field of semantic metrics, designing an experiment in order to know which one provides us with the best results in PLP.

Footnotes

In Section 6, some of these applications are explained with more detail and illustrated by examples.

In this example, the values of PEs, are manually defined by the designer of the system.

Here, the symbol “≈” means that the arguments in ɛ₁₁ and ɛ₁₂ are capable to be equals by proximity using an interval-valued fuzzy degree, that is, a substitution σ can be computed such that ℛ (ɛ₁₁σ, ɛ₁₂σ) > [0, 0].

We choose these measures only for illustrating how our proposal works and other measures can be used. It is out of the scope of this paper to perform an experimental study about which is the best measure to this framework; it is a task to be addressed as future work.

Available at:

Note that, we use $𝒱_{Π}^{𝒯_{λ} (w)}$ for indicating that a threshold λ ∈ L ([0, 1]) is employed to compute the synonyms of a symbol w.

A beta version of this incorporation can be founded at the URL:

Acknowledgments

Thanks to the anonymous referees for their valuable comments. This work has been done in collaboration with the research group SOMOS (SOftware-MOdelling-Science) funded by the Research Agency and the Graduate School of Management of the Bío-Bío University under grant 130415 GI/EF, by the European Regional Development Fund (ERDF/FEDER) under the projects CN2012/151 and GRC2014/030 of the Galician Ministry of Education, the Postdoctoral Training Grants funded by the Galician Ministry of Education (2016), the program Becas Iberoamérica, Jóvenes Profesores e Investigadores, Santander Universidades (España, 2014) funded by Grants Santander Universities and the Spanish Ministry for Economy and Competitiveness under the grants TIN2016-76843-C4-2-R and TIN2014-56633-C3-1-R.

References

Bobillo

, Gómez-Romero

and León-Araúz

, Fuzzy Ontologies for Specialized Knowledge Representation in WordNet, in: Proceedings of 14th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU2012), vol. I, 2012, pp. 430–439.

León-Araúz

, Gómez-Romero

and Bobillo

, A Fuzzy Ontology Extension of WordNet and EuroWordnet for Specialized Knowledge, in: Proceedings of the Terminology and Knowledge Engineering Conference 2012 (TKE-2012), 2012, pp. 154– 139–154.

Davis

and Morgenstern

, Introduction: Progress in formal commonsense reasoning, Artificial Intelligence 153 (2004), 1–12.

Levesque

, Pirri

and Reiter

, Foundations for the situation calculus, Electronic Transactions on Artificial Intelligence 2(34) (1998), 159–178.

McCarthy

J.L.

, Applications of circumscription to formalizing common sense knowledge, Artificial Intelligence 28 (1984), 89–116.

Fernández-Lanza

, Prolog for automatic processing of synonymy, Logica Trianguli 4 (2000), 25–34.

Minsky

, The emotion machine: Commonsense thinking, artificial intelligence and the future of human mind, Simon and Schuster, 2006.

Martínez-Cruz

, et al., Flexible queries on relational databases using fuzzy logic and ontologies, Information Sciences 366 (2016), 150–164.

Lenat

D.B.

and Guha

R.V.

, Building large knowledge-based systems; representation and inference in the Cyc project,Addison-Wesley Longman Publishing Co., Inc., 1989.

10.

Miller

G.A.

, WordNet: A lexical database for english, Communications of the ACM 38 (1995), 39–41.

11.

Liu

and Singh

, Commonsense Reasoning in and over Natural Language, Proc of the 8th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, 2004, pp. 293–306.

12.

Rubio-Manzano

and Julian-Iranzo

, Reasoning with words: A first approximation, Proceedings of the 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2014), 2014, pp. 569–574.

13.

Rubio-Manzano

and Julian-Iranzo

, A fuzzy linguistic prolog and its applications, Journal of Intelligent and Fuzzy Systems 26(3) (2014), 1503–1516.

14.

Newell

, The knowledge level, Artificial Intelligence 18(1) (2008), 87–27.

15.

Sessa

M.I.

, Approximate reasoning by similarity-based SLD resolution, Theoretical Computer Science 275 (2002), 389–426.

16.

Julián-Iranzo

and Rubio-Manzano

, A sound semantics for a similarity-based logic programming language, in: International Work-Conference on Artificial Neural Networks, Springer Berlin Heidelberg C. 2011, pp. 421–428.

17.

Fontana

and Formato

, Likelog: A logic programming language for flexible data retrieval. Proc of the ACM SAC, 1999, pp. 260–267.

18.

Julián-Iranzo

, et al., A Fuzzy Logic Programming Environment for Managing Similarity and Truth Degrees, arXiv, preprint arXiv:1501.02034.

19.

Loia

, Senatore

and Sessa

M.I.

, Similarity-based SLD resolution and its role for web knowledge discovery, Fuzzy Sets and Systems 144(1) (2004), 151–171.

20.

Rodriguez-Artalejo

and Romero-Diaz

, A declarative semantics for CLP with qualification and proximity, Theory and Practice of Logic Programming 10(4-6) (2010), 627–642.

21.

Sobrino

, The Role of Synonymy and Antonymy in ’Natural’ Fuzzy Prolog, in Soft Computing in Humanities and Social Sciences, Seising

Rudolf

, Sanz

Veronica

, eds., Springer, 2012, pp. 209–236.

22.

Hadj Taieb

M.A.

, Ben Aouicha

and Ben Hamadou

, Ontology-based approach for measuring semantic similarity, Engineering Applications of Artificial Intelligence 36 (2014), 238–261.

23.

Rada

, Mili

, Bicknell

and Blettner

, Development and application of a metric on semantic nets, IEEE Transactions on Systems, Man and Cybernetics 19 (1989), 17–30.

24.

and Palmer

, Verbs semantics and lexical selection, in: Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics (ACL ’94), Association for Computational Linguistics, Stroudsburg, USA, 1994, pp. 133–138.

25.

Tversky

, Features of similarity, Psychological Review 84 (1977), 327–352.

26.

Hirst

and St-Onge

, Lexical chains as representations of context for the detection and correction of malapropisms, WordNet: An electronic lexical database, 1998, pp. 305–332.

27.

Bustince

, Interval-valued fuzzy sets in soft computing, International Journal of Computational Intelligence Systems 3(2) (2010), 215–222.

28.

Julián-Iranzo

and Rubio-Manzano

, A similarity-based WAM for Bousi∼Prolog, Lecture Notes in Computer Science, 5517, Springer, Heidelberg, 2009, pp. 245–252.

29.

Witzig

, Accessing wordnet from prolog. Artificial Intelligence Centre, University of Georgia, 2003, pp. 1–18.

30.

Schallehn

, et al., Efficient similarity-based operations for data integration, Data and Knowledge Engineering 48(3) (2004), 361–387.