Abstract
Proximity-based Logic Programming is a formal framework for representing general or non-specialized knowledge. Although it is a powerful tool, it is too complex because the values of the proximity equations (fuzzy binary relations that establish the relationships among the symbols of a first-order language) must be manually defined by the designer of the system. In this paper, we propose a new framework for Proximity-based Logic Programming enhanced with WordNet and Interval-Valued Fuzzy Sets. Its main contribution is to compile automatically the information provided by WordNet and generate an interval-valued proximity relation on the set of their words. This proposal is completely integrated inside the unification mechanism of
Keywords
Introduction
One of the most relevant sources of human awareness is general or non-specialized knowledge [3]. Consequently, the modelling of this type of information is one of the main challenges in the field of Artificial Intelligence. However, this is a complex effort given that vagueness and uncertainty are essential in this type of knowledge, and both phenomena must be adequately addressed in order to develop an efficient computational representation.
In the literature, we distinguish two main approaches to encode it. For the first one, it is assumed that common-sense is analogous to any other type of knowledge, such as a mathematical one, and consequently, it can be described by a formal framework. Thus, some proposals assuming this frame are Situation Calculus [4], Circumscription [5] or Multi-agent Cognitive Architecture [7]. In the second approach, by its hand, it is assumed that this type of human knowledge is too complex for being represented using mathematical logic, and therefore, formal approaches are not valid. Instead, it is induced from a large collection of facts (knowledge bases) and the relationships among them. Some examples of this frame are Cyc Project [9], WordNet [10] or Concept Net [11].
In this paper, we assume the second framework because it allows us to incorporate the field of lexical knowledge, a source of non-specialized knowledge, which proposes the use of linguistic information about words and relationships between them to perform valid inferences. However, developing a computational approach without a formal framework is a difficult task. Thus, for that reason, we propose to use the frame of Proximity-based Logic Programming (PLP) [13], which enhances logic programming with two important capabilities: (a) the capability to specify the meaning of words drawn from natural language using the named Proximity Equations (PE); and, (b) the ability to reason and compute with words and clauses by means of a unification algorithm based on proximity, performing a sort of approximate reasoning.
A challenge in PLP is the management of linguistic knowledge, because this involves dealing with vagueness and word synonymy. Linguistic vagueness has been addressed in the literature by means of PEs, but they are mostly defined for a specific domain [17–20], being the designer of the system who fixes the values of these equations manually, which makes harder to use PLP systems in real applications. On the other hand, synonymy has been analyzed with the development of semantic similarity measures. In the literature, we can find different families and measures which provide us different results, but there is not a predominant one [22], which leads us to consider assessing synonym between words as a high uncertainty field. From our point of view, PEs also can be defined for non-specific domains whether the definitions provided by the designer are obtained automatically from a general ontology. Using this approach, it is not necessary to establish the set of PEs associated to the PLP system but these can be generated in an automatic way, even also vague expressions.
On the other hand, enhancing knowledge representation with lexical knowledge and linguistic resources allows us: i) adding new specific information for more or less specialized domains, ii) inferring new knowledge from the same set of facts and rules, iii) using the power of lexical knowledge for applications as text cataloging, natural queries on databases, computing with words and perceptions, fuzzy experts systems, semantic web or approximate reasoning 1 .
The seminal ideas of this paper are described in [6, 21], where a research line to get a ‘natural’ fuzzy linguistic Prolog system is initiated. Requisites defined to this system have the capability of working together with linguistic sources (e.g., electronic linguistic dictionaries) and the resolution rule and unification mechanism also should be treated with linguistic relations (e.g., synonymy or antonym). This paper is also a contribution on this matter, given we propose to use WordNet in the formalization of a PLP system; in other words, we enhance the inference mechanism of Prolog system by using lexical resources.
In the literature, there is an implementation available for using WordNet in Prolog [29]. This is formed by files containing the WordNet database in a Prolog-readable format, but a Prolog interface to WordNet is not implemented. Since the Prolog database is very large and may take many minutes to load into the Prolog workspace, a separate file has been created for each WordNet relation. This requires from the user the ability to load only those parts of the database that they are interested in in each case and, hence, a sort of advanced knowledge. In contrast with this proposal, our frame allows a programmer to create an interface with WordNet in a clear and transparent way, avoiding any built-in command; therefore, the programmer does not need to be an expert on WordNet or on the management of vagueness. Moreover, as knowledge provided by WordNet is generated at compilation time, our approach is a useful alternative of incorporating WordNet into a Prolog system based on some type of weak unification.
Our main aim in this paper is to deal with linguistic knowledge from a holistic view, which allows us to address the weaknesses explained above. Thus, PEs are automatically obtained from a general ontology such us WordNet [10], which allows us to automatize also the task of knowledge representation and to avoid the manually definition by the designer, reducing the time for performing the system. Since synonymy is one of the most typical cases that are modeled by PEs, we propose to use semantic similarity measures [22] for assessing it. As we said, this is a field with a high degree of uncertainty and, for that reason, we shall apply Interval-Valued Fuzzy Relations (IVFRs) [27] for modeling the results of the PEs. IVFRs are an extension of standard Fuzzy Sets which allows us to improve the expressive power of the system in two interesting points: i) we can define a lower and an upper bound of similarity degree according to different metrics; and, ii) we can expand or narrow the use interval of linguistics concepts according to the use context. This paper contains the general procedure for generating automatically and independently of a designer the PEs of a knowledge-base of a Prolog system, which are expressed using IVFRs. It enhances and formalizes the approach presented in [12] and it is also a contribution to the development of a general PLP framework connected with lexical resources.
The remainder of this paper is organized as follows: in Section 2 we describe the main concepts used in our framework; in Section 3 and 4, our proposal of lexical reasoning, combining PLP with semantic similarity measures, is introduced; Section 5 describes the details of the implementation of our proposal; Section 6 describes some examples of possible applications of our frame; and, finally, Section 7 summarizes the main conclusions of this paper as well as some proposal for future work.
Preliminary concepts
In this section, we introduce the main concepts used in the definition of our framework.
Proximity equations
PEs are in the core of the PLP frame, since they are used for defining linguistic similarity relationships inside the knowledge base. From a syntactic point of view, a PE “a ∼ b = α” defines an entry ℛ (a, b) = α of a relation ℛ establishing that a is close to b with a degree α ∈ [0, 1]. Thus, PEs play an essential role in the unification step; i.e., Proximity-based Unification Algorithm. Roughly speaking, this algorithm states that two terms f (t1, …, t
n
) and g (s1, …, s
n
) weakly unify if the root symbols f and g are close, with a certain degree, and each of their arguments t
i
and s
i
matches in a soft sense. For Example 1, we illustrate the behaviour of
In a standard Prolog program, if we ask about healthy people
Now, the BPL program allows us to get the answers: “
Source for lexical knowledge: WordNet
WordNet [10] is an electronic English thesaurus. The basic elements in its knowledge base are words, word senses and synsets (i.e., the set of cognitive synonyms). It is worth noting that the same word has associated different senses and different part-of-speech and, consequently, it appears with different meanings in different pragmatic contexts. Each synset is described by a gloss, which is a brief textual description of the meaning of a synset. The lexical and semantic relations included in WordNet, such as hypernym, hyponym, antonymy, etc., are among synsets, not among words.
In this paper, we focus on the use of WordNet as an ontology. It has been used in different applications both for general/natural language tasks [22], but its behaviour in specialized discourses show significant limitations [1]. In order to address this problem, an extension of WordNet using fuzzy ontologies is proposed because it provides a better frame for the representation of lexical and semantic relationships [1, 2].
Interval-valued fuzzy sets
IVFSs are a fuzzy formalism based on two membership mappings instead of a single one. They are called, analogously to ordinary fuzzy sets, the lower membership function and the upper membership function. Both are established on a universe of discourse 𝒳, and map each element from 𝒳 to a real number in the [0, 1] interval.
With respect to the name of this kind of fuzzy sets, Interval-valued, values of and , computed for any x ∈ 𝒳 are the lower and upper bounds of the interval number and it is the membership degree for x to the set A. That interval is included in [0, 1] and closed at both ends.
Some arithmetic operations on interval-numbers have been recalled since they are useful in operating on cardinalities of IVFSs. Let a =[, ā], b =[, ] be intervals in R, and r∈ R +. The arithmetic operations ‘+’, ‘-’, ‘·’ and power are defined as follows: [, ā] + [, ] = [ + , ā + ] [, ā] - [, ] = [ - , ā - ] [, ā] · [, ] = [min ( · , · , ā · , ā · ) , max ( · , · , ā · , ā · )] ([, ā])
r
= [, ā
r
] for non-negative ā,
The operations of union and intersection for IVFSs are defined by triangular norms. Let A,B be IVFSs in 𝒳, t a t-norm and s a t-conorm. The union of A and B is the interval-valued fuzzy set A ∪ B with the membership function.
Thus, de Morgan’s laws for IVFSs A,B in 𝒳 are
L = {[x1, x2] ∈ [0, 1] 2 with x1 ≤ x2}
[x1, x2] ≤ L [y1, y2] iff x1 ≤ y1 and x2 ≤ y2
Also by definition:
[x1, x2] < L [y1, y2] ⇔ x1 < y1, x2 ≤ y2 or x1 ≤ y1, x2 < y2
[x1, x2] = L [y1, y2] ⇔ x1 = y1, x2 = y2
0 L = [0, 0] and 1 L = [1, 1] are the smallest and the greatest elements in L.
Since interval-valued fuzzy relations are interval-valued fuzzy subsets they have [, ]-cuts.
A binary interval-valued fuzzy relation on a set U is an interval-valued fuzzy subset on U × U (that is, a mapping U × U ⟶ ([0, 1])).
Any extension of a logic programming language requires keeping a balance between two levels [14]: i) the knowledge level, where the facts and the behaviour of a particular world are described; ii) the symbolic level, where symbols are used for representing the knowledge level. Thus, considering deductive reasoning, the symbolic level can simulate reasoning and inferring new knowledge.
In the case of a logic programming language enriched with linguistic resources and IVFSs, it is also necessary to distinguish between crisp and uncertain knowledge. Uncertain knowledge is not usually present at knowledge level, but it plays an important role in the inference process and, therefore, it must be considered by the inference mechanism. From the point of view of the design of a programming language, this is crucial because the user/programmer must not handle manually the definition of fuzzy sets; in other words, the system should compute automatically all type of fuzzy sets. Thus, assuming these ideas, we separate the representation of precise and uncertain knowledge, which will improve the user’s experience. To this aim, we employ Interval-Valued Fuzzy Relations (IVFR).
Interval-valued fuzzy relations in PLP
A binary IVFR ℛ is a proximity relation if it fulfils the reflexive and symmetric properties. In standard PLP, different syntactic symbols represent proximity information. Generalizing this idea presented in [16] and introduced in [15], an IVFR ℛ is defined on the alphabet of a first order language. This makes possible to treat as indistinguishable two syntactic symbols which are related by ℛ with a certain degree greater than zero.
The IVFR ℛ on the alphabet of a first order language can be extended to terms and atomic formulas by structural induction in the usual way [15]: Let f and g be two n-ary function symbols and let t1, …, t
n
, s1, …, s
n
be terms. ℛ (f (t1, …, t
n
) , ; Let p and q be two n-ary predicate symbols and let t1, …, t
n
, s1, …, s
n
be terms. ℛ (p (t1, …, t
n
) , ; Let C = A0←A1, …, A
n
and be two Horn clauses. If n = m, ;
∧ represents any t-norm acting on interval-valued approximation degrees. Otherwise, the approximation degree of two expressions is zero.
Although substitutions can be “fuzzified” in some respect, in this paper we use the classical notion of a substitution disregarding other possible alternatives in our setting. A substitution σ is a mapping from the set of variables 𝒳 to the set of terms 𝒯 such that its domain Dom (σ) = {x ∈ 𝒳 ∣ xσ ≠ x} is finite. We frequently identify a substitution σ with the set {x/xσ ∣ x ∈ dom (σ)}. We denote the identity substitution by id. The restriction σ↾𝒱 of a substitution σ to a set 𝒱 of variables is defined by xσ↾𝒱 = xσ if x ∈ 𝒱 and xσ↾𝒱 = x if x ∉ 𝒱. We write σ = θ [𝒱] iff σ↾𝒱 = θ↾𝒱.
When we apply a substitution to two expressions (terms or atoms) which are similar w.r.t. a proximity relation, ℛ, and a cut value, [, ], their instances remain similar with the same approximation degree.
Also, in the context of a proximity relation, ℛ, we can define an analogous concept to the classical notion of variant substitutions. Roughly speaking, two substitutions are similar if they share the same domain and the terms of their respective ranges are pairwise similar w.r.t. ℛ and a cut value [, ].
The following define a weaker version of the notion of more general substitution.
Note that, the relation lap is a pre-order in the set of substitutions of a first order language [15, pag. 410].
The notion of proximity between substitutions can be characterized in terms of the weaker version of more general substitution just introduced in the last definition, which we name the weak most general unifier (wmgu).
Note that a result of soundness on the lattice [0, 1] has been proved in [16] and, since the proposal defined in this paper is an extension of it, those properties are preserved. Nevertheless, as future work, soundness and completeness will be also formally in proved this frame.
Weak unification algorithm based on interval-valued fuzzy relations
Now, we extend the weak unification algorithm presented in [12] with IVFRs on syntactic domains. It is formalized as a transition system supported on an interval-valued unification relation “⇒”. The unification of two expressions ɛ11 = f (t1, …, t
n
) and ɛ12 = g (s1, …, s
n
) is obtained by a state transformation sequence starting from an initial state 〈G, id, [, ], where G = {t1 ≈ s1, …, t
n
≈ s
n
} is a set of unification problems
3
, id is the identity substitution and [, ] = [1, 1] is the initial interval-valued degree:
Weak SLD resolution based on interval-valued fuzzy relations
Let Π be a set of Horn clauses and ℛ an interval-valued fuzzy relation on the alphabet of a first order language ℒ. Let Λ = {[, ] , . . . , [, ]} be the set of approximation levels of ℛ. We extend the Weak SLD (WSLD) resolution presented in [12]. Now, this is a transition system 〈E, ⇒WSLD〉 where E is a set of triples 〈𝒢, θ, [, ] (goal, substitution, interval-valued degree), that we call the state of a computation, and whose transitional relation ⇒WSLD ⊆ (E × E) is the smallest relation that satisfies:
𝒞 = (𝒜 ← 𝒬) ≪Π,
σ = wmgu (𝒜, 𝒜′) ≠ fail,
[, ] = ℛ (𝒜σ, 𝒜′σ)) ≥ [, ]
〈 (← 𝒜′, 𝒬′) , θ, [, ] ⇒WSLD 〈← (𝒬, 𝒬′) σ, θσ, [, ] ∧ [, ]
where 𝒬, 𝒬′ are conjunctions of atoms, the notation “𝒞≪Π” is representing that 𝒞 is a standardized apart clause in Π, and that the value [, ] is a cut value in Λ, which imposes a limit to the expansion of the search space in a computation. We say that the performed step is a step of level [, ] because the computed approximation degree is greater than or equal to [, ].
A WSLD derivation of level [, ] for Π ∪ {𝒢0} and ℛ is a sequence of steps of level [, ]:
Thus, the sequence of steps to achieve the answer is: X = dracula with [0.7, 0.9] is:〈good (X1) , X/X1, [1, 1] ⇒WSLD〉〈 ← book (X1, mystery) , {X/X1} , [1, 1] ⇒WSLD〉〈 □ , {X/X1, X1/dracula} , [0.7, 0.9].
Semantic proximity equations
As we said, WordNet can be interpreted as an ontology, given nouns and verbs are organized into hierarchies based on “is-a” relations. In the literature, these relationships have been used to obtain different resemblance semantic metrics, which can be classified into two main categories [22]: Semantic similarity measures: They are exclusively based on “is-a” relations. Some very well known examples are Path length [23] or Wu and Palmer [24]. Semantic relatedness measures: They include other linguistic relations (meronym, glosses, etc.) in addition to “is-a” relations. Some examples are Tversky [25] or Hirst and St-Onge [26].
As we said in the introduction, we propose to use general knowledge for enhancing PLP. Although logic programming requires formal expressions, we propose to induce their values from WordNet, a linguistic resource. In this paper, we shall use semantic similarity measures, which may be classified into five different groups [22]: Edge-counting measures: They are based on the minimum path length between two terms present in WordNet. Information content-based measures: These metrics introduce the concept of information content (IC); i.e., the more abstract a concept is, the less IC it has. There are two main ways for calculating the IC of a concept: i) statistical corpora analysis; and, ii) using the topological parameters of ‘is-a’ taxonomy. Featured-based measures: These measures are based on the concept of feature, being the similarity estimated according to the weighted sum of common and non-common characteristics. The main source for defining features in this approach is using glosses provided by WordNet or other dictionary. Gloss-based measures: These measures quantify the overlap between the glosses of two concepts with their semantic neighbours. Hybrid measures: These consist of the combination of different proposed methods.
Although we use WordNet, in general, any other thesaurus (𝒯) can be applied. Thus, a word w1 ∈ 𝒯 is related to a word w2 ∈ 𝒯 with a degree [, ], which is represented in
Thus, words and their relations become part of the first order language alphabet and take part naturally in the inference process. This requires an inference rule based on the semantic relation that has been used to obtain the set of PEs between words. More formally, we call program, Π, a set of first order horn clauses in which there is a set of symbols that we interpret as words. This program is associated with a thesaurus (𝒯) that we call vocabulary of Π and that are composed of all those symbols (predicate, function or constant) present in the program and that are defined in a thesaurus 𝒯.
Now, each one of the elements in , are provided by the thesaurus 𝒯 in order to obtain the PEs between the words.
The next step is to define the reasoning model, therefore, we shall define the inference mechanism corresponding to this frame, which is based on the semantic relations among words.
Given a program Π, a thesaurus 𝒯 and its associated vocabulary , then for all , the set of terms related with w extracted from 𝒯 is denoted as . Once the vocabulary of synonyms has been constructed, the proximity equations are created applying the corresponding semantic similarity measures. In this paper, we use two of the most simple measures from edge-counting category 4 : Path length’s [23] and Wu and Palmer’s [24] ones. Both metrics are calculated using the library called WS4J 5 . In addition, we introduce a as threshold in order to select only words with a high degree of similarity.
Therefore, as the system knows that “mary loves mountaineering”, now it is able to infer that “mountaineering is a passion of mary” or “mary enjoys mountaineering” by asking if “
Incorporation of interval-valued fuzzy sets and Wordnet into Bousi∼Prolog
In this section, we are going to explain how the model of reasoning detailed in the previous sections can be incorporated into the
Discussion about the incorporation of IVFs
The incorporation of the IVFSs into the BPL system requires two main steps: i) enhancing the compiler in order to allow a programmer to deal with interval-valued PEs defined above; ii) enhancing the abstract machine in order to introduce IVFSs inside the inference process. In particular, we have enhanced the Similarity-based WAM [28] for the execution of interval-valued
Discussion about the incorporation of WordNet
Introducing WordNet in
During the syntactic analysis phase, it is verified whether there are syntactic errors in the source of the program. At the same time, the syntactic tree, which is the basis for later code generation, is built. Additionally, in this phase, the directive “thesaurus” creates a connection with the thesaurus and generates the vocabulary of Π and 𝒯.
where: Thesaurus _ Name is the name of the thesaurus. [M1, …, M
N
] is the semantic metrics employed to generate the set of proximity equations. By default path or wup metrics are considered. Lambad _ Cut a threshold that indicates the minimum degree of semantic similarity. By default the value is zero ([0,0]) (that is, no threshold is imposed).
Once the connection has been established, the next step is the generation of PEs from the semantic relations indicated as arguments in the directive. We proceed as follows: i) we start from a program Π, a thesaurus 𝒯, a set of semantic metrics [M1, …, M N ] (for now Path Length and/or Wu-Palmer measures) and a λ-cut ; ii) the interval-valued fuzzy relation is defined and initialized: ℛ : =∅; iii) the vocabulary is computed; iv) for each w we compute compute interval-valued proximity equations 𝒮: ℛ : = ℛ ∪ {ℛ (w, s i ) = [, ]}; iv) finally, a set of entries which defines a semantic proximity relation ℛ is returned.
The algorithm allows us to generate the equations shown in Example 4. Therefore, it could be asked if “
It is worth noting that general rules have been developed automatically using WordNet; not manually item by item. On the other hand, our reasoning schema is based on it. However, it is also relevant to note that this way to proceed is not free of inefficiencies and problems because all the semantic relations of a given word are included in the program. An immediate solution is to apply automatic techniques in order to prune and improve ontologies, but this task is out of the aim of this paper and it will be proposed as future work.
Applications and empirical study
Bousi∼Prolog enriched with IVFSs is well suited to making the query answering process more flexible, given the interval-valued fuzzy unification algorithm. But, in addition, there are several practical applications where our extension can be useful: advanced pattern matching; flexible deductive databases; knowledge-based systems; information retrieval, where textual information is selected or analyzed using an ontology; or approximate reasoning.
Generating PEs automatically using linguistic resources allows us to perform knowledge representation in a simple and quicker way. In the case of big knowledge bases, an automatic procedure for dealing with vagueness is mandatory for addressing the problem within a reasonable time.
In order to perform an empirical comparative, we are going to use the examples shown in [13]. We focus on flexible deductive databases. Here, the fuzzy component is defined by two proximity relations. In the example shown in [13], it is assumed a database storing information on films which are shown at some cinema of a specific neighborhood in Los Angeles city. The database consists of three tables (relations, represented by BPL facts) with a total of eight attributes. The
Supposing that a user writes down each proximity equation in twenty seconds, the user would need 320 seconds.
Another important case is when an ontology is employed to the querying database [8]. For example, assume a fragment of a database that stores information about people and their jobs. We want to know who is middle-aged and likes science. Suppose a relational table named “person” (see Table 1), where a person is defined by a key, his name, age and job.
In this example the attribute Age is associated to the linguistic variable age, the attribute Profession is associated to an ontology, WordNet in our case. From it, we obtain the following interval-valued proximity equations
Supposing that a user writes down each proximity equation in twenty seconds, the user would need 280 seconds.
The empirical study is simple, illustrative. We must take into account the PEs employed in a knowledge representation task. The above examples support our initial affirmation. It is hard for an expert to model all the semantic knowledge involved in a particular problem. However, although the expert could do it, this is a task which needs too much time to be completed. The connection to a thesaurus allows us to automatize this step, making the task easier for the expert and reducing time.
Conclusions and future work
In this paper, framed into a PLP framework, we have proposed a model of logic programming,
It is relevant to note that the use of IVFSs allows us to work with several semantic similarity metrics simultaneously, which are automatically computed. This avoid to the programmer handle them manually. Lastly, it is also worth noting that our framework allows the system to automatically acquire knowledge which is not initially incorporated in the original knowledge bases (e.g., general rules have been developed automatically using WordNet; not manually item by item).
As future work, we propose to enhance the inference patterns introducing new reasoning schemes different to deductive ones, and considering other linguistic relationships involved in WordNet, such as meronym, sister terms, etc. as well as applying techniques to improve the behaviour of WordNet as an ontology. On the other hand, we also propose to explore the field of semantic metrics, designing an experiment in order to know which one provides us with the best results in PLP.
Footnotes
In Section 6, some of these applications are explained with more detail and illustrated by examples.
In this example, the values of PEs, are manually defined by the designer of the system.
Here, the symbol “≈” means that the arguments in ɛ11 and ɛ12 are capable to be equals by proximity using an interval-valued fuzzy degree, that is, a substitution σ can be computed such that ℛ (ɛ11σ, ɛ12σ) > [0, 0].
We choose these measures only for illustrating how our proposal works and other measures can be used. It is out of the scope of this paper to perform an experimental study about which is the best measure to this framework; it is a task to be addressed as future work.
Note that, we use for indicating that a threshold λ ∈ L ([0, 1]) is employed to compute the synonyms of a symbol w.
Acknowledgments
Thanks to the anonymous referees for their valuable comments. This work has been done in collaboration with the research group SOMOS (SOftware-MOdelling-Science) funded by the Research Agency and the Graduate School of Management of the Bío-Bío University under grant 130415 GI/EF, by the European Regional Development Fund (ERDF/FEDER) under the projects CN2012/151 and GRC2014/030 of the Galician Ministry of Education, the Postdoctoral Training Grants funded by the Galician Ministry of Education (2016), the program Becas Iberoamérica, Jóvenes Profesores e Investigadores, Santander Universidades (España, 2014) funded by Grants Santander Universities and the Spanish Ministry for Economy and Competitiveness under the grants TIN2016-76843-C4-2-R and TIN2014-56633-C3-1-R.
