Abstract
RDF (Resource Description Framework) and RDF Schema (RDF(S)) are the normative languages to represent the Web information. Extracting RDF(S) from the existing data sources is becoming an important research issue. However, information imprecision and uncertainty exist in many real-world applications and hence fuzzy data modeling has been extensively investigated in various data models. In particular, fuzzy Entity-Relationship (ER) model has been widely applied to fuzzy data modeling in many application domains. Therefore, how to extract RDF(S) from the existing fuzzy ER models becomes an important issue to be solved. In this paper, we propose a formal approach for extracting RDF(S) from fuzzy ER models and implement a prototype tool. We first give the formal definitions of fuzzy ER models and RDF(S). After that, a formal approach for extracting RDF(S) from fuzzy ER models is proposed. Also, an extraction example is provided. Finally, based on the proposed approach, a prototype extraction tool is implemented, and the experiments show that the approach and the tool are feasible.
Introduction
The Semantic Web is an extension of the current Web, in which Web resources are given computer-understandable semantics [25]. RDF (Resource Description Framework) and RDF vocabulary description language RDF Schema (RDF(S)) are the W3C recommendation normative language to describe the Web resource information and their semantics [3, 21]. The formal acceptance of RDF(S) by W3C stimulates their utilization in many areas. But most information resources are not available in RDF(S)-format at present. Therefore, how to construct RDF(S) from the existing resources becomes an important issue to be solved in the context of the Semantic Web.
Currently, some approaches have been proposed to extract RDF(S) from some data resources. How to transform XML documents into RDF-format data was investigated in [2, 18]. Moreover, the approaches for extracting RDF(S) from databases were proposed in [8, 27]. In addition, the proposal for building RDF from semi-structured legal documents was presented in [6]. The rules of constructing RDF from spreadsheets were proposed in [14].
Among these approaches, constructing RDF(S) by acquiring information from data models has increasingly attracted considerable attention, because much domain information is still modeled by the data models (e.g., ER model). The entity-relationship (ER) model was incepted by P.P. Chen [17] and has played a crucial role in database design and information systems analysis. The ER model, using varying notations and with some semantic variations, is enjoying a remarkable, and increasing, popularity in both the research community, the computer science curriculum, and in industry. In step with the increasing diffusion of relational platforms, ER modeling is growing in popularity [4, 24]. Therefore, in order to use the information of an ER model in a semantic context, it has to be mapped into RDF(S), the data format of the Semantic Web. In recent years, several proposals are presented to establish the relationships between data models and the data format RDF(S) of the Semantic Web [12, 26]. A brief discussion of the relationships between RDF Schema and the data model UML (Unified Modeling Language) was made in [19, 26]. Moreover, how to convert RDF Schema into the data models ER and UML were briefly discussed in [12, 15], but their aims are in contrast to ours, and in this paper we aim at extracting RDF(S) from fuzzy datamodels.
In real-world applications, information is often vague or ambiguous. It is especially true in the data intensive areas, such as information systems and databases [7]. For modeling the fuzziness in real-world applications, the fuzzy set theory [13] was originally applied to some of the basic ER concepts in [1]. Accordingly, lots of efforts have been made to extend the ER model with fuzzy features [9, 23] (see [30] for a survey). To the best of our knowledge, so far, there are no comprehensive reports on extraction of RDF(S) from fuzzy ER models.
Based on the observations above, this paper proposes a formal approach for extracting RDF(S) from fuzzy ER models and implements a prototype tool. The paper makes the following main contributions: Proposing a formal approach (i.e., some extraction rules) for extracting RDF(S) from fuzzy ER models; Providing an extraction example, and making a discussion about the approach; Implementing a prototype tool, the extraction algorithm is given and some experiments are done. The results show that the approach and the tool are feasible.
The remainder of this paper is organized as follows. Section 2 recalls formal definitions of fuzzy ER models and RDF(S). Section 3 proposes an approach for extracting RDF(S) from fuzzy ER models, provides an example, and makes a discussion about the approach. Section 4 implements a prototype tool. Section 5 shows conclusions and futurework.
Formalizations of RDF(S) and fuzzy ER models
In this section, we recall formal definitions of RDF(S) and fuzzy ER models, which will help to well establish correspondences between them.
Formal definition of RDF(S)
Resource Description Framework (RDF) [21] is a framework for expressing the Web resource information. The basic idea of RDF is: anything is called “
Further, the RDF Vocabulary Description Language—RDF Schema, uses the notion of “
Formal definition of fuzzy ER models
Zvieli and Chen [1] first applied the fuzzy set theory to some of the basic ER concepts, in which fuzzy entities, fuzzy relationships and fuzzy attributes were introduced to fuzziness in entity and relationship occurrences and in attribute values. In the following, we recall a fragment of the formal definition of fuzzy ER models. Here we first define a function as follows: for two finite sets X and Y, we call a function from a subset of X to Y an X-labeled tuple over Y. The labeled tuple T that maps xi ∈ X to yi ∈ Y (i ∈ {1, …, k}) is denoted by [x1 : y1, …, xk : yk], and we also write T [xi] to denote yi [17].
LS = ES ∪ AS ∪ US ∪ RS ∪ DS is a finite alphabet, where ES is a set of fuzzy entity symbols; AS is a set of fuzzy attribute symbols; US is a set of role symbols; RS is a set of fuzzy relationship symbols; and DS is a set of domain symbols; ≤S ⊆ ES × ES, which is a binary relation over ES, denotes the inheritance relation (i.e., ISA) between two fuzzy entities; attE: ES → T (AS, DS) is a function that maps each fuzzy entity symbol in ES to an AS-labeled tuple over DS; attR: RS → T (AS, DS) is a function that maps each fuzzy relationship symbol in RS to an AS-labeled tuple over DS; relS: RS → T (US, ES) is a function that maps each fuzzy relationship symbol in RS to a US-labeled tuple over ES. Without loss of generality, we assume that: Each role is specific to exactly one relationship; For each role U ∈ US, there is a fuzzy relationship R and a fuzzy entity E such that rel
S
(R) = [… , U : E, …].
Extracting RDF(S) from fuzzy ER models
On the basis of the formalizations of RDF(S) and fuzzy ER models in Section 2, in this section, we propose a formal approach for extracting RDF(S) from fuzzy ER models. Also, we provide an example, and make a discussion about the approach.
Extraction approach
By comparing and analyzing the characteristics of fuzzy ER models and RDF(S), here we first briefly summarizes the correspondences between them for providing readers of an initial understanding of the process. In general, the fuzzy entities and attributes in a fuzzy ER model can be directly represented by the rdfs:Class and rdf:Property in RDF(S); the roles in a fuzzy ER model correspond to the rdf:Property in RDF(S), and thus the fuzzy relationships in a fuzzy ER model has to be transformed into rdfs:Class of RDF(S); the inheritance relation between two fuzzy entities in a fuzzy ER model can be represented by the rdfs:subClassOf in RDF(S); and several functions in a fuzzy ER model can be represented by the formal definition of RDF(S).
In the following we propose several rules of extracting RDF(S) from fuzzy ER models. Let ɛ = (LS, ≤ S, attE, attR, relS) be a fuzzy ER model, the corresponding RDF(S)
Comments: In a fuzzy ER model, an instance may belong to a fuzzy entity with a degree of [0, 1]. A special attribute u ∈ [0, 1] is introduced into a fuzzy entity to represent the membership degree of an instance to the fuzzy entity. In this case, the special attribute u is represented by a RDF(S) property φ (u) as shown in Rule 2.
The triple statement represents the domain of the RDF(S) attribute φ(A) is φ(E), and its range is φ(D).
For example, given a fuzzy entity YoungPerson ∈ ES with attributes attE(YoungPerson) = [Name: string, FUZZYAge: integer, u: real] in a fuzzy ER model, the following RDF(S) class, properties, and triple statements will be created as the above rules 1–3 and 6:
Comments: In a fuzzy ER model, a relationship among entities may be fuzzy. A special symbol β ∈ [0, 1] is introduced into a fuzzy relationship to denote the membership degree of the fuzzy relationship occurring in several entities. In this case, the special symbol β is considered as an attribute of the fuzzy relationship and thus is represented by a RDF(S) property φ (β) as shown in Rule 7.
For example, given a relationship relS(Teach)= [tea: Lecturer, tea_by: Course], the following RDF(S) classes, properties, and triple statements will be created as the above rules 1, 4, 5, and 8:
For example, given an ISA relation Postgraduate_Student ≤SStudent, as the rules 1 and 9, the following RDF(S) classes and triple statement will be created:
In order to well explain the above rules from fuzzy ER models to RDF(S), the following section will further provide an example.
An extraction example
In the following we provide an extraction example to well explain the rules proposed in Section 3.1. Figure 1 shows a fuzzy ER model E1 and a part of instance information with respect to the fuzzy ER model.

A fuzzy ER model ɛ1 and a part of instance information.
By applying the rules in Section 3.1 to the fuzzy ER model ɛ1 in Figs. 1, 2 shows the extracted RDF(S) from ɛ1, where the URLs of the RDF(S) resources are omitted.

The RDF(S)
It is shown that the model of Fig. 1 in Section 3.2 basically includes the main elements of a common fuzzy ER model. Further, Fig. 2 shows that the fuzzy ER model can be transformed into RDF(S), and it can also be found from Fig. 2 that the extracted RDF(S) can represent the entities, properties, relationships, and constraints in a fuzzy ER model. All of these show that the approach for extracting RDF(S) from fuzzy ER models is feasible.
Moreover, it should be noted that since the limitation of the expression of RDF(S), some enhanced features of the basic fuzzy ER model (e.g., the generalization hierarchies or the cardinality constraints) cannot be directly represented as RDF(S) in the extraction process: The only direct relationship between entities that can be expressed in the basic fuzzy ER Model is the ISA relation as discussed in our work. A common extension is the arbitrary boolean constructs on entities, so called generalization hierarchies (similarly for [22]), which allow one to express that the extension of a fuzzy entity should be the disjoint union of the extensions of other fuzzy entities. As we have known, RDF(S) cannot represent such disjoint constraint. Such enhanced features may be represented by the further extension language of RDF(S) (e.g., the ontology language OWL [10, 29]). An optional cardinality constraint (m, n) can be enforced on a fuzzy relationship and is used to specify that each instance of the fuzzy entity can participate at least m times and at most n times to the fuzzy relationship. Since RDF(S) cannot represent the cardinality constraint, when extracting RDF(S) from fuzzy ER models, the cardinality constraint cannot be represented by RDF(S) directly. Such cardinality constraint can only be represented by the further extension language of RDF(S).
Until now, the approach proposed in the previous sections can extract RDF(S) from fuzzy ER models, and the main features of the basic fuzzy ER models and the instance information can be represented by RDF(S), the data format of the Semantic Web. Furthermore, in order to implement the automated extraction, in the following section we will develop a prototype tool.
Prototype extraction tool
In this section, as a proof-of-concept for the proposed approach in the previous sections, we developed a prototype tool called FER2RDF, which can extract RDF(S) from fuzzy ER models.
In the following we briefly introduce the design and implementation of the prototype tool FER2RDF. The core of FER2RDF is that it can first read in a fuzzy ER model of the XML-coded file, and then extract RDF(S) from the fuzzy ER model. The implementation of FER2RDF is based on Java 2 JDK 1.6 platform, and the Graphical User Interface (GUI) is exploited by using the java.awt and javax.swing packages. The overall architecture of FER2RDF is briefly shown in Fig. 3.

Module structure graph of FER2RDF.
It is shown from Fig. 3 that FER2RDF includes four main modules, i.e., input module, parse module, extraction module, and output module:
The input module first read in a fuzzy ER model (the XML-coded file of the model produced from the CASE tool); The parse module further parses the input file and stores the parsed information as Java ArrayList classes. The features of the fuzzy ER model in the XML-coded file (such as fuzzy entities, attributes, fuzzy relationships, inheritance, and some constraints as mentioned in Definition 2) can be parsed; The extraction module transforms the parsed results of the fuzzy ER model into RDF(S) according to the following algorithm Extract_FER2RDF as shown in Table 1, which is given based on the proposed approach in Section 3. The algorithm briefly describes the extraction process from fuzzy ER models to RDF(S), and does not contain the detailed extraction steps that have been shown in the proposed approach in Section 3. In brief, the algorithm performs two kinds of operations, i.e., the extraction from fuzzy ER symbols to RDF(S) resource identifies and the extraction from the fuzzy ER constraints (e.g., the inheritance relation and several functions as mentioned in Definition 2) to RDF(S) triples. The time complexity of the algorithm will be briefly analyzed in the following; The output module finally produces the resulting RDF(S) which is saved as a text file and displayed on the tool screen.
The algorithm Extract_FER2RDF
Here we briefly analyze the time complexity of algorithm Extract_FER2RDF. Since the extraction from fuzzy ER symbols to RDF(S) resource identifiers (i.e., Step 1 of the algorithm) can be simultaneously made as sub-operations in creating RDF(S) triples (i.e., Step 2 of the algorithm), we can ignore the amount of work done in the first step and consider only the creation of triples in the second step. Also, we only consider the extraction operations and ignore the preprocessing operations (i.e., the parsing of the XML-coded file of the fuzzy ER model), that is, we exclude the amount of work done by an XML parser (e.g., the DOM API for Java in our implementation) that parses the fuzzy ER model (i.e., an XMI-coded file) and prepares the element data in computer memory for the usage in the extraction procedure of the algorithm. Moreover, here we also only consider the fuzzy ER model and ignore the instance information. In this case, the time complexity of the algorithm mainly depends on the structure of fuzzy ER model. Suppose the scale of fuzzy ER model is
We carried out extraction experiments of some fuzzy ER models using the tool, with a PC (CPU P4/3.0 GHz, RAM 3.0 GB and Windows XP system). Some of these fuzzy ER models are designed by the CASE tool by ourselves, and some are from the existing work. Case studies show that our approach proposed in Section 3 is feasible and the FER2RDF tool is efficient.
In the following we provide a running example of FER2RDF. Figure 4 shows the screen snapshot of the tool running one of case studies, which displays the extraction of RDF(S) from the fuzzy ER model ɛ1 in Section 3.2. In Fig. 4, the XML-coded fuzzy ER model file, the parsed results, and the extracted RDF(S) are displayed in the left, middle and right areas, respectively. The extracted results show that the proposed approach is feasible.

Screen snapshot of the tool FER2RDF.
In this paper, we proposed an approach and implemented a prototype tool for extracting RDF(S) from fuzzy ER models. After recalling the formal definitions of RDF(S) and fuzzy ER models, we proposed a formal extraction approach from fuzzy ER models to RDF(S), and provided the detailed rules. Then, an extraction example was given, and the discussions about the approach were done briefly. Finally, based on the proposed approach, we implemented a prototype tool, and the experiments show that the approach and the tool are feasible. Our work in this paper may act as a gap-bridge between the existing fuzzy ER modeling applications and the Semantic Web.
As far as future work, we may further discuss the extraction of RDF(S) from the enhanced fuzzy ER models (e.g., fuzzy EER models and fuzzy UML models), and some more expressive modeling features will be considered. Moreover, we also intend to enhance the prototype tool and make more large-scale test.
Footnotes
Acknowledgments
The work is supported by the Natural Science Foundation of Liaoning Province (No.2015020048), Fundamental Research Funds for the Central Universities (N151704001) and National Natural Science Foundation of China (61672139).
