Abstract
At present, domain ontology construction methods are mature, but most domain ontologies are still in manual or semi-automatic status. Based on the track research on the metadata and its registration standards, this paper proposes a method of domain ontology automatic construction based on Metadata Registration (MDR).The conceptual metamodel of MDR and the mapping rules of data describing meta model to ontology are formulated, and based on two mapping rules, this paper proposes a domain ontology based on MDR, and evaluates the domain ontology, realizes the oil field (upstream) ontology of automated construction. It is proved that the ontology method of petroleum field based on MDR is feasible and effective, and this method is also applicable to other fields.
Introduction
Metadata is the data about other data, or structured data of the characteristics and attributes of an information resource or data. As an important way of information organization, metadata has been widely used in many fields such as education and e-government [1]. With the deepening of the research on data element technology by experts and scholars, the international community has also issued the ISO/IEC 20943 “Procedures for achieving metadata registry content consistency” [2, 3], ISO/IEC 19763 “Meta model framework for interoperability (MFI)” [4, 5, 6] and other series of standards.
ISO/IEC 11179 is the Metadata Registry (MDR) standard [7, 8, 9]. The International Organization for Standardization and the International Electro Technical Commission jointly issued the latest metadata registration standards (ISO/IEC 11179-3: 2013) as the core, put forward the concept of “registration” mechanism, emphasis on the interoperability of metadata on the basis of the original, thus the research on the construction and representation of MDR conceptual system is extended based on the classification, semantics and life cycle of data element. Referred to the current part of the data element theory of the results, the concept of MDR is constructed and the semantics of the concept is defined. It provides a strong theoretical support for the both organizing and sharing of concepts in MDR and metadata catalog.
The standard also proposes a metadata registration framework model, which clarifies the composition of the concepts in MDR by expounding the contents and connections of each layer in the model. Besides, these concepts will be associated together through the MDR conceptual metamodel, and this concept and the relations among the concepts constitute the MDR conceptual system. The concepts and the relations among concepts in MDR are recognized by experts in the field. Therefore, MDR is the best way to construct domain ontology. This paper studies how to transform the conceptual system of MDR into domain ontology. Through the establishment of mapping algorithm between MDR conceptual system and ontology, this paper realizes the mapping and converting of MDR conceptual system to ontology.
A typical ontology-based data integration method.
Schema of object class.
It is one of the most efficient ways to develop and extend ontology database by acquiring field concepts and knowledge contained in the existing and available data resource. Ontologies are commonly used in computer science either as a reference model to support semantic interoperability, or as an artifact that should be efficiently represented to support tractable automated reasoning [11]. Ontology is also used building a networked part resource service platform for complex product rapid design. Ontology combined with the development of computer technology in all fields produces the research of construction and application for so-called domain ontology [13]. Figure 1 presents a typical ontology-based data integration method. In this method, the hybrid approach has been taken and the system is based on a mediation model. When implementing this method in the integration systems, it is very important to define and develop the integrated ontology and to achieve mappings. In this paper, we use the ontology-based integration approach together with the ISO 15926 standard to solve oil & gas data querying and reasoning problem.
The structural representation of the MDR conceptual system refers to the application of XML to represent the concepts and the relations among concepts in MDR. The reason for utilizing XML is because that can play a significant feature in the MDR information acquisition, processing and exchange. Specific methods are as follows:
Definition for the registry entry Schema. Through a series of legal elements, the structure of each register item in MDR is defined. Registry entry Schema mainly includes structural element and semantic element. The structural elements describe the composition of the entries, and the semantic elements describe the attributes of the registration item. Schema ensures that the structure itself is simple, correct, and also ensures the scalability of the construction rules themselves. Extension of registry entry Schema. The concept of relational concepts in MDR conceptual system is extended to Schema, which is mainly embodied in the constituent relations of registration items and the relations among registered items. Figure 2 is part of the conceptual schema of object class. Set up of the mapping rules. Each attribute in the registration of MDR corresponds to an element in the schema. The mapping rules should be set with reference to the MDR data model and Schema definition, so that semi-automatic or automatic generation from MDR to XML can be achieved. Namespace division: This step only occurs in the global MDR. Since the global MDR is composed of multiple industry MDR. In order to avoid element conflicts when generating XML files, different namespaces should be used to distinguish Schema.
A structural representation of the MDR conceptual system can describe all the concepts in MDR as well as the relations among them. Schema defines the structure of all entries in MDR and their relations, and the definition of these concepts is similar. Figure 3 is an XML representation of the data elements in the MDR conceptual system. It can be divided into three parts: the “crude oil production” data element concept, the “crude oil production of the well” data element and the “monthly crude oil production of the well” derived data element, and each XML element describes its structure, properties, and relations according to its schema. From the figure we can see: “crude oil production” data element concept is an integral part of two data elements, and the association is carried out through the element DECID (data element concept number). The relation of “crude oil production of the well” and “monthly crude oil production of the well” is a genus-species relation. They are associated by the attribute values of relational elements in XML, and the metadata registration system and Schema are combined by this means to realize semi-automatic representation of MDR conceptual system to XML.
Comparison table of MDR conceptual system representation language
Structural representation example of MDR conceptual system.
MDR conceptual system model is somewhat abstract and broad in definition. It is the “programmatic” model of the whole MDR. The notation attribute in the MDR conceptual architecture model expresses the MDR conceptual system in a particular construction language, and the supported construction languages are simple knowledge organization system (SKOS), web ontology language (OWL), uniform modeling language (UML) and objects relation mapping (ORM). Table 1 describes the simple correspondence between various modeling language and conceptual system elements, where N/A is not applicable.
SKOS provides an extensible language for describing structural terminology systems like taxonomies and thesauri. Based on RDF, it inherits the power of flexibility and distribution. SKOS extends RDF with specific notions of linkage between structural vocabularies, making it particularly useful for enterprise vocabulary management.
OWL models are designed precisely for data exchange and model integration. Through a formalism that is founded on interoperability principles of uniform identifiers and canonical representations of modeling facts, they have intrinsic properties for linking data and models.
UML unifies various methods for different types of systems, different stages of development and different internal concepts, thus eliminating the need for unnecessary differences between various modeling languages. But models expressed in UML are limited in their interoperability because of the ineffectiveness of XMI as an exchange mechanism between UML tools, which often implement proprietary formats that extend current XMI standards.
ORM provides another model of the persistent layer, it uses the mapping metadata to describe object relational mapping, makes the ORM middleware can in any application of the business logic layer and database layer acts as a bridge.
Framework for MDR conversion to ontology.
From the proposition of MDR conceptual component metamodel, it is clear that, to extend the conceptual representation and knowledge reasoning of MDR, and to combine the role of each representation language in practical application, it is recommended to use SKOS and OWL languages. Although there is no definition of Relation and Link in the two modeling languages, it will be embodied on the elements of the modeling language in the conceptual system representation.
As the ontology construction method has become more and more mature, and the analysis of knowledge has achieved good results, it decided to use the ontology as the MDR conceptual system representation language.
In order to represent the concept of MDR system, MDR ontology conversion framework is proposed, as shown in Fig. 4. In the transformation process of MDR to ontology, the mapping model is mainly used. The mapping model consists of two parts: one is the mapping model of MDR conceptual metamodel to ontology, the other is MDR data description metamodel to ontology mapping model B, which can be expressed as MM
Conceptual metamodel, data description metamodel, ontology model are the three elements of the mapping model, and their representation is as follows:
Illustration of mapping model A.
Illustration of mapping model B.
MDR conceptual metamodel: MDR MDR data description metamodel: MDRD Ontology model:
Through the study and comparison of the three model elements, the mapping rules are formulated and the mapping model MM is realized. The mapping rules are as follows.
Rule 1:
Rule 2:
The mapping model A defines the mapping rules of the classes of Corresponding to ontology in the concept metamodel links. Figure 5 shows the data element to the ontology conversion, while the left side shows the concept relation between “well concept” and “oil well” (
MDR data descriptive metamodel to ontology mapping (mapping model B)
Rule 1:
Rule2:
The mapping model B defines the mapping rules of each registration item to the ontology in the data description component. Figure 6 shows the data element to the ontology conversion, while the left side shows the “well number” data element, which consists of the object class word “well” and the property “number” to construct the data element concept
Ontology generation in oil field based on the MDR
The ontology is an explicit formal specification of the shared conceptual model. The ontology can be divided into top-level ontology, domain ontology, task ontology and application ontology [14, 15, 16] according to the research level of ontology. The domain ontology expresses the definition of concept, attribute of concept, relation between concepts and the field activities.
After studying the mapping model of MDR concept system to ontology, we can construct the ontology of MDR step by step according to MDR ontology transformation frame. The concrete steps are as follows:
Metadata selection: Part of the MDR metadata items should be converted to the class of the ontology, and another part should be converted to the properties of the class, which need the classification of the metadata items in MDR, from which the value of the metadata item should be selected. Class and relationship definition: Through the ontology conversion framework in the mapping model A, the concept of the relationship between the MDR converts to the classes and their relationships. Class attribute definition: Through the ontology conversion framework in the mapping model B, the valuable metadata items of MDR converts into the class attributes.
The transformed class, the relation of the class, the attribute of the class, and the data type of the attribute form an ontology. Figure 6 is a schematic diagram of the transformation of the partial metadata items converted to the ontology. The concept of connotation and extension of the MDR is ever-changing, and different relationships between the concepts occur. In the conceptual metamodel, the relation of these concepts can be classified into the same relation, disparate relation, genus relationship, cross-relation, opposition relation [17]. In this paper, the ontology in oil field is defined as a six-tuple [18], as follows:
Concept representation The concept is the most important element in the oil field development knowledge base. It is expressed as a class in the ontology, and is represented in the OWL language by the label "<owl: Class>". For example the concept of "oilfield water" defined as follows:
Relationship representation The OWL description language is used as follows:
Equivalence relationship (Equivalent-of) The notion of equivalence is represented in the OWL language as an equivalence class, described by the "<owl: equivalentClass>" tag. For example, the definition of the equivalent relationship between the "short and full name" form of the concept of "Oil extraction well" and "Oil well":
Inheritance relationship (Kind-of) The concept of inheritance (upper and lower relations) is represented in the OWL language by the label "<owl: equivalentClass>". For example, the concepts "Groundwater" and "Oilfield water" are as follows:
Parts and whole relationship (Part-of) OWL language does not have the label that directly represents the relationship between parts and whole, so it is necessary to define a partOf attribute to represent the relationship between parts and whole, partOf attribute is defined as follows:
For example, "Wellhead is composed of four-way, flange joints, casinghead and BOP components" is defined with OWL language as follows:
Attribute relationship (Attribute-of) Attribute is a binary relationship that usually is divided into two types:
Data type attribute: Describes the relationship between an instance of a class and RDF Literals and XML Schema data types; Object Attributes: Describes the relationship between instances of two classes. For example, the data type attribute "Oil extraction Index":
Instance relationship (Instance-of) An instance is a connection between a concept instance and a concept by using object or data type attributes and some mutual constraints between attributes. For example: "In December 2016, the daily liquid production capacity of oil production wells G75-3 is 65.9 t, and the daily oil productioncapacity is 8.5 t with the water content of 20%, and the dynamicliquid surface of 549 m", is defined with OWL language as follows:
The field domain ontology is described in Protégé 4.2 and OWL, as shown in Fig. 7.
Schematic diagram of the part ontology in oil field (upstream).
The domain ontology evaluation is the method of comprehensive evaluation of the domain ontology construction concept, application requirements, concept organization, function design and actual operation status, etc., using the scientific method and the unified index system [19].
The scientific ontology evaluation system plays a guiding and controlling role on the construction of the domain ontology. However, at present, the research on ontology evaluation at home and abroad has accumulated some achievements, but it has not yet formed a perfect ontology evaluation standard and mechanism. The present research does not provide a complete analysis framework. At the same time, the existing evaluation index more or less has some operational weaknesses and other shortcomings, cannot really be put into use.
In this case, this paper follows the basic principles of ontology evaluation in six fields: integrity, scientificity, universality, feasibility, orientation and openness, and selects the method of quantitative evaluation combined with the practical application of ontology to evaluate the constructed oilfield Domain ontology.
In quantitative evaluation, the OntoQA method [20, 21] proposed by the LSDIS Laboratory of the Department of Computer Science of the University of Georgia is used to evaluate the ontology of oil field. In this method, a series of formulas are used to calculate and evaluate each index of oilfield domain ontology from different angles. And the validity of the constructed ontology is explained by actual data.
Richness of the relationship (RR) RR reflects the diversity of relations in the ontology. Generally speaking, an ontology containing a variety of relationships than a parent-child relationship that only contains more ontology.
Where Attribute richness (AR) In general, the more attributes defined in the ontology, the more knowledge they can express. The attribute richness is represented by the average number of attributes of the class in this formula.
Where Attr represents the attributes in the ontology, and Inheritance richness This formula describes the information distribution of the subclass tree of each class in the ontology, which can well reflect how the knowledge is organized into different classes and subclasses.
The sub-ontology of the drilling rig machinery and equipment in the oil field of is taken as an example, and the above method is used to the ontology evaluation, and the indicators are shown in Table 2. Quantitative evaluation of oil field ontology
After evaluation, the oil domain ontology based on MDR is rich in concept, attribute and instance relationship, which can meet the demand of oil field knowledge service.
Metadata is an effective way of knowledge organization in the field. After many years of international standardization, metadata has more abundant semantic knowledge. After studying the metadata and its registration standard, this paper adopts the structural representation method and the knowledge ontology representation method to represent the MDR conceptual system, and puts forward the MDR ontology conversion framework. The MDR conceptual metamodel and data description metamodel to ontology mapping rules are formulated. Based on the two mapping rules, the method of domain ontology based on MDR is given. According to this method, the upstream ontology of oil field is constructed and the established ontology is evaluated. It is proved that the oil field ontology based on MDR structure is feasible and effective, and because the method is universal, it is also suitable for transforming from MDR to ontology in other fields. The field ontology management after MDR generated, evolution and change of MDR, the synchronized field ontology and field ontology need to be further studied.
