Abstract
The growth of web has touched everyone’s life from an economist, entrepreneur, and academician to a farmer. The agriculture sector is quite important for any country’s growth. Farmers can also be benefited if they are provided with relevant information they need. The reason that it is not happening on a large scale is due to many reasons. Information is available in different formats, platforms and it is highly unstructured. The availability and real time usage of this information is prohibited mainly by the way it is represented. Recently, Ontology has emerged as one of very expressive knowledge representation scheme which enables gathering information from heterogeneous sources and creates a common data model that is shared and agreed upon in diverse domains. The interoperability and deduction capabilities that Ontology offers are very useful for generating new knowledge. Most of the work pertaining to ontology development in agriculture domain is crop specific, where information about fertilizers is confined to a particular crop only. In this paper, we have designed and developed a generic ontology for fertilizers. Fertilizer itself is a concept which should be modeled independently. Even for crop specific ontology, fertilizer is one of the most important concept to be captured. So, it needs to be represented independently. Finding this research gap, an ontology in fertilizer subdomain is developed, taking into account the various issues that are faced while constructing an ontology and the enormous amount of data that is available but is not easy to structure and present in the form of ontology. We have also validated the Ontology using an available tool and domain experts have also validated the developed Ontology.
Introduction
Agriculture plays a vital role in the economic growth of a country. With the evolution of internet, agriculture sector is also becoming IT focused. There are various efforts going on in the direction of increasing productivity, improving farmer’s conditions by providing them quick solutions through sms, emails, portals etc. On the other hand, the volume of data is increasing rapidly, so it needs to be represented and managed in a way so that it can be effectively utilized. Increased number of information sources make it challenging to incorporate information on a single platform due to heterogeneity and unstructured nature [4].
In recent years, Ontology has emerged as a research area and has applications in almost all fields. The word “Ontology” has its roots in philosophy which means existence. Technically, an Ontology is defined as a set of concepts and relations among them in a particular domain [6]. Ontology development is not a straightforward task. Each domain has its own set of needs and desirable structure. However, keeping in mind the generic structure and nature of ontologies, it should follow certain principles that are as follows [15]:
The objective of ontology development should be clear and it should explicitly define the meanings of the terms, concepts and all other notions. There should be standard definitions and documentations which enables group of users to reuse and port the ontology. Definitions in ontology should be complete and should be in a form so that it is possible to represent them in some standard representation formalism. All the definitions in the ontology should permit inferences with a consistency check. Since the basic aim of ontology is to enable the reuse of existing knowledge, there should be standard mechanism to extend the ontology by including more concepts and terms. It should be done in such a way that existing information in the ontology is unaffected by inclusion of new knowledge. Ontology should be flexible so that it can be utilized and customized by maximum number of people without caring about any commitments. All the modules need to be diversified by minimizing the coupling between them. The classes should be disjoint and hierarchies should be formed in such a way that similar concepts are represented with similar primitives. Naming style should be consistent for the whole ontology.
In today’s scenario, ontologies are being developed in all domains such as biomedical, tourism, health, aerospace [20] etc. There are ontologies available in the field of agriculture but they are crop specific. Crop specific ontologies have their advantages, especially in field of developing expert systems. Fertilizers always contribute significantly to a crop ontology. However, it is generally treated as a dependent entity. Due to this, fertilizer information is not consistent and underrepresented. We realized that since fertilizer is a specific domain, a generic ontology is required for fertilizers. This can represent entire information about fertilizers in totality. The ontology can further be integrated with crop and soil ontology to retrieve more specific information. The objective of this paper is to provide generic foundations for design and development of fertilizers ontology.
The paper is organized in six sections. First section gives the general introduction and principals behind ontology creation. In second section, we have discussed related work particularly in agriculture domain. In third section, we have discussed various research issues in the field of ontology development in agriculture domain. Section 4 gives the complete design and development of the proposed ontology. The criteria followed for each step in ontology development process is clearly stated. Ontology validation is discussed in Section 5. The next section concludes the proposed work.
During the literature review, it has been found that there are already few ontologies developed in the agriculture domain. The Thai Rice Ontology [16] is a prototype ontology for plant production using Thai rice as a case study. This ontology covers all the stages of rice production in Thailand starting from cultivation to harvesting. Thai Rice ontology has been designed with an aim to facilitate the process of knowledge acquisition and information retrieval for research purposes. The User Centered Ontology for Sri Lankan Farmers [17] is an ontology, where aim is to provide information specific to user context. This ontology has been developed considering the farmers’ needs and also taking into account the questions that vary from farmer to farmer such as farm environment, types of farmers, etc. There are some systems available [14] which provide tailor built fertilizers to farmers. Such systems are made to target specific problems; they are not generic in nature [18]. have developed an ontology for social life networks [19]. have studied the life cycle of crops and identified different stages of the crop lifecycle to produce an ontology.
There are also many well-established controlled vocabularies in the agricultural domain. [5] has identified several limitations and drawbacks with current vocabularies such as semantic ambiguity in definitions and usage of vocabularies; lack of high-level cross-domain concepts; and meaning of their relationships not being precisely defined. One of the most well-established and authentic controlled vocabulary in agriculture is the AGROVOC [2]. AGROVOC is a multilingual, structured, controlled vocabulary/thesaurus designed to cover concepts and terminology in agriculture, forestry, fisheries, food and related domains developed by the Food and Agriculture Organization (FAO) of the United Nations and the Commission of the European Communities. Agropedia [1] is an online knowledge repository for information related to agriculture in India backed by Government of India and sponsored by the World Bank through the National Agricultural Innovation Project of the Indian Council of Agricultural Research (ICAR). It is more useful for those users whose needs are mainly based on crops as the repository is maintained about specific crops. Its main aim is to keep alerts on different crops from the scientists and make the farmers aware of it by keeping them updated through text messages. It is a crop wise based knowledge repository. The knowledge models given are from a crop point of view and there is need for ontologies that deal from other concepts point of view to have a better coverage of the ontologies on agriculture domain.
Research motivation and challenges
While going through the related work and their limitations, we found that there is a scope of developing ontologies for representing and extracting information about the fertilizers. Following are motivations of our research:
There does not exist an online resource for getting information about fertilizers, even if we leave ontology aside. AGROVOC is the most exhaustive and well-established thesaurus available today in the agriculture domain. AGROVOC is the largest vocabulary of agriculture domain [5]. AGROVOC includes some information about fertilizer, which indicates that fertilizer is considered to be an important part of Agriculture vocabulary. However, AGROVOC is not a complete ontology, neither it contains complete information about fertilizer. The main target areas for ontology development in the field of agriculture are soil, crop and fertilizer. Some crop-specific work has been done in agriculture domain for developing ontologies. For example, Thai Rice. However, in order to capture knowledge in these domains, it might be preferable to develop generic ontologies separately for each of them. While crop and soil ontologies are available, fertilizer ontology is no available. Fertilizer management is also an important area on its own that needs to be taken notice of. The farm yields get affected adversely due to the wrong and naive usage of the fertilizers. This is because the farmers do not have a proper idea of the various fertilizers and its usage for different crops. However, this area of fertilizer management has been less targeted and hence it should be given more attention. From general perspective also, it is important to organize the scattered information on fertilizers so that general purpose questions on fertilizers can be answered.
In order to start building fertilizer ontology, we faced following research challenges:
No prior work has been done in developing ontology for fertilizers domain, so we had to start from scratch. Considering the application of this ontology, there can be various stakeholders for this ontology which can be researchers, academicians, agriculture scientists, agriculture policy makers and finally farmers, which may be benefited, if are able to provide the proper ways of accessing ontology. Satisfying them all is not possible, so we have to confine the scope. However, the foundations should be such that it can be extended in future. Fertilizer taxonomy is not well defined, so it is not straight forward to create hierarchical classes required in ontology. Gathering standard and authentic information sources on fertilizer is not an easy task. Properties and constraints are not well defined for fertilizer, which increases the complexity of Ontology construction.
Following are the main objectives of our work:
To explore the feasibility of fertilizers ontology. To develop ontology in fertilizer domain to represent the existing information in fertilizers such as type of fertilizers, nutrient contents, equivalent acidity, crops benefitted, preferred soil type, time of application, etc. To validate the developed ontology. Building an ontology for a specific domain can start from scratch or by modifying an existing ontology. In both cases, techniques for evaluating the characteristics and the validity of the ontology are necessary.
Overall ontology design process is given in Fig. 1.
Overall process.
We were not able to find a single complete and authentic resource containing the required information. Since Agrovoc is the most well established and publicly available resource for agriculture, we started our work by capturing some information from Agrovoc regarding the terms and taxonomical knowledge. As Agrovoc is not an ontology, neither its coverage nor structure are sufficient for fertilizer ontology. In addition to Agrovoc, we also considered other knowledge repositories (digital and non-digital) such as Agropedia [2], textbooks, reference books etc. In order to authenticate we consulted experts from ICAR.
Since, we did not find any fertilizers taxonomy which can be directly mapped to our ontology, we had to structure it on our own by using a mix of knowledge available in AGROVOC, Agriculture books, journals, magazines and expert advice.
While designing the ontology, we faced many challenges. Some of the specific challenges are presented as follows:
We started with developing some competency questions, in order to determine the scope of our ontology. As discussed earlier, Fertilizer as an application covers many aspects and it is not possible to cover each aspect. So we tried that at basic level it should be properly structured and basic knowledge about fertilizer is as complete as possible. Accordingly, we confined to certain type of competency question on basic fertilizer and its use. Information in our ontology should be incorporated in such a way that it is possible to answer basic questions such as what are the different categories of fertilizers, what should be their properties and restrictions etc. All this information has been identified and included after much discussion. A list of some focused questions is presented in further subsection.
Deciding ontology development approach
Three different approaches have been proposed for the development of ontology [7]:
Top-down approach. Bottom-up approach. Combination approach.
No approach is better than the other two. The approach to take depends on the understanding of the domain to the developer and his convenience. We have followed the combination approach for development of ontology which is a combination of top-down and bottom-up approaches.
[11] have discussed several guidelines to keep in mind while developing a class hierarchy. We consider these guidelines and use them to check against the class hierarchy that we have created for our ontology.
Deciding class hierarchy.
Defining the class hierarchy is one of the most important steps in ontology development. Due to the unavailability of well-defined hierarchies in many domains, it is very difficult to have a proper class hierarchy for the same domain. Even the most well-established and authoritative thesaurus AGROVOC has some conflicts when it comes to the hierarchy when checked against some of the ontology development methodologies. For example, in AGROVOC, the fruits (class) has the subclasses shown as Fig. 2.
Agrovoc hierarchy.
However, according to [5], all the siblings in the hierarchy (except the ones at the root) must be at the same level of generality. But as we can see that bananas and citrus fruits are not at the same level of generality. Citrus fruits are more general than bananas.
Since we are developing ontology from the scratch, there should be an extensive reading of the books and other resources related to agriculture. Also, the information that we have gathered and found possible for structuring should get verified by the domain experts. Hence it is an iterative process which takes time and patience.
Deciding whether a particular concept should be a class or an individual instance of a class depends on the possible application of the ontology and scope of the domain. The ontology that we are building is a generic ontology in the domain of agriculture with fertilizer subdomain as the case study, it should have enough coverage of the information in the subdomain of fertilizer.
Class versus property
Deciding whether a specific distinction should be set as a property value or a set of classes depends on the scope of the ontology and its use. For example, deciding whether Nitrogenous fertilizer should be a separate class or should it be made a class ChemicalFertilizer and fill the value for the property Nutrient contained as Nitrogen, depends on the scope of the domain. Now the distinction of Nitrogenous, Potassic and Phosphorus fertilizers is very important since we are developing a detailed ontology of fertilizer. Hence this distinction leads to the subclasses NitrogenousFertilizer, PotassicFertilizer and PhosphoricFertilizer.
Relation between properties
There are two types of properties in an Ontology. In simple words, object properties link individuals via a property and datatype properties link individuals to data. In some cases, it is obvious to decide between these two and sometimes it gets difficult to decide what should be data property and object property. It is decided based on the requirements of the domain.
Data properties: These are the properties that relate individuals to a user-defined value.
Object properties: These are the properties that relate the individuals of a class to individuals of another class.
Keeping in mind the rules proposed by [11] and the various stages involved in the ontology development, the fertilizer ontology is built from scratch. The approach selected is manually-driven. The tool that we are using for the development of FertOnt is Protégé 4.3 (Build 304).
Step 1: Determine the domain and scope of the ontology
The scope of ontology will be identified by clearly defined competency questions. Further, in order to answer these questions, classes. Concepts, properties and relationships are to be identified. The knowledge base formed in the form of ontology should be able to give answers to these questions.
Competency questions:
What is the nitrogen content in various Amide fertilizers? Which chemical fertilizers can be mixed physically and used as mixed fertilizers? Which phosphatic fertilizers are suitable for acidic soil? Which nitrogenous fertilizers use Topdressing type of fertilizer application method? List the potassic fertilizers that have potassium content greater than 20%. List the phosphatic fertilizers that contain phosphorus in the form of dicalcium phosphate. What is the N, P, K production and import for the year 2004–2005? What is the time of application of basic slag?
Step 2: Consider reusing existing ontology
Though there are some ontologies available in the agriculture field but they are not suitable for us. We have developed the ontology from scratch since there are no existing ontologies that fit our requirements.
Step 3: Enumerate different terms in the ontology
The next step is to write down an exhaustive list of terms that the fertilizer subdomain covers. For example, the terms that usually come to mind when we think of fertilizer are chemical fertilizer, manure, compost, dosage, bio fertilizer, application time, application method, and so on.
Step 4: Define the classes and class hierarchy
A new class is introduced when there is something that can be said about the new class but the same is not true for its superclass. For example, bio fertilizers can have different organisms involved for its production and usage, whereas for fertilizers in general, this property has no significance and is not used to describe them. The nitrogenous fertilizers have a property that describes the nitrogen nutrient content in them which is not a very useful property of the chemical fertilizers in general. Therefore, bio fertilizer and nitrogenous fertilizer are subclasses of fertilizer and chemical fertilizer respectively.
It is however, sometimes important to define new classes even if there are no new properties added for them. For example, the class ApplicationMethod has subclasses LiquidApplicationMethod and SolidApplicationMethod. This classification is just a hierarchy and the subclasses have the same set of properties.
Also, deciding whether a specific distinction should be set as a property value or a set of classes depends on the scope of the ontology and its use. For example, deciding whether Nitrogenous fertilizer should be a separate class or should it be made a class ChemicalFertilizer and fill the value for the property Nutrient contained as Nitrogen, depends on the scope of the domain. Now the distinction of Nitrogenous, Potassic and Phosphorus fertilizers is very important since we are developing a detailed ontology of fertilizer. Hence this distinction leads to the subclasses NitrogenousFertilizer, PotassicFertilizer and PhosphoricFertilizer.
Similarly, deciding whether a particular concept should be a class or an individual instance of a class depends on the possible application of the ontology and scope of the domain. The ontology that we are building is a generic ontology in the domain of agriculture with fertilizer subdomain as the case study, it should have enough coverage of the information in the subdomain of fertilizer. In order to decide the level of granularity, i.e. the possible terms which will act as the individuals, there is a need to go back to the competency questions that we have identified in step 1. The most specific terms which answer these questions are the possible candidates for individuals.
Individual instances are the most specific concepts represented in a knowledge base.
From the list created in step 3, we select certain concepts we are familiar with and then specialize and generalise them appropriately. For example, we start with a top-level concept Fertilizer, and a more specific concept AmideFertilizer. And then we can relate them to a middle-level concept, NitrogenousFertilizer.
Taxonomy of the concept Fertilizer.
Fertilizer is the most general concept. Biofertilizer, ChemicalFertilizer, etc. are the general top level concepts.
AmideFertilizer, AmmonicalFertilizer, etc are the bottom level concepts.
Step 5: Define the properties of classes-slots
In order to answer the competency questions defined in step 1, we must describe properties of the classes.
There are two types of properties:
Data Properties: These are the properties that relate individuals to a user-defined value. The data property hierarchy displays the asserted and inferred hierarchy.
Object Properties: These are the properties that relate the individuals of a class to individuals of another class. The object property characteristics are functional, inverse functional, transitive, symmetric, asymmetric, reflexive and irreflexive.
From the terms that we have enumerated in step 1, after selecting the classes, most of the terms that are left will be the properties of these classes. For example, for the term Fertilizer, its properties would be a fertilizer’s nutrient content, application method, application time, etc.
Data properties.
Object properties.
Step 6: Define the facets of the slots
Defining facets of the slots is nothing but defining constraints of the properties. Properties can have different constraints such as allowed values, value type, cardinality (number of values) and other features.
Step 7: Create instances
Creating instances of classes is the last step. We select a class and create an individual of that class and fill the values of its properties. For example, we create an individual instance BasicSlag of the class PhosphoricFertilizer. Few properties of this instance are as below:
canBeMixedWith: RockPhosphate
canBeMixedWith: PotassiumSulphate
hasApplicationTime: WellBeforeSowingTheCrop
hasApplicationMethod: BroadcastingAtPlanting
containsPInTheFormOf: DicalciumPhosphateForm
containsPInTheFormOf: CitricAcidSolublePhosphoricAcidForm
hasPercentP: 3–8
Keeping these rules in mind, we have defined our class hierarchy as Fig. 6.
Fertilizer class hierarchy.
Defining properties with more details is also an important part of the ontology building process. Protégé provides several ways to define object properties:
Inverse property.
This represents bidirectional relationships.
Example of inverse property.
Symmetric property.
This property defines relationships that are symmetric in nature. For example, rock phosphate can be mixed with basic slag implies that basic slag can be mixed with rock phosphate.
Example of symmetric property.
In both the examples, the blue lines show the inferred relationships.
Utmost care has been taken in deciding the key components of the ontology. We have been in continuous touch with domain experts. After the classes, subclasses and other restrictions are identified, we need to use some formal representation mechanism to define the competency questions. We have made use of First Order Logic to represent the competency questions. Below we have shown two examples from Section 4.1.
Competency Question: Which chemical fertilizers can be mixed physically and used as mixed fertilizers? (
The implemented ontology has been made available at the following link for reference
Directory Link:
File Link:
Ontology validation is a key activity in different ontology engineering scenarios such as development and selection. In this study, two aspects need to be validated; the correctness or appropriateness of the contents of the ontology and the correctness of the construction of the ontology. After designing the ontology, the contents of the ontology need to be validated by domain experts against the users’ requirements. Validation is the process of checking whether or not a certain design is appropriate for its purpose; meets all the constraints; and will perform as expected. Since the ontology is built from scratch, knowledge required for the fertilizers ontology development has been gathered from the following reliable sources:
Domain experts from the NCAP, ISRI, New Delhi; Research journals and papers, Fertilizer use by crop in India (2005); Text books (A handbook of Fertilizer, Soil and Manure; Agricultural Handbook); Fertilizer Statistics 2011–12, prepared by The Fertilizer Association of India; Online data sources from Authoritative Organisations (the Ministry of Agriculture – India) Mass media (newspaper, television, radio).
Expert response for some of the criteria
The aim behind this work is to make it available in real time for researchers, scientists and farmers. The correctness of the work has to be ensured before making it publicly available. The domain experts have validated the components of the ontology. Based on expert’s advice, ontology has been refined.
The Delphi method is a research technique that is used to obtain responses to a problem from a group of domain experts [22]. Since experts’ opinions are a source of information available for this purpose to clarify the complex real situations in the domain of agriculture, the Delphi method is selected to obtain expert advice and responses to validate the content of the ontology. This technique is used by the experts to suggest and confirm the criteria. The end product of this technique is a consensus among the experts by use of statistical information and includes their commentaries on each of the questionnaire items, organized as a written document. However, while going through the literature, it was observed that the response rate is very low to the questionnaire using this method and this method was not very effective to validate the constraints (criteria).
The modified Delphi method: When designing the ontology, the ontology developers have to decide ways to represent complex real situations of the domain. Therefore, a consensus among the group members (domain experts) to each of the decision made by developers to represent complex situation needs to be achieved (to validate assumptions and/or judgments). However, the Delphi method had some limitations for this purpose. To make more dialogues and active collaboration among the participants in the Delphi group, we arranged a discussion based on the Modified Delphi method to validate the assumptions and/or judgments. The Delphi Method was adapted for use in face-to-face group meeting, allowing group discussions. This technique is very effective in generating a large quantity of creative new ideas. It was designed to allow every member of the group to express their ideas and minimizes the influence of other participants [23]. If you need to generate a lot of ideas and want to assure all members participate freely without influence from other participants, to identify priorities or select a few alternatives for further examination then the modified Delphi method can be effectively used. The use of the consensus is common to both techniques.
Examples of patterns defined to detect pitfalls.
Screenshot displaying anamolies in FertOnt.
We sat with domain experts from Indian Agriculture Research Institute. First, the problems at hand were explained in detail to obtain experts’ knowledge. Based on their responses, comments, and suggestions we made judgments about the validity of the design criteria and assumptions made during the design process. The feedback and comments to the questionnaire were analyzed. Table 1 shows some responses received from the experts. The contents of the ontology have been refined based on these feedback and comments.”
Ontology Validation using OOPS: We have also used a tool named OOPS! [21] to validate the ontology. It is a web-based tool used to detect anomalies in ontologies and is independent of any implementation language used for the ontology development. The tool tries to find as many anomalies as possible in ontologies. It is able to identify as many as 34 common pitfalls that appear while developing ontologies [21]. The pitfalls are related to the following dimensions:
Human understanding (P2, P7, P8, P11, P12, P13, P19, P20, and P22); Logical consistency (P5, P6, P19, P27, P28, and P29); Real world representation (P10); and Modelling issues (P2, P3, P4, P5, P6, P7, P10, P11, P12, P13, P19, P21, P24, P25, P26, P27, P28, and P29).
The Fig. 9 below shows some of the patterns used to detect pitfalls within OOPS! for example, the pattern used to detect P5 (Defining wrong inverse relationships) consist on pairs of relationships defined as inverse but where the domain of one of them is not equal to the range of the other.
The source code of FertOnt is loaded and is validated. The ontology is analyzed and the results are displayed. The anomalies are highlighted in grey colour. In the first run, few anomalies are detected as shown in Fig. 10.
There are pitfalls that affect individual elements in the ontology, others affect more than one element, and there are also pitfalls that do not affect particular ontology elements but the whole ontology. As shown in the Fig. above there are some pitfalls detected as well as suggestion is also given.
P07 depicts different concepts are merged in the same class.
P08. (Missing annotations) that affects individual ontology elements. In this case, the output is grouped by (a) elements that have neither rdfs:label or rdfs:comment defined and (b) elements that have no rdfs:comment.
P13 depicts the missing inverse relationship pitfall that affects more than one ontology element.
Once the pitfalls are identified, respective changes are made to make FertOnt conformed to the best ontology modelling practices. In this study, we have identified 100 concepts, 40 object properties and 38 data properties. We have also reasoned our ontology using the Pellet reasoner and shown how knowledge can be inferred from asserted data. We have also answered the competency questions by querying on the knowledge base based on our ontology using SPARQL and DL query languages.
Currently, there is no ontology available for fertilizers. Our motive for this work is to develop a generic ontology for fertilizers which can fit in a generic framework for answering user’s queries. [12] has been studied to know about the common pitfalls in ontology development. It is further possible to integrate our ontology with vocabularies such as Agrovoc or any other agriculture based taxonomy which will enhance the extensibility of our work. We have explained in detail the overall process and different concerns involved in design and development of an ontology in agriculture domain. We have evaluated the ontology internally and resolved the problems encountered in consultation with agriculture scientists. We further wish to integrate our ontology with soil and crop ontology so that we able to extract more specific information. We have got valuable suggestions from domain experts and will further refine the ontology. Since it is important to add dynamic information, this ontology can be extended by adding dynamic information such as market prices, consumer behavior, information about places to buy and sell fertilizer products, etc. It will increase the practical utility of the ontology. Now a days, there are many government schemes and policies provided for the benefit of farmers which makes use of up to date information. If that information is incorporated, it is surely going to help the farmers in increasing productivity.
Footnotes
Acknowledgments
We have been in touch with scientists from Indian Agriculture Research Institute (IARI). We thank Dr. Rajni Jain, scientist NCAP (IARI) and Ms. Pavithra, Scientist NCAP (IARI) for giving valuable suggestions on the kind of information that should be included in the ontology.
