A knowledge-based decision support system for inferring supportive treatment recommendations for diabetes mellitus

Abstract

BACKGROUND:

Diabetes Mellitus (DM) is a significant risk, mostly causing blindness, kidney failure, heart attack, stroke, and lower limb amputation. A Clinical Decision Support System (CDSS) can assist healthcare practitioners in their daily effort and can improve the quality of healthcare provided to DM patients and save time.

OBJECTIVE:

In this study, a CDSS that can predict DM risk at an early stage has been developed for use by health professionals, general practitioners, hospital clinicians, health educators, and other primary care clinicians. The CDSS infers a set of personalized and suitable supportive treatment suggestions for patients.

METHODS:

Demographic data (e.g., age, gender, habits), body measurements (e.g., weight, height, waist circumference), comorbid conditions (e.g., autoimmune disease, heart failure), and laboratory data (e.g., IFG, IGT, OGTT, HbA1c) were collected from patients during clinical examinations and used to deduce a DM risk score and a set of personalized and suitable suggestions for the patients with the ontology reasoning ability of the tool. In this study, OWL ontology language, SWRL rule language, Java programming, Protégé ontology editor, SWRL API and OWL API tools, which are well known Semantic Web and ontology engineering tools, are used to develop the ontology reasoning module that provides to deduce a set of appropriate suggestions for a patient evaluated.

RESULTS:

After our first-round of tests, the consistency of the tool was obtained as 96.5%. At the end of our second-round of tests, the performance was obtained as 100.0% after some necessary rule changes and ontology revisions were done. While the developed semantic medical rules can predict only Type 1 and Type 2 DM in adults, the rules do not yet make DM risk assessments and deduce suggestions for pediatric patients.

CONCLUSION:

The results obtained are promising in demonstrating the applicability, effectiveness, and efficiency of the tool. It can ensure that necessary precautions are taken in advance by raising awareness of society against the DM risk.

Keywords

Health informatics diabetes mellitus knowledge bases clinical decision support systems smart healthcare

1. Introduction

Diabetes Mellitus (DM) is a disease about elevation of blood glucose level that occurs when the pancreas does not produce enough insulin hormone or when the insulin hormone cannot be used effectively [1]. DM can be caused by many factors including genetic and environmental factors, such as age, gender, obesity, sedentary lifestyle, and smoking [1].

Diabetes comprises many disorders characterized by hyperglycemia. According to the clinical classification, there are two major types of DM: Type 1 and Type 2 diabetes. The distinction between the two types has historically been based on age at onset, degree of loss of $\beta$ cell function, degree of insulin resistance, presence of diabetes-associated autoantibodies, and requirement for insulin treatment for survival [1]. However, none of these characteristics unequivocally distinguishes one type of DM from the other, nor accounts for the entire spectrum of DM phenotypes [2]. Although Type 1 is common in younger age, nowadays, young people and even children can have Type 2 due to increasing obesity in childhood. Thus, it is a crucial public health problem [2]. The International Diabetes Federation (IDF) stated that over 463 million people live with diabetes [2]. By 2030, it is expected that this number will reach 578 million and jump to 700 million by 2045 [3].

The term diabetes describes a group of metabolic disorders characterized and identified by the presence of hyperglycemia in the absence of treatment [4]. The long-term specific effects of diabetes include retinopathy, nephropathy, and neuropathy, among other complications. People with diabetes are also at increased risk of other diseases including heart, peripheral arterial and cerebrovascular disease, obesity, cataracts, erectile dysfunction, and non-alcoholic fatty liver disease. They are also at increased risk of some infectious diseases, such as tuberculosis [4]. Many people who do not show the symptoms of DM have not been diagnosed yet. These people continue their lives without be noticed of their risk. In this case, medical diagnosis is essential, which is the process by which physicians/health practitioners diagnose their patients’ risk based on observed symptoms and laboratory test results.

With the evolution of technology, the applications of Decision Support System (DSS) have evolved significantly since the early 1970s. Numerous technological and organizational developments have employed an impact on this evolution. DDS combines data and information from different fields and sources to provide users with the information they need. Thus, it aims to help individuals make informed decisions. DSS comprises both human knowledge and technology to assist and improve decision-making [5]. In the medical field, DSSs have become popular computer-aided tools especially in medical diagnosis [6].

Furthermore, a Clinical Decision Support System (CDSS) enhances the effectiveness of health systems while saving costs and eliminating medical test duplication, eliminating possible risks, and recommend less costly but equally effective treatments [6]. A CDSS uses a domain knowledge base and a set of predefined medical rules on a patient’s data to produce a diagnostic result for physicians or health practitioners [6, 7]. Therefore, physicians or health practitioners using a CDSS can have a second opinion in the diagnosis and treatment of diseases by avoiding/minimizing medical errors and using data collected from patients’ past/current medical conditions [6].

In this study, an ontology knowledge based CDSS is developed as a clinical tool for healthcare professionals, general practitioners, hospital-based clinicians, health educators, and other primary care clinicians. The aim of the proposed system is to assist health practitioners and professionals, especially in the conduct of primary health care, preventive health practices and facilitating early diagnosis. First, the system takes a set of necessary information (e.g., patient profile, available medical conditions, certain blood laboratory test results) of a patient, then calculates a risk score and attempts to predict the DM risk at an early stage and deduces a set of personalized supportive treatment suggestions. For inferencing mechanism of the system, an ontology knowledgebase is built around medical knowledge in the form of rules and extensive conceptual information about diagnosing DM risk. Thanks to the semantic rules in the ontology developed, the system can infer a set of personalized supportive treatment suggestions to healthcare practitioners or professionals before making any decision for a particular patient.

The remained of the paper is structured as follows: Section 2 covers the background about recent diagnosing methodologies used in DM risk. Section 3 explores the technologies used in developing DSSs and the similar studies published in literature. Section 4 presents the components and features of the proposed CDSS. Section 5 presents the development of ontology knowledgebase and semantic medical rules of the system. Section 6 presents a case study and Section 7 discusses the evaluation of experimental results conducted and the contributions of the system. Finally, Section 8 presents the summary and conclusions.

2. Background

2.1 Diabetes mellitus

DM is a metabolic disease characterized by hyperglycemia due to the inability of the pancreas to produce enough insulin hormone. DM can be classified in different depending on the type and conditions [8]. In clinical classification of DM, there are two major types: (1) Type 1 diabetes (T1DM) and (2) Type 2 diabetes (T2DM). In addition, DM can be observed depending on physical or medical conditions of individuals: (3) Prediabetes and (4) Gestational Diabetes.

(1) T1DM is a type of DM, which occurs when the pancreas cannot produce enough insulin in blood. Genetic and environmental factors are generally effective in the development of T1DM, but many risk factors have not yet been identified and remain a big challenge.

(2) T2DM is another type of DM, in which insulin is produced but is defective and cannot transport glucose into cells. T2DM has more risk factors than T1DM. In general, age, obesity, family history of diabetes, history of gestational diabetes, impaired glucose metabolism, physical inactivity, and race/ethnicity, etc. factors are highly associated with T2DM. In literature, T2DM accounts for approximately 90.0% to 95.0% of all diabetes cases diagnosed in adulthood [2, 8].

(3) Prediabetes is a condition in which individuals have high blood sugar or hemoglobin A1C levels but not high enough to be classified as DM. People with prediabetes have a higher risk of developing T2DM over time [8].

(4) Gestational Diabetes is diagnosed in the second or third trimester of pregnancy. It is known as a type of glucose intolerance that usually disappears once the pregnancy is over. Between 5.0% and 10.0% of women with gestational diabetes continue to have high blood sugar levels, and then diabetes often develops into T2DM. In addition, the children of these women are at risk of developing obesity and diabetes later in life [2, 8].

2.2 Diagnosing DM in asymptomatic adults

Many candidates of DM do not show any symptoms in the early stages. In this case, occasional medical screening is necessary especially in asymptomatic cases with the following risk factors:

(1) Testing should be considered in adults with overweight or obesity related to Body Mass Index (BMI) [9] who have (BMI $\geqslant$ 25 kg/m ${}^{2}$ or $\geqslant$ 23 kg/m ${}^{2}$ in Asian Americans) one or more the following risk factors. The risk factors that are expected to be investigated are listed below [10]:

•
First-degree relative with diabetes,
•
High-risk racial/ethnicity (e.g., African American, Latino, Native American, Asian American, Pacific Islander),
•
Presence of cardiovascular disease,
•
Hypertension ( $\geqslant$ 140/90 mmHg or on therapy for hypertension),
•
HDL cholesterol level $<$ 35 mg/dL (0.90 mmol/L) and/or a triglyceride level $>$ 250 mg/dL (2.82 mmol/L),
•
Women with polycystic ovary syndrome,
•
Physical inactivity,
•
Other clinical findings associated with insulin resistance (e.g., severe obesity, acanthosis nigricans).

(2) Patients with prediabetes {A1C $\geqslant$ 5.7% [39 mmol/mol], Impaired Glucose Tolerance, or Impaired Fasting Glucose} should be tested yearly [10].

(3) Women who were diagnosed with gestational DM should have lifelong testing at least every 3 years [10].

(4) For all other patients, testing should begin at age 35 [10].

(5) If results are normal, testing should be repeated at a minimum of 3-year intervals, with consideration of more frequent testing depending on initial results and risk status [10].

(6) People with HIV [10].

If there is no BMI risk and the factors given above, a person should have continuous screening after the age of 45. If results are normal, screening should be repeated at least every 3 years [10, 11]. Alternative screening for DM can be identified by the medical blood tests detailed below.

Table 1
Comparison of T1DM and T2DM

Specifications T1DM T2DM

Starting age Usually $\leqslant$ 30 Usually $\geqslant$ 30

Starting way Usually acute Slow, asymptomatic

Ketosis Often Not often

Starting weight Usually thin Usually overweight

Diabetes in family Not obvious Intense

C-Peptide Low Normal/high/low

ICA/Anti GAD/IA2Ab/IAA Usually positive Negative

Autoimmune Yes No

2.3 Diagnosing DM via laboratory medicine

Specifications	T1DM	T2DM
Starting age	Usually $\leqslant$ 30	Usually $\geqslant$ 30
Starting way	Usually acute	Slow, asymptomatic
Ketosis	Often	Not often
Starting weight	Usually thin	Usually overweight
Diabetes in family	Not obvious	Intense
C-Peptide	Low	Normal/high/low
ICA/Anti GAD/IA2Ab/IAA	Usually positive	Negative
Autoimmune	Yes	No

As it is mentioned before, DM falls into two major categories: T1DM and T2DM [2, 8]. T1DM and T2DM may cause different symptoms in the human body. The comparison of T1DM and T2DM is detailed in Table 1 [10, 11].

•
T1DM is caused by an immune attack in which the body’s immune system destroys the pancreas’ insulin-producing cells. The body generates little or no insulin.
•
People with T2DM should get daily insulin injections to keep their glucose levels within the normal range and survive. However, they may continue their lives normal or reduce the consequences of DM with daily insulin dosing, periodic blood glucose testing, knowledge, and care [11]. Hyperglycemia in T2DM is caused by the body’s cells’ failure to respond sufficiently to insulin. It is known as insulin resistance. In that case, the hormone is ineffectual and causes the increment of insulin levels and pancreas cannot produce demand enough insulin for the cells. Typically, it occurs in adult people who are usually above 30 years. Recently, it started to be seen in children and young adults due to increased obesity and inefficient daily activities.
•
Prediabetes occurs when blood sugar level is higher than normal but not yet high enough to be T2DM.

DM can be diagnosed in a variety of ways. Major tests used for diagnosing DM are (1) Fasting Plasma Glucose (FPG), (2) Random Plasma Glucose (RPG), (3) Oral Glucose Tolerance Test (OGTT), and (4) HbA1c and only one of the criteria is sufficient to diagnose DM [1, 2, 11].

Table 2
The limit values of FPG, RPG, OGTT, and HbA1c in diagnosing of DM

Test Limit value

Fasting Plasma Glucose (FPG) $\geqslant$ 126 mg/dl

Random Plasma Glucose (RPG) $\geqslant$ 200 mg/dl

Oral Glucose Tolerance Test (OGTT) $\geqslant$ 200 mg/dl

HbA1c $\geqslant$ 6.5%

The World Health Organization (WHO) recommends the HbA1c test to use in diagnosing DM [4]. OGTT and HbA1c, which are highly used in diagnosis of DM, have no superiority over each other in terms of diagnostic value [2, 12]. The limit values of these tests are shown in Table 2.

As mentioned before, prediabetes is a condition defined as blood sugar levels above normal but below the defined diabetes threshold [2, 12]. The same tests (given in Table 2) are also used to identify individuals with prediabetes. Furthermore, to diagnose prediabetes, two other medical tests are possible to check to diagnose prediabetes, which are Impaired Fasting Glucose (IFG) and Impaired Glucose Tolerance (IGT). WHO and IDF currently propose a 2-hour OGTT for the testing of IGT and IFG [2]. Physicians may identify prediabetes as IFG or IGT based on the test used to diagnose it. The ranges determined to define prediabetes according to IFG, IGT and HbA1c values are shown in Table 3 [1, 2, 10, 11]. Individuals diagnosed with prediabetes are potential DM patients.

Table 3
The ranges for diagnosing “prediabetes”

Plasma glucose (PG)

Under risk Fasting (mg/dl) Satiety (mg/dl)

Impaired Fasting Glucose (IFG) 100–125

Impaired Glucose Tolerance (IGT) 140–199

HbA1c 5.7%–6.4%

For decades, indicating if a person has DM, prediabetes, or not is based on glucose criteria, either FPG or the 75-g OGTT [1]. When the HbA1c test is applied, it is possible to say that patient has DM, if the value of HbA1c is 6.5% or higher. When it is measured between 5.7% and 6.4%, the patient is detected as prediabetes, and when it is less than 5.7%, the patient is called normal.

Another common test is FPG and if its result is 126 mg/dl or higher, it is possible to say that the patient has a risk of DM. If the FPG result is less than 100 mg/dl and the patient has no symptoms, then it is called normal. Otherwise, a 2-hour plasma glucose test should be performed in order to detect IGT and/or IFG for prediabetes. The general algorithm based on FPG in diagnosing DM is shown in Fig. 1.

Figure 1.
Diagnosis algorithm of DM by IDF [2].

2.4 Diagnosing DM by lifestyle

Test	Limit value
Fasting Plasma Glucose (FPG)	$\geqslant$ 126 mg/dl
Random Plasma Glucose (RPG)	$\geqslant$ 200 mg/dl
Oral Glucose Tolerance Test (OGTT)	$\geqslant$ 200 mg/dl
HbA1c	$\geqslant$ 6.5%

Plasma glucose (PG)
Impaired Fasting Glucose (IFG)	100–125
Impaired Glucose Tolerance (IGT)		140–199
HbA1c	5.7%–6.4%

Prospective studies have revealed that lifestyle behaviors such as poor diet, alcohol consumption, insufficient physical activity and sedentary life, smoking are highly associated with having DM [13]. To evaluate lifestyle related risk factors and assess the prevalence of glucose metabolism disorders, a risk assessment questionnaire is obtained and used in DM evaluation [14]. In this questionnaire, each response option has a different weight in the DM’s assessment model. At the end of the assessment, a total risk score of a person having DM is calculated which gives the risk of developing T2DM within 10 years [14]. This questionnaire was used to determine a person’s DM risk score in the CDSS developed in our study and the questions with its options are given in Appendix A [14]. If the total score is lower than 7, the risk of having DM within the next 10 years is low. If it is between 7 and 11, the patient has slightly elevated DM risk. If it is higher than 11 but less than 15, then having such a risk is moderate. If it is between 15 and 20, this risk is high. If the score is higher than 20, the patient is under very high risk to have DM within next 10 years [14].

As a conclusion, it is required regular medical check-ups, continuing good self-care, and enough knowledge to prevent the severe complications of DM and reduce its long-term complications. Every patient with T2DM should have access to a well-structured lifestyle change program that helps achieve and maintain a BMI in the appropriate range, an accurate physical activity routine, and control of cardiovascular risk factors, as well as glycemic index. The CDSS proposed, is a sophisticated health information technology tool, can be used to ease and manage these activities, and assess its long-term complications over time. Diagnosing and predicting of the DM based on lifestyle and/or blood test results for a patient with the help of a systematic computerized CDSS tool is possible.

3. Methodology and related works

3.1 Technologies used in developing of DSS

Knowledgebase is an information collection that can be used in decision-making tasks. An expert system also known as a knowledge-based system is a computer program that involves the knowledge and analytical skills of one or more human experts in a specific problem domain (e.g., medical diagnosis) [15]. As mentioned before, DSS concept is a wide that covers various areas of assisting people with decision-making and offering automated intelligent help when needed. A DSS, for example, can assist a physician in deciding which prescription to recommend based on a patient’s past health records and a drug company repository, whereas a recommender system is able to advise identical products by evaluating previous user behaviors and immediately making recommendations with a product database [5].

By using knowledge bases and considering Information Retrieval and Natural Language Processing (NLP) methods, a DSS can be developed that could be smarter and effective in the decision-making tasks. Semantic Web is an extension of the existing web and is also a sub-branch of artificial intelligence [16]. Semantic Web promotes improved collaboration between user and computer and let users to figure out the answers to their questions through ontologies. Ontology is the systematic representation of a domain knowledge such as classes (or sets), attributes (or properties), and relationships (or relations among class members) [17]. It is used for sharing knowledge. In ontology, class represents the sets such as patients, symptoms, and disease. Attributes refers to the relationship between the classes or sets. Instance represents an element of a class.

Many ontological languages, such as Resource Description Framework Schema (RDFS) and Web Ontology Language (OWL), have recently been suggested and standardized by W3C for modelling ontologies [18, 19]. OWL is a family of knowledge representation languages used for reasoning on ontologies, according to the World Wide Web Consortium (W3C). OWL can be categorized into three species or sub-languages: OWL Lite, OWL DL and OWL Full [16].

The rules language of the Semantic Web is intended to be the Semantic Web Rules Language (SWRL) [20, 21]. SWRL is used to define rules as well as logic, combining OWL DL or OWL Lite with a subset of the Rule Markup Language. The rules are used to infer new knowledge from existing knowledge on an OWL knowledgebase.

In this study, a CDSS based on ontology knowledge was developed as a clinical tool, consisting of the following components: (1) a database, (2) an ontology knowledge base, (3) semantic medical rules, (4) an inference engine module, and (5) portal user interface.

•
The database is a collection of patients’ data and domain related information.
•
The ontology knowledge base of the system is developed by using OWL and comprises domain information of DM, and its semantic medical rules on diagnosing and treatment of DM.
•
The semantic medical rules represent the knowledge of healthcare professionals who specialize in DM management and the directives in Clinical Practice Guidelines, which are conditional statements systematically developed to assist health practitioners/professionals in making decisions about appropriate health care under certain circumstances [10, 22].
•
The reasoning component is an Inference Engine, which is Java module used to send a patient’s data collected to the system ontology and execute its proper semantic rules on a patient’s instance (e.g., observed symptoms and signs, physical activities, risk factors, and blood test results) to deduce a set of personalized and supportive treatment suggestions and obtain a DM risk score.
•
The portal user interface manages the entire operations on the patient data, provides access to the ontology knowledge base for inferencing tasks, and maintains the entire relationship between a user and the system.

3.2 Related works

There have been similar efforts on CDSSs that use the ontology knowledge base and have semantic medical rules [23, 24, 25, 26, 29, 30, 31, 33]. As in similar studies, we have developed an ontology to model key concepts and relationships of clinical practice in DM diagnosis to allow for clinical knowledge sharing, updating, and reuse. This section discusses the similar studies in the designing and developing ontology based CDSSs proposed in literature on diagnosis and management of DM and other risks.

Chen et al. proposed a recommendation system that suggests the most appropriate drugs for DM patients. They analyzed 51 medicines’ nature attributes and side effects to build anti-diabetic drugs ontology and patient data ontology [24, 25]. They built 96 rules related anti-diabetic drugs by using SWRL and Java Expert System Shell (JESS) was used for reasoning to select the most proper prescription for the patients [27]. Twenty (20) patient test data were used to check the precision of the system and the recommend drugs rate. The recommended medicines were derived from user information parameters such as HbA1c value, the liver function, renal function, gastrointestinal dysfunction, heart failure and hypoglycemia. Their system was evaluated by a physician. According to their results, their system was successful since the recommendation system agreed doctor requirements.

Mahmoud and Elbeh proposed an individualized recommendation system for T2DM medication [27]. They built two different ontologies. One was for anti-diabetic drugs, and one was for patient knowledge. Their system provided the HbA1c target, drug selection and its dose. They used SWRL and Jess inference engine. Their system was evaluated by doctors for 30 patient records. Precision was calculated as 100.0% for all test cases.

Sappagh and Elmogy designed a fuzzy case base ontology that could model different types of features like text, ordinal, semantic, numerical, and fuzzy types [28]. Their model had 63 OWL Classes, 54 Object Type Property, 138 Data Type Property and 105 Fuzzy Data Types and 60 cases. They applied different evaluation methodologies which are consistency checking, comparison against a gold standard, user-based or criteria-based evaluation, lexical, vocabulary or data level evaluation and vagueness evaluation. Their results showed that their system was accurate, consistent, and addressed the concept and reasoning of diabetes mellitus diagnosis.

Sappagh et al. introduced DM Treatment Ontology (DMTO) about treatment of T2DM patients [29]. Their system diagnosed the patients’ current situations, lab results, symptoms, family, and medical background. They also provided to monitor the previous treatment plans if a patient already had T2DM, treatment plans based on the patients’ drugs, complications and offered some advice for diet and exercise if necessary. They included more than 10700 classes, 277 relations, and 214 semantic rules in their ontology. They asserted that DMTO could be used as a knowledge base to analyse the most appropriate medicines, foods, and exercises for T2DM patients.

Putra et al. developed a system that diagnoses DM risk by using a weighted ontology and weighted tree [30]. Two ontologies were built on the DM knowledge and classification. JENA reasoner was used in implementation of inferencing task. According to patients’ data, their system performs a similarity matching. Hundred patient data were used, and their system was evaluated by doctors. Consistency of their system was performed as 93.0%. Based on information fusion and the construction of a context ontology, Chen et al. presented a personalized RS of antihypertensive medicines [31]. Their system can detect the users’ context in real-time using wearable and medical smart sensors, and it gives dependable antihypertensive medicine prescriptions. They employed Semantic Web and ontology engineering tools to assess user interests. SWRL was used to develop the rules for reasoning of their system to recommend medicines. The researchers also used three types of information suggestion rules that corresponded to different priority levels.

Table 4
The proposed system is compared with others

System features	Our	24, 25	27	28	29	30	31	33
Creating a patient profile	$+$	$-$	$+$	$+$	$+$	$+$	$+$	$+$
Evaluates family DM history	$+$	$-$	$-$	$+$	$+$	$-$	$-$	$+$
Evaluates symptoms and signs observed	$+$	$-$	$-$	$+$	$+$	$-$	$+$	$+$
Evaluates daily lifestyle and physical activities info	$+$	$-$	$-$	$-$	$-$	$+$	$-$	$+$
Evaluates medical conditions and other risk factors	$+$	$+$	$+$	$+$	$-$	$-$	$+$	$+$
Evaluates medical laboratory test results	$+$	Only 1 test	$-$	$+$	$+$	Only 1 test	$-$	$+$
Keeps currently or previously taken drugs	$-$	$-$	$-$	$-$	$+$	$-$	$-$	$+$
Predicts type of DM risk	$+$	$-$	$-$	$-$	$+$	$+$	$-$	$+$
Predicts probability of having DM in 10 years	$+$	$-$	$-$	$-$	$-$	$-$	$-$	$-$
Calculating BMI	$+$	$-$	$-$	$-$	$-$	$+$	$-$	$+$
Calculating ideal weight	$+$	$-$	$-$	$-$	$-$	$-$	$-$	$-$
Predicts DM & deduces personalized supportive treatment suggestions	$-$	$-$	$-$	$-$	$+$	$-$	$-$	$+$
Runs as a CDSS tool/application	$+$	$+$	$+$	$+$	$-$	$+$	$+$	$-$
Case-based inferencing via semantic rules on ontology	$+$	$-$	$-$	$-$	$-$	$-$	$-$	$+$
Querying the case base fuzzy ontology	$-$	$-$	$-$	$+$	$-$	$-$	$-$	$-$
Involves own ontology for anti-diabetic drugs	$-$	$+$	$+$	$-$	$-$	$-$	$-$	$+$
Recommends anti-diabetic drugs with its descriptions	$-$	$+$	$+$	$-$	$-$	$-$	$-$	$+$
Recommends anti-hypertensive drugs	$-$	$-$	$-$	$-$	$-$	$-$	$+$	$-$

The researchers also used three types of information recommendation criteria that correspond to various priority levels, as well as a sorting algorithm to enhance the suggestions provided. The SWRL rules were executed by researchers using the fastest rule engine JESS developed by Ernest Friedman to infer new knowledge. To match patterns, the engine employs the Rete algorithm [32].

Chandra et al., using Optical Character Recognition (OCR) and Natural Language Processing (NLP) methods, aimed to infer the named entities required for name, age, gender, test parameters, complications, etc. These extracted assets were then used to create the system ontology with OWL technology. Next, the researchers created 37 SWRL rules to provide constraints on their ontology. Researchers claimed that information such as disease, diagnosis, treatment, drugs from patient documents can be searched and edited with OCR. Researchers have focused on developing an appropriate treatment plan according to the needs of the patient and developing a clinical decision support system for this. The outputs of the study; (1) DM diagnosis and management ontology, (2) provide necessary guidelines for diagnosing T1DM, T2DM, and GDM and recommending care, (3) SWRL guidelines for diagnosis and treatment of DM. Their proposed approach was tested on a character recognition dataset and the OCR used effectiveness was 94.0% accuracy, while the system success was evaluated as 84.0% on 38 DM patients [33].

The proposed system and all similar studies discussed above are compared in Table 4.

4. System analysis and design

The system developed utilizes the following units and presents returned results to the physician or health practitioners via its user interface. The system consists of the following components: (1) Diabetes Mellitus Diagnosis and Support Ontology (DMDSOnt), (2) Semantic Web Rule Knowledgebase (SWRL rules), (3) Inference Engine module, (4) System Database, and (5) Portal User Interface Framework of the CDSS developed. The components of the system are shown in Fig. 2 and its functionalities are given below.

Figure 2.

Components and functioning of the proposed system.

(1) and (2) DMDSOnt and SWRL Rules: IDF Clinical Practice Recommendations for the management of T2DM in primary care aim to summarize the available evidence for optimal management of people with T2DM [2, 10]. The guide is intended to be a decision support tool for general practitioners, hospital-based clinicians, and other primary care clinicians working in the DM field [2, 10]. For this reason, in this study, an ontology database, called Diabetes Mellitus Diagnosis and Support Ontology (DMDSOnt) has been developed using OWL 2.0 ontology language according to the directions determined in the IDF’s guidelines [2, 10]. As required and directed in the guidelines of IDF, DMDSOnt contains metadata about the signs and symptoms observed in DM, active lifestyle criteria expected from patients, patient genetic status possibilities, physical activities to be followed, necessary laboratory tests, other factors/criteria in the diagnosis of DM, and recommendations for the proper management of DM risk [2, 10].

(3) Inference Engine (IE), often known as the ontology reasoner, is a crucial component of the proposed system. IE is written in Java that executes SWRL rules on the data of a patient instance and produces certain inferencing results for the experts to verify the patient’s DM risk. Many other forms of reasoners are accessible in the literature, including Pellet [34], Hermit [35], FaCT $++$ [36], Drools [37], SWRL Rule Engine [38], and so forth. During the inference process, the reasoners are key tools to deduce new medical knowledge using existing knowledge. IE employs the Pellet and SWRL rule engines during reasoning. Pellet reasoner has a forward-chaining approach. The relevant SWRL rules on DMDSOnt are executed by IE and process the patient profile, medical history, blood test result, etc. to estimate a patient case’s DM risk level and deduce a set of supportive treatment activity suggestions appropriate for the patient.

(4) Database consists of various medical records that are collected from registered patients (e.g., profile, observable symptoms, blood test results, genetic history, etc.).

(5) Portal User Interface Framework is a web/mobile application, which provides the interaction between the user and system, is enabled through an application user interface. The interface displays the patient information to its user (e.g., a healthcare professional, general practitioner, hospital-based clinician, health educator, and other primary care clinician) who utilizes the information to diagnose a patient with DM. The application presents: (1) the estimation result of a patient case as DM risk level and (2) a set of appropriate supportive treatment suggestions inferred by the system. Table 5 summarizes the main features of the system.

Table 5

Characteristics of the proposed system

Items	Characteristics
Aim of the CDSS tool proposed	Supporting physician(s)/health practitioner(s) to evaluate the risk level of DM for their patients and recommending a set of supportive treatment activities found appropriate.
Domain	Diabetes mellitus.
Knowledge resources	Expertise, diagnosing factors, treatment rules, and supportive activities in DM.
Knowledge acquisition Technique used	DFD, decision tables, requirements engineering.
Knowledge representation Technique used	OWL and SWRL are used to create the system ontology and its rules, respectively.
User interface	A portal user interface is developed using JSP.
Inference engine (IE)	Pellet Reasoner API [34] and SWRL Rule Engine [38] in the Java environment were used in developing of the reasoning module of the system to execute its own semantic medical rules. Besides, forward chaining is used.
Explanation facility	Results by triggered rules and the relationships on DMDSOnt.
Development method	Prototyping.
Development tools	Pellet Reasoner API [34], SWRL API [38], OWL API [39], Protégé ontology editor [40], and Java platform.
Development languages	OWL 2.0, SWRL, Java Script, JSP, and Java.

The development of the system and ontology reasoning via its semantic medical rules will be discussed in detail in the following sections.

5. Diabetes mellitus diagnosis and support ontology (DMDSOnt)

DMDSOnt is the systematic modelling of domain knowledge, such as concepts (or classes), attributes (or properties), and relationships (between class members) of methods used in the diagnosis and management of DM. In addition, a set of semantic medical rules using SWRL has been implemented on DMDSOnt [20] DMDSOnt is developed by using OWL and SWRL languages and built with the Protégé editor [21].

Figure 3.

A portion of the system ontology – DMDSOnt.

5.1 Top-level classes developed

Figure 3 shows a portion of the DMDSOnt developed using Protégé. In DMDSOnt 8 high-level classes have been created: “Disease”, “Patient”, “Suggestion”, “Symptom”, “Gender”, “C_Peptid_Val”, “Diagnose_Test”, and “Risk_Score_Result”.

•
“C_Peptid_Val” class contains 3 instances, which are “Low_Peptid”, “Normal_Peptid” and “High_Peptid”.
•
“Risk_Score_Result” class is created for listing the risk of having T2DM within 10 years and has 5 categories: “Low”, “Slightly_Elevated”, “Moderate”, “High”, and “Very_High”.
•
“Disease” class is created for listing the 5 types of DM risk: “Diabetes” {(1) “T1DM”, (2) “T2DM”}, (3) “Prediabetes”, (4) “Asymptomatic”, and (5) “Gestational_DM” as instances (or as OWL individuals).
•
“Patient” class is used for keeping patient case as an instance on the DMDSOnt.
•
“Symptom” class includes all potential symptoms known and observed in DM as sign and symptom instances.
•
“Gender” class involves gender types (female/male).
•
“Suggestions” class contains various medical suggestions assigned to the patients analysed after inferencing.
•
“Diagnose_Test” class contains the types of laboratory medical tests applied in diagnosing of DM, which are proposed by IDF [10].

Table 6
Data type properties

No DTP Domain Range

1 AgePoint Patient xsd:double

2 BMIPoint Patient xsd:double

3 Do Exercise Patient xsd:boolean

4 FamilyHistoryPoint Patient xsd:double

5 Has Age Patient xsd:int

6 Is 2hOGTT Diagnose_Test xsd:double

7 hasPolycystic Patient xsd:boolean

8 Is FPG Diagnose_Test xsd:double

9 Is HBA1C Diagnose_Test xsd:double

10 Is Hypertension b Diagnose_Test xsd:double

11 Is Hypertension s Diagnose_Test xsd:double

12 TOTAL_SCORE Patient xsd:double

13 Is RPG Diagnose_Test xsd:double

14 Has BMI Patient xsd:double

15 Has IdealWeight MAX Patient xsd:double

16 Has Weight Patient xsd:double

17 Has Height Patient xsd:double

18 hasGDM Patient xsd:boolean

19 is_Asymptomatic Patient xsd:boolean

20 hasFamilyHistory Patient xsd:boolean

5.2 Data type properties (DTP) developed

No	DTP	Domain	Range
1	AgePoint	Patient	xsd:double
2	BMIPoint	Patient	xsd:double
3	Do Exercise	Patient	xsd:boolean
4	FamilyHistoryPoint	Patient	xsd:double
5	Has Age	Patient	xsd:int
6	Is 2hOGTT	Diagnose_Test	xsd:double
7	hasPolycystic	Patient	xsd:boolean
8	Is FPG	Diagnose_Test	xsd:double
9	Is HBA1C	Diagnose_Test	xsd:double
10	Is Hypertension b	Diagnose_Test	xsd:double
11	Is Hypertension s	Diagnose_Test	xsd:double
12	TOTAL_SCORE	Patient	xsd:double
13	Is RPG	Diagnose_Test	xsd:double
14	Has BMI	Patient	xsd:double
15	Has IdealWeight MAX	Patient	xsd:double
16	Has Weight	Patient	xsd:double
17	Has Height	Patient	xsd:double
18	hasGDM	Patient	xsd:boolean
19	is_Asymptomatic	Patient	xsd:boolean
20	hasFamilyHistory	Patient	xsd:boolean

To keep variety of patient data and obtain a risk score on DMDSOnt, various Data Type Properties (DTP) have been created as relationships between classes and data types. In a two-sided DTP relationship, a domain indicates an OWL class, while a range indicates a data type. The domain value restricts the class of subject in triple of the extension of the DTP and the range value restricts the range of the DTP value. Table 6 provides some of DTPs created on DMDSOnt with their domain and range.

5.3 Object type properties (OTP) developed

Object Type Properties (OTP) represents a relationship established between two instances of two classes. There are 7 OTPs created on DMDSOnt that are listed in Table 7. The first column represents the name of an OTP, the second column represents the domain class of that OTP, and the third column represents its range class.

Table 7
Object type properties

No	OTP	Domain	Range
1	hasDisease	Patient	Disease
2	hasGender	Patient	Gender
3	hasSuggestions	Patient	Suggestion
4	hasSymptom	Patient	Symptom
5	is_C_peptid	Patient	C_Peptid_Val
6	isSymptomOf	Symptom	Disease

5.4 Semantic medical rules developed using SWRL

The semantic medical rules developed represent the knowledge of healthcare professionals working in the field of DM and the directives defined in Clinical Practice Guidelines [1, 2, 4, 9, 10, 11]. These directives are conditional statements systematically developed to assist health practitioners/professionals in making decisions about appropriate health care under certain circumstances. As mentioned earlier, SWRL is used to develop semantic medical rules that allow to predict DM risk at an early stage based on a risk score and to derive a set of personalized supportive treatment recommendations. In this study, 60 SWRL rules are designed based on the directives given in the research studies [1, 2, 4, 9, 10, 11] and a few of them are detailed in Table 8. For example, to determine the Prediabetes category, FPG, 2hOGTT or HbA1c values should be checked, as mentioned earlier. Accordingly, Rule #22 is developed to identify the prediabetes by considering the availability of FPG level between 100 and 125 (as depicted in Table 3). In addition, Rule #1 calculates a patient’s BMI, while Rule #18 calculates a patient’s total DM risk score to estimate the risk of having T2DM within 10 years based on responses given to the questionnaire of CDSS developed.

Table 8
An example set of semantic rules developed on DMDSOnt using SWRL

No	SWRL rule	Rule description
1	Patient (?p), hasHeight (?p, ?h), hasWeight (?p, ?w), eval (?BMI, “10000*w/(pow(h, 2))”, ?w, ?h) -> hasBMI (?p, ?BMI)	BMI is calculated by using hasHeight and hasWeight values and kept in hasBMI.
4	Patient (?p), hasHeight (?p, ?h), eval (?ideal, “(18.5*(pow(h, 2)))/10000”, ?h) -> hasIdealWeight_MIN (?p, ?ideal)	An ideal weight for a patient is determined.
8	Patient (?p), is_2hOGTT (?p, ?ogtt), greaterThanOrEqual (?ogtt, “140” ${}^{\wedge\wedge}$ xsd:int), lessThanOrEqual (?ogtt, “199” ${}^{\wedge\wedge}$ xsd:int) -> hasDisease (?p, Prediabetes)	The patient has Prediabetes since the value of 2hOGTT (195.0) is between 140 and 199.
9	Patient (?p), is_HBA1C (?p, ?a1c), greaterThanOrEqual (?a1c, 5.7f), lessThanOrEqual (?a1c, 6.4f) -> hasDisease (?p, Prediabetes)	Prediabetes is determined if the value of HBA1C is between 5.7 and 6.4.
11	greaterThanOrEqual (?tot, “7” ${}^{\wedge\wedge}$ xsd:int), Patient (?p), hasDisease (?p, Asymptomatic), TOTAL_SCORE (?p, ?tot) -> hasSuggestions (?p, SG_10), hasSuggestions (?p, SG_01), hasSuggestions (?p, SG_06), hasSuggestions (?p, SG_07), hasSuggestions (?p, SG_08), hasSuggestions (?p, SG_09), hasSuggestions (?p, SG_02), hasSuggestions (?p, SG_03), hasSuggestions (?p, SG_04), hasSuggestions (?p, SG_05)	If the total risk score of a patient is greater than 7 and asymptomatic, suggestions from 1 to 10 are assigned.
16	Patient (?p), hasBMI (?p, ?bmi), greaterThan (?bmi, “30” ${}^{\wedge\wedge}$ xsd:int) -> BMIPoint (?p, 3.0f)	Value of BMIPoint is found as 3.0 since the patient’s BMI was found greater than 30 after running Rule #1.
18	eval (?res,“fm+hb+w+b+a+ex+d+bmi”, ?fm, ?hb, ?w, ?b, ?a, ?ex, ?d, ?bmi), Patient (?p), ExercisePoint (?p, ?ex), BMIPoint (?p, ?bmi), WaistCircumferencePoint (?p, ?w), HealthyDietPoint (?p, ?d), BloodPressurePoint (?p, ?b), FamilyHistoryPoint (?p, ?fm), AgePoint (?p, ?a), HighBloodGlucosePoint (?p, ?hb) -> TOTAL_SCORE (?p, ?res)	Value of TOTAL_SCORE is found as “?res”.
19	hasFamHist_P_B_S_OC (?p, true), Patient (?p), hasFamilyHistory (?p, true) -> FamilyHistoryPoint (?p, 5.0f), is_Asymptomatic (?p, true)	FamilyHistoryPoint is calculated as 5.0 and the value of is_Asymptomatic becomes true since the patient has DM in the 1st degree of the family.
22	Patient (?p), is_FPG (?p, ?fpg), greaterThanOrEqual (?fpg, “100” ${}^{\wedge\wedge}$ xsd:int), lessThanOrEqual (?fpg, “125” ${}^{\wedge\wedge}$ xsd:int) -> hasDisease (?p, Prediabetes)	The patient has Prediabetes since the value of FPG (123.0) is between 100 and 125.
27	Patient (?p), hasDisease (?p, Prediabetes), hasBMI (?p, ?bmi), greaterThanOrEqual (?bmi, “25” ${}^{\wedge\wedge}$ xsd:int) -> hasSuggestions (?p, SG_19), hasSuggestions (?p, SG_20), hasSuggestions (?p, SG_05), hasSuggestions (?p, SG_21), hasSuggestions (?p, SG_22), hasSuggestions (?p, SG_23), hasSuggestions (?p, SG_24), hasSuggestions (?p, SG_25), hasSuggestions (?p, SG_26), hasSuggestions (?p, SG_27), hasSuggestions (?p, SG_28), hasSuggestions (?p, SG_29)	Since the patient is Prediabetes and BMI value is greater than or equal 25.0, suggestions from 19 to 29 and suggestion 5 are listed.
28	Patient (?p), TOTAL_SCORE (?p, ?tot), greaterThanOrEqual (?tot, “7” ${}^{\wedge\wedge}$ xsd:int) -> hasSuggestions (?p, SG_10), hasSuggestions (?p, SG_01), hasSuggestions (?p, SG_06), hasSuggestions (?p, SG_07), hasSuggestions (?p, SG_08), hasSuggestions (?p, SG_09), hasSuggestions (?p, SG_02), hasSuggestions (?p, SG_03), hasSuggestions (?p, SG_04), hasSuggestions (?p, SG_05)	Since the patient’s total risk score is greater than 7, suggestions from 1 to 10 are listed.
32	Patient (?p), is_Hypertension_b (?p, ?htb), greaterThanOrEqual (?htb, “140” ${}^{\wedge\wedge}$ xsd:int) -> is_Asymptomatic (?p, true)	Is_Asymptomatic value becomes true since the Hypertension_b is greater than or equal 140.0.

To run the semantic medical rules on the DMDSOnt, the inference engine module of the system connects to the ontology knowledge base and executes the entire SWRL rules based on the data of a patient instance (e.g., profile details, the symptoms and signs observed, availability of physical activities, other risk factors, and recent blood test results) to evaluate a risk score and deduce a set of personalized supportive treatment suggestions for that patient. Based on the risk score obtained, the system can determine whether the patient instance has DM risk or what the risk level is of developing DM within 10 years.

The next section discusses the system usage via a case study. All rules developed are given in Appendix B.

Figure 4.

The user interface where patient data and laboratory results are collected on the left. The system calculates a risk score and deduces a set of personalized supportive treatment suggestions using the SWRL rules on the DMDSOnt for the patient.

Figure 5.

Mobile application interface of the system developed.

6. System implementation and assessment

6.1 Used tools

The system was developed on NetBeans platform with Java Script, JSP and Java languages. As mentioned earlier, the Semantic Web technologies such as OWL and SWRL have been used as they not only help to understand the meanings of words in a document, but also help extract new data from existing data through its semantic rules involved. To execute the SWRL rules on a Java class, the OWL API [39], Pellet Reasoner API [34], and SWRL API [38] were used. For development of DMDSOnt, the Protégé ontology editor [40] is used. Both the OWL API and the SWRL API are Java-based and allow to connect to DMDSOnt and execute various operations through the system’s Java classes using existing Java libraries. While the OWL API performs various operations on the ontology (e.g., connect, search, insert, delete, update) via Java classes, the SWRL API runs OWL-based SWRL rules built into the system ontology to infer new data from existing data and uses SQWRL in query operations on the ontology.

6.2 System evaluation from healthcare workers through a case study

Figure 4 illustrates a user screen (e.g., a healthcare professional, general practitioner, hospital-based clinician, health educator, other primary care clinician), which allows to the user enter his/her patient’s data during diagnosing and assessment DM risk in the clinical or home environment. As shown in Fig. 4, the system user fills first his/her patient’s data (e.g., name, date of birth, gender, physical activities, family medical history, etc.) using the profile form on the left, symptoms observed (e.g., blurred vision, hyperglycemic crisis, weight gain) on the right corner, and recent laboratory test results (e.g., 2hOGTT, FPG, HBA1c) in middle of the form to obtain a DM risk score and deduce a set of supportive treatment suggestions for that patient. After the user collects the required data of that patient, the system sends all the patient data to the DMDSOnt. The next step after sending the patient data collected to DMDSOnt is to determine a total DM risk score based on the responses to the questionnaire the system sends to that user. The purpose of this step is to determine the total risk score for that patient, taking into account the patient’s lifestyle and habits in addition to the information collected in the previous step. The questionnaire used in this step regarding the lifestyle of the patients is given in Appendix A.

As shown in Fig. 4, the profile details, daily lifestyle, signs and symptoms observed, and recent laboratory test results of the patient case (“Sam Grey”) were filled to the required fields on the form. Furthermore, as shown in Fig. 4, the patient that needs to be evaluated can be searched and retrieved in the database. In the next step, the “Get Suggestions” button must be enabled by the user to initiate the inferencing of the appropriate supportive treatment suggestions on the DMDSOnt for that patient. After inferencing task is complete, the system automatically deletes the patient data processed lastly from the system ontology to ensure that the DMDSOnt does not swell over time. The inferencing outcomes are always kept in the patients table in the database. Figure 5 shows the same case study on the mobile application interface.

Table 9 shows all input data collected from this discussed case (“Sam Grey”). Based on these data, the suggestions inferred by IE for that patient are presented in the “System Predictions” panel of the patient form shown on the right in Fig. 4. The system inferred that the patient was at high risk of prediabetes. In addition, it inferred a set of personalized supportive treatment suggestions (21 suggestions) considered suitable for that patient and presents them to assist the physician/health practitioner user (seen on the Fig. 4).

Table 9
Asserted data to the DMDSOnt for a case study focused

Features of the case study	Properties used on DMDSOnt	Assigned data to DMDSOnt
Name	hasName & hasSurname	Sam Grey
Age	hasAge	48
Weight (kg)	hasWeight	130.0
Height (m)	hasHeight	175.0
Waistline (cm)	hasWaistCircumference	120.0
Gender (f/m)	hasGender	Male
Does exercise (yes/no)	DoesExercise	No
Healthy diet (yes/no)	hasHealthyDiet	No
High blood glucose(yes/no)	HighBloodGlucose	Yes
Blood pressure treatment (yes/no)	isTakingBloodPressureTreatment	Yes
Cardiovascular disease (yes/no)	hasCardiovascularDisease	Yes
Autoimmune disease (yes/no)	hasAutoimmuneDisease	No
Family history	hasFamHist_P_B_S_OC	Parent/brother/sister/own child has the risk
HIV	hasHIV	No
2hOGTT (mg/dL)	is_2hOGTT	195.0
FPG (mg/dL)	is_FPG	123.0
HBA1c (%)	is_HBA1C	6.0
RPG (mg/dL)	is_RPG	184.0
Hypertension B (mg/Hg)	is_Hypertension_b	140.0
Hypertension S (mg/Hg)	is_Hypertension_s	95.0
HDL cholesterol (mg/dL)	is_HDL_cholesterol	30
Triglycerides (mg/dL)	is_Triglycerides	300.0
Anti-GAD (yes/no)	is_AntiGAD	No
IA 2AB (yes/no)	is_IA_2Ab	No
IAA (yes/no)	is_IAA	No
ICA (yes/no)	is_ICA	No
Insulin resistance	is_Insulin_resistance	Yes
Ketosis (yes/no)	is_Ketosis	No
C peptide	is_C_peptid	Normal_Peptid

In conclusion, the results deduced from the DMDSOnt for one questioned patient: “Type of Diabetes” “Risk of Diabetes in 10 Years” “BMI” “Min. Ideal Weight” “Max. Ideal Weight”, and “Personalized Supportive Treatment Suggestions”. The suggestions inferred by the IE of the system for the patient case (“Sam Grey”) are shown on Table 10.

Table 10

Results returned from DMDSOnt

Properties on DMDSOnt	Deduced results via DMDSOnt
Type of diabetes	Prediabetes
Risk of diabetes in 10 years	Very high
BMI	42.4
Min. ideal weight	56.7
Max. ideal weight	76.5
Suggestions deduced	SG_01, SG_02, SG_03, SG_04, SG_05, SG_06, SG_07, SG_08, SG_09, SG_10, SG_19,
	SG_20, SG_21, SG_22, SG_23, SG_24, SG_25, SG_26, SG_27, SG_28, SG_29

In addition, system users can search and edit previously assessed and saved patient results, initiate new inference processes for the new/existing patients, insert/delete patients to/from the system database, etc. Traditional database operations except inferencing operations have not been discussed on the case study focused with details in this section due to page limitation constraints. All supportive treatment suggestions (30 suggestions) created in the system ontology according to the five main DM risk definition categories and patient risk score ranges are given in detail in Appendix C.

7. Discussion

Machine learning models can be used to classify diabetes and cardiovascular diseases early such as the Artificial Neural Network and Bayes Network [41, 42]. According to some effective research studies in the literature, it has been stated that a higher accuracy result can be obtained when Artificial Neural Networks are applied, with the possibility of obtaining better accuracy in both DM and cardiovascular disease classification. Classification through Artificial Neural Network techniques as in machine learning is not under the assessment of knowledge and rule-based expert systems. To achieve better system accuracy, the accuracy is calculated by considering the match between the recommendation deduced by an expert system and the ground-truth value. Therefore, accuracy determines how accurate the system is and how accurate the recommendations suggested by the system are. The following sections discuss the accuracy-based performance of the system, considering the domain knowledge, the recommendations inferred by IE, and the expected recommendations.

7.1 Validation of rules on the domain knowledge

exists (a patient as ?p) AND (?p has Systolic_Hypertension_Val) AND (?p has Systolic_Hypertension_Val

\geqslant

140) AND

(?p has Diastolic_Hypertension_Val) AND (?p has Diastolic_Hypertension_Val

\geqslant

90)

THEN

ASSERT

(?p is Asymptomatic)

In rule-based systems, each rule modelled on the rule-base is expected to be able to maintain the integrity of the domain knowledge and process each circumstance of the domain knowledge, so that accuracy is determined for that rule. Whether a rule represents all that part of/only that part of the domain knowledge that it is supposed to be modelled, gives us a notation of the accuracy of the rule [43]. If a rule that does not meet this issue, will not perform exactly as the domain knowledge that it claims to represent, and can, in turn, lead to inaccurate responses from the system. Thus, the rule would be inaccurate [43]. For instance, let us consider the combination of RULE #32 and RULE #33 in our rule base and rename it as a new rule, $\bm{R}_{i}$ :

and, the knowledge item $\bm{K}_{i}$ , that had to be represented was: “If a patient has any of these risk factors; high systolic hypertension ( $\geqslant$ 140 mg/Hg), or high diastolic hypertension ( $\geqslant$ 90 mg/Hg), or high triglycerides ( $\geqslant$ 250 mmol/L), the patient is assigned into the category of Asymptomatic Diabetes” [2, 10, 22].

Apparently, the $\bm{R}_{i}$ is inaccurate with respect to the knowledge item $\bm{K}_{i}$ , because if universe of discourse U was that given in Table 11, then the $\bm{R}_{i}$ would select only objects {p1, p2, p5} whereas knowledge item $\bm{K}_{i}$ , would imply the objects {p1, p2, p3, p5}. Here, the $\bm{R}_{i}$ rule violates the issue in the notion of accuracy, “whether a rule represents all that part of the knowledge that it is supposed to model” [40]. Hence the $\bm{R}_{i}$ is characterized inaccurate with respect to $\bm{K}_{i}$ .

Table 11
A sample universe

No	Systolic blood pressure	Diastolic blood pressure	Triglycerides	Domian knowledge
p1	145 $\surd$	98 $\surd$	260 $\surd$	Asymptomatic
p2	130 $\surd$	100 $\surd$	235	Asymptomatic
p3	136	86	276 $\surd$	Asymptomatic
p4	110	82	216	Not asymptomatic
p5	184 $\surd$	110 $\surd$	230	Asymptomatic

Therefore, the rules should contain a complete model of the domain knowledge and be created in a way that does not impair the integrity and accuracy of the domain knowledge. This may require, in some special circumstances, the creation of individual but interoperable rules. In our case, the presence of any of these three conditions (high systolic hypertension, high diastolic hypertension, high triglyceride) is sufficient to automatically place the patient in the asymptomatic category [10, 22].

Rule #32: Patient (?p), is_Hypertension_b (?p, ?htb), greaterThanOrEqual (?htb, “140” ${}^{\wedge\wedge}$ xsd:int) -> is_Asymptomatic (?p, true)
Rule #33: Patient (?p), greaterThanOrEqual (?hts, “90” ${}^{\wedge\wedge}$ xsd:int), is_Hypertension_s (?p, ?hts) -> is_Asymptomatic (?p, true)
Rule #59: Patient (?p), is_Triglycerides (?p, ?trg), greaterThan (?trg, “250” ${}^{\wedge\wedge}$ xsd:int) -> is_Asymptomatic (?p, true)

Consequently, these rules are handled separately in our rule base. Other rules that process the outputs of these rules as inputs maintain the integrity of our domain (see Appendix B).

In this study, while modelling the rules, the integrity of the domain knowledge, the interoperability of the rules and the correct modelling of the rules based are collaborated with a DM specialist. The rules were verified with test studies on 30 patient cases on the Protégé tool, and the accuracy of each rule was ensured after the necessary corrections were made. Other test studies belong to these patient cases are presented in the next section.

7.2 Evaluation of the results inferred via DMDSOnt

In experimental studies, thirty (30) anonymous patient records from Near East University Faculty of Medicine were collected retrospectively by the collaborating physician to evaluate the system results. Each record uses similar inputs as shown in Table 9 and executes its inferences. The evaluation of the system was applied by considering the following approaches:

7.2.1 Matching tests on inferred results by IE and expected results

The data of each patient case was entered into the system by the collaborating physician using the application interface. The “Get Suggestion” button on the application interface was enabled for each patient case. Next, the system loaded each patient’s input data (as shown in Table 9) into the ontology to initiate the inferencing task.

After the inference task is complete, the obtained system inference data for each case are: “Type of Diabetes” “Risk of Diabetes in 10 Years” “BMI” “Min. Ideal Weight” “Max. Ideal Weight”, and “Personalized Supportive Treatment Suggestions”, for each. The results obtained for each patient case were the same as expected results, as formulated in predefined rules in the systems ontology.

7.2.2 Manual verification by the collaborating physician

Medical supportive treatment suggestions inferred by the system for each patient were then discussed with the collaborating physician for verification. The collaborated physician manually evaluated each of supportive treatment suggestions deduced and went through standard diagnosis and treatment processes on the same data set collected and used professional reasoning to verify each patient’s conditions and needed medical/physical activities.

In the verification studies conducted with the collaborating physician, when data of the 30 patient cases were considered, no errors were observed in the determination of the formularized and score-based results (i.e. “Type of Diabetes”, “Risk of Diabetes in 10 Years”, “BMI”, “Min. Ideal Weight”, “Max. Ideal Weight”) by the IE. However, in our first-round test studies, in the personalized supportive treatment suggestions for 30 patients, the system produced a total of 8 missing suggestions and 6 incorrectly classified suggestions. Therefore, the consistency of the developed system was performed as 965%. The missing suggestions have been added to the appropriate SWRL rules on the DMDOnt. Additionally, 6 misclassified suggestions were inserted into the correct SWRL rules.

After all corrections were conducted, 30 already enrolled patients on the system database were reprocessed by the IE in the second-round tests. It was observed that the suggestions produced by IE for each of the 30 patients were produced as correct and complete suggestions. The accuracy of the system increases as the rules are corrected and validated, so the system accuracy was found as 100.0% after the second round of tests and all the corrections made.

Overall results after the second-round tests show that the personalized and supportive treatment suggestions deduced by the system on the 30-patient case are in line with the manual evaluation of the collaborating physician. Moreover, the results obtained also show that, the DMDSOnt can deliver accurate reasoning while preserving knowledge base shareability and extensibility. As a result, as the number of patient cases increases, the DMDSOnt will be enhanced and advanced by adding the missing supportive medical treatment suggestions to the system ontology over time by the ontology engineers.

7.3 System contributions summary

The contributions of this system developed for the management and delivery of health services are as follows:

•
Most people live with DM risk without realizing it, and this risk manifests itself in later ages. With this CDSS developed, it is possible to predict and prevent DM risk in the early years by monitoring the health signs and conditions of individuals.
•
With this CDSS, it is possible to identify T1DM, T2DM, Prediabetes, Asymptomatic and Gestational Diabetes patients in adults. The system does not currently examine the pediatric group. Since BMI should be evaluated according to percentiles for pediatric group, the system is designed for use by individuals over the age of 18. Integration of percentile assessment into the system and adaptation of childhood assessment to the CDSS developed are planned as future studies.
•
With this CDSS, it is possible to make required changes in their lifestyle of individuals by taking certain precautions by recognizing and treating the DM risk in the early stage.
•
With this CDSS, the incidence of DM risk in society can be followed through continuous evaluations and public awareness of this risk can be increased.
•
It can be used to develop and apply Clinical Practice Guidelines of DM [22], which is useful to formalized statements, and include recommendations intended to create best practices and optimize public healthcare.
•
With this CDSS, it is possible to facilitate preventive health practices especially in primary health care services.
•
With this CDSS, medical students and lecturers studying in this field can be supported. The processing of the information collected in the clinical environment and its reinforcement with theoretical knowledge can be supported. Therefore, it can be used for educational purposes by those studying in DM diagnosis and DM health care management (e.g., those who want to become experts in DM management and decision making, such as medical students, nurses, health practitioners, general practitioners, health educators, and other primary care clinicians).
•
With this CDSS, feedback can be obtained on the follow-up of the treatment processes of the patients who are currently receiving DM treatment.

8. Conclusions

DM is a chronic disease that can cause various severe damage to our body by affecting various crucial organs in our body such as the heart, eyes, kidneys, skin and feet. DM requires ongoing medical check-ups and self-care and education to prevent its severe complications and reduce the risk of its long-term complications. CDSS is a sophisticated health information technology component, can be used to facilitate this effort and assess the risk of long-term complications. In this study, an ontology based CDSS is engineered as a tool that provides to deduce a DM risk in an early stage and return a set of personalized supportive treatment suggestions for a patient. With this system, it is aimed to facilitate preventive health practices and early diagnosis, especially in primary health care services.

The system consists of five parts: (1) Diabetes Mellitus Diagnosis and Support Ontology (DMDSOnt), (2) semantic medical rules (SWRL rules), (3) inference engine module, (4) system database, and (5) portal user interface.

Ontology knowledge bases provide a major step towards the development of smarter clinical systems for disease assessment. In this study, the DMDSOnt developed contains various information (e.g., symptoms and signs, physical activities, clinical history of patients, frequently used clinical laboratory tests and other risk factors) and relationships used in the diagnosis and treatment of DM risk. In addition, the system has its own semantic medical rules on the DMDSOnt which are executed by an inference engine of the system to deduce a DM risk score and a set of proper supportive treatment suggestions for a patient.

The rules are built on the SWRL to enable high-level context reasoning and information evaluation. The rules are created from valid relationships between ontology classes (concepts) to diagnose DM risk and estimate its type. Thanks to the reasoning ability of the system’s inference engine, it helps physicians/health practitioners diagnose DM risk at an early stage and provides feedback to understand progress in DM treatment, thus ensuring proper management of the treatment process and accelerating recovery. Currently, the semantic medical rules of the system can only predict Type 1 and Type 2 DM in adults, while current guidelines do not yet make recommendations for DM risk estimation and support for pediatric patients. This part is planned as future work.

Funding

The authors report no funding.

Ethics statement

Due to the nature of this study, no formal consent was required.

Supplementary data

The supplementary files are available to download from https://dx-doi-org.web.bisu.edu.cn/10.3233/THC-230237.

Footnotes

Acknowledgments

The project was completed with the collaboration of the Department of Pediatric Endocrinology, Faculty of Medicine, Near East University, and the Department of Computer Engineering, Engineering Faculty, Eastern Mediterranean University in North Cyprus, Turkey. The project was completed with the technical contribution of a research team comprised of four computer engineers and one DM physician. An updated version of the OWL form of DMDSOnt is provided on the Bio portal repository: https:// bioportal.bioontology.org/ontologies/DMDSONT.

Conflict of interest

The authors declare that they have no conflict of interest.

References

American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes care. 2014; 37(Supplement 1): 81–90.

International Diabetes Federation. IDF Diabetes Atlas, 9th ed. Brussels, Belgium; 2019.

Regufe

Pinto

Perez

. Metabolic syndrome in type 2 diabetic patients: A review of current evidence. Porto Biomedical Journal. 2020; 5(6). doi: 10.1097/j.pbj.0000000000000101.

WHO [at: www.who.int/publications/i/item/9789241565257]. Global Report on Diabetes: Executive Summary; 2016.

Liang

. Recommendation systems for decision support: An editorial introduction. Decision Support Systems. 2008; 45(3): 385–386. doi: 10.1016/j.dss.2007.05.003.

Abbasi

Kashiyarndi

. Clinical Decision Support Systems: A discussion on different methodologies used in Health Care. Marlaedalen Uni., Sweden, [https://citeseerx.ist.psu.edu]; 2006.

Chung

Boutaba

Hariri

. Knowledge based decision support system. Information Technology and Management. 2016; 17(1): 1–3. doi: 10.1007/s10799-015-0251-3.

Alharbi

Berri

El-Masri

. Ontology based clinical decision support system for diabetes diagnostic. In 2015 Science and Information Conf (SAI). IEEE. 2015 July. pp. 597–602. doi: 10.1109/SAI.2015.7237204.

Nuttall

. Body Mass Index: Obesity, BMI, and Health: A Critical Review. Nutrition Today. 2015; 50(3): 117. doi: 10.1097/NT.0000000000000092.

10.

American Diabetes Association Professional Practice Committee [http://diabetesjournals.org/care/article-pdf/45/Supplement_1/S17/637547/dc22s002.pdf]. Classification and diagnosis of diabetes: standards of medical care in diabetes – 2022. Diabetes Care. 2022; 45(Supplement_1): 17–S38. doi: 10.2337/dc22-S002.

11.

Ulusal Diyabet Konsensüs Grubu [www.turkdiab.org/admin/PICS/files/Diyabet_Tani_ve_Tedavi_Rehberi_2019.pdf]. TURKDİAB-2019 Diyabet Tanı ve Tedavi Rehberi; 2020.

12.

Florkowski

. HbA1c as a diagnostic test for diabetes mellitus-reviewing the evidence. The Clinical Biochemist Reviews. 2013; 34(2): 75. PMID: 24151343.

13.

Feldman

Long

Johansson

Weinehall

Fhärm

Wennberg

, et al. Change in lifestyle behaviors and diabetes risk: Evidence from a population-based cohort study with 10-year follow-up. International Journal of Behavioral Nutrition and Physical Activity. 2017; 14(1): 1–10. doi: 10.1186/s12966-017-0489-8.

14.

Lindström

Absetz

Hemiö

Peltomäki

Peltonen

. Reducing the risk of type 2 diabetes with nutrition and physical activity-efficacy and implementation of lifestyle interventions in Finland. Public Health Nutrition. 2010; 13(6A): 993–999. doi: 10.1017/S1368980010000960.

15.

Pac

Mikutskaya

Mulawka

. Knowledge discovery from medical data and development of an expert system in immunology. Entropy (Basel, Switzerland). 2021; 23(6): 695. doi: 10.3390/e23060695.

16.

Berners-Lee

Hendler

Lassila

[https://www-jstor-org.web.bisu.edu.cn/stable/26059207]. The semantic web. Scientific American. 2001; 284(5): 34–43.

17.

Gruber

[https://queksiewkhoon.tripod.com/ontology_01.pdf]. Ontology; 2018.

18.

Brickley

Guha

Layman

. Resource description framework (RDF) Schema Specification [http://www.w3.org/ TR/PR-rdf-schema]. Technical report, 1999. W3C Proposed Recommendation; 1998.

19.

McGuinness

Van Harmelen

. OWL web ontology language overview [https://static.twoday.net/71desa1bif/files/W3C-OWL-Overview.pdf], W3C Recommendation. 2004; 10(10).

20.

O’connor

Knublauch

Musen

. Writing rules for the semantic web using SWRL and Jess [https://www. researchgate.net/profile/Martin-Oconnor-7]. Protégé With Rules WS, Madrid; 2005.

21.

O’Connor

. SWRLTap: A development environment for working with SWRL rules in Protégé-OWL [http://protege. stanford.edu/conference/2007/]; 2018.

22.

Latoszek-Berendsen

Tange

Van Den Herik

Hasman

. From clinical practice guidelines to computer-interpretable guidelines. Methods of Information in Medicine. 2010; 49(06): 550–570. doi: 10.3414/ME10-01-0056.

23.

Woo

Yang

Lee

Kang

. Healthcare decision support system for administration of chronic diseases. Healthcare Informatics Research. 2014 Jul 1; 20(3): 173–82. doi: 10.4258/hir.2014.20.3.173.

24.

Chen

Huang

Bau

Chen

. A recommendation system based on domain ontology and SWRL for anti-diabetic drugs selection. Expert Syst. Appl. 2012; 39: 3995–4006. doi: 10.1016/j.eswa.2011.09.061.

25.

Chen

Huang

Bau

. Development of anti-diabetic drugs ontology for guideline-based clinical drugs recommend system using OWL and SWRL. Int. Conf. on Fuzzy Systems. 2010; 1–6. doi: 10.1109/FUZZY.2010.5584139.

26.

Mahmoud

Elbeh

. IRS-T2D: Individualize Recommendation System for Type2 Diabetes Medication Based on Ontology and SWRL. INFOS ’16: Proceedings of the 10th International Conference on Informatics and System. 2016. doi: 10.1145/2908446.2908495.

27.

Friedman-Hill

. Jess the Rule Engine for the Java Platform, Sandia National Laboratorie. 2008; 159–161.

28.

El-Sappagh

Elmogy

. A fuzzy ontology modelling for case base knowledge in diabetes mellitus domain. Engineering Science and Technology, An International Journal. 2017; 20(3): 1025–1040. doi: 10.1016/j.jestch.2017.03.009.

29.

El-Sappagh

Kwak

Ali

Kwak

. DMTO: A realistic ontology for standard diabetes mellitus treatment. Journal of Biomedical Semantics. 2018; 9(1): 1–30. doi: 10.1186/s13326-018-0176-y.

30.

Putra

WHN

Sarno

Sidiq

. Weighted ontology and weighted tree similarity algorithm for diagnosing diabetes mellitus. In 2013 International Conference on Computer, Control, Informatics and Its Applications (IC3INA). IEEE. 2013 Nov. pp. 267–272. doi: 10.1109/IC3INA.2013.6819185.

31.

Chen

Jin

Goh

Wei

. Context-awareness based personalized recommendation of anti-hypertension drugs. Journal of Medical Systems. 2016 Sep 1; 40(9): 202. doi: 10.1007/s10916-016-0560-z.

32.

Forgy

. Rete: A fast algorithm for the many pattern/many object pattern match problem. In Readings in Artificial Intelligence and Databases. 1989 Jan 1; 547–559. doi: 10.1016/B978-0-934613-53-8.50041-8.

33.

Chandra

Shukla

Tiwari

Agarwal

Svafrullah

Adiyarta

. Natural language Processing and Ontology based Decision Support System for Diabetic Patients. In 2022 9th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI). IEEE. 2022 Oct. pp. 13–18.

34.

Sirin

Parsia

Grau

Kalyanpur

Katz

. Pellet: A practical owl-dl reasoner. Journal of Web Semantics. 2007; 5(2): 51–53. doi: 10.1016/j.websem.2007.03.004.

35.

Shearer

Motik

Horrocks

. HermiT: A Highly Efficient OWL Reasoner [www.cs.ox.ac.uk/boris.motik/pubs/ smh08HermiT.pdf]. In Owled. 2008; 432.

36.

Tsarkov

Horrocks

. FaCT

++

description logic reasoner: System description. In International Joint Conference on Automated Reasoning, Springer, Berlin, Heidelberg. 2006. pp. 292–297. doi: 10.1007/11814771_26.

37.

Proctor

. Drools: a rule engine for complex event processing. In International symposium on applications of graph transformations with industrial relevance, Springer, Berlin, Heidelberg. 2011 Oct. pp. 2–2. doi: 10.1007/978-3-642-34176-2_2.

38.

SWRL API [https://github.com/protegeproject/swrlapi/releases/tag/release-2.0.11]. Retrieved: February 16th, 2023.

39.

Horridge

Bechhofer

. The OWL API: A java api for owl ontologies. Semantic Web. 2011; 2(1): 11–21.

40.

Knublauch

Horridge

Musen

Rector

Stevens

Drummond

, et al. The Protege OWL Experience. In OWLED; 2005 November.

41.

Alić

Gurbeta

Badnjević

. Machine learning techniques for classification of diabetes and cardiovascular diseases. In 2017 6th Mediterranean Conference on Embedded Computing (MECO) IEEE. June 2017. pp. 1–4.

42.

Sejdinović

Gurbeta

Badnjević

Malenica

Dujić

Čaušević

Mehmedović

, et al. Classification of prediabetes and type 2 diabetes using artificial neural network. In CMBEBIH 2017: Proceedings of the International Conference on Medical and Biological Engineering. Springer Singapore. 2017. pp. 685–689.

43.

Anantaram

Nagaraj

Nori

. Verification of accuracy of rules in a rule-based system. Data & Knowledge Engineering. 1998; 27(2): 115–138. ISSN 0169-023X. doi: 10.1016/S0169-023X(98)00009-3.

Plasma glucose (PG)
Under risk	Fasting (mg/dl)	Satiety (mg/dl)
Impaired Fasting Glucose (IFG)	100–125
Impaired Glucose Tolerance (IGT)		140–199
HbA1c	5.7%–6.4%

A knowledge-based decision support system for inferring supportive treatment recommendations for diabetes mellitus

Abstract

BACKGROUND:

OBJECTIVE:

METHODS:

RESULTS:

CONCLUSION:

Keywords

1. Introduction

2. Background

2.1 Diabetes mellitus

2.2 Diagnosing DM in asymptomatic adults

3. Methodology and related works

3.1 Technologies used in developing of DSS

Table 4 The proposed system is compared with others

5.3 Object type properties (OTP) developed

Table 7 Object type properties

Table 8 An example set of semantic rules developed on DMDSOnt using SWRL

6.1 Used tools

6.2 System evaluation from healthcare workers through a case study

Table 9 Asserted data to the DMDSOnt for a case study focused

7.1 Validation of rules on the domain knowledge

Table 11 A sample universe

7.2.1 Matching tests on inferred results by IE and expected results

7.2.2 Manual verification by the collaborating physician

7.3 System contributions summary

Funding

Ethics statement

Supplementary data

Footnotes

Acknowledgments

Conflict of interest

References

Table 4
The proposed system is compared with others

Table 7
Object type properties

Table 8
An example set of semantic rules developed on DMDSOnt using SWRL

Table 9
Asserted data to the DMDSOnt for a case study focused

Table 11
A sample universe