Abstract
The normal parameter reduction is used as a useful approach to identify the irrelevant parameters in soft set-based decision making systems. It finds a subset with least number of parameters that preserve the original classification of the decision alternatives. A number of algorithms have been developed for the normal parameter reduction of soft set but the case of repeated columns (i.e., e i = e j ) was only considered by Danjuma et al. In this study, first we address the limitations of the Danjuma et al.’s approach to normal parameter reduction of soft set. Then, we propose a new algorithm for normal parameter reduction of soft set which is free of all such limitations. Moreover, we compare the proposed algorithm with some of the existing algorithms of normal parameter reduction of soft set to show its efficiency. Finally, the application of the proposed algorithm is elaborated by a medical diagnostic problem.
Introduction
Different types of uncertainties such as; randomness, vagueness, and roughness often occur in practical decision systems. To handle these uncertainties in a befitting manner, a number of mathematical theories have been developed by different authors such as; probability theory [20], fuzzy set theory [31], intuitionists fuzzy set theory [4], interval mathematics theory [10] and rough set theory [28]. Each of the afore-mentioned theory has its inherent difficulties, which are mentioned by Molodtsov in [27]. Therefore, Molodtsov [27] developed a completely new approach for modeling uncertainty and vagueness, which is called soft set theory. Soft set theory is basically based on the parameterization tool which makes this theory more simple and applicable as compared to the other existing theories.
The applications of soft sets in real-world problems are progressing rapidly and many algorithms have been developed for solving practical problems. Zhang et al. [38] introduced the concepts of fuzzy soft β-minimal (maximal) descriptions and developed four new types of fuzzy soft β-neighborhoods. Similarly, by means of fuzzy soft β-coverings based fuzzy rough set, a new algorithm was proposed by Zhang and Zhan in [37]. Further, by means of soft neighborhoods, Zhan and Wang introduced a novel type of soft rough covering in [35]. Moreover, in [32], five new types of soft coverings based rough sets were developed by Zhan and Alcantud, where the first two types were used to proposed two special algorithms based on soft coverings based rough sets. Similarly in many directions, the applications of soft sets have been explored by different authors such as; association rules mining [8, 13], medical diagnosis [1, 30], incomplete data analysis [19, 40] and decision making [3, 39].
In the last few years, the problems related to the parameterization reduction and decision making have been gained considerable attention in dealing with uncertainties. Some successful implementations have been made by different researchers to elaborate the applications of the soft set theory in parameter reduction and decision making. In this regard, the first attempt was made by Maji et al. [26] who used a rough set approach for dimensionality reduction and applied it to a decision making problem. Later in [5], Chen et al. mentioned that the technique of Maji et al. [26] may produce some wrong results because they do the reduction process before computing the choice values of the objects in a decision system. Moreover, according to Chen et al. [5], the parameters reduction in soft set theory is a different approach from the attributes reduction in rough set theory and they cannot be used interchangeably for computing the optimal object in soft set-based decision systems. Thus in [5], they developed a new parameter reduction technique for soft sets and find the optimal decision on a general Boolean data set. However, their technique was failed to maintain all the levels of the suboptimal choices during the reduction process.
According to Kang et al. [22], most of the methods related to soft set reduction (i.e. Maji et al. [26], Chen et al. [5] etc.) have only considered the optimal choice and they ignored suboptimal choices at the time of decision making. In many decision making problems (such as; sold products, demand products etc.), we select the optimal choice from the data set and delete the data of the optimal object from the corresponding data set. If we want to make next decision on the same data set where the data of optimal object are already deleted, usually we need a new reduction which obviously wastes our time. Similarly, in some cases, if the character of objects in soft set cannot be embodied by a given parameter set, then some more parameters are added to the given parameter set. After adding new parameters, we need to make a new reduction of soft set for decision making which is wastes too much time. To overcome these problems, Kong et al. [22] introduced the concept of normal parameter reduction (NPR) and proposed an algorithm for it. Normal parameter reduction has the ability to reduce the dimensionality of a data set without disturbing the original classification of its decision alternatives (objects). Since the NPR algorithm presented by Kong et al. [22] was based on the parameter importance degree, which was hard to compute and involve a great amount of computation. Therefore, the new efficient normal parameter reduction algorithm (NENPR) for soft sets was proposed by Ma et al. in [24]. In NENPR algorithm, the whole reduction process was based on oriented-parameter sum and there was no need to compute the parameter importance degrees and decision partitions. Further, the particle swarm optimization algorithm was used by Kong et al. [23] to give a proper mathematical representation to the problem of normal parameter reduction of soft sets. For more study about soft set reduction, we refer [2, 33].
Since, all of the afore-mentioned algorithms have tried to reduce the running time and computational complexity of the normal parameter reduction process, however, they did not consider the case of repeated columns (i.e., the parameters e
i
= e
j
) in soft set data tables which can cause an extra computation burden on the reduction process. Therefore, in [7], Danjuma et al. addressed this issue and proposed a new algorithm called, the alternative normal parameter reduction algorithm (ANPR). Moreover, they compared the ANPR algorithm with some previous algorithms (i.e, NPR and NENPR algorithms) and also discussed some decision making problems. However, the ANPR algorithm proposed by Danjuma et al. is based on some miss conceptual results and cannot be considered as a legitimate approach to normal parameter reduction of soft set. The main contributions of this work are summarized as follows.
To highlight the deficiencies of the Danjuma et al.’s approach to normal parameter reduction of soft set. To propose a new improved algorithm for normal parameter reduction of soft set. To compare our proposed algorithm with some of the existing algorithms of normal parameter reduction of soft set. To give an application of the proposed algorithm in a medical diagnostic problem.
The organization of the paper is given as follows. In Section 2, we review some basic concepts about rough set theory and soft set theory. In Section 3, we analyze the NPR and NENPR algorithms as presented in [22] and [24], respectively. Section 4 analyzes some limitations of the ANPR algorithm proposed in [7]. In Section 5, initially, we give some useful definitions and results and then provide an improved algorithm for normal parameter reduction of soft set. In Section 6, some comparison results are given between the proposed algorithm and some of the existing algorithms. Section 7 provides the application of the proposed algorithm in a medical diagnostic problem. Finally, Section 8 presents the conclusion of the paper.
Preliminaries
This section reviews some basic concepts about soft set and rough set. Throughout the paper, we take U as initial universe, E the set of parameters which may be some attributes, properties or some characteristics of the objects in U, and P (U) the set of all subsets of U.
Equivalently, the soft set (F, E) over U can be thought as a parameterized family obtained from P (U). The mapping F is named as the approximate function and its functional value i.e., F (e) is called the set of e-approximate elements of (F, E). To clarify the concept, consider the following example.
F (e1) = {u1, u2, u5} , F (e2) = {u3, u4, u6} ,
F (e3) = {u2, u3, u6} , F (e4) = {u3, u4, u6} ,
F (e5) = {u1, u2, u5} , F (e6) = {u1, u3, u4} ,
F (e7) = {u1, u4, u6} , F (e8) = {u2, u3, u5, u6} ,
F (e9) = {u1, u2, u3, u4, u5, u6} and
F (e10) = {u3, u4, u6} .
Thus, (F, E) can be written as a parameterized family of subsets of U such as;
A soft set can also be represented by a Boolean-valued information system. The definition of an information system is given as follows.
An information system sometimes called a knowledge-based system that can be expressed by an information table. In particular, S is said to be a Boolean-valued information system if, V a = {0, 1}.
Proof. The proof is given in [24].■
Thus, by Proposition 2.1, the soft set (F, E) as defined in Example 2.1 can be considered as a Boolean-valued information system and its Boolean table is given by Table 1.
Tabular form of (F, E) given in Example 2.1
Tabular form of (F, E) given in Example 2.1
This section analyzes the two well-known algorithms namely; normal parameter reduction algorithm (NPR) and new efficient normal parameter reduction algorithm (NENPR) proposed by Kong et al. [22] and Ma et al. [24], respectively.
The normal parameter reduction algorithm (NPR)
Let U = {u1, u2,. . . , u n }, E = {e1, e2,. . . , e m } and u ij be the entries in the Boolean table of the soft set (F, E). For a soft set (F, E) over U, the choice value f E (.) of an object u i ∈ U is defined by f E (u i ) = ∑ j (u ij ).
To describe the rank and partitions of the objects in U, an indiscernibility relation is defined as follows.
For the soft set (F, E) over U, the partition C
E
= {{u1, u2,. . . , u
i
}
f
1
, {ui+1, ui+2,. . . , u
j
}
f
2
,. . . , {uk+i, uk+2,. . . , u
n
}
f
s
} is called the decision partition, which ranks the objects of U according to their choice values f
E
(.). Further, if we delete a parameter e
i
from E, then a new decision partition is obtained from C
E
, which is denoted by
and | . | denotes the cardinality of set.
The NPR algorithm was proposed by Kong et al. [22] as given by Algorithm 1.
The following example will give a clear description of the NPR algorithm.
CE–e1 = {{u3, u6} 7, {u4} 6, {u1, u2, u5} 4} ,
CE–e2 = {{u3, u6} 6, {u1, u2, u4, u5} 5} ,
CE–e3 = {{u3, u4, u6} 6, {u1} 5, {u2, u5} 4} ,
CE–e4 = {{u3, u6} 6, {u1, u2, u4, u5} 5} ,
CE–e5 = {{u3, u6} 7, {u4} 6, {u1, u2, u5} 4} ,
CE–e6 = {{u6} 7, {u3} 6, {u2, u4, u5} 5, {u1} 4} ,
CE–e7 = {{u3} 7, {u6} 6, {u2, u4, u5} 5, {u1} 4} ,
CE–e8 = {{u3, u4, u6} 6, {u1} 5, {u2, u5} 4} ,
CE–e9 = {{u3, u6} 6, {u4} 5, {u1, u2, u5} 4} ,
CE–e10 = {{u3, u6} 6, {u1, u2, u4, u5} 5} .
The importance degrees r e j , ∀ e j ∈ E are then computed as:
α1,e1 = | {u3, u6} – {u3, u6} |=0, α2,e1 = | {u4} – {u4} |=0, α3,e1 = | {u1, u2, u5} – φ|=3.
Thus,
Normal parameter reduction of Table 1.
We see that the normal parameter reduction of (F, E) as given by Table 2 has the decision partition
This section briefly analyzes the new efficient normal parameter reduction algorithm (NENPR), proposed by Ma et al. [24]. The NENPR algorithm finds the oriented-parameter sum instead of parameter importance degree and compute the candidate parameter set. This makes the NENPR algorithm more simple and easily implementable as compared to the NPR algorithm.
Based on the above definitions and Theorem 3.1, the NENPR algorithm is given by Algorithm 2. The following example will give a clear description of the NENPR algorithm.
Tabular form of
in Example 3.2
Tabular form of
The main setback of NENPR algorithm is that, it does not consider the issue of the repeated columns in soft set tables and performs the same process repeatedly which puts an extra computation burden on the reduction process. Therefore, to overcome this problem, Danjuma et al. [7] proposed the ANPR algorithm and applied it to some decision making problems. Since, the idea proposed in [7] was interesting but it has not been implemented so effectively to handle the afore-mentioned problem.
This section highlights the overall deficiencies of the alternative normal parameter reduction algorithm (ANPR) proposed by Danjuma et al. [7]. At the end of the section, an example is provided to show that the ANPR algorithm cannot be used for normal parameter reduction of soft sets.
The dialectic subsets of the parameter set E can be defined in the following way.
f A (u1) = f A (u2) = . . . = f A (u n ) and
f B (u1) = f B (u2) = . . . = f B (u n ).
The intersection of the dialectic subsets is defined as follows.
Based on Definitions 4.1, 4.2 and 4.3, the ANPR algorithm was proposed by Danjuma et al. as given by Algorithm 3.
If we consider Definitions 4.1, 4.2 and 4.3, then all of them are confusing and not very clear. For example, in Definition 4.1, the expression u1j = u2j = . . . = u
nj
is used to represent the parameters e
i
= e
j
, which is incorrect. In fact, this kinds of expression can only be used for the parameters
f A (u1) = f A (u2) = . . . = f A (u n ) and
f B (u1) = f B (u2) = . . . = f B (u n )
are used to define the dialectic subsets A and B of E which looks very strange because according to these two expressions, any two dispensable subsets of E are dialectic (see Definition 3.2). Later in Example 4 of [7] (in Step 4), it was observed that two subsets A and B of E are said to be dialectic if they are dispensable and have the same choice values for all u i ∈ U. Thus, the appropriate expressions that should be used in Definition 4.2 are
f A (u1) = f A (u2) = . . . = f A (u n ) = k and
f B (u1) = f B (u2) = . . . = f B (u n ) = k,
where k is nonnegative integer.
Moreover, the expression
On the other hand, if we consider Algorithm 3, then in Step 2, the authors reduce the parameters e
i
= e
j
from the parameter set E without providing any justifying result for it. We cannot directly remove two or more similar columns from a soft set table except
f A (u1) = f A (u2) = . . . = f A (u6) = k and
f B (u1) = f B (u2) = . . . = f B (u6) = k,
where k is a nonnegative integer. In this way we obtain only one subset A = {e1, e2} such that f A (u1) = f A (u2) = . . . = f A (u6) =1 and there is no other subset B which satisfy the given condition.
Tabular form of
in Example 4.1
Tabular form of
According to Table 4, the decision partition is given by
In this section, first, we give some useful definitions and results and then provide a new algorithm for normal parameter reduction of soft set.
The proposed technique
Since in Definition 4.1, the expression u1j = u2j = . . . = u nj used for the special entries is incorrect. In the following, we use a right expression for it and define an indiscernibility relation on the parameter set E.
The indiscernibility relation as defined in Definition 5.1 is an equivalence relation on the parameter set E. Moreover, based on the indiscernibility relation ∼, the parameter set E can be partitioned into disjoint indiscernibility classes
where t denotes the class label and r is the total number of the indiscernibility classes of E.
and
f T (u1) = f T (u2) = . . . = f T (u n )
also holds for T and hence by Definition 3.2, T is dispensable.■
We illustrate Theorem 5.1 by the following example.
f T i (u1) = f T i (u2) = . . . = f T i (u6) =2
is satisfied for all T
i
, where (1 ≤ i ≤ 6). Thus, by Theorem 5.1, the |A|-tuples generated from the cartesian product of
Based on the above definitions and results, the proposed algorithm is given by Algorithm 4. For a clear description of the proposed algorithm, consider the following example.
Tabular form of
The NENPR algorithm and the proposed algorithm follow the same footsteps to reach the optimal normal parameter reduction, however, they use different approaches to deal with the repeated columns (i.e., e
i
= e
j
) in soft set tables. The NENPR algorithm treats them separately in each step and do the same job repeatedly, which put an extra computation burden on the reduction process. While the proposed algorithm initially combines the parameters e
i
= e
j
into a single indiscernibility class and then selects only one parameter from each of the indiscernibility class to reduce the number of parameters in
Comparison results
It was mentioned in [24] that NENPR algorithm performed well and has less computational complexity as compared to the NPR algorithm of [22]. Therefore, in this section, first we compare the proposed algorithm with the NENPR algorithm in terms of computational complexity. Then, we provide some experimental results to show that the proposed algorithm outperforms some state of the art algorithms while finding the normal parameter reduction in a Boolean data set.
Comparison table
Comparison table
Estimating the candidate parameter reduction set
Estimating the candidate parameter reduction set is a complex part of the normal parameter reduction process. To compute the candidate parameter reduction set, the NENPR algorithm first needs to compute the oriented-parameter sum for each parameter e
j
∈ E, where (1 ≤ j ≤ m), therefore the total number of access entries (i.e., u
ij
) are given by m · n. Then, it checks every A ⊆ E for which S
A
is a multiple of |U| and computes the equation S
A
+ S
A
c
= S
E
for the multiplicity of its complement A
c
. In this whole process, the number of access oriented-parameter sums are given by
Filtering the candidate parameter reduction set
Suppose, the total number of the candidate parameter reduction sets of the proposed and NENPR algorithms are given by k and z, respectively. Let A = {e1, e2,. . . , e
p
} is one of the k candidate parameter reduction sets for the proposed algorithm such that, the parameters e
j
∈ A, where (1 ≤ j ≤ p), are belongs to p distinct indiscernibility classes
Results and discussion
To discuss the efficiency of the proposed algorithm in capturing the normal parameter reduction in a Boolean data set. We applied the four algorithms that is, the proposed algorithm, the NPR algorithm of [22], the NENPR algorithm of [24], and the ANPR algorithm of [7] to the same Boolean data set that was given in Table 1. All algorithms were implemented in Python programming language and executed on Intel Core 2 Duo processer with 4 GB memory and Window 8 operating system. The experimental results obtained are summarized in Table 6, where the notations PID, OPS, FPRS and CPRS stand for the parameter importance degree, oriented-parameter sum, feasible parameter reduction set and candidate parameter reduction set, respectively. From Table 6, we see that the output obtained from the ANPR algorithm is not a normal parameter reduction and cannot be compared with our proposed algorithm. Moreover, from Table 6, it is clear that the proposed algorithm has accessed less number of entries and oriented-parameter sums as compared to the NPR and NENPR algorithm, respectively. Also, the proposed algorithm has only 7 subsets to check for the dispensability condition f A (u1) = f A (u2) = . . . = f A (u n ), while the NPR and NENPR algorithms must check 127 and 63 subsets for the same dispensability condition, respectively. Thus, it is evident that the proposed algorithm has decreased the computational complexity at every stage of the reduction process, and efficiently captured the normal parameter reduction as compared to the other three algorithms. However, the proposed algorithm and the NENPR algorithm give the same results for a soft set having no similar parameters.
Application in decision making problem
This section deals with a real-life application of our proposed algorithm in a medical diagnostic problem. We reconsider the hiatal hernia disease problem that was discussed by Danjuma et al. in [7]. First we show that the given problem cannot be solved by the ANPR algorithm. Then we solve the same problem by our proposed algorithm and elaborate its application in decision making.
The hiatal hernia disease problem
The hiatal hernia disease can affect organs in the chest cavity and causes abnormalities of stomach and Acid reflux or Gastroesophageal reflux disease (GERD). Normally, it occurs when the diaphragm (a separating membrane between the chest and abdominal cavity) is affected by the enlargement of the upper part of the stomach. The data are collected from the Mariri comprehensive hospital in Kino stat, Nigeria, which is displayed in Table 7. According to Table 7, there are 50 patients where each patient is described with 13 different symptoms. Let U = {u1, u2,. . . , u50} is the set of all patients and E = {e1, e2,. . . , e13} is the set of symptoms where each e i for (1 ≤ i ≤ 13), stands for heartburn, chest pain, nausea, vomiting, burping, water brush, appearance of a large amount of saliva, cough, difficulty in swallowing, passing black stool, abdominal pain, belching and fever, respectively. From the choice values given in Table 7, the doctors can decide that the patients at optimal choice such as; {u7, u8, u26, u27, u47, u50} are those who are affected with hiatal hernia disease. Meanwhile, the patients at suboptimal choice such as; {u2, u4, u10, u12, u14, u15, u17, u18, u19, u22, u23, u27, u30, u34, u35, u46, u48} have the tendency of being affected with hiatal hernia disease, if the precautions have not been taken. Similarly for all other patients, the doctors can put their decision according to their level of choice values.
Tabular form of (F, E) given in hiatal hernia disease problem
Tabular form of (F, E) given in hiatal hernia disease problem
Now, Our problem is to find a least subset of E that can provide the same decision order for the patients as the entire set of parameters. In other words, we have to identify those parameters (symptoms) in E which are jointly sufficient and individually necessary for the original decision order of the patients. For this, let us apply the ANPR algorithm to the given soft set (F, E), then by Algorithm 3, the parameters {u7, u9} and {u3} can be putted into the reduced parameter sets C and D, respectively. Further, there exist two subsets A = {e1, e4, e5, e10} and B = {e2, e4, e6, e11} such that;
f A (u1) = f A (u2) = . . . = f A (u50) =3, and
f B (u1) = f B (u2) = . . . = f B (u50) =3.
Thus A ∩ B = {e4} is kept into the reduced parameter reduction set D. Finally, E – C – D – Q = {e1, e2, e5, e6, e8, e10, e11, e12, e13} is the optimal normal parameter reduction as given by Table 8. From Table 8, the optimal and suboptimal choices for the reduced table are given by u10 and {u7, u8, u26, u27, u47, u50}, respectively, which are different from Table 7. This implies that the ANPR algorithm was failed to provide a minimum subset of E that can provide the same decision ability as the entire set of parameters.
Normal parameter reduction of (F, E) by ANPR algorithm
On the other hand, if we apply our proposed algorithm to the same problem, then we proceed as follows.
Tabular form of
Normal parameter reduction of (F, E) by the proposed algorithm
Now consider the reduced Table 10, then the optimal and suboptimal choices are given by {u7, u8, u26, u47, u50} and {u2, u4, u10, u12, u14, u15, u17, u18, u19, u22, u23, u27, u30, u34, u35, u46, u48}, respectively, which are same as that of Table 7. This shows that our proposed algorithm has the ability to find a subset of E with least number of symptoms that can provide the same decision order for the patients as the entire set of parameters. Thus, our algorithm helps the doctors in providing the same decision partition of the patients with limited number of symptoms and save their time.
Many algorithms have been developed for the normal parameter reduction of soft set, however, the case of repeated columns has not been gained considerable attention so far, although this phenomenon is very useful to reduce the workload of an algorithm. Recently in 2017, Danjuma et al. [7] consider this issue and proposed the ANPR algorithm for normal parameter reduction of soft sets. In this study, initially we discussed some drawbacks of the ANPR algorithm and shown that it has no mathematical existence. Then we presented an improved algorithm for normal parameter reduction of soft set that has overcome the previous problems of the ANPR algorithm. Some comparison results are given to show that the proposed algorithm has relatively less computational complexity and workload as compared to the existing algorithms of normal parameter reduction of soft set. Finally, the proposed algorithm is applied to the hiatal hernia disease problem to elaborate its application in a real-world problem. Since very limited practical applications of soft set-based parameter reduction can be found in the existing literature. Thus, the practical applications of soft set reduction require more attention and should be explored further.
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this article.
Footnotes
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Grant No. 61673011).
