Abstract
This paper proposes a couple of criteria for evaluating the quality and relevance of a fuzzy partition. These criteria are established from a fuzzy classification system and its recursive De Morgan triplet. We propose a comparison process between the classes of a fuzzy partition, based on a translation invariant similarity relation. Therefore a classification process is carried out with the equivalence relations determined by the similarity relation. Such a relation is built on the commutative group structure formed by the elements of the fuzzy classification system. Our approach is illustrated through an example on image analysis by the fuzzy c-means algorithm.
Keywords
Introduction
The process of recognition and classification of different stimuli into broader perceptions and concepts is one of the most fundamental activities of human thinking. As a matter of fact, one of the most primitive and common activities of human reasoning consists in generalizing similar objects into concepts or classes. Since the paper presented by Bellman, Kalaba and Zadeh [5] fuzzy techniques stand as one of the fields with major advances and applications in the area of (soft) classification and pattern recognition. In [3], [4], [24], [28] and [14] we can find a variety of applications, perspectives and approaches that have been developed in relation to these two areas.
The concept of a partition is, without any doubt, one of the pillars of unsupervised classification problems. In the fuzzy case, a standard notion for a partition has been that proposed by Ruspini [32], where the total degree of membership is completely distributed among all classes, which was later generalized by Krishnapuram and Keller [22], only requiring that the membership of an object to any class is positive. Therefore, under the notion of a Ruspini’s partition, the total sum of the membership degrees of an object to all classes must be equal to 1, while in the perspective of Krishnapuram and Keller, it is sufficient that for each element, its membership degree is greater than zero. Based on these ideas, and considering that in most cases fuzzy partitions do not verify the Ruspini condition, the notion of fuzzy classification systems have been defined by means of families of aggregation functions, allowing to analyze the classes obtained in a fuzzy partition [1]. Therefore, a classification problem can be evaluated through a fuzzy classification system regardless of the notion of partition used.
Fuzzy classification systems, as proposed in [1] (see also [2]), were conceived as a structure allowing the treatment of complex classification problems following two key ideas. On the one hand, under the theoretical framework introduced by Dombi [15], [16] regarding aggregation operators, fuzzy classification systems were proposed as an alternative approach for non-associative connectives through the use of the concept of recursiveness.
On the other hand, fuzzy classification systems were proposed as a structure allowing the evaluation of the established classification from De Morgan triplets i.e., by using recursive rules (satisfying the De Morgan laws) to evaluate three key characteristics of the family of classes that are obtained in a fuzzy partition: redundancy, coverage and relevance.
Redundancy refers to a certain orthogonality in the family of classes, which is then viewed as a particular representation system of the set of objects [1]. Hence, redundancy suggests the possible existence of an alternative representation to be found by means of an appropriate redefinition of the family of classes or by removing some of the existent classes.
The second characteristic of coverage refers to how different aspects of reality are fully verified by a family of classes. Finally, the third characteristic of relevance is understood as the necessity of including, or not excluding, a class or family of classes from the classification. In this way, decision makers can have a hint on how to improve the classifier performance, e.g., searching for some missing classes, proposing greater or smaller classes, or deleting some classes.
According to the above, fuzzy classification systems propose indexes that allow measuring the degree of redundancy (overlapping) between classes, the degree in which all of the classes accurately cover the aspect(s) of reality under consideration, and the degree in which some classes can be disregarded.
This approach is similar to some standard procedures in statistics and image analysis, where a crisp partition is pursued in such a way that every pixel is member of one and only one category (see e.g. [6]). Any automatic learning procedure should also take into account a number of indices, each one capturing a specific aspect that should be jointly balanced out for learning purposes.
In this line of research, some first approaches led to the development of works such as those presented in [8–10], and [7], and more recently in [29], [30], about a more in-depth study of aggregation functions for evaluating the redundancy and coverage of a particular classification. Such studies allowed the development of what are now known as overlap functions [8], [17] and grouping functions [9], [10]. These new functions do not impose the associativity of the most classic operators, such as t-norms and t-conorms, with which the concepts of redundancy and coverage, respectively, were originally defined (see, e.g., [20]).
An issue that is still open and with a broad field of development, is the study of the relevance property [7], focusing on the optimal number of clusters for a fuzzy partition. Previous works have studied this property from a more statistical perspective [2], or even as a dimensionality reduction problem [12].
Here we propose some first steps towards the characterization of relevance, and with it, a general study of a global quality criterion for a fuzzy partition. Therefore, this approach also allows establishing a stopping criterion for the calculation of the optimal number of classes for a fuzzy partition, based on the relevance of its classes.
The above allows evaluating the quality of a fuzzy partition. In this sense, an important highlight for our approach is that it does not require external tools for its application, such as the comparison with a fuzzy control partition (as suggested in [18]).
Our approach defines relevance as an observable property of a fuzzy partition. Therefore, two processes are necessary to determine the relevance of a class in a fuzzy partition: a comparison process among the classes of the fuzzy partition, and a process allowing to determine the degree of relevance of a class or set of classes. Regarding this comparison process, we identify a commutative group structure on any fuzzy classification system. Such structure is formed by the aggregations rules of any fuzzy classification system and the negation of these rules.
As mentioned above, the rules of a fuzzy classification system determine the degree of coverage and overlap, so their negations represent the degree of non-coverage and non-overlap, respectively. Keeping this in mind, we are interested in comparing such degrees looking for fuzzy partitions with high degrees of coverage and low degrees of overlap. In this sense, similarity relations [23] provide an interesting tool to carry out the mentioned comparison process, as they will allow, under specific properties, defining criteria for evaluating the quality of fuzzy partitions and the relevance of its classes.
The consideration of similarity relations is motivated by two facts: 1) Similarity relations satisfy a “min-transitivity” property that is similar to one of the conditions in the definition of fuzzy subgroups. 2) Similarity relations can be considered to effectively group elements into crips sets whose members are similar to each other to some specified degree, i.e., they generate a set of equivalence relations that can be ordered through inclusion. However, establishing similarity relations may not be a simple task because obtaining operators verifying the property of min-transitivity can be certainly complex. Furthermore, not any similarity relationship is useful for the mentioned comparison process, as we shall see in Section 4.2. In order to examine this problem, proximity relations [21] will be used to subsequently obtain similarity relations through transitive closure. We establish the conditions that are required for both relations to remain invariant.
This paper is organized as follows. In the next section, some preliminary definitions and theorems are presented. In Section 3, we discuss a general framework to address the concept of relevance. Subsequently, we analyze such concept in the perspective of a fuzzy partition. In Section 4, we present the commutative group structure formed from any fuzzy classification system. On this structure, we propose to consider a similarity relation to compare its elements. This similarity relation is in turn constructed through a proximity relation. We prove that under certain conditions, some elements remain invariant. In Section 5, we define a quality criterion and a relevance criterion that allow, through an algorithm, to establish a stopping criterion in the determination of an adequate the number of classes in a fuzzy partition. In the same line, we establish a comparison criterion to select the optimal fuzzy partition as long as the above criteria are satisfied. In Section 6, the proposed method is applied to an image segmentation problem, for which fuzzy partitions are obtained by using the fuzzy c-means algorithm. Finally, Section 7 is devoted to present some conclusions and future work.
Preliminaries
In this section some preliminary definitions and necessary results for the development of the proposal are established.
A (0, . . . , 0) = 0 and A (1, . . . , 1) =1,
Note that for each
A recursive rule is a family of operators allowing a sequential reckoning by means of a successive application of binary operators, once data have been properly ordered: the ordering rule assures that new data do not introduce modifications in the relative position of items already ordered. Recursiveness assures consistency of such a family of operators by assuming that such a rule is operational, in the sense that it can be evaluated both from left and right by means of a sequence of binary operators. A standard recursive rule will be one based upon the identity ordering rule (i.e., the ordering rule that keeps the order of data as they are given to us). Hence, recursiveness indeed allows the generalization of associativity, i.e., in case such a sequence of binary operators is given by a unique binary operator (L i = R j , for all i, j), we should be talking about a rule based upon an associative binary operator.
Recursive rules constitute an elegant formal mechanism to deal with aggregations of any arbitrary number of elements. Moreover, the recursive approach allows this mechanism to be robust under changes in the cardinality of the data, as it guarantees that all the operators in a family of aggregation functions are tightly related to a unique aggregation procedure: the binary operators that build up the recursive rule (see also [31]). This is useful in the context of unsupervised classification, where the number of classes in which the data has to be segmented is typically unknown a priori. In this context, recursive rules provide a robust mechanism to deal with the aggregation of diverse class information for different number of classes [1], automatically assessing the classification performance. We recall next the definition of fuzzy classification systems.
φ is a standard recursive rule, where φ2 (0, 1) = φ2 (1, 0) =0; N : [0, 1] → [0, 1] is a strong negation function, i.e., a continuous strictly decreasing function such that N (N (a)) = a, ∀ a ∈ [0, 1]; φ is a standard recursive rule such that, ∀n > 1, φ
n
(a1, . . . , a
n
) = N-1 [φ
n
(N (a1) , . . . , N (a
n
)], ∀ (a1, . . . , a
n
) ∈ [0, 1]
n
.
According to Definition 3, a fuzzy classification system can be denoted by (C, φ, φ, N). Notice that in a fuzzy classification system each x ∈ X has a membership degree μ c (x) associated with each class c ∈ C. As our purpose is to analyze the fuzzy classes, from now on we consider each standard recursive rule to act on such membership degrees, that is, for any given object x ∈ X we are interested in the sets φ {μ c (x)/ c ∈ C} or φ {μ c (x)/ c ∈ C} . In other words, the elements a i in the previous definition will be given by the membership degrees μ i (x), where μ i (x) denotes the membership degree of the object x to the i-th class of C, i = 1, . . . , n.
Thus, for instance, we have that
In the same line, notice that φ is a conjunctive recursive rule, in the sense that φ n (μ1 (x) , …, μ n (x)) =0, whenever μ j (x) =0 for certain j ∈ {1, . . . , n}. As a direct consequence, φ is a disjunctive recursive rule, in the sense that φ n (μ1 (x) , …, μ n (x)) =1, whenever there is μ j (x) such that μ j (x) =1.
Then, given a fuzzy partition C the coverage of an object x ∈ X is analyzed by means of the disjunctive rule φ {μ c (x)/ c ∈ C}, and its redundancy is analyzed by means of the conjunctive rule φ {μ c (x)/ c ∈ C}. Without loss of generality, we refer to φ n (μ1 (x) , μ2 (x) , …, μ n (x)) as φ n (c1, c2, …, c n ).
N (μ (x)) =1 - μ (x)
According to each selected rule, an aggregated value is obtained, to be understood as the degree up to which the family of fuzzy classes C satisfies a particular property or characteristic with respect to an object x ∈ X (in this case, coverage and redundancy). In this sense, it is also desirable to analyze the aggregated information of such characteristic for all objects x. To this aim, we propose the following definition.
Such aggregation operator can be of very different nature (conjunctive, disjunctive or averaging). Definition 4 is a generalization of the definition given in [13] (about the degree of global coverage) to any property analyzed by a recursive rule ρ n . The Example 2 illustrates the aim of this definition.
Disjunctive recursive rule
Disjunctive recursive rule
If we select the aggregation operator
Next we will remember the definitions corresponding to similarity relations, proximity relations and fuzzy subgroups.
Reflexivity: w (x, x) =1 Symmetry: w (x, y) = w (y, x)
Reflexivity: v (x, x) =1 Symmetry: v (x, y) = v (y, x) Min-transitivity: v (x, z) ≥ min {v (x, y) , v (y, z)}
μ (x · y) ≥ min {μ (x) , μ (y)}, ∀x, y ∈ G and μ (x-1) ≥ μ (x), ∀x ∈ G
1
According to [27] it is well-known that if v is a similarity relation on a set Ω, then v t = {(x, y) |v (x, y) ≥ t} is an equivalence relation on Ω for all t ∈ [0, 1]. This leads to the following theorem.
According to Definition 8, the next theorem establishes the connection between a translation invariant similarity relation and a normal fuzzy subgroup.
In this section we address the concept of relevance, establishing the elements to determine the relevance of an object. We use such elements to evaluate the relevance of a class in a fuzzy partition from a fuzzy classification system.
General framework
Relevance is a vague concept and, from a more general and intuitive perspective, people may be able to distinguish irrelevant information or, in some cases, more relevant information from less relevant information. The fact that there is a linguistic notion of relevance with a vague and variable meaning exposes the complexity of the problem and reveals different ways to approach it. Moreover, intuitions of relevance are relative to contexts, and there is no way of controlling exactly which context someone will have in mind at a given moment, or how to understand such a context [33].
According to the above, establishing that an element is relevant in a given context means that if the object is added or eliminated, there is a change in the context, or there is a change in information about such a context. It is precisely on this last case that we focus our study. In particular, in a first stage we study the changes that arise when the object is eliminated from the context and we analyze the resulting information as well. Thus, relevance can be understood as a local property and not as a global property.
Determining the relevance of an element in a context consists of carrying out two essential activities: the first consists in establishing a comparison process between the information obtained before and after eliminating that element. Subsequently, it is necessary to establish a process to determine the relevance degree of the classes. A comparison process is performed comparing diverse information provided by the conformation of two sets: the information of the context with the object and the information of the context without the object. In our proposal, the information of the context corresponds to the values obtained by the application of the recursive rules and their negations, according to the De Morgan triplet selected. This is, regarding a fuzzy partition, we compare the degrees of redundancy, coverage, non-relevance and non-coverage of the set of classes being studied. As mentioned above, relevance is represented here as a fuzzy concept and, therefore, is a matter of degree.
As a technical concept which can be suitable for being measured by computational methods, relevance requires a characterization that allows its formal understanding for computational use. Keeping this in mind, here we propose a new approach over relevance and the means for evaluating and measuring it regarding a given fuzzy partition.
The relevance of a class in a fuzzy partition has been addressed in [2] in terms of a statistical test. In such a proposal, the relevance of a class is evaluated through the disjunctive operator φ, comparing the degrees of coverage of the whole partition with those obtained without the class under study. Similarly, in [1] the relevance of a class is studied with the disjunctive operator φ, this time without using a statistical test. In both proposals, a single recursive rule is considered. In general, a large number of proposals to determine the quality of a fuzzy partition are based on the study of a single index or the measurement of a single property. Extensive reviews are presented in [19] and [34]. Under our approach, all recursive rules given by the fuzzy classification system and its negations are considered.
Relevance in a fuzzy partition
Under our approach, relevance is understood as the necessity of including, or not excluding, a class or family of classes from a given fuzzy partition. This situation may occur, for instance, because a class can generate high degrees of overlap without improving coverage, or because, although a class generates high degrees of coverage, some elements are better explained by another class. Therefore, we consider the evaluation of relevance as a dimensionality reduction problem i.e., given a set of data, find the best fuzzy partition with the least number of classes. Example 4 illustrates this problem.
Global degree
Global degree
However, note that class c4 can be eliminated without deteriorating the degree of coverage of the remaining classes. Indeed, after eliminating such a class we obtain the values
Regarding the above questions, we consider it necessary to use all the information available by the fuzzy classification system (recursive rules and their negations) and establish a procedure that does not consider recursive rules in isolation. Therefore, two key aspects arise: firstly, how to establish a structure that allows comparing recursive rules and their negations. Secondly, once this structure is established, how to determine a criterion that allows finding the optimal number of classes in which a set of data should be partitioned, considering the properties of coverage and redundancy of the set of classes and the relevance of each class. Therefore, we require a structure that allows studying the fuzzy partition from two perspectives: on the one hand, measuring properties of the whole partition or of a subset of classes of the partition (global properties); and on the other hand, measuring properties of each class (local properties).
The relevance property, in the case of fuzzy partitions, is a key element because an element or object can belong simultaneously to several classes, and sometimes its membership in one of these classes may not provide any discriminant information about the object.
Keeping this in mind, our proposal is established under the following reasoning. We consider a fuzzy classification system (C, φ, φ, N), where C = {c1, …, c n } describes the given context and φ, φ, N provides us with information about that context. We identify a commutative group structure in the set of recursive rules and their negations, i.e., G = {φ, φ, N (φ), N (φ)}. In order to compare the elements of such a group, we propose using a translation invariant similarity relation v on G and the corresponding equivalence classes v t obtained from v as described in Theorem 1. From such a comparison, we establish a criterion that allows determining, when a partition has optimal levels of coverage and redundancy, and subsequently, determine whether there are irrelevant classes or sets of classes.
We propose this approach motivated by the following ideas: 1) Establishing a commutative group structure and a similarity relationship on it is motivated by the fact that the “min-transitivity” property, established for such relations, is similar to one of the conditions of the fuzzy subgroup definition (see Definition 7 above). 2) It is well known that the possible equivalence relations on any set, when ordered by set inclusion, form a complete lattice. Therefore, this provides a mechanism to compare the elements of the commutative group G by order. However, as it will be explained in the next section, performing a comparison process by establishing a similarity relation in advance is not simple, and at the same time may lack meaning for the intend purpose. Thus, one way to solve this difficulty is through proximity relations, which constitute a first step to obtain similarity relations.
In this section we present the commutative group structure formed from any fuzzy classification system. On such a structure we propose to consider similarity relations as a way of comparing the elements of such a structure.
Commutative group of aggregation operators
From a fuzzy classification system (C, φ, φ, N), we consider two new mappings σ
n
, δ
n
: [0, 1]
n
→ [0, 1] , defined for all aggregation operators φ
n
and φ
n
. In this way, σ
n
: [0, 1]
n
→ [0, 1] is defined as
Notice that when we use the strong negation N on Eq. (1) or Eq. (2), the resulting expression can be interpreted as the complement of a proposition related to the classes {μ1 (x) , …, μ n (x)}.
In particular, if φ n (μ1 (x) , …, μ n (x)) represents the degree of coverage of the classes, then N (φ n (μ1 (x) , …, μ n (x))) represents the degree of non-coverage of the classes, understanding φ n as a proposition and N (φ n ) as the negation of such a proposition. In a similar way, if φ n (μ1 (x) , …, μ n (x)) represents the degree of redundancy of the classes, then N (φ n (μ1 (x) , …, μ n (x))) represents the degree of non-redundancy of the classes.
In the perspective of Definition 4, the degree of global non-covering and the degree of global non-overlap are defined and denoted as
Let us recall from Definition 3 that if N is a strong negation operator, then φ
n
(μ1 (x) , …, μ
n
(x)) = N [φ
n
(N (μ1 (x)) , …, N (μ
n
(x)))] and thus, given the mapping σ
n
, it holds that σ
n
(μ1 (x) , …, μ
n
(x)) = N (N [φ
n
(N (μ1 (x)) , …, N (μ
n
(x)))] and therefore,
The idea that motivates this construction is based on the possibility of establishing a relationship between the conjunctive and disjunctive operators, together with their negations, in such a way that a fuzzy partition can be evaluated taking into account both global and local properties of the corresponding fuzzy partition. We seek to compare the degrees of coverage, overlap, non-coverage and non-overlap in a iterative process for fuzzy partitions with different number of classes, and determine the partition with the highest quality. Such iterative process will be explained in the next section.
A close relationship is established between the set of mappings φ
n
, φ
n
, σ
n
and δ
n
, as follows. According to Eq. (1) and Eq. (2), we can write φ
n
, σ
n
and δ
n
in terms of φ
n
, and therefore we denote:
Thus, “∗”, “∧” and “∼” are operations on φ
n
forming φ
n
, σ
n
and δ
n
i.e., each one is a unary operation. In this sense, an identity operation i may be also defined as,
Let B be the set of such operations, i.e., B = {∗ , ∧ , ∼ , i } and let ∘ be the composition of operations. If ▽ , △∈B, then it is
Without loss of generality, we refer to φ n = φ n (μ1 (x) , …, μ n (x)). We denote by φ kn = φ n (μ1 (x k ) , …, μ n (x k )) the aggregation of the membership functions μ n of the element x k for the n classes.
Keeping this in mind, we establish Definition 10.
For instance,
Based on the above, Table 3 presents the results for operation ⊙ .
Operation ⊙
Clearly, (G o , ⊙) can be viewed as a normal subgroup of the alternating group A4 2 . Let G p = {e, a, b, c} be the set of permutations of G o onto itself, where
e=
b=
Here, under the composition of permutations, e is the identity composition and each composition is its own inverse. Thus, there is an isomorphism τ : G p → G o where τ (e) = φ n , τ (a) = σ n , τ (b) = δ n and τ (c) = φ n .
In general, a partition is of quality when it has high degrees of covering and low degrees of overlap. Therefore, when we include the degree of non-coverage and the degree of non-overlap a partition is considered of quality if it has low non-coverage degree and high non-overlap degree. However, considering the group structure formed by the degrees of coverage, overlap and its negations, it is possible to establish a quality criterion by comparing all the elements of such group simultaneously.
According to the above, the idea of similarity relations on groups appears naturally and allows establishing new structures that preserve the properties of the fuzzy subgroup.
In this section, we present a way to establish a translation invariant similarity relation v on G. From this relation v it is possible to define equivalence relations v t on G for all t ∈ [0, 1], which can be ordered by inclusion. These equivalence relations are composed of the information coming from comparing the rules of the fuzzy classification system and its negations. Therefore, we can order such information and thus analyze, in a simultaneous way, both the coverage and the redundancy of the classes.
The above is an aspect to highlight in our proposal because a calibration procedure for different indices is being considered, and not just the indices separately or even a single index.
Based on the theoretical framework developed in [27], consider the G
o
and G
p
groups addressed in Proposition 1, and let v
o
be a similarity relation on G
o
and v the similarity relation defined as follows:
v (a, b) = min {v o (a (φ n ) , b (φ n )) , v o (a (σ n ) , b (σ n )) , v o (a (δ n ) , b (δ n )) , v o (a (φ n ) , b (φ n ))}
From Eq. (4) it is immediate that the following proposition is fulfilled.
Notice that the construction of the similarity relation v requires the similarity relation v0. The relation v is necessary in order to be consistent with the established group structure, since the relation v0 is not enough to this aim. However, establishing a similarity relation is not always a simple process because the property of min-transitivity is difficult to verify. Thus, to obtain a similarity relation v o on G, which will give rise to the similarity relation v via Eq. 4, we propose considering a fuzzy proximity relation denoted by w. Such relation ensures that in the comparison process, each element of G is declared as totally similar to itself, i.e., the similarity of the element with itself is equal to 1. In the same line, the symmetry condition of proximity relations assures a consistent comparison process because the differences or similarities between two elements should not depend on the order in which they are related. In this sense, the properties of reflexivity and symmetry are very appropriate for expressing the degree of "closeness" or "proximity" between elements.
Subsequently, we obtain the transitive clousure of w. Let us recall that since the lack of transitivity distinguishes proximity relations from similarity relations, the transitive clousures of proximity relations are similarity relations. Thus, if we denote by
It is important to note that the selected proximity relation must effectively reflect the idea of closeness.
At this point, let us remark two important aspects about the above line of reasoning: 1) The similarity relation v allows undertaking a comparison process by inclusion of equivalence relations. Regarding this process, the interpretation of the proximity relation is as follows: w (φ
n
, φ
n
) provides the degree that covers the redundancy of the classes, w (φ
n
, σ
n
) provides the degree of proximity between the covering and the non-covering and, w (φ
n
, δ
n
) provides the degree of coverage without considering the redundancy of the classes. Therefore, it is desirable, for instance, that the degree of similarity between φ
n
and φ
n
, i.e., the degree that covers the redundancy of the classes, be low and lower than the degree of similarity between
To see this theorems, let us recall that G
p
= {e, a, b, c} is the group formed by the set of even permutations of G
o
onto itself,
Operation ⊙
Let H be the membership matrix of w as shown in Table 4. Then, we compute H′ = H ∪ (H ∘ H) with the max operator for union and max - min composition to find v o , i.e., the transitive closure of w. If H′ ≠ H then set H = H′ and repeat the process until the equality is reached. After the first step, we obtain
If m > s = k = p, then H′ = H, and thus H′ is the membership matrix of v
o
. Therefore, we compute v (c, b) and v (c, a) as follow:
Now, in an analogous way we obtain v (c, a) = s. However, if m > s = k > p then H′ ≠ H, and therefore the process is repeated with H = H′, i.e., H′′ = H′ ∪ (H′ ∘ H′). This leads to
In this case, it is H′′ = H′ and again v (c, b) = m and v (c, a) = s. However, if m > s > k > p then H′′≠ H′, and therefore the process has to be repeated with H′ = H′′, i.e., H′′′ = H′′ ∪ (H′′ ∘ H′′). Now, it is
As H′′′ = H′′, the transitive closure has been obtained after considering all possible cases, and again it holds that v (c, b) = m and v (c, a) = s.□ Theorem 3 indicates that, under certain conditions, the similarity and proximity relations between
Additionally, Theorem 3 and Theorem 4 guarantee that it is possible to consider only the values obtained by the proximity relation and the order relation on them, without altering the order relation (by inclusion) on equivalence relations. Similarly, the structure of fuzzy subgroup obtained is preserved (see Theorem 2). We will use this result in the next section to establish a quality criterion that is invariant under both similarity and proximity relations.
Based on the algebraic structure of commutative group and a translation invariant similarity relation on such group, in this section we define a quality criterion for a fuzzy partition and a relevance criterion for classes. An algorithm that describes the comparison process is presented. Such algorithm allows establishing the minimum number of classes in a fuzzy partition in such a way that a high-quality fuzzy partition is obtained. Additionally, we present a criterion that allows comparing two fuzzy partitions that meet the quality criterion and the relevance criterion.
Following the arguments and remarks previous to Theorem 3, we propose the first quality criterion. Consider a fuzzy classification system (C, φ, φ, N) with C = {c1, . . . , c n }, where C is obtained from a fuzzy iterative, non-hierarchical classification algorithm (e.g., fuzzy c-means or possibilistic c-means among others). Also, consider the groups G o = {φ n , φ n , σ n , δ n } and G p = {e, a, b, c} together with a translation invariant similarity relation v on G p (according to Eq. 4). Then, it is possible to compute the equivalence relations v t = {(x, y) |v (x, y) ≥ t} for all t ∈ [0, 1]. In this way, it holds that if 0 ≤ t1 < ⋯ < t p = 1, then v t p ⊆ ⋯ ⊆ v t 1 = G o × G o .
Let
K = vt=k (x, y) ={ (x, y) |v (x, y) ≥ k },
M = vt=m (x, y) = {(x, y) |v (x, y) ≥ m},
S = vt=s (x, y) = {(x, y) |v (x, y) ≥ s}.
Taking into account the above considerations and as well as Theorem 3 and Theorem 4, the following quality criterion is defined:
Remember that m provides the degree of coverage without considering the redundancy of the classes. Furthermore, we can extend the same argument for a second relevance criterion.
Notice that this criterion proceeds by removing a class from the partition under consideration, one at a time, in order to study its relevance. In this sense, the removed class will be relevant if the aggregation of the remaining classes indicates that the degree of non-coverage is greater than the degree of overlap. In other words, a class is considered relevant if its elimination decreases the degree of coverage in a greater proportion than the degree of overlap.
Therefore, we can state on more formal terms the basic intuition supporting both criteria, by the following definition.
According to the criteria established above, it is possible to develop a procedure to assess the quality and relevance of a given fuzzy partition. This procedure is summarized in the Algorithm 1.
This Algorithm 1 proceeds as follows: In the first step, a partition C = {c1, c2} is computed through an iterative, non-hierarchical classification algorithm. For practical purposes we will use the fuzzy c-means. From a proximity relation, Algorithm 1 determines if the partition C is of quality. In the case n = 2 (two classes), the relevance of the classes is not evaluated because, under our framework, the aggregation of a single element or fuzzy partitions with a single class are not considered. Therefore, if the partition with two classes is of quality, then the algorithm ends. Otherwise, the algorithm increases the number of classes by one, returning to step 1, until a high-quality partition is found, that is, a quality partition in which all its classes are relevant.
1: C = output partition of the fuzzy c-means with n classes
2: Compute φ n , φ n , σ n and δ n . Consider G o = {φ n , φ n , σ n , δ n }
3: Compute w on G o . Let k = w (φ n , φ n ), m =. w (φ n , δ n ), s = w (φ n , σ n )
4:
5:
6: n = n + 1, return to 1.
7:
8:
9:
10:
11:
12: Compute C i for all i = 1, …, n
13: Compute φ i , φ i , σ i and δ i for each C i and consider G i = {φ i , φ i , σ i , δ i }
14:
15:
16:
17:
18:
19:
Notice that, in particular, the process ends when a high-quality fuzzy partition has been found, i.e, a partition fullfiling the quality and relevance criteria and with the smallest number of classes. However, it can happen that there is no optimal number of classes. In this case, it is suggested to restart the fuzzy c-means algorithm with a different seed or with a different fuzziness parameter. Similarly, it can happen that there are two partitions fulfilling all the proposed criteria. In principle, the partition with the smallest number of classes is desirable, but this does not guarantee that relevant information is not lost. Therefore, we propose an additional criterion to compare two high-quality fuzzy partitions when we use Algorithm 1 several times under different configurations (different seeds or fuzziness parameter values). Let us stress that this criterion is not used in Algorithm 1.
Application
In order to apply the defined criteria and Algorithm 1, we have selected the image presented in Fig 1., considering the unsupervised classification problem of obtaining classes of similar pixels. We consider the recursive triplet of Example 1 and the translation invariant similarity relation of Example 5.

Aurora borealis.
According to Algorithm 1, we start by obtaining a partition with n=2 classes through the fuzzy c-means algorithm, leading to obtain m = 1, s = 0.44, k = 0.44. As s = k, then the partition is of low quality.The corresponding classes of the fuzzy partition are presented in Fig. 2.

Classes obtained by applying the fuzzy 2-means algorithm (left: class 1, right: class 2). The gray scale represents the membership degree of each pixel to each class, where black = 0 and white = 1.
The low quality of the partition obtained from the image can be observed in some regions where objects with different intensity of color have been classified in the same class. For instance, there is almost no distinction between trees and much of the sky.
According to Algorithm 1, we again apply the fuzzy c-means obtain a new partition with n = 3 classes. Then, we compute the proximity relation w, which is shown in Table 5.
Proximity relation w
As w (φ3, δ3) =0.7 > w (φ n , σ n ) =0.61 this 3-class partition is a quality partition. After computing the transitive closure of w, we compute the (translation invariant) similarity relation v, which is shown in Table 6.
Similarity relation v
Thus, we have that vt=0.7 (c, b) ⊂ vt=0.61 (c, a). Equivalently, by Theorem 2 and Theorem 3, we have that v (c, b) =0.7 ⊂ v (c, a) =0, 61 (being v a fuzzy subgroup).
As the partition is a quality partition, following Algorithm 1 and according to Step 12, we evaluate the relevance of the classes considering the 2-class sub-partitions C1, C2, C3 and computing the corresponding values for w.
For C3 it is k3 = w (φ2, φ2) =0.67 > w (φ2, σ2) =0.46 = s3. For C2 it is k2 = w (φ2, φ2) =0.59 > w (φ2, σ2) =0.4 = s 2 and for C1 it is k3 = w (φ2, φ2) =0.69 > w (φ2, σ2) =0.48 = s3.
According to the Relevance Criterion and Step 16, all the classes are then highly relevant. The corresponding classes of the fuzzy partition are presented in Fig. 3.

Classes obtained after applying the fuzzy 3-means algorithm (left: class 1, center: class 2 and right: class 3). The gray scale represents the membership degree of each pixel to each class, where black = 0 and white = 1.
Under the procedure above, we may conclude that this 3-class partition obtained through the fuzzy c-means is a high-quality partition. In this partition, for example, there is a much better distinction than before between the trees and the sky. In general, with a fuzzy partition with three classes, the image is segmented into regions that allow understanding it in terms of the intensity of the color of the objects.
As a way of checking these results, we later applied the fuzzy c-means to obtain a partition with n = 4 classes, for which the different quality indicators were computed. Particularly, for this partition we obtained m = w (φ4, δ4) =0.62 < w (φ4, σ4) =0.73 = s. Therefore, this 4-class partition was assessed as of low quality, which somehow provides a confirmation of the previous result.
Furthermore, in order to illustrate the robustness of the obtained results, the procedure has been performed with two different De Morgan triplets (given in Table 7). Table 8 summarizes the results.
Two De Morgan Triplets
Quality indicators of the given 3-class partition with two different recursive triplets.
Under triplets 1 and 2 the same result was obtained, allowing us to assert that the optimal number of classes for the image worked are three classes.
Through this paper, some basic elements are examined for characterizing the property of relevance regarding the evaluation of a fuzzy partition from a classification system. In particular, three aspects are considered for the study of relevance: 1) the comparison process between classes and the way they cover the objects under consideration; 2) the estimation of degrees of intensity in the changes generated by the elements in the valuation space; and 3) the specification of a stopping criterion for inclusion of classes in a fuzzy partition.
We explore the algebraic group structure G o = {φ n , φ n , σ n , δ n }, established from a fuzzy classification system. On such a structure, a translation invariant similarity relation v is applied in such a way that the equivalence relations v t form a lattice under inclusion. Such equivalence relations acquire practical meaning because they allow comparing the coverage of a family of classes and the overlap degree, non-covering degree and non-overlap degree. However, our proposal establishes similarity relations based on proximity relations. In this way, some values of both relations remain invariant. This is, we prove that under certain conditions the degrees of similarity and the degrees of proximity for certain elements of the group are the same. Such conditions allow establishing a criterion of relevance for the classes and a criterion of quality for the partitions.
Degrees of similarity have been presented by examining pairs of properties, such as the similarity between the coverage degree and the overlap degree, or the similarity between the degree of grouping and non-overlap. Based on this, it is desirable that there is a greater similarity between the degrees of coverage and non-overlap, than between the degree of coverage and overlap. Hence, we proposed a couple of criteria that allow determining the quality and relevance of a fuzzy partition, showing that relevance is a local property and is responsible for determining the size of the partition, i.e., the optimal number of classes of a fuzzy partition.
We established a process that helps the decision maker to choose the best family of classes. Such process, that we have described algorithmically, is a iterative process because such an algorithm requires obtaining fuzzy partitions with different number of classes, and possibly also different initialization, as many times as necessary, until a high-quality partition is found. In this sense, the process ends when it has found the high-quality fuzzy partition with the lowest number of classes (if possible).
In general, a fuzzy partition is said to be of quality if it has high degrees of coverage and low degrees of overlap. However, this is a feature that several partitions may have. Therefore, we established an extra criterion that allows comparing two or more high-quality fuzzy partitions. In such a process, the evaluation of the relevance of the classes, as expected, is a determining aspect.
As future work, we propose to compare the established criteria with traditional indexes for fuzzy partitions, in such a way that two scenarios can be presented: on the one hand, complementing the existing indexes with the criteria we have proposed or, on the other hand, finding other advantages in the use of the criteria.
In the same line, it is necessary to carry out a study on the performance of the criteria with different families of De Morgan triplets and different proximity relations. In general, we established a process that does not depend on these two elements, however, we may carry out processes that are computationally inefficient. A key aspect in our proposal is to establish a proximity relation that effectively reflects the idea of closeness.
Another future work that may be addressed in this context is the introduction of paired fuzzy sets (see [25]). In this way we could extend the classification model not only introducing general aggregation tools like overlap and grouping degrees, but incorporating opposite arguments, as for instance, the degree of class separation, perhaps understanding separate as the opposite concept of overlap. Despite the apparent increase in computation difficulties due to the extra information being included, the existence of opposites should help in collecting more evidence for properly assessing the quality of a partition, and hopefully improve the confidence in the model. Similarly, our proposal may be extended to the alternative approach proposed in [26], where the concept of an aggregation rule should be defined from a computational point of view, focusing on the computational properties of such an aggregation, i.e., on the manner in which the aggregation values are computed.
Footnotes
In a group structure (G, ∗), x-1 denotes the inverse of the element x for operation ∗.
An alternating group is a group of even permutations on a set of length n, denoted A n or Alt (n). Alternating groups are therefore permutation groups. In particular, A4 = {id, (12) (34) , (13) (24) , (14) (23)} .
Acknowledgment
This research has been partially supported by the Government of Spain (grant PGC2018-096509-B-I00) Complutense University (UCM Research Group 910149) and Gran Colombia University (grant JCG2019-FCEM-01).
