Evaluation of the quality and relevance of a fuzzy partition

Abstract

This paper proposes a couple of criteria for evaluating the quality and relevance of a fuzzy partition. These criteria are established from a fuzzy classification system and its recursive De Morgan triplet. We propose a comparison process between the classes of a fuzzy partition, based on a translation invariant similarity relation. Therefore a classification process is carried out with the equivalence relations determined by the similarity relation. Such a relation is built on the commutative group structure formed by the elements of the fuzzy classification system. Our approach is illustrated through an example on image analysis by the fuzzy c-means algorithm.

Keywords

Relevance fuzzy classification system translation invariant similarity relation fuzzy partitions proximity relations

1 Introduction

The process of recognition and classification of different stimuli into broader perceptions and concepts is one of the most fundamental activities of human thinking. As a matter of fact, one of the most primitive and common activities of human reasoning consists in generalizing similar objects into concepts or classes. Since the paper presented by Bellman, Kalaba and Zadeh [5] fuzzy techniques stand as one of the fields with major advances and applications in the area of (soft) classification and pattern recognition. In [3], [4], [24], [28] and [14] we can find a variety of applications, perspectives and approaches that have been developed in relation to these two areas.

The concept of a partition is, without any doubt, one of the pillars of unsupervised classification problems. In the fuzzy case, a standard notion for a partition has been that proposed by Ruspini [32], where the total degree of membership is completely distributed among all classes, which was later generalized by Krishnapuram and Keller [22], only requiring that the membership of an object to any class is positive. Therefore, under the notion of a Ruspini’s partition, the total sum of the membership degrees of an object to all classes must be equal to 1, while in the perspective of Krishnapuram and Keller, it is sufficient that for each element, its membership degree is greater than zero. Based on these ideas, and considering that in most cases fuzzy partitions do not verify the Ruspini condition, the notion of fuzzy classification systems have been defined by means of families of aggregation functions, allowing to analyze the classes obtained in a fuzzy partition [1]. Therefore, a classification problem can be evaluated through a fuzzy classification system regardless of the notion of partition used.

Fuzzy classification systems, as proposed in [1] (see also [2]), were conceived as a structure allowing the treatment of complex classification problems following two key ideas. On the one hand, under the theoretical framework introduced by Dombi [15], [16] regarding aggregation operators, fuzzy classification systems were proposed as an alternative approach for non-associative connectives through the use of the concept of recursiveness.

On the other hand, fuzzy classification systems were proposed as a structure allowing the evaluation of the established classification from De Morgan triplets i.e., by using recursive rules (satisfying the De Morgan laws) to evaluate three key characteristics of the family of classes that are obtained in a fuzzy partition: redundancy, coverage and relevance.

Redundancy refers to a certain orthogonality in the family of classes, which is then viewed as a particular representation system of the set of objects [1]. Hence, redundancy suggests the possible existence of an alternative representation to be found by means of an appropriate redefinition of the family of classes or by removing some of the existent classes.

The second characteristic of coverage refers to how different aspects of reality are fully verified by a family of classes. Finally, the third characteristic of relevance is understood as the necessity of including, or not excluding, a class or family of classes from the classification. In this way, decision makers can have a hint on how to improve the classifier performance, e.g., searching for some missing classes, proposing greater or smaller classes, or deleting some classes.

According to the above, fuzzy classification systems propose indexes that allow measuring the degree of redundancy (overlapping) between classes, the degree in which all of the classes accurately cover the aspect(s) of reality under consideration, and the degree in which some classes can be disregarded.

This approach is similar to some standard procedures in statistics and image analysis, where a crisp partition is pursued in such a way that every pixel is member of one and only one category (see e.g. [6]). Any automatic learning procedure should also take into account a number of indices, each one capturing a specific aspect that should be jointly balanced out for learning purposes.

In this line of research, some first approaches led to the development of works such as those presented in [8 –10], and [7], and more recently in [29], [30], about a more in-depth study of aggregation functions for evaluating the redundancy and coverage of a particular classification. Such studies allowed the development of what are now known as overlap functions [8], [17] and grouping functions [9], [10]. These new functions do not impose the associativity of the most classic operators, such as t-norms and t-conorms, with which the concepts of redundancy and coverage, respectively, were originally defined (see, e.g., [20]).

An issue that is still open and with a broad field of development, is the study of the relevance property [7], focusing on the optimal number of clusters for a fuzzy partition. Previous works have studied this property from a more statistical perspective [2], or even as a dimensionality reduction problem [12].

Here we propose some first steps towards the characterization of relevance, and with it, a general study of a global quality criterion for a fuzzy partition. Therefore, this approach also allows establishing a stopping criterion for the calculation of the optimal number of classes for a fuzzy partition, based on the relevance of its classes.

The above allows evaluating the quality of a fuzzy partition. In this sense, an important highlight for our approach is that it does not require external tools for its application, such as the comparison with a fuzzy control partition (as suggested in [18]).

Our approach defines relevance as an observable property of a fuzzy partition. Therefore, two processes are necessary to determine the relevance of a class in a fuzzy partition: a comparison process among the classes of the fuzzy partition, and a process allowing to determine the degree of relevance of a class or set of classes. Regarding this comparison process, we identify a commutative group structure on any fuzzy classification system. Such structure is formed by the aggregations rules of any fuzzy classification system and the negation of these rules.

As mentioned above, the rules of a fuzzy classification system determine the degree of coverage and overlap, so their negations represent the degree of non-coverage and non-overlap, respectively. Keeping this in mind, we are interested in comparing such degrees looking for fuzzy partitions with high degrees of coverage and low degrees of overlap. In this sense, similarity relations [23] provide an interesting tool to carry out the mentioned comparison process, as they will allow, under specific properties, defining criteria for evaluating the quality of fuzzy partitions and the relevance of its classes.

The consideration of similarity relations is motivated by two facts: 1) Similarity relations satisfy a “min-transitivity” property that is similar to one of the conditions in the definition of fuzzy subgroups. 2) Similarity relations can be considered to effectively group elements into crips sets whose members are similar to each other to some specified degree, i.e., they generate a set of equivalence relations that can be ordered through inclusion. However, establishing similarity relations may not be a simple task because obtaining operators verifying the property of min-transitivity can be certainly complex. Furthermore, not any similarity relationship is useful for the mentioned comparison process, as we shall see in Section 4.2. In order to examine this problem, proximity relations [21] will be used to subsequently obtain similarity relations through transitive closure. We establish the conditions that are required for both relations to remain invariant.

This paper is organized as follows. In the next section, some preliminary definitions and theorems are presented. In Section 3, we discuss a general framework to address the concept of relevance. Subsequently, we analyze such concept in the perspective of a fuzzy partition. In Section 4, we present the commutative group structure formed from any fuzzy classification system. On this structure, we propose to consider a similarity relation to compare its elements. This similarity relation is in turn constructed through a proximity relation. We prove that under certain conditions, some elements remain invariant. In Section 5, we define a quality criterion and a relevance criterion that allow, through an algorithm, to establish a stopping criterion in the determination of an adequate the number of classes in a fuzzy partition. In the same line, we establish a comparison criterion to select the optimal fuzzy partition as long as the above criteria are satisfied. In Section 6, the proposed method is applied to an image segmentation problem, for which fuzzy partitions are obtained by using the fuzzy c-means algorithm. Finally, Section 7 is devoted to present some conclusions and future work.

2 Preliminaries

In this section some preliminary definitions and necessary results for the development of the proposal are established.

Definition 1. [11] An aggregation operator $A : ⋃_{n \in ℕ} {[0, 1]}^{n} \to [0, 1]$ is a function that satisfies:

A (0, . . . , 0) = 0 and A (1, . . . , 1) =1,

$\forall n \in ℕ$ , if x₁ ≤ y₁, . . . , x_n ≤ y_n, then A (x₁, . . . , x_n) ≤ A (y₁, . . . , y_n)

Note that for each $n \in ℕ$ an n-ary aggregation operator A_n is defined and therefore, A is understood as a family of n-ary aggregation operators. For simplicity, in what follows we shall use the term aggregation operator to refer to both a family of n-ary aggregation operators and n-ary operators themselves.

Definition 2. [1] A recursive rule ρ is a family of aggregation operators ${ρ_{n} : [0, 1]^{n} \to [0, 1]}_{n > 1}$ such that there exist an ordering rule π and two sequences of binary operators {L_n : [0, 1] ² → [0, 1]} _n>1 and {R_n : [0, 1] ² → [0, 1]} _n>1 such that for each n and for each (a₁, . . . , a_n) ∈ [0, 1] ⁿ, $\begin{matrix} ρ_{n} (a_{π (1)}, . . ., a_{π (n)}) = \\ L_{n} (ρ_{n - 1} (a_{π (1)}, . . ., a_{π (n - 1)}), a_{π (n)}) = \\ R_{n} (a_{π (1)}, ρ_{n - 1} (a_{π (2)}, . . ., a_{π (n)}) . \end{matrix}$

A recursive rule is a family of operators allowing a sequential reckoning by means of a successive application of binary operators, once data have been properly ordered: the ordering rule assures that new data do not introduce modifications in the relative position of items already ordered. Recursiveness assures consistency of such a family of operators by assuming that such a rule is operational, in the sense that it can be evaluated both from left and right by means of a sequence of binary operators. A standard recursive rule will be one based upon the identity ordering rule (i.e., the ordering rule that keeps the order of data as they are given to us). Hence, recursiveness indeed allows the generalization of associativity, i.e., in case such a sequence of binary operators is given by a unique binary operator (L_i = R_j, for all i, j), we should be talking about a rule based upon an associative binary operator.

Recursive rules constitute an elegant formal mechanism to deal with aggregations of any arbitrary number of elements. Moreover, the recursive approach allows this mechanism to be robust under changes in the cardinality of the data, as it guarantees that all the operators in a family of aggregation functions are tightly related to a unique aggregation procedure: the binary operators that build up the recursive rule (see also [31]). This is useful in the context of unsupervised classification, where the number of classes in which the data has to be segmented is typically unknown a priori. In this context, recursive rules provide a robust mechanism to deal with the aggregation of diverse class information for different number of classes [1], automatically assessing the classification performance. We recall next the definition of fuzzy classification systems.

Definition 3. [1] Let us assume a finite set of objects X. A fuzzy classification system is a finite family C of n fuzzy classes, where each c ∈ C has an associated membership function μ_c : X → [0, 1] , together with a recursive triplet (φ, φ, N) such that

φ is a standard recursive rule, where φ₂ (0, 1) = φ₂ (1, 0) =0;

N : [0, 1] → [0, 1] is a strong negation function, i.e., a continuous strictly decreasing function such that N (N (a)) = a, ∀ a ∈ [0, 1];

φ is a standard recursive rule such that, ∀n > 1, φ_n (a₁, . . . , a_n) = N^-1 [φ_n (N (a₁) , . . . , N (a_n)], ∀ (a₁, . . . , a_n) ∈ [0, 1] ⁿ.

According to Definition 3, a fuzzy classification system can be denoted by (C, φ, φ, N). Notice that in a fuzzy classification system each x ∈ X has a membership degree μ_c (x) associated with each class c ∈ C. As our purpose is to analyze the fuzzy classes, from now on we consider each standard recursive rule to act on such membership degrees, that is, for any given object x ∈ X we are interested in the sets φ {μ_c (x)/ c ∈ C} or φ {μ_c (x)/ c ∈ C} . In other words, the elements a_i in the previous definition will be given by the membership degrees μ_i (x), where μ_i (x) denotes the membership degree of the object x to the i-th class of C, i = 1, . . . , n.

Thus, for instance, we have that $\begin{matrix} φ_{n} (μ_{1} (x), μ_{2} (x), \dots, μ_{n} (x)) = \\ N^{- 1} [φ_{n} (N (μ_{1} (x)), \dots, N (μ_{n} (x)))] . \end{matrix}$

In the same line, notice that φ is a conjunctive recursive rule, in the sense that φ_n (μ₁ (x) , …, μ_n (x)) =0, whenever μ_j (x) =0 for certain j ∈ {1, . . . , n}. As a direct consequence, φ is a disjunctive recursive rule, in the sense that φ_n (μ₁ (x) , …, μ_n (x)) =1, whenever there is μ_j (x) such that μ_j (x) =1.

Then, given a fuzzy partition C the coverage of an object x ∈ X is analyzed by means of the disjunctive rule φ {μ_c (x)/ c ∈ C}, and its redundancy is analyzed by means of the conjunctive rule φ {μ_c (x)/ c ∈ C}. Without loss of generality, we refer to φ_n (μ₁ (x) , μ₂ (x) , …, μ_n (x)) as φ_n (c₁, c₂, …, c_n).

Example 1. We can consider a recursive triplet (φ, φ, N) such that,

$φ_{n} (μ_{1} (x), \dots, μ_{n} (x)) = \frac{3 \prod_{k = 1}^{n} μ_{k} (x)}{1 + 2 \prod_{k = 1}^{n} μ_{k} (x)}$

$φ_{n} (μ_{1} (x), \dots, μ_{n} (x)) = \frac{1 - (\prod_{k = 1}^{n} (1 - μ_{k} (x)))}{1 + 2 \prod_{k = 1}^{n} (1 - μ_{k} (x))}$

N (μ (x)) =1 - μ (x)

According to each selected rule, an aggregated value is obtained, to be understood as the degree up to which the family of fuzzy classes C satisfies a particular property or characteristic with respect to an object x ∈ X (in this case, coverage and redundancy). In this sense, it is also desirable to analyze the aggregated information of such characteristic for all objects x. To this aim, we propose the following definition.

Definition 4. Let X = {x₁, . . . , x_m} be a finite set of objects, C = {c₁, . . . , c_n} a family of fuzzy classes over this set and ρ_n an aggregation operator representing the degree up to which the family of fuzzy classes satisfies a specific property for an object x ∈ X. Then the global degree of ρ_n on X, denoted $ρ_{n}^{T}$ , is defined as the aggregation of the values of ρ_n for all objects x ∈ X by means of an aggregation operator A. This is, $ρ_{n}^{T} = A (ρ_{1 n}, . . ., ρ_{mn})$ . Where ρ_jn denotes the degree up to which the n fuzzy classes satisfy a specific property for object x_j, j = 1, …, m.

Such aggregation operator can be of very different nature (conjunctive, disjunctive or averaging). Definition 4 is a generalization of the definition given in [13] (about the degree of global coverage) to any property analyzed by a recursive rule ρ_n . The Example 2 illustrates the aim of this definition.

Example 2. Let C = {c₁, c₂, c₃} be a partition with three classes of the set X = {x₁, x₂, x₃, x₄}. We select the disjunctive recursive rule φ_n from Example 1 to evaluate the coverage (see Table 1).

Table 1
Disjunctive recursive rule

Objects c ₁ c ₂ c ₃ φ₃ (c₁, c₂, c₃)

x ₁ 0.8 0.1 0.1 0.6329

x ₂ 0.7 0.2 0.1 0.5474

x ₃ 0.3 0.62 0.08 0.5070

x ₄ 0.5 0.47 0.03 0.4908

If we select the aggregation operator $A (φ_{13}, φ_{23}, φ_{33}, φ_{43}) = \frac{1}{4} (φ_{13} + φ_{23} + φ_{33} + φ_{43})$ (remember that φ_ij denotes the degree of coverage of j classes with respect to the object i), then a global degree of coverage of the class set, i.e., $φ_{3}^{T} =$ A (φ₁₃, φ₂₃, φ₃₃, φ₄₃) =0.5445, is obtained.

Next we will remember the definitions corresponding to similarity relations, proximity relations and fuzzy subgroups.

Definition 5. Let X be a set and w be a fuzzy subset of X × X. Then w is called a proximity relation (also, tolerance relation) on X if the following properties hold, ∀x, y ∈ X:

Reflexivity: w (x, x) =1

Symmetry: w (x, y) = w (y, x)

Definition 6. [23] Let X be a set and v be a fuzzy subset of X × X. Then v is called a similarity relation on X if the following properties hold, ∀x, y, z ∈ X:

Reflexivity: v (x, x) =1

Symmetry: v (x, y) = v (y, x)

Min-transitivity: v (x, z) ≥ min {v (x, y) , v (y, z)}

Definition 7. [27] The set of all fuzzy subsets of X is called the fuzzy power set of X and is denoted by FP (X). Let G = (G, ·) be an arbitrary group and let μ ∈ FP (G). Then μ is called a fuzzy subgroup of G if

μ (x · y) ≥ min {μ (x) , μ (y)}, ∀x, y ∈ G and

μ (x^-1) ≥ μ (x), ∀x ∈ G 1

Example 3. Let G = {e, a, b, ab} be the Klein four-group. The fuzzy subset of G defined by μ (e) = μ (ab) =0.7 and μ (a) = μ (b) =0.2 is a fuzzy subgroup of G.

Definition 8. [27] Let G be a group and let μ be a fuzzy subgroup of G. Then μ is called a normal fuzzy subgroup of G if it is an Abelian fuzzy subset of G, i.e., μ (x · y) = μ (y · x), ∀x, y ∈ G.

According to [27] it is well-known that if v is a similarity relation on a set Ω, then v_t = {(x, y) |v (x, y) ≥ t} is an equivalence relation on Ω for all t ∈ [0, 1]. This leads to the following theorem.

Theorem 1. [23] Let v be a similarity relation on a finite set G. Then there is a binary operation · on G such that G = (G, ·) is a group and a fuzzy subgroup μ of G such that v (x, y) = μ (x · y^-1) or v (x, y) = μ (x^-1 · y), ∀x, y, z ∈ G if and only if the equivalence classes determined by the crisp equivalence relation v_t = {(x, y) |v (x, y) ≥ t} have the same size for all t ∈ [0, 1].

Definition 9. [23] Let G be a group and let v be a similarity relation on G. Then v is said to be right-invariant if (∀ x, y, z ∈ G) v (x, y) = v (x · z, y · z). In a similar way, v is said to be left-invariant if (∀ x, y, z ∈ G) v (x, y) = v (z · x, z · y). v is translation invariant if it is both left and right-invariant.

According to Definition 8, the next theorem establishes the connection between a translation invariant similarity relation and a normal fuzzy subgroup.

Theorem 2. [23] Suppose v is a translation invariant similarity relation on G. Then v is a fuzzy subgroup of G × G if, and only if, μ is a fuzzy subgroup of G such that μ (x) = v (e, x) and μ is commutative, where e is the identity element of G.

3 About the concept of relevance

In this section we address the concept of relevance, establishing the elements to determine the relevance of an object. We use such elements to evaluate the relevance of a class in a fuzzy partition from a fuzzy classification system.

3.1 General framework

Relevance is a vague concept and, from a more general and intuitive perspective, people may be able to distinguish irrelevant information or, in some cases, more relevant information from less relevant information. The fact that there is a linguistic notion of relevance with a vague and variable meaning exposes the complexity of the problem and reveals different ways to approach it. Moreover, intuitions of relevance are relative to contexts, and there is no way of controlling exactly which context someone will have in mind at a given moment, or how to understand such a context [33].

According to the above, establishing that an element is relevant in a given context means that if the object is added or eliminated, there is a change in the context, or there is a change in information about such a context. It is precisely on this last case that we focus our study. In particular, in a first stage we study the changes that arise when the object is eliminated from the context and we analyze the resulting information as well. Thus, relevance can be understood as a local property and not as a global property.

Determining the relevance of an element in a context consists of carrying out two essential activities: the first consists in establishing a comparison process between the information obtained before and after eliminating that element. Subsequently, it is necessary to establish a process to determine the relevance degree of the classes. A comparison process is performed comparing diverse information provided by the conformation of two sets: the information of the context with the object and the information of the context without the object. In our proposal, the information of the context corresponds to the values obtained by the application of the recursive rules and their negations, according to the De Morgan triplet selected. This is, regarding a fuzzy partition, we compare the degrees of redundancy, coverage, non-relevance and non-coverage of the set of classes being studied. As mentioned above, relevance is represented here as a fuzzy concept and, therefore, is a matter of degree.

As a technical concept which can be suitable for being measured by computational methods, relevance requires a characterization that allows its formal understanding for computational use. Keeping this in mind, here we propose a new approach over relevance and the means for evaluating and measuring it regarding a given fuzzy partition.

The relevance of a class in a fuzzy partition has been addressed in [2] in terms of a statistical test. In such a proposal, the relevance of a class is evaluated through the disjunctive operator φ, comparing the degrees of coverage of the whole partition with those obtained without the class under study. Similarly, in [1] the relevance of a class is studied with the disjunctive operator φ, this time without using a statistical test. In both proposals, a single recursive rule is considered. In general, a large number of proposals to determine the quality of a fuzzy partition are based on the study of a single index or the measurement of a single property. Extensive reviews are presented in [19] and [34]. Under our approach, all recursive rules given by the fuzzy classification system and its negations are considered.

3.2 Relevance in a fuzzy partition

Under our approach, relevance is understood as the necessity of including, or not excluding, a class or family of classes from a given fuzzy partition. This situation may occur, for instance, because a class can generate high degrees of overlap without improving coverage, or because, although a class generates high degrees of coverage, some elements are better explained by another class. Therefore, we consider the evaluation of relevance as a dimensionality reduction problem i.e., given a set of data, find the best fuzzy partition with the least number of classes. Example 4 illustrates this problem.

Example 4. Let C = {c₁, c₂, c₃, c₄} be a partition with four classes on the set X = {x₁, x₂, x₃, x₄, x₅}. Table 2 shows the membership degrees of these 5 objects into the 4 classes in C. We select the fuzzy classification system operators given by φ (c₁, c₂, c₃, c₄) = max(c₁, c₂, c₃, c₄), φ (c₁, c₂, c₃, c₄) = min(c₁, c₂, c₃, c₄) and N (x) =1 - x. The last 2 columns in Table 2 present the degrees of coverage and redundancy of this partition C for each object in X. The global degrees of coverage and redundancy obtained by using A = mean are shown in the last line of this table.

Table 2
Global degree

Objects c ₁ c ₂ c ₃ c ₄ φ (c₁, c₂, c₃, c₄) φ (c₁, c₂, c₃, c₄)

x ₁ 0.3 0.4 0.2 0.1 0.4 0.1

x ₂ 0.6 0.2 0.12 0.08 0.6 0.08

x ₃ 0.8 0.1 0.07 0.03 0.8 0.03

x ₄ 0.2 0.5 0.25 0.05 0.5 0.05

x ₅ 0.1 0.1 0.79 0.01 0.79 0.01

$φ_{4}^{T} = 0.618$ $φ_{4}^{T} = 0.054$

Objects	c ₁	c ₂	c ₃	c ₄	φ (c₁, c₂, c₃, c₄)	φ (c₁, c₂, c₃, c₄)
x ₁	0.3	0.4	0.2	0.1	0.4	0.1
x ₂	0.6	0.2	0.12	0.08	0.6	0.08
x ₃	0.8	0.1	0.07	0.03	0.8	0.03
x ₄	0.2	0.5	0.25	0.05	0.5	0.05
x ₅	0.1	0.1	0.79	0.01	0.79	0.01
					$φ_{4}^{T} = 0.618$	$φ_{4}^{T} = 0.054$

However, note that class c₄ can be eliminated without deteriorating the degree of coverage of the remaining classes. Indeed, after eliminating such a class we obtain the values $φ_{3}^{T} = 0.618$ and $φ_{3}^{T} = 0.138$ . Therefore, the following questions arise: how to determine which class can be eliminated? Is it better to just delete that class or to reconstruct the partition with a class less? The degree of redundancy has increased, what would be an acceptable threshold?

Regarding the above questions, we consider it necessary to use all the information available by the fuzzy classification system (recursive rules and their negations) and establish a procedure that does not consider recursive rules in isolation. Therefore, two key aspects arise: firstly, how to establish a structure that allows comparing recursive rules and their negations. Secondly, once this structure is established, how to determine a criterion that allows finding the optimal number of classes in which a set of data should be partitioned, considering the properties of coverage and redundancy of the set of classes and the relevance of each class. Therefore, we require a structure that allows studying the fuzzy partition from two perspectives: on the one hand, measuring properties of the whole partition or of a subset of classes of the partition (global properties); and on the other hand, measuring properties of each class (local properties).

The relevance property, in the case of fuzzy partitions, is a key element because an element or object can belong simultaneously to several classes, and sometimes its membership in one of these classes may not provide any discriminant information about the object.

Keeping this in mind, our proposal is established under the following reasoning. We consider a fuzzy classification system (C, φ, φ, N), where C = {c₁, …, c_n} describes the given context and φ, φ, N provides us with information about that context. We identify a commutative group structure in the set of recursive rules and their negations, i.e., G = {φ, φ, N (φ), N (φ)}. In order to compare the elements of such a group, we propose using a translation invariant similarity relation v on G and the corresponding equivalence classes v_t obtained from v as described in Theorem 1. From such a comparison, we establish a criterion that allows determining, when a partition has optimal levels of coverage and redundancy, and subsequently, determine whether there are irrelevant classes or sets of classes.

We propose this approach motivated by the following ideas: 1) Establishing a commutative group structure and a similarity relationship on it is motivated by the fact that the “min-transitivity” property, established for such relations, is similar to one of the conditions of the fuzzy subgroup definition (see Definition 7 above). 2) It is well known that the possible equivalence relations on any set, when ordered by set inclusion, form a complete lattice. Therefore, this provides a mechanism to compare the elements of the commutative group G by order. However, as it will be explained in the next section, performing a comparison process by establishing a similarity relation in advance is not simple, and at the same time may lack meaning for the intend purpose. Thus, one way to solve this difficulty is through proximity relations, which constitute a first step to obtain similarity relations.

4 Translation invariant similarity relation on a group of aggregation operators

In this section we present the commutative group structure formed from any fuzzy classification system. On such a structure we propose to consider similarity relations as a way of comparing the elements of such a structure.

4.1 Commutative group of aggregation operators

From a fuzzy classification system (C, φ, φ, N), we consider two new mappings σ_n, δ_n : [0, 1] ⁿ → [0, 1] , defined for all aggregation operators φ_n and φ_n. In this way, σ_n : [0, 1] ⁿ → [0, 1] is defined as $\begin{matrix} σ_{n} (μ_{1} (x), \dots, μ_{n} (x)) = N (φ_{n} (μ_{1} (x), \dots, μ_{n} (x))) \end{matrix}$ (1) and δ_n : [0, 1] ⁿ → [0, 1] is defined as $δ_{n} (μ_{1} (x), \dots, μ_{n} (x)) = N (φ_{n} (μ_{1} (x), \dots, μ_{n} (x)))$ (2)

Notice that when we use the strong negation N on Eq. (1) or Eq. (2), the resulting expression can be interpreted as the complement of a proposition related to the classes {μ₁ (x) , …, μ_n (x)}.

In particular, if φ_n (μ₁ (x) , …, μ_n (x)) represents the degree of coverage of the classes, then N (φ_n (μ₁ (x) , …, μ_n (x))) represents the degree of non-coverage of the classes, understanding φ_n as a proposition and N (φ_n) as the negation of such a proposition. In a similar way, if φ_n (μ₁ (x) , …, μ_n (x)) represents the degree of redundancy of the classes, then N (φ_n (μ₁ (x) , …, μ_n (x))) represents the degree of non-redundancy of the classes.

In the perspective of Definition 4, the degree of global non-covering and the degree of global non-overlap are defined and denoted as $σ_{n}^{T}$ and $δ_{n}^{T}$ .

Let us recall from Definition 3 that if N is a strong negation operator, then φ_n (μ₁ (x) , …, μ_n (x)) = N [φ_n (N (μ₁ (x)) , …, N (μ_n (x)))] and thus, given the mapping σ_n, it holds that σ_n (μ₁ (x) , …, μ_n (x)) = N (N [φ_n (N (μ₁ (x)) , …, N (μ_n (x)))] and therefore, $σ_{n} (μ_{1} (x), &, μ_{n} (x)) = ϕ_{n} (N (μ_{1} (x)), &, N (μ_{n} (x)))$ (3)

The idea that motivates this construction is based on the possibility of establishing a relationship between the conjunctive and disjunctive operators, together with their negations, in such a way that a fuzzy partition can be evaluated taking into account both global and local properties of the corresponding fuzzy partition. We seek to compare the degrees of coverage, overlap, non-coverage and non-overlap in a iterative process for fuzzy partitions with different number of classes, and determine the partition with the highest quality. Such iterative process will be explained in the next section.

A close relationship is established between the set of mappings φ_n, φ_n, σ_n and δ_n, as follows. According to Eq. (1) and Eq. (2), we can write φ_n, σ_n and δ_n in terms of φ_n, and therefore we denote:

$φ_{n}^{*} = N [φ_{n} (N (μ_{1} (x)), \dots, N (μ_{n} (x)))] = φ_{n} (μ_{1} (x), \dots, μ_{n} (x))$ ,

$φ_{n}^{\land} = φ_{n} (N (μ_{1} (x)), \dots, N (μ_{n} (x))) = σ_{n} (μ_{1} (x), \dots, μ_{n} (x)),$

$φ_{n}^{\sim} = N (φ_{n} (μ_{1} (x), \dots, μ_{n} (x))) = δ_{n} (μ_{1} (x), \dots, μ_{n} (x)) .$

Thus, “∗”, “∧” and “∼” are operations on φ_n forming φ_n, σ_n and δ_n i.e., each one is a unary operation. In this sense, an identity operation i may be also defined as, $φ_{n}^{i} =$ φ_n (μ₁ (x) , …, μ_n (x)).

Let B be the set of such operations, i.e., B = {∗ , ∧ , ∼ , i } and let ∘ be the composition of operations. If ▽ , △∈B, then it is $▽ \circ △ = φ_{n}^{▽ △}$ . For instance, $φ_{n}^{\sim \land}$ is given by the composition of “∼” and “∧”. As $φ_{n}^{\land} = σ_{n}$ , then by Eq. (3) it holds that,

$σ_{n}^{\sim} = N (σ_{n} (μ_{1} (x), \dots, μ_{n} (x))) = N (φ_{n} (N (μ (x_{1})), \dots, N (μ (x_{n})))) = φ_{n}$ .

Without loss of generality, we refer to φ_n = φ_n (μ₁ (x) , …, μ_n (x)). We denote by φ_kn = φ_n (μ₁ (x_k) , …, μ_n (x_k)) the aggregation of the membership functions μ_n of the element x_k for the n classes.

Keeping this in mind, we establish Definition 10.

Definition 10. Let us denote G_o = {φ_n, φ_n, σ_n, δ_n}. An operation ⊙ : G_o × G_o → G_o on G_o may be defined as θ_n ⊙ λ_n = φ^▽ ⊙ φ^△ $= φ_{n}^{▽ △}$ , for all θ_n, λ_n∈ G_o and ▽, △ ∈B.

For instance, $δ_{n} ⊙ σ_{n} = φ_{n}^{\sim} ⊙ φ_{n}^{\land} = φ_{n}^{\sim \land} = φ_{n}$ . Similarly, e.g. $φ_{n} ⊙ φ_{n} = φ_{n}^{i} ⊙ φ_{n}^{*} = φ_{n}^{i *}$ , and thus, as $φ_{n}^{*} =$ φ_n and $φ_{n}^{i} = φ_{n}$ , it is φ_n ⊙ φ_n = φ_n.

Based on the above, Table 3 presents the results for operation ⊙ .

Table 3

Operation ⊙

⊙	φ _n	σ _n	δ _n	φ _n
φ _n	φ _n	σ _n	δ _n	φ _n
σ _n	σ _n	φ _n	φ _n	δ _n
δ _n	δ _n	φ _n	φ _n	σ _n
φ _n	φ _n	δ _n	σ _n	φ _n

Proposition 1. (G_o, ⊙) is a commutative group.

Proof. It is straightforward from Table 3, since φ_n is the neutral element and each element is its own inverse.□

Clearly, (G_o, ⊙) can be viewed as a normal subgroup of the alternating group A₄ 2 . Let G_p = {e, a, b, c} be the set of permutations of G_o onto itself, where

e= $(\begin{matrix} φ_{n} & σ_{n} & δ_{n} & φ_{n} \\ φ_{n} & σ_{n} & δ_{n} & φ_{n} \end{matrix})$ a= $(\begin{matrix} φ_{n} & σ_{n} & δ_{n} & φ_{n} \\ σ_{n} & φ_{n} & φ_{n} & δ_{n} \end{matrix})$

b= $(\begin{matrix} φ_{n} & σ_{n} & δ_{n} & φ_{n} \\ δ_{n} & φ_{n} & φ_{n} & σ_{n} \end{matrix})$ c= $(\begin{matrix} φ_{n} & σ_{n} & δ_{n} & φ_{n} \\ φ_{n} & δ_{n} & σ_{n} & φ_{n} \end{matrix})$

Here, under the composition of permutations, e is the identity composition and each composition is its own inverse. Thus, there is an isomorphism τ : G_p → G_o where τ (e) = φ_n, τ (a) = σ_n, τ (b) = δ_n and τ (c) = φ_n .

4.2 Translation invariant similarity relation

In general, a partition is of quality when it has high degrees of covering and low degrees of overlap. Therefore, when we include the degree of non-coverage and the degree of non-overlap a partition is considered of quality if it has low non-coverage degree and high non-overlap degree. However, considering the group structure formed by the degrees of coverage, overlap and its negations, it is possible to establish a quality criterion by comparing all the elements of such group simultaneously.

According to the above, the idea of similarity relations on groups appears naturally and allows establishing new structures that preserve the properties of the fuzzy subgroup.

In this section, we present a way to establish a translation invariant similarity relation v on G. From this relation v it is possible to define equivalence relations v_t on G for all t ∈ [0, 1], which can be ordered by inclusion. These equivalence relations are composed of the information coming from comparing the rules of the fuzzy classification system and its negations. Therefore, we can order such information and thus analyze, in a simultaneous way, both the coverage and the redundancy of the classes.

The above is an aspect to highlight in our proposal because a calibration procedure for different indices is being considered, and not just the indices separately or even a single index.

Based on the theoretical framework developed in [27], consider the G_o and G_p groups addressed in Proposition 1, and let v_o be a similarity relation on G_o and v the similarity relation defined as follows: $v (x, y) = min {v_{o} (x (θ), y (θ)) | θ \in G_{o}} \forall x, y \in G_{p}$ (4) Operationally, we have for instance that

v (a, b) = min {v_o (a (φ_n) , b (φ_n)) , v_o (a (σ_n) , b (σ_n)) , v_o (a (δ_n) , b (δ_n)) , v_o (a (φ_n) , b (φ_n))}

From Eq. (4) it is immediate that the following proposition is fulfilled.

Proposition 2. If v_o is a translation invariant similarity relation on G_o, then v is a translation invariant similarity relation on G_p.

Proof. Given v (zx, zy) = min {v_o (zx (θ) , zy (θ)) |θ ∈ G_o}, as v_o is translation invariant then it holds that v (zx, zy) = min {v_o (x (θ) , y (θ)) |θ ∈ G_o} = v (x, y). Analogously, for right-invariant similarity relation, it holds that v (xz, yz) = min {v_o (x (θ) , y (θ)) |θ ∈ G_o} = v (x, y).□

Notice that the construction of the similarity relation v requires the similarity relation v₀. The relation v is necessary in order to be consistent with the established group structure, since the relation v₀ is not enough to this aim. However, establishing a similarity relation is not always a simple process because the property of min-transitivity is difficult to verify. Thus, to obtain a similarity relation v_o on G, which will give rise to the similarity relation v via Eq. 4, we propose considering a fuzzy proximity relation denoted by w. Such relation ensures that in the comparison process, each element of G is declared as totally similar to itself, i.e., the similarity of the element with itself is equal to 1. In the same line, the symmetry condition of proximity relations assures a consistent comparison process because the differences or similarities between two elements should not depend on the order in which they are related. In this sense, the properties of reflexivity and symmetry are very appropriate for expressing the degree of "closeness" or "proximity" between elements.

Subsequently, we obtain the transitive clousure of w. Let us recall that since the lack of transitivity distinguishes proximity relations from similarity relations, the transitive clousures of proximity relations are similarity relations. Thus, if we denote by $\tilde{w}$ the transitive closure of w, then $\tilde{w}$ is the desired similarity relation, i.e., $\tilde{w} = v_{o}$ . Example 5 presents the construction of v for the group G_o, and a specific proximity relation.

Example 5. Given the commutative group G_o = {φ_n, φ_n, σ_n, δ_n}, let us define the operator ξ : G_o × G_o → [0, 1] such that for all θ_n, λ_n ∈ G_o it is ξ (θ_n, λ_n) = |θ_n - λ_n|. In accordance with Definition 4, we define $ξ^{T} (θ_{n}, λ_{n}) = \frac{{\sum_{k = 1}}^{m} | θ_{kn} - λ_{kn} |}{m}$ . Recall that θ_kn = θ_n (μ₁ (x_k) , …, μ_n (x_k)) is the aggregation of the membership functions μ_n of the element x_k ∈ X to the n classes. Thus, we select the proximity relation $w (θ_{n}, λ_{n}) = 1 - ξ^{T} (θ_{n}, λ_{n}) = 1 - \frac{{\sum_{k = 1}}^{m} | θ_{kn} - λ_{kn} |}{m}$ where m = |X|. Once w is established, we compute its transitive closures to obtain v_o, i.e., $\tilde{w} = v_{o}$ . Thus, in accordance with Eq. 4 we consider the translation invariant similarity relation given by v (x, y) = min {v_o (x (θ_n) , y (θ_n)) |θ_n ∈ G_o} where x, y ∈ G_p = {e, a, b, c}.

It is important to note that the selected proximity relation must effectively reflect the idea of closeness.

At this point, let us remark two important aspects about the above line of reasoning: 1) The similarity relation v allows undertaking a comparison process by inclusion of equivalence relations. Regarding this process, the interpretation of the proximity relation is as follows: w (φ_n, φ_n) provides the degree that covers the redundancy of the classes, w (φ_n, σ_n) provides the degree of proximity between the covering and the non-covering and, w (φ_n, δ_n) provides the degree of coverage without considering the redundancy of the classes. Therefore, it is desirable, for instance, that the degree of similarity between φ_n and φ_n, i.e., the degree that covers the redundancy of the classes, be low and lower than the degree of similarity between $φ_{n}^{T}$ and $δ_{n}^{T}$ , i.e., the degree of coverage without considering the redundancy. Notice that these ideas may help in establishing a criterion to determine when a partition is of quality and when it is a high quality partition. 2) The similarity relation v generates a partition on G_p. However, as v has been constructed from w and G_p has been constructed from G_o, in this process the values of φ_n, φ_n, σ_n and δ_n given by the proximity relation can permute, and therefore change their meaning. However, under certain conditions it is possible to guarantee the coherence of this reasoning, as will be proved in Theorem 3 and Theorem 4 below.

To see this theorems, let us recall that G_p = {e, a, b, c} is the group formed by the set of even permutations of G_o onto itself, $v (x, y) = min {v_{o} (x (θ), y (θ)) | θ \in G_{o}^{T}}$ , ∀x, y ∈ G_p and v_o is the transitive closure of the proximity relation w.

Theorem 3. Consider the groups G_o = {φ_n, φ_n, σ_n, δ_n} and G_p = {e, a, b, c}, as well as a proximity relation w on G_o. Let be 1 = w (φ_n, φ_n), k = w (φ_n, φ_n), m = w (φ_n, δ_n), p = w (φ_n, δ_n) and s = w (φ_n, σ_n). If m > s ≥ k ≥ p, then v (c, b) = m and v (c, a) = s.

Proof. Table 4 summarizes the definition of the relation w.

Table 4

Operation ⊙

⊙	φ _n	σ _n	δ _n	φ _n
φ _n	1	m	p	k
σ _n	m	1	k	s
δ _n	p	k	1	m
φ _n	k	s	m	1

Let H be the membership matrix of w as shown in Table 4. Then, we compute H′ = H ∪ (H ∘ H) with the max operator for union and max - min composition to find v_o, i.e., the transitive closure of w. If H′ ≠ H then set H = H′ and repeat the process until the equality is reached. After the first step, we obtain

$H^{'} = (\begin{matrix} 1 & m & k & s \\ m & 1 & s & s \\ k & s & 1 & m \\ s & s & m & 1 \end{matrix})$ .

If m > s = k = p, then H′ = H, and thus H′ is the membership matrix of v_o. Therefore, we compute v (c, b) and v (c, a) as follow: $\begin{matrix} v (c, b) = \\ min {v_{o} (c (φ_{n}), b (φ_{n})), v_{o} (c (σ_{n}), b (σ n)), \\ v_{o} (c (δ_{n}), b (δ_{n})), v_{o} (c (φ_{n}), b (φ_{n}))} = \\ min {v_{o} (σ_{n}^{T}, φ_{n}^{T}), v_{o} (φ_{n}^{T}, σ_{n}^{T}), \\ v_{o} (δ_{n}^{T}, φ_{n}^{T}), v_{o} (φ_{n}^{T}, δ_{n}^{T})} = \\ min {m, m, m, m} = m . \end{matrix}$

Now, in an analogous way we obtain v (c, a) = s. However, if m > s = k > p then H′ ≠ H, and therefore the process is repeated with H = H′, i.e., H^′′ = H′ ∪ (H′ ∘ H′). This leads to

$H^{''} = (\begin{matrix} 1 & m & s & s \\ m & 1 & s & s \\ s & s & 1 & m \\ s & s & m & 1 \end{matrix})$ .

In this case, it is H^′′ = H′ and again v (c, b) = m and v (c, a) = s. However, if m > s > k > p then H^′′≠ H′, and therefore the process has to be repeated with H′ = H^′′, i.e., H^′′′ = H^′′ ∪ (H^′′ ∘ H^′′). Now, it is

$H^{'''} = (\begin{matrix} 1 & m & s & s \\ m & 1 & s & s \\ s & s & 1 & m \\ s & s & m & 1 \end{matrix})$ .

As H^′′′ = H^′′, the transitive closure has been obtained after considering all possible cases, and again it holds that v (c, b) = m and v (c, a) = s.□ Theorem 3 indicates that, under certain conditions, the similarity and proximity relations between $σ_{n}^{T}$ and $φ_{n}^{T}$ and between $φ_{n}^{T}$ and $δ_{n}^{T}$ are equivalent. Therefore, if m > s ≥ k ≥ p then, v_t=s (c, a) ⊂ v_t=m (c, b).

Theorem 4. Consider the groups G_o = {φ_n, φ_n, σ_n, δ_n} and G_p = {e, a, b, c}, as well as a proximity relation w on G_o. Let be 1 = w (φ_n, φ_n), k = w (φ_n, φ_n), m = w (φ_n, δ_n), p = w (φ_n, δ_n) and s = w (φ_n, σ_n). If m > k > s ≥ p or k > m > s ≥ p, then v (c, b) = m and v (c, e) = k.

Proof. Analogous to the proof of Theorem 3.□ In this case, Theorem 4 indicates that, under certain conditions, the similarity and proximity relations between $φ_{n}^{T}$ and $φ_{n}^{T}$ and between $φ_{n}^{T}$ and $δ_{n}^{T}$ are equivalent. Therefore, if m > k > s ≥ p or k > m > s ≥ p, then v_t=k (c, e) ⊂ v_t=m (c, b).

Additionally, Theorem 3 and Theorem 4 guarantee that it is possible to consider only the values obtained by the proximity relation and the order relation on them, without altering the order relation (by inclusion) on equivalence relations. Similarly, the structure of fuzzy subgroup obtained is preserved (see Theorem 2). We will use this result in the next section to establish a quality criterion that is invariant under both similarity and proximity relations.

5 Criteria and iterative comparison process

Based on the algebraic structure of commutative group and a translation invariant similarity relation on such group, in this section we define a quality criterion for a fuzzy partition and a relevance criterion for classes. An algorithm that describes the comparison process is presented. Such algorithm allows establishing the minimum number of classes in a fuzzy partition in such a way that a high-quality fuzzy partition is obtained. Additionally, we present a criterion that allows comparing two fuzzy partitions that meet the quality criterion and the relevance criterion.

Following the arguments and remarks previous to Theorem 3, we propose the first quality criterion. Consider a fuzzy classification system (C, φ, φ, N) with C = {c₁, . . . , c_n}, where C is obtained from a fuzzy iterative, non-hierarchical classification algorithm (e.g., fuzzy c-means or possibilistic c-means among others). Also, consider the groups G_o = {φ_n, φ_n, σ_n, δ_n} and G_p = {e, a, b, c} together with a translation invariant similarity relation v on G_p (according to Eq. 4). Then, it is possible to compute the equivalence relations v_t = {(x, y) |v (x, y) ≥ t} for all t ∈ [0, 1]. In this way, it holds that if 0 ≤ t₁ < ⋯ < t_p = 1, then v_{t
_p} ⊆ ⋯ ⊆ v_{t
₁} = G_o × G_o.

Let $1 = w (φ_{n}^{T}, φ_{n}^{T})$ , $k = w (φ_{n}^{T}, φ_{n}^{T})$ , m = $w (φ_{n}^{T}, δ_{n}^{T})$ and $s = w (φ_{n}^{T}, σ_{n}^{T})$ and let x, y ∈ G_p = {e, a, b, c}. Then, let us define the equivalence relations

K = v_t=k (x, y) ={ (x, y) |v (x, y) ≥ k },

M = v_t=m (x, y) = {(x, y) |v (x, y) ≥ m},

S = v_t=s (x, y) = {(x, y) |v (x, y) ≥ s}.

Taking into account the above considerations and as well as Theorem 3 and Theorem 4, the following quality criterion is defined:

Quality criterion. A fuzzy partition Cwith nclasses is a quality partition if m > s > k,or equivalently, if it holds that S ⊂ M. Otherwise, we say that it is of low quality. Hence, the value of mdetermines the degree of quality of the fuzzy partition, and the higher, the better.

Remember that m provides the degree of coverage without considering the redundancy of the classes. Furthermore, we can extend the same argument for a second relevance criterion.

Relevance criterion. Given a fuzzy partition Cwith nclasses, let C_i ⊆ C = {c₁, …, c_n} be a fixed nonempty family of classes, such that C_iis the subset of n - 1classes obtained by deleting class c_i ∈ C, with i = {1, . . . , n}. Also, let k_i, m_i, and s_ibe the corresponding values for won C_i. If k_i > s_i, or equivalently if it holds that S_i ⊂ K_i,then c_iis a highly relevant class in partition C. Otherwise, we say that c_i is of low relevance for partition C.

Notice that this criterion proceeds by removing a class from the partition under consideration, one at a time, in order to study its relevance. In this sense, the removed class will be relevant if the aggregation of the remaining classes indicates that the degree of non-coverage is greater than the degree of overlap. In other words, a class is considered relevant if its elimination decreases the degree of coverage in a greater proportion than the degree of overlap.

Therefore, we can state on more formal terms the basic intuition supporting both criteria, by the following definition.

Definition 11. A fuzzy quality partition is called a high-quality fuzzy partition if all classes are relevant.

According to the criteria established above, it is possible to develop a procedure to assess the quality and relevance of a given fuzzy partition. This procedure is summarized in the Algorithm 1.

This Algorithm 1 proceeds as follows: In the first step, a partition C = {c₁, c₂} is computed through an iterative, non-hierarchical classification algorithm. For practical purposes we will use the fuzzy c-means. From a proximity relation, Algorithm 1 determines if the partition C is of quality. In the case n = 2 (two classes), the relevance of the classes is not evaluated because, under our framework, the aggregation of a single element or fuzzy partitions with a single class are not considered. Therefore, if the partition with two classes is of quality, then the algorithm ends. Otherwise, the algorithm increases the number of classes by one, returning to step 1, until a high-quality partition is found, that is, a quality partition in which all its classes are relevant.

Algorithm 1 Evaluation of quality of a fuzzy partition

Input: X finite set of objects, |X| = r, recursive rules φ and φ and the negation N (corresponding to a fuzzy classification system), proximity relation w and n = 2. Let z be an integer denoting the maximum number of classes to be considered.

1: C = output partition of the fuzzy c-means with n classes

2: Compute φ_n, φ_n, σ_n and δ_n. Consider G_o = {φ_n, φ_n, σ_n, δ_n}

3: Compute w on G_o. Let k = w (φ_n, φ_n), m =. w (φ_n, δ_n), s = w (φ_n, σ_n)

4: If m < s ≤ k then

5: While n < z do

6: n = n + 1, return to 1.

7: end while

8: RETURN “No quality partition found” and END.

9: else if m > k > s and c = 2 then return C and END.

10: else if m > k > s and c > 2 then go to 12

11: end if

12: Compute C_i for all i = 1, …, n

13: Compute φ_i, φ_i, σ_i and δ_i for each C_i and consider G_i = {φ_i, φ_i, σ_i, δ_i}

14: for each C_i do compute w on G_i

15: if s_i ≤ k_i and n < z then n = n + 1, return to 1.

16: else if k_i > s_i then RETURN C and END.

17: else if n = z then RETURN “No high-quality partition found” and END.

18: end if

19: end for

Notice that, in particular, the process ends when a high-quality fuzzy partition has been found, i.e, a partition fullfiling the quality and relevance criteria and with the smallest number of classes. However, it can happen that there is no optimal number of classes. In this case, it is suggested to restart the fuzzy c-means algorithm with a different seed or with a different fuzziness parameter. Similarly, it can happen that there are two partitions fulfilling all the proposed criteria. In principle, the partition with the smallest number of classes is desirable, but this does not guarantee that relevant information is not lost. Therefore, we propose an additional criterion to compare two high-quality fuzzy partitions when we use Algorithm 1 several times under different configurations (different seeds or fuzziness parameter values). Let us stress that this criterion is not used in Algorithm 1.

Comparison criterion. Let C_{n
₁}and C_{n
₂}be two high-quality fuzzy partition on X. Let m₁ = and m₂ = respectively be the values for w for each partition. Then C_{n
₁} is better that C_{n
₂} if m₁ > m₂.

6 Application

In order to apply the defined criteria and Algorithm 1, we have selected the image presented in Fig 1., considering the unsupervised classification problem of obtaining classes of similar pixels. We consider the recursive triplet of Example 1 and the translation invariant similarity relation of Example 5.

Fig.1

Aurora borealis.

According to Algorithm 1, we start by obtaining a partition with n=2 classes through the fuzzy c-means algorithm, leading to obtain m = 1, s = 0.44, k = 0.44. As s = k, then the partition is of low quality.The corresponding classes of the fuzzy partition are presented in Fig. 2.

Fig.2

Classes obtained by applying the fuzzy 2-means algorithm (left: class 1, right: class 2). The gray scale represents the membership degree of each pixel to each class, where black = 0 and white = 1.

The low quality of the partition obtained from the image can be observed in some regions where objects with different intensity of color have been classified in the same class. For instance, there is almost no distinction between trees and much of the sky.

According to Algorithm 1, we again apply the fuzzy c-means obtain a new partition with n = 3 classes. Then, we compute the proximity relation w, which is shown in Table 5.

Table 5

Proximity relation w

w	φ ₃	σ ₃	δ ₃	φ ₃
φ ₃	1	0.7	0.03	0.32
σ ₃	0.7	1	0.32	0.61
δ ₃	0.03	0.32	1	0.7
φ ₃	0.32	0.61	0.7	1

As w (φ₃, δ₃) =0.7 > w (φ_n, σ_n) =0.61 this 3-class partition is a quality partition. After computing the transitive closure of w, we compute the (translation invariant) similarity relation v, which is shown in Table 6.

Table 6

Similarity relation v

v ₀	φ ₃	σ ₃	δ ₃	φ ₃
φ ₃	1	0.7	0.61	0.61
σ ₃	0.7	1	0.61	0.61
δ ₃	0.61	0.61	1	0.7
φ ₃	0.61	0.61	0.7	1

Thus, we have that v_t=0.7 (c, b) ⊂ v_t=0.61 (c, a). Equivalently, by Theorem 2 and Theorem 3, we have that v (c, b) =0.7 ⊂ v (c, a) =0, 61 (being v a fuzzy subgroup).

As the partition is a quality partition, following Algorithm 1 and according to Step 12, we evaluate the relevance of the classes considering the 2-class sub-partitions C₁, C₂, C₃ and computing the corresponding values for w.

For C₃ it is k₃ = w (φ₂, φ₂) =0.67 > w (φ₂, σ₂) =0.46 = s₃. For C₂ it is k₂ = w (φ₂, φ₂) =0.59 > w (φ₂, σ₂) =0.4 = s₂ and for C₁ it is k₃ = w (φ₂, φ₂) =0.69 > w (φ₂, σ₂) =0.48 = s₃.

According to the Relevance Criterion and Step 16, all the classes are then highly relevant. The corresponding classes of the fuzzy partition are presented in Fig. 3.

Fig.3

Classes obtained after applying the fuzzy 3-means algorithm (left: class 1, center: class 2 and right: class 3). The gray scale represents the membership degree of each pixel to each class, where black = 0 and white = 1.

Under the procedure above, we may conclude that this 3-class partition obtained through the fuzzy c-means is a high-quality partition. In this partition, for example, there is a much better distinction than before between the trees and the sky. In general, with a fuzzy partition with three classes, the image is segmented into regions that allow understanding it in terms of the intensity of the color of the objects.

As a way of checking these results, we later applied the fuzzy c-means to obtain a partition with n = 4 classes, for which the different quality indicators were computed. Particularly, for this partition we obtained m = w (φ₄, δ₄) =0.62 < w (φ₄, σ₄) =0.73 = s. Therefore, this 4-class partition was assessed as of low quality, which somehow provides a confirmation of the previous result.

Furthermore, in order to illustrate the robustness of the obtained results, the procedure has been performed with two different De Morgan triplets (given in Table 7). Table 8 summarizes the results.

Table 7

Two De Morgan Triplets

Rules	φ_n (μ₁ (x) , …, μ_n (x))	φ_n (μ₁ (x) , …, μ_n (x))	N (x)
Triplet 1	min(μ₁ (x) , …, μ_n (x))	max(μ₁ (x) , …, μ_n (x))	1 - x
Triplet 2	$\frac{\sqrt{\underset{i = 1}{\prod^{n}} μ_{i} (x)}}{\sqrt{\underset{i = 1}{\prod^{n}} μ_{i} (x)} + 1 - \underset{i = 1}{\prod^{n}} μ_{i} (x)}$	$\frac{{\sum_{i = 1}}^{n} μ_{i} (x) - \underset{i = 1}{\prod^{n}} μ_{i} (x)}{\sqrt{\underset{i = 1}{\prod^{n}} (1 - μ_{i} (x))} + {\sum_{i = 1}}^{n} μ_{i} (x) - \underset{i = 1}{\prod^{n}} μ_{i} (x)}$	1 - x

Table 8

Quality indicators of the given 3-class partition with two different recursive triplets.

n = 3	Triplet 1	Triplet 2
C	m = 0.84, s = 0.38	m = 0.77, s = 0.57
C ₁	m = 0.68, s = 0.36, k = 0.5	m = 0.69, s = 0.51, k = 0.65
C ₂	m = 0.63, s = 0.33, k = 0.44	m = 0.62, s = 0.51, k = 0.59
C ₃	m = 0.67, s = 0.36, k = 0.51	m = 0.68, s = 0.51, k = 0.67

Under triplets 1 and 2 the same result was obtained, allowing us to assert that the optimal number of classes for the image worked are three classes.

7 Final comments

Through this paper, some basic elements are examined for characterizing the property of relevance regarding the evaluation of a fuzzy partition from a classification system. In particular, three aspects are considered for the study of relevance: 1) the comparison process between classes and the way they cover the objects under consideration; 2) the estimation of degrees of intensity in the changes generated by the elements in the valuation space; and 3) the specification of a stopping criterion for inclusion of classes in a fuzzy partition.

We explore the algebraic group structure G_o = {φ_n, φ_n, σ_n, δ_n}, established from a fuzzy classification system. On such a structure, a translation invariant similarity relation v is applied in such a way that the equivalence relations v_t form a lattice under inclusion. Such equivalence relations acquire practical meaning because they allow comparing the coverage of a family of classes and the overlap degree, non-covering degree and non-overlap degree. However, our proposal establishes similarity relations based on proximity relations. In this way, some values of both relations remain invariant. This is, we prove that under certain conditions the degrees of similarity and the degrees of proximity for certain elements of the group are the same. Such conditions allow establishing a criterion of relevance for the classes and a criterion of quality for the partitions.

Degrees of similarity have been presented by examining pairs of properties, such as the similarity between the coverage degree and the overlap degree, or the similarity between the degree of grouping and non-overlap. Based on this, it is desirable that there is a greater similarity between the degrees of coverage and non-overlap, than between the degree of coverage and overlap. Hence, we proposed a couple of criteria that allow determining the quality and relevance of a fuzzy partition, showing that relevance is a local property and is responsible for determining the size of the partition, i.e., the optimal number of classes of a fuzzy partition.

We established a process that helps the decision maker to choose the best family of classes. Such process, that we have described algorithmically, is a iterative process because such an algorithm requires obtaining fuzzy partitions with different number of classes, and possibly also different initialization, as many times as necessary, until a high-quality partition is found. In this sense, the process ends when it has found the high-quality fuzzy partition with the lowest number of classes (if possible).

In general, a fuzzy partition is said to be of quality if it has high degrees of coverage and low degrees of overlap. However, this is a feature that several partitions may have. Therefore, we established an extra criterion that allows comparing two or more high-quality fuzzy partitions. In such a process, the evaluation of the relevance of the classes, as expected, is a determining aspect.

As future work, we propose to compare the established criteria with traditional indexes for fuzzy partitions, in such a way that two scenarios can be presented: on the one hand, complementing the existing indexes with the criteria we have proposed or, on the other hand, finding other advantages in the use of the criteria.

In the same line, it is necessary to carry out a study on the performance of the criteria with different families of De Morgan triplets and different proximity relations. In general, we established a process that does not depend on these two elements, however, we may carry out processes that are computationally inefficient. A key aspect in our proposal is to establish a proximity relation that effectively reflects the idea of closeness.

Another future work that may be addressed in this context is the introduction of paired fuzzy sets (see [25]). In this way we could extend the classification model not only introducing general aggregation tools like overlap and grouping degrees, but incorporating opposite arguments, as for instance, the degree of class separation, perhaps understanding separate as the opposite concept of overlap. Despite the apparent increase in computation difficulties due to the extra information being included, the existence of opposites should help in collecting more evidence for properly assessing the quality of a partition, and hopefully improve the confidence in the model. Similarly, our proposal may be extended to the alternative approach proposed in [26], where the concept of an aggregation rule should be defined from a computational point of view, focusing on the computational properties of such an aggregation, i.e., on the manner in which the aggregation values are computed.

Footnotes

In a group structure (G, ∗), x^-1 denotes the inverse of the element x for operation ∗.

An alternating group is a group of even permutations on a set of length n, denoted A_n or Alt (n). Alternating groups are therefore permutation groups. In particular, A₄ = {id, (12) (34) , (13) (24) , (14) (23)} .

Acknowledgment

This research has been partially supported by the Government of Spain (grant PGC2018-096509-B-I00) Complutense University (UCM Research Group 910149) and Gran Colombia University (grant JCG2019-FCEM-01).

References

Amo

, Montero

, Biging

and Cutello

, Fuzzy classification systems, Eur J Oper Res 9 156 (2004), 495–507.

Amo

, Gomez

, Montero

and Biging

, Relevance and Recundancy in fuzzy classification systems, Mathw Soft Comput 8 (2001), 203–216.

Baraldi

and Blonda

, A survey of fuzzy clustering algorithms for pattern recognition - Part I, IEEE Trans Syst Man Cybern Part B Cybern 29 (1999), 778–785.

Baraldi

and Blonda

, A survey of fuzzy clustering algorithms for pattern recognition - Part II, IEEE Trans Syst Man Cybern Part B Cybern 29 (1999), 786–801.

Bellman

, Kalaba

and Zadeh

, Abstraction and pattern classification, J Math Anal Appl 13 (1966), 1–7.

Benzecri

J.-P.

, Statistical Analysis As a Tool To Make Patterns Emerge From Data, in: Watanabe

(Ed.), Methodol.attern Recognit., Academic Press (1969), 35–74.

Bezdek

J.C.

and Douglas Harris

, Fuzzy partitions and relations; an axiomatic basis for clustering, Fuzzy Sets Syst 1 (1978), 111–127.

Bustince

, Fernández

, Mesiar

, Montero

and Orduna

, Overlap functions, Nonlinear Anal, Theory Methods Appl 72 (2010), 1488–1499.

Bustince

, Barrenechea

, Pagola

, Fernández

, The notions of overlap and grouping functions, in: Stud, Fuzziness Soft Comput (2016), 137–156.

10.

Bustince

, Pagola

, Mesiar

, Hüllermeier

and Herrera

, Grouping, overlap, and generalized bientropic functions for fuzzy modeling of pairwise comparisons, IEEE Trans Fuzzy Syst 20 (2012), 405–415.

11.

Calvo

, Kolesárová

, Komorníková

and Mesiar

, Aggregation Operators. Studies in Fuzziness and Soft Computing 97 (2002), 3–104.

12.

Castiblanco

, Montero

, Rodríguez

J.T.

and Gómez

, Quality assessment of fuzzy classification: An application to solvency analysis, Fuzzy Econ Rev 22 (2017), 19–31. 9.

13.

Castiblanco

, Gómez

, Montero

, Rodríguez

J.T.

, Aggregation tools for the evaluation of classifications, in: IFSASCIS 2017 - Jt. 17th World Congr. Int. Fuzzy Syst. Assoc. 9th Int. Conf. Soft Comut. Intell. Syst., IEEE, Otsu, Japan, (2017), pp. 1–5.

14.

Das

, Pattern Recognition using the Fuzzy c-means Technique, Int J Energy Inf Commun 4 (2013), 1–14.

15.

Dombi

, A general class of fuzzy operators, the demorgan class of fuzzy operators and fuzziness measures induced by fuzzy operators, Fuzzy Sets Syst 8 (1982), 149–163.

16.

Dombi

, Basic concepts for a theory of evaluation: The aggregative operator, Eur J Oper Res 10 (1982), 282–293.

17.

Gómez

, Rodríguez

J.T.

, Montero

, Bustince

and Barrenechea

, N-Dimensional overlap functions, Fuzzy Sets Syst 287 (2016), 57–75.

18.

Gopal

and Woodcock

, Theory and methods for accuracy Passessment of thematic maps using fuzzy sets, Photogramm Eng Remote Sens 60 (1994), 181–188.

19.

Halkidi

, Batistakis

and Vazirgiannis

, On clustering validation techniques, J Intell Inf Syst 17 (2001), 107–145.

20.

Klement

E.P.

and Moser

, On the redundancy of fuzzy partitions, Fuzzy Sets Syst 85 (1997), 195–201.

21.

Klir

G.J.

and Yuan

, Fuzzy sets and fuzzy logic: Theory and Applications. Prentice Hall PTR, New Jersey, (1995).

22.

Krishnapuram

and Keller

J.M.

, An Possibilistic Approach to Clustering and Ieee Trans, IEEE Trans Fuzzy Syst 1 (1993), 98–110.

23.

Kundu

, Membership functions for a fuzzy group from similarity relations, Fuzzy Sets Syst 101 (1999), 391–402.

24.

Mitra

and Pal

S.K.

, Fuzzy sets in pattern recognition and machine intelligence, Fuzzy Sets Syst 156 (2005), 381–386.

25.

Montero

, Bustince

, Franco

, Rodríguez

J.T.

, Gómez

, Pagola

, Fernández

and Barrenechea

, Paired structures in knowledge representation, Knowledge-Based Syst 100 (2016), 50–58.

26.

Montero

, González-del-Campo

, Garmendia

, Gómez

and Rodríguez

J.T.

, Computable aggregations, Inf Sci (2018), 439–449.

27.

Mordeson

J.N.

, Bhutani

K.R.

and Rosenfeld

, Fuzzy Group Theory, (2005).

28.

Nagalakshmi

and Jyothi

, A Survey on Pattern Recognition using Fuzzy Clustering Approaches, Int Ref J Eng Sci 2 (2013), 2319–183.

29.

Qiao

and Hu

B.Q.

, On interval additive generators of interval overlap functions and interval grouping functions, Fuzzy Sets Syst 323 (2017), 19–55.

30.

Qiao

and Hu

B.Q.

, On the migrativity of uninorms and nullnorms over overlap and grouping functions, Fuzzy Sets Syst (2017).

31.

Rojas

, Gómez

, Montero

and Tinguaro

, Rodríguez, Strictly stable families of aggregation operators, Fuzzy Sets Syst 228 (2013), 44–63.

32.

Ruspini

E.H.

, A new approach to clustering, Inf Control 15 (1969), 22–32.

33.

Sperber

and Wilson

, Précis of Relevance: Communication and Cognition, Behav Brain Sci 10 (1987), 697.

34.

Wang

and Zhang

, On fuzzy cluster validity indices, Fuzzy Sets Syst 158 (2007), 2095–2117.

Evaluation of the quality and relevance of a fuzzy partition

Abstract

Keywords

1 Introduction

2 Preliminaries

Table 1 Disjunctive recursive rule Objects c 1 c 2 c 3 φ3 (c1, c2, c3) x 1 0.8 0.1 0.1 0.6329 x 2 0.7 0.2 0.1 0.5474 x 3 0.3 0.62 0.08 0.5070 x 4 0.5 0.47 0.03 0.4908

3.1 General framework

3.2 Relevance in a fuzzy partition

Table 2 Global degree Objects c 1 c 2 c 3 c 4 φ (c1, c2, c3, c4) φ (c1, c2, c3, c4) x 1 0.3 0.4 0.2 0.1 0.4 0.1 x 2 0.6 0.2 0.12 0.08 0.6 0.08 x 3 0.8 0.1 0.07 0.03 0.8 0.03 x 4 0.2 0.5 0.25 0.05 0.5 0.05 x 5 0.1 0.1 0.79 0.01 0.79 0.01 φ 4 T = 0.618 φ 4 T = 0.054

4.1 Commutative group of aggregation operators

6 Application

Footnotes

Acknowledgment

References

Table 1
Disjunctive recursive rule

Objects c ₁ c ₂ c ₃ φ₃ (c₁, c₂, c₃)

x ₁ 0.8 0.1 0.1 0.6329

x ₂ 0.7 0.2 0.1 0.5474

x ₃ 0.3 0.62 0.08 0.5070

x ₄ 0.5 0.47 0.03 0.4908