Self-organizing map with granular competitive learning: Application to microarray clustering

Abstract

Self-organizing map (SOM) models perform clustering process based on a competitive learning. The learning methods of these models involve neighborhood function such as Gaussian in the output layer, where the Euclidean distance from winning node to an output node is used. In this study, a granular competitive learning of SOM (SOMGCL) involving a fuzzy distance, the distance based granular neighborhood function and fuzzy initial connection weights is developed using the concepts of fuzzy rough set. The fuzzy distance between a winning node and an output node of SOM is computed where the average of memberships belonging to the lower approximations and boundary regions of a cluster obtained at the node is used. The fuzzy distance is incorporated into a Gaussian function to define the proposed neighborhood function. Dependency values of features using fuzzy rough sets are encoded into SOM as its fuzzy initial connection weights. Here, the concepts of fuzzy rough set are based on a new fuzzy strict order relation. While the fuzzy distance defines similarity measure in clustering process, the distance based granular neighborhood function handles uncertainty in cluster boundary regions. The effectiveness of SOMGCL is demonstrated in clustering of both the samples and genes in microarrays having the large number of genes and classes in terms of cluster evaluation metrics and quantization error. Further, biological meaning of gene clusters obtained using SOMGCL is elucidated using gene-ontology.

Keywords

Fuzzy rough set self-organizing map information granulation clustering microarray

1. Introduction

Advanced computing technologies in bioinformatics field produce different types of gene expressions in large quantities resulting in massive high-dimensional datasets. Microarray technologies provide valuable information on disease subcategories, disease prognosis, and treatment outcome [32]. A microarray is a huge amount of expression values for thousands of genes (a DNA sequence), collected from microarray experiments performed in different time conditions. The conditions could be time series during biological process or a collection of tissue samples. A single type of DNA array that is based on small single-stranded oligonucleotides synthesized in situ or complementary DNA (cDNA) is used in microarray experiments. A typical workflow of a microarray experiment is described in [33]. In the following subsections, we discuss the need for better granular computing-based learning strategies for discovering meaningful clusters as it relates to microarray data, and discuss prior work on granular neural networks for clustering a gap in the literature, followed by an overview of the proposed method.

1.1 Motivation: Discovering gene clusters with different algorithms

Data mining tasks (supervised and unsupervised learning methods) associated with microarray include the following: Discovering subgroups of samples that share common features, determining gene expression profiles which are different between two or more groups of samples, identifying differentially expressed genes, describing genes with similar biological functions, and discovering subclasses of samples affected by the same disease. Gene selection is an extremely important pre-processing step in microarray data mining due to the high-dimensional nature of the data. In [35, 36], clustering methods are used in selecting important genes in microarray data.

Clustering microarray array can be categorized as sample-based clustering or gene-based clustering. Sample-based clustering involves discovering phenotypic structures or substructures of samples where samples are treated as patterns and genes are considered features. Gene-based clustering partitions gene expression profiles into subgroups of common biological characteristics [23, 32]. Neural network-based hierarchical clustering methods were used in [11, 12]. In [22], clustering methods have proven to be helpful in the identification of diseased or normal genes and understanding gene function, gene regulation and cellular processes. An overview of clustering algorithms for gene expression data can be found in [14].

In microarray data, cluster boundaries are ill-defined and are characterized by regions that are overlapping, highly connected or embedded. These characteristics result in uncertainty, imprecise (noisy), and incomplete information. Noise problem exists in microarrays when massive experiments based on complex procedures are conducted [16]. As a result, the set-theoretic methods, for example, [15, 18, 21], designed under granular computing framework, for gene expression data, have shown to be more effective than conventional clustering methods. In [15], the initial $c$ -means are computed using a fuzzy entropy. In [18], the rough sets are used to reduce effect of neighbors of a sample in assigning it into a cluster. A rough-fuzzy $c$ -means method [21] works on the principle of constructing the lower approximation and boundary region of a cluster using rough sets. Further, the boundary region is defined in terms of possibilistic and probabilistic memberships in order to quantify imprecision in a cluster definition.

1.2 Granular computing: Granular neural networks for clustering

Granular computing is an umbrella term that refers to computing with the following set theoretic methodologies: Fuzzy sets [37], rough sets [27] or a combination of fuzzy and rough sets [6]. The basic operations used for generating granules include indiscernibility, similarity, proximity, and functionality. Granulation using fuzzy sets refers to generalized fuzzy constraint that assigns memberships to samples in a set. In rough sets, crisp equivalence classes represent granules where indiscernibility between objects is considered. Here, the equivalence classes are used to approximate a set in terms of crisp lower and upper approximations. In this study, we use fuzzy rough set theoretic approach for granular computing. Here fuzzy equivalence classes (granules) are generated based on fuzzy reflexive relations so as to represent fuzzy information granules. These fuzzy equivalence granules are combined using fuzzy implication and T-norm to determine the memberships of their belonging to the lower and upper approximations of a cluster. That means, the cluster is approximated in terms of fuzzy lower and fuzzy upper approximations.

In [1, 26], different granular neural networks were developed by integrating rough set or fuzzy rough set methodologies and self-organizing map (SOM) [17]. In [25], rough sets were used to determine the initial connection weights (network parameters) of SOM. Using fuzzy rough sets, the initial connection weights and fuzzy rough neighborhood function were determined in [9] and [30], respectively. While the underlying structure in [9, 25, 30] is granular in nature, they differ in the design of the competitive learning process and the application case studies. The methodological differences are: Generation of rough rules from attribute reducts [25], evolution of information granules by exploiting $\alpha$ -cuts on fuzzy similarity matrix [9] and obtaining the information granules (clusters) with SOM for the first iteration [30]. Further, fuzzy information granule represented by the lower approximation memberships of a set is used in [9] and [30]. Applications include clustering real valued data [25], analysis of microarray gene expressions [9] and selection of genes based on sample clustering in microarray [30].

In contrast to the previous work in [9, 21, 25, 30] where a Euclidean distance was used as a measure of similarity, in this work, a new fuzzy Euclidean distance is used. Here, a fuzzy relation is also newly defined. While a degree of similarity or dissimilarity between two patterns is measured using the fuzzy Euclidean distance, the fuzzy relation determines the proximity between a pair of patterns corresponding to a feature. The proposed competitive learning strategy (SOMGCL) is described in Section 1.3. The effectiveness of SOMGCL is demonstrated using four different datasets with dimensions (features) ranging from 98 to 54675.

1.3 Overview of SOMGCL

Figure 1.

Block diagram of the proposed SOMGCL.

In this paper, we propose a different competitive learning strategy for training the SOMGCL model. This strategy is based on three concepts: Fuzzy initial connection weights (FICW), a fuzzy distance measure (DM) and the DM based granular neighborhood function (FGNF). This is in contrast to the learning strategies and distance functions discussed in [9, 25, 30]. There are two main differences between this work and one reported in [9]: i) in [9], the information granules were determined by $\alpha$ -cuts on fuzzy similarity matrix to define the network parameters. Whereas, in this article, the network parameters are based on the clusters obtained using SOM for the first iteration, ii) the model discussed in [9] uses competitive learning method involving the Gaussian neighborhood function to update its networks parameters. In contrast, in this paper, network parameters are updated using new granular competitive learning method involving the fuzzy distance based granular neighborhood function. Figure 1 shows initial architecture of SOMGCL. The proposed methodology is explained in two parts. In the first part, the fuzzy initial connection weights (FICW) of SOM are determined. These are based on the output clusters of SOM and fuzzy rough sets. Here, the SOM is trained through competitive learning for the first iteration and its clustering results are presented to a decision table. Subsequently, a dependency value for every feature in the decision table is computed using fuzzy rough sets and a new fuzzy strict order relation (FRS) as well as fuzzy decision classes. The dependency values of all features are incorporated into SOM as its initial connection weights.

The second part deals with a fuzzy distance measure (DM) and the distance based granular neighborhood function (FGNF) of the self-organizing map. Here, we use the above decision table. The fuzzy rough set (FRS) defines membership values belonging to lower approximation and boundary region of a set (cluster). The average membership values belonging to the lower approximation and boundary region of a set are computed. These average memberships are then used to compute distance between two output nodes (winner and any output node) which forms fuzzy distance measure (DM). A fuzzy granular neighborhood function (FGNF) is formulated by replacing the distance value in Gaussian neighborhood with the fuzzy distance (DM). The process of defining FGNF is repeated for every second and remaining iterations. Collectively the methodology of FICW and FGNF thus obtained forms granular competitive learning and the SOM is trained through this learning method. The contribution of this work is a novel granular competitive learning model (SOMGCL) for clustering microarray gene expression data that takes advantage of the strengths of the fuzzy rough methodology and the self-organizing map model. The performance of SOMGCL for clustering samples and gene expressions in microarray is found to be superior to five benchmark algorithms in terms of Rand, Jaccard, Fowlkes-Mallows, $\beta$ , DB and Dunn indices with four datasets.

This article is organized as follows: Preliminaries of fuzzy rough sets are discussed in Section 2. Section 3 describes the architecture of self-organizing map and its training method. Section 4 explains the proposed fuzzy strict order relation and the method of granular competitive learning of self-organizing map (SOMGCL). In Section 5, the performance of SOMGCL, as compared to related clustering methods, is discussed in terms of confusion matrices, external and inter cluster evaluation measures and quantization errors. Functional groups of genes obtained using SOMGCL and biologically significant gene ontology terms associated with the group of genes under GO-slim biological categories are also presented in this section. Conclusions of the present investigation are provided in Section 6.

2. Preliminaries of fuzzy rough sets

A fuzzy rough set is a generalized rough set, characterized by generalized lower and upper approximations. A set can be crisp or fuzzy set. The generalized rough set assigns memberships to patterns of a set, based on fuzzy operators. The fuzzy operators include a similarity value between a pair of patterns and their class labels to define the lower and upper memberships of a set. Approximation of a set using the lower and upper memberships in fuzzy feature space is called fuzzy rough set.

In fuzzy rough sets, a decision system is denoted by $S$ = $\{U,\mathcal{A}\cup\{d\}\}$ . Here, $U$ , $\mathcal{A}$ and $\{d\}$ denote a set of patterns $\{x_{1},x_{2},\ldots,x_{m}\}$ , conditional features $\mathcal{A}=\{a_{1},a_{2},\ldots,a_{n}\}$ and decision features $\{X_{k},k=1,2,\ldots,c\}$ , respectively. The decision features labeled with $c$ -values represent decision classes. Both the conditional and decision features can be defined as fuzzy. Let $R$ denote a fuzzy relation. The relation $R$ is called fuzzy similarity relation when it holds the properties of reflexivity ( $R_{a_{1}}(x,x)=$ 1), symmetry ( $R_{a_{1}}(x,y)=R_{a_{1}}(y,x)$ ), and $T$ -transitivity ( $R_{a_{1}}(x,z)\geqslant R_{a_{1}}(x,y)\wedge R_{a_{1}}(y,z)$ , with respect to a conditional attribute $a_{1}$ . The fuzzy similarity relations are generated by inclusion of mathematical operators on the universe of patterns $U$ with respect to crisp decision classes. The fuzzy equivalence classes are induced from the fuzzy similarity relations. Note that, the samples or genes in microarray are considered as patterns. The relation $R$ is reflexive relation when it satisfies only reflexivity.

The fuzzy logical counterparts of the connectives are applied to generalization of fuzzy relations, the lower and upper approximations of a fuzzy set. We now discuss some definitions. A t-norm $T(x,y)=x*y$ for all $x$ and $y\in$ [0, 1]. An implication operator $I(x,y)=1-x+xy$ for all $x$ and $y\in$ [0, 1]. These definitions can be improved by adding necessary properties: A triangular norm (t-norm for short) $T$ is any increasing, commutative, and associative $[0,1]^{2}\rightarrow[0,1]$ mapping satisfying $T(1,x)=x$ , for all $x\in[0,1]$ . Analogously, a fuzzy implication $I$ is any decreasing in its first and increasing in its second argument $[0,1]^{2}\rightarrow[0,1]$ mapping satisfying $I(0,0)=$ 1, $I(1,x)=x$ , for all $x$ in [0, 1].

Let $x_{1}$ and $x_{2}\in U$ denote a pair of patterns. A fuzzy reflexive relation $R_{a}$ [9] between the patterns corresponding to a feature $a\in\mathcal{A}$ is defined as

$\displaystyle\hskip-19.916929ptR_{a}(x_{1},x_{2})=$ (1) $\displaystyle\hskip-19.916929pt\left\{\!\!\begin{array}[]{l}\text{max}\!\!% \left(\!\!\text{min}\!\!\left(\!\!\frac{a\left(x_{2}\right)-a\left(x_{1}\right% )+\sigma_{a_{k_{1}}}}{\sigma_{a_{k_{1}}}},\frac{a\left(x_{1}\right)-a\left(x_{% 2}\right)+\sigma_{a_{k_{1}}}}{\sigma_{a_{k_{1}}}}\!\right),\!0\!\right),\\ \text{if}\ a\left(x_{1}\right)\&\ a\left(x_{2}\right)\in R_{d}\left(X_{k_{1}}% \right),\\ \text{max}\!\!\left(\!\!\text{min}\!\!\left(\!\!\frac{a\left(x_{2}\right)-a% \left(x_{1}\right)+\sigma_{a_{k_{2}}}}{\sigma_{a_{k_{2}}}},\frac{a\left(x_{1}% \right)-a\left(x_{2}\right)+\sigma_{a_{k_{2}}}}{\sigma_{a_{k_{2}}}}\!\right),% \!0\!\right),\\ \text{if}\ a\left(x_{1}\right)\in R_{d}\left(X_{k_{1}}\right),a\left(x_{2}% \right)\in R_{d}\left(X_{k_{2}}\right),\\ \text{and}\ k_{1}\neq k_{2},\end{array}\right.$

where $k_{1}$ and $k_{2}=1,2,\ldots,c$ , and $\sigma_{a_{k_{1}}}$ and $\sigma_{a_{k_{2}}}$ represent the standard deviations of patterns in the sets $k_{1}$ and $k_{2}$ , corresponding to decision attributes ${X_{k_{1}}}$ and ${X_{k_{2}}}$ , respectively. A set of fuzzy reflexive relations constitutes a fuzzy reflexive relational matrix. Every row of the matrix corresponds to a conditional feature. The fuzzy reflexive relation represents a fuzzy equivalence granule. It contains similarity values between all possible pairs of patterns. Fuzzy decision classes corresponding to a decision feature, based on the feature values, are defined as follows.

2.1 Fuzzy decision classes

Let $\overrightarrow{x}_{i}$ denote $n$ -dimensional vector corresponding to $i$ th pattern. The membership of $i$ th pattern to $k$ th class, denoted by $\mu_{k}(\overrightarrow{x}_{i})$ , is defined as

$\displaystyle\mu_{k}(\overrightarrow{x}_{i})=\frac{1}{1+(\frac{Z_{ik}}{f_{d}})% ^{f_{e}}},$ (2)

where $Z_{ik}$ is a weighted distance, and $f_{d}$ and $f_{e}$ are the denominational and exponential fuzzy generators controlling the amount of fuzziness in the class membership lying in $[0,1]$ . The values of $f_{d}$ and $f_{e}$ are chosen as positive integers as mentioned in [9]. The weighted distance $Z_{ik}$ is defined as

$\displaystyle Z_{ik}=\sqrt{\sum_{j=1}^{n}\left[\frac{x_{ij}-O_{kj}}{V_{kj}}% \right]^{2}},$ (3) $\displaystyle\quad\textit{for}\ k=1,2,\ldots,c,$

where $O_{kj}$ and $V_{kj}$ denote the mean and standard deviation of $k$ th class, respectively.

Let $k$ and $u$ be the two classes. The membership values of patterns belonging to its own class are expressed as:

The membership values of patterns in the $k$ th class to its own class are represented as

$\displaystyle DD_{kk}=\mu_{k}(\overrightarrow{x}_{i}),\text{if}\ k=u,\text{and}$ (4)

The membership values of patterns in the $k$ th class to other classes are denoted as

$\displaystyle DD_{ku}=1,\text{if}\ k\neq u,$ (5)

where $k$ and $u$ = 1, 2, …, $c$ .

Equations (4) and (5) define memberships to patterns belonging to decision classes $k$ and $u$ . Let $R_{d}$ denote a fuzzy relation with respect to a decision feature. For a pattern $x_{1}$ belonging to $k$ th class, the fuzzy decision classes, denoted by $R_{d}(x_{1})$ , using Eqs (4) and (5), are defined as

$\displaystyle R_{d}(x_{1})=\left\{\begin{array}[]{l l}DD_{kk},&\text{if}\ x_{1% }\in k\textit{th}\ \textit{class},\\ DD_{ku},&\text{otherwise}.\end{array}\right.$ (6)

Each pattern corresponding to $k$ th fuzzy decision feature, $X_{k}$ , $k=$ 1, 2, …, $c$ contains membership values belonging to its own class and other classes. The fuzzy lower and upper approximations of the set $A\subseteq U$ , based on the fuzzy reflexive relations and fuzzy decision classes, are described as follows.

2.2 Fuzzy lower and upper approximations

The fuzzy lower and upper approximations of a set, $A$ , based on the fuzzy reflexive relations and fuzzy decision classes, are defined as [4, 28]

$\displaystyle(R_{\mathcal{B}}\downarrow R_{d})(x_{1})={\inf}_{x_{2}\in U}I(R_{% \mathcal{B}}(x_{1},x_{2}),R_{d}(x_{1})),$ (7) $\displaystyle(R_{\mathcal{B}}\uparrow R_{d})(x_{1})=\!{\sup}_{x_{2}\in U}T(R_{% \mathcal{B}}(x_{1},\!x_{2}),\!R_{d}(x_{1})),$ (8)

where a pattern $x_{1}\in U$ , $R$ is a fuzzy relation and $\mathcal{B}\subseteq\mathcal{A}$ is a subset of features. Equations (7) and (8) refer to the lower and upper membership values of pattern in a set $A$ . Here, a fuzzy implication $I$ in Eq. (7) and a $t$ -norm $T$ in Eq. (8) are used.

The dependency value for the subset of features is now computed.

2.3 Dependency value

Let $\gamma_{\mathcal{B}}$ denote a dependency value for a conditional feature $\mathcal{B}\subseteq\mathcal{A}$ . It is computed as

$\displaystyle\gamma_{\mathcal{B}}=\frac{\sum_{x\in U}(R_{\mathcal{B}}% \downarrow R_{d})(x)}{|U|},$ (9)

where $|\cdot|$ denotes cardinality of a set $U$ , and $\gamma$ is 0 $\leqslant\gamma\leqslant$ 1.

3. Architecture of self-organizing map

The self-organizing map (SOM) [17] consists of an input layer and an output layer (competitive layer). There are $n$ nodes in the input layer whereas, in the output layer, the number of nodes is set to the expected number of clusters which is determined apriori. Here, $n$ is the number of features/attributes. The nodes in the input layer are connected to every node in the output layer. The weights to the links connecting the nodes in the input and the output layers are initialized with real numbers chosen randomly between 0 and 1. These are denoted by $\{w_{kj}(e),k=1,2,\ldots,c$ ; $j=1,2,\ldots,n\}$ . Here $e$ is the number of iterations and $c$ is the number of nodes in the output layer. Let $x=x(e)\in R^{n}$ denote a set of input patterns. The SOM is trained through competitive learning.

Method of competitive learning of SOM: For all iterations, $e$ , the following steps are repeated.

i)
Present an $n$ -dimensional input vector $x(e)$ at the nodes in the input layer.
ii)
Compute the Euclidean distance, $d_{k}$ , between the input vector, $x_{j}(e)$ , and the weight vector, $w_{kj}(e)$ , using

$\displaystyle d_{k}=\parallel x_{j}(e)-w_{kj}(e)\parallel^{2}.$ (10)
iii)
Find a winning node $v$ in the output nodes, which consists of the minimum Euclidean distance to its input nodes, using

$\displaystyle v=\text{argmin}\{d_{k}\},k=1,2,\ldots,c.$ (11)
iv)
Calculate the distance ( $D_{vk}$ ) between a winning node $v$ and an output neuron $k$ using

$\displaystyle D_{vk}(e)=\parallel r_{v}-r_{k}\parallel^{2},$ (12)

where $r_{v}$ and $r_{k}$ are the positions of a winning node $v$ and the neuron $k$ , respectively.
v)
Define a Gaussian neighborhood ( $N_{v}$ ) of a winning node, based on $D_{vk}$ , using

$\displaystyle N_{v}(e)=\exp\left(\frac{-D_{vk}^{2}}{2\sigma(e)^{2}}\right),$ (13)

where $\sigma(e)$ represents the width of the Gaussian neighborhood at iteration $e$ . The value of $\sigma$ is determined as

$\displaystyle\sigma(e)=\sigma_{0}\exp\left(\frac{e}{\tau_{1}}\right),e=0,1,2,\ldots,$ (14)

where $\sigma_{0}$ and $\tau_{1}$ , are time constants chosen as follows. The value of $\tau_{1}$ is the ratio of the total number of training epochs to $\log(\sigma_{0})$ . Here, $\sigma_{0}=$ maximum of rows or columns in the output layer of SOM. If the maximum value is greater than 2, then the value of $\sigma_{0}$ is set to maximum value divided by 2 as in [17].
vi)
Modify the weights of a winning node (winner) and its neighborhood neurons using

$\displaystyle w_{kj}(e+1)$ (15) $\displaystyle\!=\!\left\{\begin{array}[]{lll}w_{kj}(e)\!+\!\alpha(e)N_{v}(e)(x% _{j}(e)\!-\!w_{kj}(e)),\\ \text{if}\ k\in N_{v}(e),\\ w_{kj}(e),\text{else}.\end{array}\right.$

Here, $\alpha$ is a learning parameter with a value between 0 and 1.

4. Methodology of granular competitive learning of self-organizing map

A method of granular competitive learning of self-organizing map (SOMGCL) for clustering is developed using the output clusters of self-organizing map and a fuzzy rough set which is based on a new fuzzy strict order relation defined as follows:

Proposed fuzzy strict order relation: Let $x_{1}$ and $x_{2}\in U$ denote a pair of patterns. By using the fuzzy logical operators discussed in Section 1, a fuzzy strict order relation $R_{a}$ between the patterns corresponding to a feature $a\in\mathcal{A}$ is defined as:

$\displaystyle R_{a}(x_{1},x_{2})=$ $\displaystyle\left\{\!\!\begin{array}[]{l}\text{max}\left(\text{min}\left(a% \left(x_{2}\right)-a\left(x_{1}\right)+\left(a\left(x_{1}\right)\sigma_{a_{k_{% 1}}}\right)\right.\right.,\\ \left.\left.1-a\left(x_{1}\right)+a\left(x_{1}\right)\sigma_{a_{k_{1}}}-a\left% (x_{2}\right)\right),\sigma_{a_{k_{1}}}\right),\\ \text{if}\ a\left(x_{1}\right)\quad\&\quad a\left(x_{2}\right)\in R_{d}\left(X% _{k_{1}}\right),\\ \text{max}\left(\text{min}\left(a\left(x_{2}\right)-a\left(x_{1}\right)+a\left% (x_{1}\right)\sigma_{a_{k_{2}}}\right.\right.,\\ \left.\left.1-a\left(x_{1}\right)+a\left(x_{1}\right)\sigma_{a_{k_{2}}}-a\left% (x_{2}\right)\right),\sigma_{a_{k_{2}}}\right),\\ \text{if}\ a\left(x_{1}\right)\in R_{d}\left(X_{k_{1}}\right),a\left(x_{2}% \right)\in R_{d}\left(X_{k_{2}}\right),\\ \text{and}\ k_{1}\neq k_{2},\end{array}\right.$ (16)

where $k_{1}$ and $k_{2}$ $=$ $1,2,\ldots,c$ , and $\sigma_{a_{k_{1}}}$ and $\sigma_{a_{k_{2}}}$ represent the standard deviations of patterns in the sets $k_{1}$ and $k_{2}$ , corresponding to decision attributes ${X_{k_{1}}}$ and ${X_{k_{2}}}$ , respectively. A relation $R_{a}$ is a fuzzy strict order relation, when it has the properties of non-reflexive, antisymmetric and transitive. The motivation is that the fuzzy strict order relation considers a criteria of maximizing relevance between samples within a class corresponding to a conditional attribute.

The following sections give computation details for determining FICW, DM and FGNF.

4.1 Computation of FICW

In a conventional SOM, initial connection weights are chosen randomly between 0 and 1. With the random initial connection weights, the SOM method has the following shortcomings: i) slower convergence speed results in large number of iterations in order to converge to a solution, ii) inability to handle uncertain decisions, iii) updating weights at each training instance. To avoid the above listed shortcomings, fuzzy initial connection weights (FICW) are determined. To define FICW, the self-organizing map is initially trained through competitive learning for the first iteration. The output clusters of SOM, thus obtained, are then presented to a decision table. The number of clusters is denoted by $c$ . We use the decision table thereafter.

The initial connection weights (FICW) using fuzzy rough sets based on the proposed fuzzy relation defined in Eq. (4) are defined as follows:

1.
Compute the fuzzy relational matrix for every feature using Eq. (4). Every row of the matrix represents an information granule.
2.
Define fuzzy decision classes, corresponding to fuzzy decision features, using Eq. (6).
3.
Compute the lower membership values of patterns in a set (concept) using Eq. (7), based on fuzzy decision classes and fuzzy relational matrix, for every feature. A concept refers to a set of decision classes. The lower membership values represent exactness in class belongingness of the patterns.
4.
Find the average of the lower membership values of patterns in a concept, representing a dependency value, using Eq. (9) for every feature. The dependency values of all the features, indicating domain knowledge about data, are encoded into the SOM as its initial connection weights.

We describe the initialization process of the connection of weights into network as follows: Let $\{\gamma_{1}^{k},\gamma_{2}^{k},\ldots,\gamma_{n}^{k}\}$ be the dependency values, with respect to $k$ th concept (set), for features $\{a_{1},a_{2},\ldots,a_{n}\}$ $\in\mathcal{B}$ . For a feature $j$ and a concept $k$ , the dependency value $\gamma_{j}^{k}$ is defined as

$\displaystyle\gamma_{j}^{k}=\frac{\sum_{x\in U_{k}}(R_{\mathcal{B}}\downarrow R% _{d})(x)}{|U_{k}|},j=1,2,\ldots,n;k=1,2,\ldots,c.$ (17)

The weight $w_{jk}$ between nodes in the input layer and the output layer of SOM is initialized with $\gamma_{j}^{k}$ .
4.2 Computation of DM to define neighborhood neurons

In a conventional self-organizing map, the Euclidean distance between the positions of a winning node (winner) and each of the nodes is arranged in a one or two dimensional (1D or 2D) array (see Eq. (13)). It is transformed into a Gaussian function. Here, we use a fuzzy distance function based on a strict order relation defined in Eq. (4). It is computed as follows:

Let $v$ denote a winning node (or winning neuron) and $k$ be any neuron in the output layer of the network. A neuron in the output layer which has minimum Euclidean distance to the input neurons denotes the winning node (winner).

1.
Compute the distance from a winning neuron $v$ to a neuron $k$ , denoted by $\wedge_{vk}$ , using

$\displaystyle\wedge_{vk}=\sum_{j=1}^{n}(||\gamma_{j}^{v}-\gamma_{j}^{k}||^{2}+% ||B_{j}^{v}-B_{j}^{k}||^{2}),$ (18)

where $\gamma_{j}^{v}$ and $\gamma_{j}^{k}$ are the averages of membership values belonging to lower approximations (using Eq. (9)), and $B_{j}^{v}$ and $B_{j}^{k}$ are the averages of membership values belonging to boundary regions, of sets obtained at the nodes $v$ and $k$ respectively for $j$ th feature.

(a)
$B_{j}^{v}$ is defined as

$\displaystyle B_{j}^{v}=\left(\frac{\sum_{x\in U_{v}}(R_{\mathcal{B}}\uparrow R% _{d})(x)}{|U_{v}|}\right.\left.-\frac{\sum_{x\in U_{v}}(R_{\mathcal{B}}% \downarrow R_{d})(x)}{|U_{v}|}\right),$ (19)

where $(R_{\mathcal{B}}\uparrow R_{d})(x)$ and $(R_{\mathcal{B}}\downarrow R_{d})(x)$ are computed using the lower and upper approximations of Eqs (7) and (8), respectively. The proposed fuzzy strict order relation, Eq. (4), is employed in computing the lower and upper approximations of a set.
(b)
$B_{j}^{k}$ is defined using Eq. (19), where $v$ is replaced with $k$ .

2.
A neuron $k$ lies within the neighborhood of a winning node $v$ (neighborhood neuron of $v$ ), when

$\displaystyle\left(\frac{\wedge_{vk}}{\varrho}\right)\leqslant\sigma^{2}.$ (20)

–
Here, $\wedge_{vk}$ is the Euclidean distance from a winning neuron (winner) $v$ to a neuron $k$ and it is calculated using Eq. (18). Parameters $\varrho$ and $\sigma$ are defined as follows.

$\displaystyle\varrho=\text{max}\{N1,N2\},$ (21)

where $N1$ and $N2$ represent the numbers of nodes arranged in the rows and columns of output layer of SOM, respectively.

4.3 Computation of FGNF for updating connection weights

The proposed granular neighborhood function, based on the fuzzy distance in Eq. (18), is defined as

$\displaystyle\textit{NBD}_{v}(e)=\exp\left(\frac{-\wedge_{vk}^{2}}{2\sigma(e)^% {2}}\right),$ (22)

where $\sigma(e)$ represents width of the neighborhood at iteration $e$ . The value of $\sigma$ in Eq. (22) is obtained using Eq. (14). The weights of a winning neuron and its neighborhood neurons are updated using

$\displaystyle w_{kj}(e+1)$ (23) $\displaystyle=\left\{\begin{array}[]{lll}w_{kj}(e)\!+\!\alpha(e)NBD_{v}(e)(x_{% j}(e)\!-\!w_{kj}(e)),\\ \text{if}\ (\frac{\wedge_{vk}}{\varrho})\leqslant\sigma^{2},\\ w_{kj}(e),\text{else}.\end{array}\right.$

A learning rate $\alpha(e)$ in Eq. (23) is chosen as in [17]

$\displaystyle\alpha(e)=\alpha_{0}\exp\left(-\frac{e}{\tau_{2}}\right),e=2,3\ldots,$ (24)

where $\tau_{2}$ is another time constant. The value of $\alpha_{0}$ is chosen between 0 and 1. The value of $\alpha(e)$ monotonically decreases as the number of iterations $e$ gradually increases.

The proposed granular competitive learning of self-organizing map (SOMGCL) satisfies properties of maximality and convergence. An algorithm is said to be convergent when its confusion matrix has dominant element in each row of the matrix. An algorithm attains maximum value 1 if the distance from winning neuron (winner) to its neighborhood neuron in neighborhood function is 0. An algorithm satisfies maximality condition if the value of neighborhood is 1. One can refer to [30] for the details of maximality and convergence.

4.4 Pseudo code of the proposed SOMGCL

Pseudo code of the proposed SOMGCL is shown in Algorithm 4 as follows:

Computational complexity of Algorithm 4:

1.
In step 4, the complexity is O(1).
2.
In step 5, the while loop runs $e$ times. /This is an outer loop for steps 6–17/.
3.
In step 6, the if condition has O(1) time. /This is an outer loop for steps 7–9./
4.
In step 7, for a sample, the complexity in computing the distance from the output nodes ( $c$ ) to the input nodes ( $n$ ) is O( $c n$ ), where $n$ and $c$ denote the number of features and classes (see Section 3). The complexity for determining a winning node among all the output nodes ( $c$ ) is O( $c n$ $+$ $c$ ) (see Section 3). The complexity for updating the connection weights between the output nodes ( $c$ ) to the input nodes ( $n$ ) is O( $c n$ ) (see Section 3). The computational complexity of SOM is O( $cn+cn+c+cn+K$ ). Here, a constant $K$ is the number of operations performed for determining neighborhood nodes of a winning node and updating the connection weights. For all the samples ( $m$ ), the computational cost of SOM is O( $mcn+mcn+mc+mcn+K$ ). The asymptotic complexity of SOM is O( $m c n$ ).

: Pseudo code of SOMGCL for clustering[1] Inputs: Mixture audio signal $y(m)$ . Output: Trained SOMGCL and output clusters. Method: $e\leftarrow$ 0 while( $e<=$ the total number of iterations){ $e\leftarrow$ $e$ $+$ 1, if $e$ == 1, { Self-organizing map (SOM) is trained through competitive learning (see Section 3) and $c$ number of clusters are attained at the nodes of SOM’s output layer. Present the output clusters to a decision table as its decision classes. Define initial connection weights of the SOM using the method of fuzzy initial connection weights (FICW) (see Section 4.1). }/end if/ else{ Repeat the following steps 13–17 to perform training of the SOM based on granular competitive learning (SOMGCL). Present an $n$ -dimensional vector $x_{j}(e)$ at the input nodes of SOMGCL. Compute the Euclidean distance using Eq. (10) between the input. vector $x_{j}(e)$ and the weight vector $w_{jk}(e)$ for $k$ th output node of SOMGCL. Find a winning neuron $v$ in the output layer of SOMGCL using Eq. (11). Find neighborhood neurons of the winner $v$ using Eq. (20). Modify the connection weights of the winner $v$ and it’s neighborhood neurons of SOMGCL using Eq. (23). } /end else/ }. /end while/
5.
In step 8, for all the samples ( $m$ ) with features ( $n$ ) and classes ( $c$ ), the computational complexity of fuzzy decision classes is O( $c((m_{c}n+m_{c}n+m_{c}n+m_{c})+(m_{c}(c-1)))$ ) (see Section 2.1). Here, O( $m_{c}n$ ), O( $m_{c}n$ ), O( $m_{c}n$ ) and O( $m_{c}$ ) represent the complexity for calculating mean, variance of a class (cluster), the weighted distance from a class to the mean and defining memberships to the samples in a class, respectively, where $m_{c}$ denotes the number of samples in a class. The last term O( $m_{c}(c-1)$ ) represents the complexity for fuzzy decision classes computed using Eq. (6). The asymptotic complexity is O( $c(m_{c}(n+c))$ ). One can refer to [30] for further details of complexity for fuzzy decision classes.
6.
In step 9, for defining fuzzy initial connection weights of SOM, fuzzy relational matrix and fuzzy lower approximations of all the classes corresponding to every feature are computed.

(a)
For a feature, the complexity of fuzzy relational matrix of size $m\times m$ is O(cm ${}_{c}$ $+$ 2m ${}^{2}$ ). Here, $m_{c}$ is the number of samples in $c$ th class and $m$ is total number of samples in all the classes. The asymptotic complexity is O( $cm_{c}+m^{2}$ ). The complexity for computing the membership values of all the samples ( $m$ ) in the lower approximations of all the classes( $c$ ), based on the above fuzzy decision classes, is O( $cm_{c}(n+c)$ ) $+$ O( $(cm_{c}+m^{2})+m^{2}+mm_{c}+m(m-m_{c})^{2}+m(m-m_{c})$ ).
(b)
For all the features ( $n$ ), the computational complexity of lower approximations of all classes is O( $cm_{c}(n+c)$ ) $+$ O( $n(cm_{c}+m^{2})+m^{2}+mm_{c}+m(m-m_{c})^{2}+m(m-m_{c})$ ). The asymptotic complexity is O( $cm_{c}(n+c)+nm^{2}$ ). Note that, the number of operations performed for computing the lower approximations (Eq. (7)) is equal to those in the upper approximations (Eq. (8)). So, the complexity of the upper approximation is same as the lower approximation.
(c)
For a feature, the complexity for computing the average of membership values in the lower approximation of a class is O( $m_{c}$ ). For all features ( $n$ ) and classes ( $c$ ), the complexity is O( $ncm_{c}$ ).
(d)
The complexity of fuzzy initial connection weights is O( $cm_{c}(n+c)+nm^{2}$ $+$ $ncm_{c}$ ). The asymptotic complexity is O( $cm_{c}(n+c)+nm^{2}$ ).

7.
The complexity of SOM with fuzzy initial connection weights is O( $mcn+cm_{c}(n+c)+nm^{2}$ )./* if condition is terminated in step 10/
8.
In step 13, the computational complexity of a sample presented at the input nodes ( $n$ ) of SOMGCL is O( $n$ ).
9.
In step 14, for a sample, the complexity in calculating the distance from the output nodes ( $c$ ) to the input nodes ( $n$ ) of SOMGCL is O( $n+cn$ ). The asymptotic complexity is O( $c n$ ).
10.
In step 15, for a sample, the complexity in determining a winning node (winner) among the output nodes ( $c$ ) of SOMGCL is O( $cn+c$ ). For all the samples ( $m$ ), this is O( $mcn+mc$ ). The asymptotic complexity is O( $m c n$ ).
11.
In step 16, we first calculate the computational cost of fuzzy distance from a winning node to an output node. The complexity of the fuzzy distance based neighborhood function (determining neighborhood nodes of the winning node) is then computed. This involves the above computational cost of the lower and upper approximations of all classes that is

$\displaystyle O(2(cm_{c}(n+c)+nm^{2}))(seestep6.(b)).$

(a)
For a feature, the complexity in computing the average memberships in the boundary regions of a winning node and ( $c-1$ ) output nodes using Eq. (19) is O(1 $+$ ( $c-1$ )). Here the average membership of boundary region of an output node is the difference between the average memberships of the upper and lower approximations.
(b)
The complexity in computing fuzzy distance from a winning node to $c-1$ output nodes, based on the average memberships of their lower and boundary regions, using Eq. (18) is O((1 $+$ ( $c-1))+(1(c-1))$ ). For all features ( $n$ ), the complexity is O( $(n+n(c-1))+(n(c-1))$ ). For all the samples ( $m$ ), this is O( $m(n+2n(c-1))$ ). The asymptotic complexity is O( $mn(c-1)$ ).

12.
The computational complexity for determining neighborhood nodes of a winning node using Eq. (20) is O(( $mcn)+(2(cm_{c}(n+c)+nm^{2}$ )) $+$ $(mn(c-1))+K$ ). Here, O( $m c n$ ), O(2( $cm_{c}(n+c)+nm^{2}$ ) and O( $mn(c-1)$ ) denote the complexities of a winning node, the lower and upper approximations of all classes and the fuzzy distance respectively. A constant $K$ represents the complexity of the total number of operations in Eqs (20) and (21) (maximum) (division and multiplication). The asymptotic complexity is O( $mcn+(cm_{c}(n+c)+nm^{2})$ ).
13.
In step 17, for a sample, the complexity for updating connection weights between the input nodes ( $n$ ) and output nodes ( $c$ ) of SOM is O( $c n$ ). For all the samples ( $m$ ), the complexity is O( $m c n$ ).
14.
The complexity of Algorithm 4 involving SOM with fuzzy initial connection weights for 1 iteration and granular competitive learning of SOM (determining neighborhood nodes of a winning node and updating the connection weights) for ( $e-1$ ) iterations is O( $mcn+(cm_{c}(n+c)+nm^{2})+(e-1$ )( $(cm_{c}(n+c)+nm^{2})+mcn+mcn$ )). Therefore, the asymptotic complexity is O( $e((cm_{c}(n+c)+nm^{2})+mcn)$ ). / while loop is terminated*/

.

Similarity and Difference betweenEqs. (13) and (22) The nodes in the output layer are arranged in 2D array. Equations (13) and (22) are based on the Gaussian neighbor function.

In Eq. (22), the Euclidean distance $\wedge_{vk}$ from a winner ( $v$ ) to a neuron in the output layer ( $k$ ) is defined as fuzzy. Further, the membership values of neurons $v$ and $k$ , belonging to the lower approximation and the boundary region of a set (output cluster attained at either node $v$ or $k$ ) are determined using fuzzy rough sets, based on the proposed fuzzy strict order relation.

In contrast, the distance $D_{vk}$ in Eq. (13) is calculated using the positions of a winning neuron ( $v$ ) and an output neuron ( $k$ ) in the output layer.

.

Difference between Eqs (23) and (13)Equations (23) and (13) use the same parameters except distance measures. In Eq. (23), $\wedge_{vk}$ represents a fuzzy Euclidean distance which is computed between winning neuron (winner) $v$ and a neuron $k$ . Whereas, in Eq. (13), $D_{vk}$ is a Euclidean distance from a winner $v$ to a neuron $k$ .

.

Difference between Eqs (17) and (9)Equations (17) denotes a dependency value of a feature with respect to a class. Whereas, Eq. (9) represents the average of the dependency values of all the classes for a feature. Here, the average of membership values of patterns belonging to the lower approximation of a set denotes the dependency value.
5. Experimental results

The proposed self-organizing map based on granular competitive learning (SOMGCL) is implemented in C-language using Intel Core i5-2430M CPU at 2.40 GHZ processor and 16 GB RAM. The following datasets are used in our experiments: breast cancer [34], multi-A [13], GDS5218 and GDS5499.1

¹
These are downloaded from http://www.ncbi.nlm.nih.gov/sites/ GDSbrowser.

Table 1 gives details for each of the datasets in terms of sample sizes, classes and number of attributes.

Table 1

Characteristics of datasets

Data	Samples	Attributes	Classes	Attribute type
Breast cancer	98	1213	3	Real
Multi-A	103	5565	4	Real
GDS5218	110	54675	4	Real
GDS5499	140	48803	4	Real

Figure 2.

3D plot of breast cancer data in F ${}_{1}$ -F ${}_{2}$ -F ${}_{3}$ space.

Breast cancer: The data has 98 samples of three breast cancer types, distant metastates, disease free and BRCA1 germline mutations. There are 1214 gene expressions for each sample. As an example, a three dimensional (3D) plot of 98 samples in 3 classes of breast cancer in F ${}_{1}$ -F ${}_{2}$ -F ${}_{3}$ space is shown in Fig. 2.

Multi-A: The multiple tissue type-A (Multi-A) data contains 103 samples belonging to four different tissue types (classes or groups), breast, prostate, lung and colon. Here, 5565 gene expressions are available for each sample.

GDS5218: The data consists of 110 muscle biospy samples of female young, male young, female old and male old adults. Each sample has 54675 gene expressions.

GDS5499: The data is based on 140 samples of patients from four disease groups, idiopathic pulmonary arterial hypertension (IPAH), systemic sclerosis (SSc), SSc associated PAH (SSc-PAH), and SSc complicated by interstitial lung disease and PH (SSc-PH-ILD). Every sample contains 48803 gene expressions.

5.1 Algorithms for comparison

The proposed SOMGCL is compared with granular self-organizing map (GSOM) [30], self-organizing map (SOM) [17], robust rough fuzzy c-means (RRFCM) [21], clustering ensemble method (CEM) [19] and partition around medoids ( $c$ -medoids). The output clusters of all the clustering methods are evaluated using external cluster evaluation measures, Rand index [29], Jaccard index [31] and Fowlkes-Mallows (FM) index [8] and internal cluster evaluation measures, $\beta$ -index [24], DB-index [5] and Dunn-index [7]. The external cluster evaluation measures consider the actual class labels of samples, unlike the internal cluster evaluation measures. Values of Rand, Jaccard and FM indices closer to 1 imply that the samples in output clusters are strongly associated with the actual classes (the output clusters are highly similar to true classes).

The DB-index and Dunn-index are based on the idea of taking minimum and maximum computed over the distances between clusters and variance, considering a pair of clusters together. The $\beta$ -index is based on the principle of the variance computed over individual class, taking the overall feature space. Higher values of $\beta$ and Dunn-indices and lower values for DB index indicate better clustering quality. The internal indices explore the compactness of a cluster in terms of low intra-cluster distance and high inter-cluster distance.

The proposed SOMGCL explores mapping of training vectors in the input space onto neighboring locations in the output space, based on the quantization error [17]. The quantization error is calculated by considering the distance between every input vector and its best matching weight vector. Low quantization error value implies that the input vector and the initial connection weights are close.

5.2 Results of sample clustering

In this section, selection of parameters for clustering samples in the breast cancer data using SOMGCL is explained. This section also provides the results of SOMGCL in terms of confusion matrices, clustering indices, 3D plot of output clusters with final weights, and quantization errors for this data.

5.2.1 Breast cancer data

The training of SOMGCL for breast cancer is as follows: The SOMGCL uses the following parameters: initial learning $\sigma_{0}$ and a constant $\tau_{1}$ (given in Eq. (14)), $\alpha_{0}$ and a constant $\tau_{2}$ (given in Eq. (24)), iteration $e$ , output layer nodes $c$ . During the training of SOMGCL, the value of $\sigma_{0}$ is chosen as the maximum number of neurons in either row or column of the output layer, $\tau_{1}$ and $\tau_{2}$ are set to the total number of iterations. The value of $c$ for SOMGCL is fixed as in [30]. Here, $c$ is set to 3 as only 3 true classes exist in the data. By using all the parameters, the training of SOMGCL is performed as follows.

The algorithm is trained for different values of $\alpha_{0}$ and $e$ . Durig training, the values of $e$ are chosen ranging from 5 to 50. For a particular value of $e$ , $\alpha_{0}$ value is changed from 0.1 to 0.9 in steps of 0.1. For these values of $\alpha_{0}$ and $e$ , the output clusters of SOMGCL is evaluated using cluster evaluation metrics involving internal indices (Rand, Jaccard and FM indices) and external indices (DB-index, Dunn-index and $\beta$ -index). For $\alpha_{0}=$ 0.11 and $e=$ 20, the evaluation metrics confirm that the performance of the proposed SOMGCL is superior than the 5 benchmarked methods. Figure 3 provides 2 dimensional (2D) plot of values of DB-index for SOMGCL, GSOM and SOM, for the values of $\alpha_{0}$ typically chosen between 0.1 and 0.8. Here, the values of $e$ for SOMGCL, GSOM and SOM are 20, 20 and 200, respectively. The DB-index value of SOMGCL at $\alpha_{0}=$ 0.11 is lower than that of GSOM and SOM and is shown as a dotted vertical line in the figure.

Table 2
Clustering solutions obtained using SOMGCL, GSOM, SOM, RRFCM, CEM and $c$ -medoids for breast cancer data

Method	$c_{1}$	$c_{2}$	$c_{3}$
SOMGCL	8	53	37
GSOM	15	52	31
SOM	15	50	33
RFCM	8	56	34
CEM	15	47	36
$c$ -medoids	25	43	30

Table 3

Clustering solutions obtained using SOMGCL, GSOM, SOM, RRFCM, CEM and $c$ -medoids for breast cancer data

a) SOMGCL				b) GSOM				c) SOM				d) RRFCM				e) CEM				f) $c$ -medoids
	c ${}_{1}$ .	c ${}_{2}$ .	c ${}_{3}$ .		c ${}_{1}$	c ${}_{2}$	c ${}_{3}$		c ${}_{1}$	c ${}_{2}$	c ${}_{3}$		c ${}_{1}$	c ${}_{2}$	c ${}_{3}$		c ${}_{1}$	c ${}_{2}$	c ${}_{3}$		c ${}_{1}$	c ${}_{2}$	c ${}_{3}$
c ${}_{1}$ :	5	2	1	c ${}_{1}$ :	7	2	6	c ${}_{1}$ :	6	5	4	c ${}_{1}$ :	4	1	3	c ${}_{1}$ :	4	8	3	c ${}_{1}$ :	2	7	0
c ${}_{2}$ :	4	48	1	c ${}_{1}$ :	4	48	0	c ${}_{1}$ :	4	46	0	c ${}_{1}$ :	6	48	2	c ${}_{1}$ :	4	43	0	c ${}_{1}$ :	9	44	9
c ${}_{1}$ :	2	1	34	c ${}_{1}$ :	0	1	30	c ${}_{1}$ :	1	0	32	c ${}_{1}$ :	1	2	31	c ${}_{1}$ :	3	0	33	c ${}_{1}$ :	0	0	27

Figure 3.

2D plot of DB-index values for SOMGCL comparing with GSOM and SOM for different values of $\alpha$ chosen from 0.1 to 0.8 for breast cancer data. Here, values of e for SOMGCL, GSOM and SOM are chosen as 20, 20 and 200, respectively.

Table 4

Values of Rand, Jaccard and FM indices of SOMGCL comparing with GSOM, SOM, RRFCM, CEM and c-medoids for $c=$ 3 of breast cancer data

Method	Rand index	Jaccard index	Fow.-Mall. index	Parameters
SOMGCL	0.883	0.758	0.857	( $\alpha_{0}$ ) 0.11, ( $e$ ) 20
GSOM	0.862	0.72	0.837	( $\alpha_{0}$ ) 0.065, ( $e$ ) 20
SOM	0.856	0.697	0.822	( $\alpha_{0}$ ) 0.0005, ( $e$ ) 200
RRFCM	0.825	0.627	0.771	( $\delta$ ) 0.05, (Tr) 0.15
CEM	0.792	0.603	0.748	-
$c$ -medoids	0.599	0.409	0.571	-

Figure 4.

3D plot of output clusters of SOMGCL, as compared to GSOM and SOM, for breast cancer data in F ${}_{1}$ -F ${}_{2}$ -F ${}_{3}$ space, where $\circ$ represents final weight.

The output clusters obtained using SOMGCL as compared to the other methods are provided in Table 2. When the clustering results of SOMGCL are compared with the true clusters, one can observe that the number of samples in a resultant cluster is close to that of the true cluster. For example, the number of samples in the output cluster $c_{3}$ of SOMGCL (37) is closer to the true cluster number (36) as compared to GSOM (31), SOM (33), RRFCM (34) and $c$ -mediods (30). Only the CEM method has samples (36) equal to the samples in the true clusters as shown in Table 2. However, SOMGCL is better than all the methods with respect to the number of true samples in the output cluster. From Table 3a) for cluster $c_{3}$ , 34 out of 37 samples are identified correctly. This is greater than the samples from all other methods for $c_{3}$ shown in b), c), d), e) and f) for GSOM, SOM, RRFCM, CEM and $c$ -mediods, respectively. This implies that the number of samples in SOMGCL’s clustering result may not be close to true cluster result, but it is better than those of the remaining methods. Further, the sum of diagonal elements in all the clusters of SOMGCL (87) is found to be higher than GSOM (85), SOM (84), RRFCM (83), CEM (80) and $c$ -medoids (73). Hence, the performance of SOMGCL is superior to all the remaining methods in terms of the sum of diagonal entries (true sample number).

It is evident from Fig. 2 that cluster $c_{3}$ has overlapping regions with clusters $c_{1}$ and $c_{2}$ . This information is reflected in the results of cluster $c_{3}$ of SOMGCL, as there are 2 samples in $c_{1}$ and 1 sample in $c_{2}$ which belong to $c_{3}$ . Whereas, this information is moderately reflected in the results of the other methods, as some of non diagonal entries corresponding to $c_{3}$ for the methods are zero (see Table 3). Similar comparisons can be made from the clustering results of $c_{1}$ and $c_{2}$ for all the methods in Table 3.

Furthermore, the output clusters of all the methods are evaluated using Rand, Jaccard and Fowlkes-Mallows (FM), DB, Dunn and $\beta$ indices. The results of SOMGCL compared with the remaining methods are presented in Tables 4 and 5. Parameters for all the methods are shown in the last column of Table 4. For SOMGCL, GSOM and SOM, values of $\alpha_{0}$ and $e$ are provided while thresholds for RRFCM ( $\delta$ and Tr) are also given. These values correspond to the best performing metrics. The weighting exponents $w$ and $\tilde{w}$ and possiblistic constant $b$ for RRFCM are chosen to be equal to 0.5 and 0.5 and 0.7, respectively, similar to [21].

Higher values of Rand, Jaccard and FM indices in Table 4 for SOMGCL indicate stronger association between the output clusters and the actual clusters. Table 5 shows that, the Dunn and $\beta$ indices for SOMGCL are high, while the DB-index is low, when compared to the other methods. Therefore, the performance of SOMGCL is superior to the other methods in terms of Dunn and $\beta$ indices.

Table 5

Comparison of SOMGCL with GSOM, SOM, RRFCM, CEM and $c$ -medoids, using DB, Dunn and $\beta$ indices, for breast cancer and $c=$ 3

Method	DB index	Dunn index	$\beta$ index
SOMGCL	2.845	0.502	1.252
GSOM	3.639	0.437	1.239
SOM	3.952	0.416	1.221
RRFCM	3.706	0.420	1.226
CEM	3.712	0.427	1.225
$c$ -medoids	3.956	0.415	1.204

Visualization of output clusters SOMGCL, GSOM and SOM: Fig. 4 provides 3D plots of the output clusters of SOMGCL along with the final weight vectors in F ${}_{1}$ -F ${}_{2}$ -F ${}_{3}$ space, as compared to GSOM and SOM, for the breast cancer data. In the figure, the final weight vectors are denoted by circles. In Fig. 4a), it is noteworthy that the distance between the circles is greater as compared to the circles in Fig. 4(b) (GSOM) and Fig. 4c) (SOM). Whereas the circles ( $\circ$ s) using the GSOM and SOM seemed to be very close (see Fig. 4(b) and (c)). In addition, the circles for SOMGCL are in the center of the cluster while the circle using GSOM and SOM is slightly away from the cluster center.

Quantization error: The error of SOMGCL, compared to GSOM and SOM, for $c=$ 3 is shown in Fig. 5. It is clear from the figure that the error of SOMGCL for fourth and the remaining iterations is lower than GSOM and SOM. Further, the convergence speed of SOMGCL ( $e=$ 9) is lower than GSOM ( $e=$ 20) and SOM ( $e=$ 300). It may be noted that only GSOM and SOM are used for comparison of error as the models use competitive learning unlike RRFCM, CEM and $c$ -medoids.

Figure 5.

2D plots of quantization errors of SOMGCL, GSOM and SOM for $c=$ 3 of breast cancer data.

A plot of average quantization error of SOMGCL compared to GSOM and SOM is provided in Fig. 6. Here the number of output nodes ( $c$ ) for these methods is gradually increased from 2 to 10 in steps of 2. The average error for all the methods expectedly decrease in accordance of increasing value of $c$ . However, the proposed SOMGCL has least error for all the values of $c$ and hence the proposed SOMGCL performs better than GSOM and SOM.

Table 6

Comparison of SOMGCL with GSOM, SOM, RRFCM, CEM and $c$ -medoids, for $c=$ 4 of multi-A data in terms of Rand, Jaccard and FM indices

Method	Rand index	Jaccard index	FM index	Paramters
SOMGCL	0.862	0.56	0.718	$(\alpha_{0}$ ) 0.75, ( $e$ ) 10
GSOM	0.844	0.545	0.708	( $\alpha_{0}$ ) 0.065, ( $e$ ) 20
SOM	0.828	0.507	0.675	( $\alpha_{0}$ ) 0.5, ( $e$ ) 100
RRFCM	0.826	0.487	0.655	( $\delta$ ) 0.4, ( $T r$ ) 0.4
CEM	0.806	0.456	0.611	-
$c$ -medoi.	0.794	0.448	0.605

Figure 6.

2D plots of average quantization errors of SOMGCL, GSOM and SOM for $c=$ 2, 4, 6, 8, and 10 of breast cancer data.

5.2.2 Multi-A data

For samples in Multi-A data, SOMGCL is trained through granular competitive learning. During training, values of parameters for SOMGCL are chosen in a manner similar to that of the breast cancer. Tables 6 and 7 provide results for Rand, Jaccard, & FM, and DB, Dunn & $\beta$ indices, respectively. The parameter values for all methods corresponding to the best performance metrics are presented in Table 6. The results from the tables indicate that the performance of SOMGCL is found to be superior to the other methods.

Table 7
Comparison of SOMGCL with GSOM, SOM, RRFCM, CEM and $c$ -medoids using DB, Dunn and $\beta$ indices for $c=$ 4 of multi-A data

Method	DB-index	Dunn-index	$\beta$ -index
SOMGCL	3.016	0.47	1.404
GSOM	3.289	0.437	1.382
SOM	3.654	0.416	1.356
RRFCM	3.536	0.42	1.369
CEM	3.393	0.427	1.355
$c$ -medoids.	3.711	0.415	1.265

Table 8

Comparison of SOMGCL with other algorithms for $c=$ 4 in terms of Rand, Jaccard and FM indices

Data	Method	Rand index	Jaccard index	FM index	Paramters
GDS	SOMGCL	0.64	0.166	0.284	$(\alpha_{0}$ ) 0.76, ( $e$ ) 10
5218	GSOM	0.63	0.16	0.275	( $\alpha_{0}$ ) 0.15, ( $e$ ) 11
	SOM	0.628	0.162	0.269	( $\alpha_{0}$ ) 0.85, ( $e$ ) 200
	RRFCM	0.619	0.155	0.269	( $\delta$ ) 0.0025
					( $T r$ ) 0.0015
	CEM	0.621	0.156	0.27
	$c$ -medoi.	0.605	0.153	0.26
GDS	SOMGCL	0.593	0.221	0.366	$(\alpha_{0}$ ) 0.7, ( $e$ ) 9
5499	GSOM	0.558	0.209	0.346	$(\alpha_{0}$ ) 0.3, ( $e$ ) 10
	SOM	0.554	0.2	0.33	$(\alpha_{0}$ ) 0.35, ( $e$ ) 200
	RRFCM	0.586	0.205	0.345	( $\delta$ ) 0.05, ( $T r$ ) 0.55
	CEM	0.565	0.19	0.322
	$c$ -medoi.	0.556	0.198	0.334

Quantization error: Figure 7 shows the 2-dimen-sional plot of quantization errors obtained using SOMGCL, GSOM and SOM for multi-A data and $c=$ 4. The error curve for SOMGCL is seen to be much lower than GSOM and SOM. SOMGCL with the minimum error results in efficient mapping of training vectors in the feature space onto neighboring locations in the output space. Clearly, the mapping preserves neighborhood relations i.e., nearby samples in the feature space remain close in the output space.

Table 9

Results of SOMGCL as compared to the other clustering methods for $c=$ 4 of multi-A data using DB, Dunn and $\beta$ indices

Data	Method	DB-index	Dunn-index	$\beta$ -index
GDS	SOMGCL	3.198	0.372	1.465
5218	GSOM	3.608	0.42	1.374
	SOM	4.134	0.29	1.32
	RRFCM	3.314	0.353	1.373
	CEM	4.493	0.32	1.39
	$c$ -medoi.	3.321	0.312	1.451
GDS	SOMGCL	1.763	0.872	2.295
5499	GSOM	1.8131	0.892	2.179
	SOM	2.164	0.66	2.033
	RRFCM	2.542	0.523	1.82
	CEM	2.74	0.421	2.017
	$c$ -medoi.	2.863	0.812	2.093

Figure 7.

2D plot of quantization errors of SOMGCL, GSOM and SOM for multi-A data for $c=$ 4.

5.2.3 GDS5218 data and GDS5499 data

The parameter selection procedure is the same for these datasets as before. The quality of an output cluster using SOMGCL is assessed using the external and internal indices. The results of SOMGCL for $c=$ 4, and other methods, are provided in Tables 8 and 9 . The tables also provide the best parameter values for all the methods.

It can be found from Table 8 that the values of indices for SOMGCL are higher than the remaining methods for the two datasets. Further, it is evident from Table 9 for the two datasets that the values of DB-index and $\beta$ index for SOMGCL are the lowest and the highest, respectively. The value of Dunn-index for SOMGCL is better than SOM, RRFCM, CEM and $c$ -medoids, except GSOM where the GSOM is the best and the SOMGCL is the second best. In other words, from Tables 8 and 9, out of 30 pairwise comparisons (2 $\times$ 5 $\times$ 3), SOMGCL is the best in 28 cases and the second best in 2 cases. The reason for this is that Dunn-index uses a ratio of minimum inter-cluster distance to the maximum cluster size, unlike the other indices. Note that, the quantization errors of SOMGCL for the remaining data can also be found in similar way to the breast cancer and the multi-A data.

5.3 Gene clustering and biological significance

We now discuss the gene clustering process with SOMGCL and then explain the biological interpretation of the resulting gene clusters using the GO term finder.

The microarray gene expressions are partitioned into different groups or clusters. Hence the number of nodes in the input layer are equal to the number of features or attributes. For example, the number of nodes in the input layer for training SOMGCL with breast cancer data is 1213 (shown in Table 1). Using SOMGCL, different numbers of gene clusters are generated for $c=$ 2, 4, 6, 8 and 10. As an example, the number of samples obtained using SOMGCL for $c=$ 4 and all the datasets are provided in Table 10. The table also shows parameter values of $\alpha_{0}$ and $e$ for the same value of $c$ (4). The biological meaning of the four gene clusters obtained using SOMGCL is revealed using GO term finder as explained below.

Table 10
Number of patterns in an output cluster of SOMGCL for $c=$ 4 for the datasets

Data	$c_{1}$	$c_{2}$	$c_{3}$	$c_{4}$
Breast	428	402	348	35	( $\alpha_{0}$ ) 0.095
					( $e$ )10
Multi-A	3071	2418	60	17	( $\alpha_{0}$ ) 0.95
					( $e$ ) 5
GDS5218	15588	14408	13336	11343	( $\alpha_{0}$ ) 0.9
					( $e$ ) 3
GDS5499	16394	13667	9611	9131	( $\alpha_{0}$ ) 0.95
					( $e$ ) 3

Gene ontology (GO) enrichment tool [3] is used to determine significant GO term annotations that are strongly associated with a group of genes under GO-slim biological categories, biological process, molecular function and cellular component. A $p$ -value associated with the GO term is obtained by using the hypergeometric distribution and the Bonferroni multiple hypothesis correction. A lower $p$ -value ( $<$ 0.05) indicates that the GO term associated with the group of genes is significant. The significant subcategories (GO terms) associated with each of the gene clusters under every category are initially identified. The most significant category out of all the subcategories under each category is then selected. For example, using the aforesaid gene cluster $c_{1}$ (428) of breast cancer data, 35 significant subcategories under GO-slim biological process are identified. The subcategory cytokine-mediated signaling pathway, out of all the subcategories, is found to be the most significant one. For breast cancer and multi-A data and $c=$ 4, the most significant subcategories are provided in Table 11. Although the significant subcategories are determined for different values of $c$ , only the most significant subcategories which are obtained for $c=$ 4 are presented in the table. For GDS5218 and GDS5499 data sets, the gene clusters obtained using SOMGCL for $c=$ 4 are found to be not biologically meaningful, i.e., no significant gene terms are found, using the GO term finder. The reason for this could be that the output cluster has a large number of gene expressions (see Table 10). It should be noted that the biologically meaningful clusters for GDS5218 and GDS5499 are also not discovered by using the comparative algorithms.

Table 11

Significant gene ontology terms obtained using the proposed SOMGCL for $c=$ 4

No. clusters	GO-slim biological category	GO term	$p$ -value
Breast cancer
$c_{1}$	Biological process	Cytokine-mediated
		Signaling pathway	3.04E-07
	Molecular function	Catalytic activity	1.67E-09
	Cellular component	Extracellular space	9.70E-04
$c_{2}$	Biological process	Lipid metabolic process	2.48E-03
	Molecular function	Oxidoreductase activity	8.85E-06
	Cellular component	Extracellular space	5.76E-05
$c_{3}$	Biological process	Cellular process	7.29E-04
	Molecular function	Binding	4.77E-05
	Cellular component	Extracellular region	2.70E-02
$c_{4}$	Cellular component	Extracellular matrix	3.43E-02
Multi-A data
$c_{1}$	Biological process	Metabolic process	2.60E-37
	Molecular function	Binding	2.24E-19
	Cellular component	Intracellular	2.16E-22
$c_{2}$	Biological process	Cellular process	1.33E-20
	Molecular function	Catalytic activity	1.25E-17
	Cellular component	Cell part	5.14E-14
$c_{3}$	Biological process	MAPK cascade	2.01E-02
	Molecular function	Protein binding	1.73E-02
$c_{4}$	Biological process	Translation	1.01E-21
	Molecular function	Structural constituent
		Of ribosome	3.44E-63
	Cellular component	Ribosome	4.24E-75

The gene terms in Table 11 are interpreted as follows. The lipid metabolism disorder is pathologically linked to hyperlipidemia, lipid storage disease, obesity and other related diseases. Abnormalities of related genes, hormones and enzymes lead to lipid metabolism disorders of cardiovascular disorders (CVD), metabolic diseases and cancers as specified in [20]. Cytokines are important in cancer initiation, progression, angiogenesis, metastasis and immunotherapy. Cytokines stimulate host immune responses not only against pathogens, but also against tumors. Host-derived cytokines can inhibit tumor progression. These can also promote proliferation, foster invasion and metastasis. Further details can be found in [10]. Catalytic activity represents correlations between the chemical composition and physical structure of a material. The relative reactivities of the lower alkanes in hydrogenolysis on a Pt/Al2O3 catalyst depend on the H2 pressure. These are pretreated in various ways for propane hydrogenolysis and a Ru/Al2O3 catalyst. In [2], the description of propane hydrogenolysis and a Ru/Al2O3 catalyst is presented. Biological descriptions of other gene functions and interpretations of gene terms are provided at https://scis.uohyd.ac.in/People/profile/avggoterm.xlsx.

6. Conclusions

In this study, a new method of granular competitive learning for self-organizing map (SOMGCL) is developed using fuzzy rough sets. A fuzzy strict order relation, which is an extension to the fuzzy reflexive relation, is proposed. The concepts of implication and t-norm are utilized in max and min composition of the proposed fuzzy relation. The proposed relation is incorporated into the lower and upper approximations of a set to define memberships belonging to their approximate regions. The lower and upper memberships of a set and the memberships of a class induced by the order relation represent information granules. The lower and upper memberships of a set are used to define the proposed SOMGCL involving a new fuzzy distance and the distance based granular neighborhood function, and the initial connection weights. The information granules embedded in SOMGCL help in efficient handling of uncertainty. The performance of SOMGCL for clustering of samples and genes is demonstrated on four well-known microarray datasets with dimensions ranging from 98 to 54675.

The performance of SOMGCL for clustering samples for all the datasets is found to be superior to GSOM, SOM, RRFCM, CEM and $c$ -medoids with respect to confusion matrices, Rand, Jaccard, Fowlkes-Mallows, $\beta$ and DB indices. The superiority of SOMGCL, when compared with the 5-benchmarked methods, in terms of Dunn-index, except GSOM for GDS5218 and GDS5499, is achieved. The SOMGCL obtains quantization error lower than GSOM and SOM for breast and multi-A datasets. Moreover, it is found that the SOMGCL identifies biologically significant gene clusters from the breast cancer and mutli-A datasets. The significant gene ontology terms associated with a group of genes are presented for four clusters. Although the significant gene terms are presented for only four clusters, similar results are observed for four other values of $c$ (clusters) discussed in this paper.

This work demonstrates the effectiveness of SOMGCL for clustering microarray with different characteristics that include large dimensions and more overlapping classes. The advantages of the proposed SOMGCL include less convergence speed and better clustering results for data having more overlapping classes. The limitation of the proposed SOMGCL is that it takes higher computational time. Future work will involve investigation of SOMGCL for clustering real life data with different sizes and multiple classes, where reducing computation time and finding biologically meaningful gene groups are the major tasks. Designing a granular model for clustering by considering feature selection techniques may constitute another research problem, where maintaining low computational cost and high performance is important. Future work may also include testing the effectiveness of the proposed algorithm with different distance functions and similarity relations. Selection of appropriate norm and fuzzy implication for the similarity relations is another research direction. It is also important to recommend the results of the proposed method to the experts in the field from which the data originates in terms of their usefulness in practice.

Availability of data and material (data transparency)

Data and material would be available with the manuscript.

Code availability

Code would be available with the manuscript.

Compliance with ethical Standards

(In case of Funding) Funding: No funding was granted to this study.(In case animals were involved) Ethical approval: No animals were involved in this study.(And/or in case humans were involved) Ethical approval: No humans were involved in the study.(If articles do not contain studies with human participants or animals by any of the authors) Ethical approval: The article does not contain any studies with human participants or animals performed by any of the authors.(In case humans are involved) Informed consent: No human participants were involved in the study. There was no need to obtain informed consent from the participants.

Footnotes

Conflict of interest

The author declares no conflict of interest.

References

Bianchi

Scardapane

Rizzi

Uncini

Sadeghian

. Granular computing techniques for classification and semantic characterization of structured data. Cognitive Computation. 2016; 8: 442-461.

Bond

Cunningham

Slaa

. What do we mean by atalytic activity. Topics in Catalysis. 1994; 1: 19-24.

Boyle

Weng

Gollub

Jin

Botstein

Cherry

, et al. GO: Term finder open source software for accessing gene ontology information and finding significantly enriched gene ontology terms associated with a list of genes. Bioinformatics. 2004; 20: 3710-3715.

Cornelis

De Cock

Radzikowska

. Fuzzy rough sets: From theory into practice. In: Pedrycz

Skowron

Kreinovich

(eds.). Wiley, Chichester. 2008.

Davies

Bouldin

. A cluster separation measure. IEEE Transanction on Pattern Analysis Machcine Intelligence PAMI-1. 1979; (2): 224-7.

Dubois

Prade

. Rough fuzzy sets and fuzzy rough sets. International Journal of General System. 1990; 17(2-3): 191-209.

Dunn

. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Journal of Cybernetics. 1973; 3(3): 32-57.

Fowlkes

Mallows

. A method for comparing two hierarchical clusterings. Journal of The American Statistical Association. 1983; 78(383): 553-569.

Ganivada

Ray

Pal

. Fuzzy rough granular self-organizing map and fuzzy rough entropy. Theoretical Computer Science. 2012; 466: 37-63.

10.

Guven-Maiorov

Acuner-Ozbabacan

Keskin

Gursoy

Nussinov

. Structural pathways of cytokines may illuminate their roles in regulation of cancer development and immunotherapy. Cancers. 2014; 6(2): 663-683.

11.

Haiying

Huiru

Francisco

. Poisson-based self-organizing feature maps and hierarchical clustering for serial analysis of gene expression data. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2007; 4(2): 163-175.

12.

Herrero

Valencia

Dopazo

. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics. 2001; 17: 126-136.

13.

Hoshida

Brunet

Tamayo

Golub

Mesirov

. Subclass mapping: Identifying common subtypes in independent disease data sets. PLoS One. 2007; 2(11): e1195-1e11958.

14.

Jelili

Itunuoluwa

Funke

Olufemi

Efosa

Faridah

, et al. Clustering algorithms: Their application to gene expression data. Bioinformatics and Biology Insights. 2016; 10: 237-253.

15.

Jiang

Min

Rao

. Fuzzy c-means clustering based on weights and gene expression programming. Pattern Recognition Letters. 2017; 90: 1-7.

16.

Klebanov

Yakovlev

. How high is the level of technical noise in microarray data. Biology Direct. 2007; 2(9): 1977-1989.

17.

Kohonen

. Self-organizing maps. Proceedings of The IEEE. 1990; 78: 1464-1480.

18.

Chen

. An extension to rough c-means clustering based on decision-theoretic rough sets model. International Journal of Approximate Reasoning. 2014; 55: 116-129.

19.

Qian

Wang

Dang

Jing

. Clustering ensemble based on sample’s stability. Artificial Intelligence. 2019; 273: 37-55.

20.

Long

Zhang

Zhu

Yin

Tan

, et al. Lipid metabolism and carcinogenesis, cancer development. American journal of Cancer Research. 2018; 8: 778-791.

21.

Maji

Paul

. Rough-fuzzy clustering for grouping functionally similar genes from microarray data. IEEE/ACM Transactions on Computational Biology Bioinformatics. 2013; 10(2): 286-299.

22.

Olman

Mao

. Parallel clustering algorithm for large data sets with applications in bioinformatics. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2008; 6(2): 344-352.

23.

Oyelade

Isewon

Oladipupo

Aromolaran

Uwoghiren

Ameh

, et al. Clustering algorithms: Their application to gene expression data. Bioinformatics and Biology Insights. 2016; 10: 237-253.

24.

Pal

Ghosh

Shankar

. Segmentation of remotely sensed images with fuzzy thresholding, and quantitative evaluation. International Journal of Remote Sensing. 2000; 21(11): 2269-2300.

25.

Pal

Dasgupta

Mitra

. Rough self organizing map. Applied Intelligence. 2004; 21(3): 289-299.

26.

Pal

Ray

Ganivada

. Granular neural networks, pattern recognition and bioinformatics. Springer-Verlag, Heidelberg. 2017.

27.

Pawlak

. Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic, Massachusetts. 1992.

28.

Radzikowska

Kerre

. A comparative study of fuzzy rough sets. Fuzzy Sets and Systems. 2002; 126(2): 137-155.

29.

Rand

. Objective criteria for the evaluation of clustering methods. Journal of The American Statistical Association. 1971; 66(336): 846-850.

30.

Ray

Ganivada

Pal

. A granular self-organizing map for clustering and gene selection in microarray data. IEEE Transactions on Neural Networks and Learning Systems. 2016; 27(9): 1890-1906.

31.

Rice

Belland

. A simulation study of moss floras using Jaccard’s coefficient of similarity. Journal of Biogeography. 1982; 9: 411-419.

32.

Thalamuthu

Mukhopadhyay

Zheng

Tseng

. Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics. 2006; 22(19): 2405-2412.

33.

Trevino

Falciani

Saldana

HAB

. DNA microarrays: A powerful genomic tool for biomedical and clinical research. Molecular Medicine. 2007; 13(9-10): 527-541.

34.

van’t Veer

Dai

van de Vijver

Hart

Mao

, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002; 415(6871): 530-536.

35.

Xie

Chi

Wang

. A aovel feature selection method based on binary differential evolution and feature subset correlation for microarray data. Soft Computing. 2022; doi: 10.21203/rs.3.rs-1283185/v1.

36.

Xie

Wang

. ILRC: A hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data. BMC Bioinformatics. 2021; 22(1): 1-19.

37.

Zadeh

. Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems. 1997; 90: 111-127.

Self-organizing map with granular competitive learning: Application to microarray clustering

Abstract

Keywords

1. Introduction

1.1 Motivation: Discovering gene clusters with different algorithms

1.2 Granular computing: Granular neural networks for clustering

1.3 Overview of SOMGCL

.

.

.

1 These are downloaded from http://www.ncbi.nlm.nih.gov/sites/ GDSbrowser.

5.2 Results of sample clustering

5.2.1 Breast cancer data

Table 2 Clustering solutions obtained using SOMGCL, GSOM, SOM, RRFCM, CEM and c -medoids for breast cancer data

Table 7 Comparison of SOMGCL with GSOM, SOM, RRFCM, CEM and c -medoids using DB, Dunn and β indices for c = 4 of multi-A data

5.3 Gene clustering and biological significance

Table 10 Number of patterns in an output cluster of SOMGCL for c = 4 for the datasets

Availability of data and material (data transparency)

Code availability

Compliance with ethical Standards

Footnotes

Conflict of interest

References

¹
These are downloaded from http://www.ncbi.nlm.nih.gov/sites/ GDSbrowser.

Table 2
Clustering solutions obtained using SOMGCL, GSOM, SOM, RRFCM, CEM and $c$ -medoids for breast cancer data

Table 7
Comparison of SOMGCL with GSOM, SOM, RRFCM, CEM and $c$ -medoids using DB, Dunn and $\beta$ indices for $c=$ 4 of multi-A data

Table 10
Number of patterns in an output cluster of SOMGCL for $c=$ 4 for the datasets