Abstract
Bottom-up and top-down are two main computing models in granular computing by which the granule set including granules with different granularities. The top-down hyperbox granular computing classification algorithm based on isolation, or IHBGrC for short, is proposed in the framework of top-down computing model. Algorithm IHBGrC defines a novel function to measure the distance between two hyperbox hgranules, which is used to judge the inclusion relation between two hyperbox granules, the meet operation is used to isolate the ith class data from the other class data, and the hyperbox granule is partitioned into some hyperbox granules which include the ith class data. We compare the performance of IHBGrC with support vector machines and HBGrC, for a number of two-class problems and multiclass problems. Our computational experiments showed that IHBGrC can both speed up training and achieve comparable generalization performance.
Introduction
There have been many researchers working in the granular computing field. Zadeh1,2 has identified three basic concepts, namely granulation, organization, and causation, that underlie the process of human cognition. More specifically, granulation is a process that decomposes a universe into parts. Conversely, organization is the way in which parts are integrated into the universe by the operation between two granules. Causation involves the association of causes and effects.
In general, organization process of objectives, granules, information granules are easily used to form granular computing algorithms, such as Jamshidi and Kaburlasos, 3 Kaburlasos and Pachidis, 4 Papadakis et al., 5 and Kaburlasos and Kehagias 6 represent a granule by a vector, obtain the granule set including the granules with different granularaties by the partial ordered relation between the two granules.
An associative memory is a system such that given input x produces output y. That is, the memory associates x with y. An auto-associative memory is an associative memory such that
Sossa and Guevara 8 introduce an efficient training algorithm for a dendrite morphological neural network (DMNN), for a training set, they regard a set as a hypercube, and partition the hypercube into 2 n smaller hypercubes for N-dimensional space, until each hypercube includes the same class labels.
In this paper, isolation-based hyperbox granular computing classification algorithms are presented. Firstly, a granule is a hyperbox induced by the beginning point and the end point. Secondly, the ith class data are isolated from others and used to form the temporary hyperbox granule set. Thirdly, all the temporary hyperbox granule sets are united to the final hyperbox granule set.
The layout of the remainder of this paper is as follows. The second section introduces motivation and related work. The third section describes algorithm IHBGrC. The fourth section presents comparative experimental results. The final section summarizes our contribution and describes future work.
Motivation and related work
In this section, the motivation for this proposed research work is presented, and some related works are discussed.
Motivation
For granular computing classification (GrC) in view of set theory, a granule is represented as the subset for the training set S. In general, distance between two non-empty sets is the minimum of the distances between any two of their respective points,
9
i.e.
Distances defined by formula (1) between two sets.
On the other hand, the operations that join and meet are used to generate the another granule. In most cases, the join operation is used by granular computing algorithms, and meet operation is used to measure the fuzzy inclusion relation, such as the fuzzy inclusion measure σ(G1, G2) is defined by the positive valuation values of G1 and meet of G1 and G2.
Decomposition method is the main strategy for divide and conquer methods,10,11 which are used in the transformation between complex task and simple task. In view of the limitation of traditional distance formula (1) and the advantage of decomposition method, we form the granular classification algorithms by the novel distance between two hyperbox granules and the meet operation between two hyperbox granules, which utilize the decomposition methods to divide the data set into some subsets, these subsets are regarded as granules. The novel distance formula is used to determine whether a datum is located inside the meet hyperbox. The labels of data lying inside the meet hyperbox determine the isolation process.
Related work
GrC has been proposed and studied in many fields, including machine learning and data analysis.3–6,12–18 In general, GrC is an emerging computing paradigm of information processing based on lattice computing theory.
GrC includes two kinds of computing models according to lattice computing theory, one is granular structure in terms of the relation between object and attribute. The other is fuzzy lattice reasoning induced by the relation between two objects based on lattice theory, such as fuzzy lattice, which defines the fuzzy inclusion measure between two granules, and the granule is related to the object. Pedrycz and his colleagues set up a certain conceptual framework composed of some generic and conceptually meaningful entities-information granules, which is granular structure and relates to the problem formulation and problem solving. This becomes a framework in which they formulate generic concepts by adopting a certain level of abstraction, carry out further processing, and communicate the results to the external environment. A few examples offer compelling evidence with image processing, processing, and interpretation of time series.12–15 A triarchic structure of granular computing integrates three important perspectives, namely, philosophy of structured thinking, methodology of structured problem solving, and mechanism of structured information processing.16,17 Fuzzy lattice reasoning is proposed by Kaburlasos and his colleagues, and generate granules with different granularity by the meet operation and the join operation between the two granules.3–6,18
The difference between the granular structure and fuzzy lattice reasoning is the different fuzzy relations. The granular structure mainly discusses the relations between object and attribute, and fuzzy lattice reasoning mainly discusses the partial order relations between two objects. In the literature of granular computing, the distance between two granules is seldom discussed. So we discuss the distance between two granules and isolate the ith class data from the other data.
Granular computing classification algorithms based on isolation
For N-dimensional space, we form the top-down IHBGrC in terms of the following steps. Firstly, two points called the beginning point and the end point are used to represent the hyperbox granule, and each sample is regarded as the atomic hyperbox granule that cannot be divided. Secondly, the distance measure between two hyperbox granules is defined to measure the inclusion relation. Thirdly, the meet operation ∧ between two hyperbox granules are designed to isolate the ith class data from the other class data.
Representation of the hyperbox granule
For the training set S composed of ℓ N-dimensional input vectors, two points
≤ is the less than or equal relation between two scalars. Here, point
For example, in two-dimensional space, HB1 = [0.1, 0.2, 0.4, 0.6] represents the hyperbox granule shown in Figure 2, which has the beginning point (0.1, 0.2) and the end point (0.4, 0.6). The length of hyperbox granule equals 0.4, and its width equals 0.3. The granularity of hyperbox granule is 0.5, which is determined by the beginning point and the end point. The another example is the atomic hyperbox granule HB2 = [0.5, 0.6, 0.5, 0.6] shown in Figure 2 with the granularity 0, which represents the single point (0.5, 0.6).
Hypergranules in two-dimensional space.
Distance measure
Distance is a numerical description of how far apart objects are. Distance between two hyperbox granules is the measure of farness between two objects, such as hyperbox granules. In analytic geometry, the distance between two points of the xy-plane can be found using the distance formula. In the Euclidean space RN, the distance between two points is usually given by the Euclidean distance. In mathematics, in particular geometry, a distance function on a given set M is a function d: M × M → R, where R denotes the set of real numbers. Similarly, in granule space–induced hyperbox granules, we define the distance between two hyperbox granules HB1 = (Bp1, Ep1) and HB2 = (Bp2, Ep2) as follows.
Firstly, the distance between point P and hyperbox granule HB is defined as
Secondly, the distance between two hyperbox granules HB1 = (Bp1, Ep1, g1) and HB2 = (Bp2, Ep2, g2) is defined as
The distance between two hyperbox granule has the following properties.
Because D(Bp1, HB2) ≥ 0 and D(Ep1, HB2) ≥ 0,Theorem 3.1
Proof
D(HB1, HB2) = (D(Bp1,HB2) + D(Ep1, HB2))/2 ≥ 0.
If D(HB1, HB2) = 0, D(Bp1,HB2) = 0 and D(Ep1, HB2) = 0. Both Bp1 and Ep1 are included in hyperbox granule HB2, namely HB1 ⊆ HB2.
If HB1 ⊆ HB2, both Bp1 and Ep1 are included in hyperbox granule HB2. According to theorem 1, D(Bp1,HB2) = 0 and D(Ep1,HB2) = 0, namely
Especially, if HB1 is an atomic hyperbox granule induced by a point P, we can judge whether the point P is located in the hyperbox granule HB2 by D(HB1,HB2). If D(HB1,HB2) = 0, the point P is located in HB2, otherwise the point P is out of HB2. The distance (equation (3)) is not a metric distance and does not satisfy symmetry, namely
D(HB1, HB2) = (D(Bp1, HB2) + D(Ep1, HB2))/2 = (d(Bp1, Bp2) + d(Bp1, Ep2) − d(Bp2, Ep2) + d(Ep1, Bp2) + d(Ep1, Ep2) − d(Bp2, Ep2))/2 = (d(Bp1, Bp2) + d(Bp1, Ep2) + d(Ep1, Bp2) + d(Ep1, Ep2))/2 − d(Bp2, Ep2)Theorem 3.2
Proof
Similarly,
Generally, D(HB1, HB2) ≠ D(HB2, HB1), especially, D(HB1, HB2) = D(HB2, HB1) when d(Bp1, Ep1) = d(Bp2, Ep2).
For two-dimensional space, two hyperbox granules HB1 = [0.2 0.1 0.3 0.4] and HB2 = [0.25 0.15 0.4 0.5], the distance between HB1 and HB2 are computed as follows. d(Bp1, Bp2) = 0.1, d(Bp1, Ep2) = 0.6, d(Ep1, Bp2) = 0.4, d(Ep1, Ep2) = 0.2, d(Bp1, Ep1) = 0.3, d(Bp2, Ep2) = 0.5, D(HB1, HB2) = 0.15, D(HB2, HB1) = 0.35.
Operation between two granules
In N-dimensional space, any two points x = (x1, x2,…, xN) and y = (y1, y2,…, yN) can form a hyperbox granule HB = (Bp, Ep), where
To avoid the points lying in the line or hyperplane of hyperbox, the relaxation factor ξ is used to form hyperbox for the subset s.
The join operator ∨ between two hyperbox granules is designed to achieve the hyperbox granule with larger granularity compared with the original hyperbox granules. For two hyperbox granules HB1 = (Bp1, Ep1) and HB2 = (Bp2, Ep2), the join operation ∨ is designed as follows.
Conversely, the meet operation ∧ between two hyperbox granules is designed to obtain the hyperbox granule with the smaller granularity compared with the original hyperbox granules. The meet operation ∧ is designed as follows.
From formula (3), we can see Bp1 ∧ Bp2 ≼ Bp1, Bp1 ∧ Bp2 ≼ Bp2, Bp1 ≼ Ep1 ∨ Ep2, Bp2 ≼ Ep1 ∨ Ep2, ||Bp1 ∧ Bp2 − Ep1 ∨ Ep2||2 ≥ ||Bp1 − Ep1||2, ||Bp1 ∧ Bp2 − Ep1 ∨ Ep2||2 ≥ ||Bp2 − Ep2||2, namely the granularity of HB1 ∨ HB2 is greater than or equal to the granularities of HB1 and HB2, and the operation ∨ induces the hyperbox granule with larger granularity compared with original granules. From formula (5), we draw the opposite conclusion that the meet operation induces the hyperbox granule with the smaller granularity compared with original granules.
Isolation algorithm
For the training set
Input: the training set S Output: the hyperbox granule set GSi and the corresponding class label labi S1. GSi = Ø, labi = Ø S2. Extracting the ith class data from the training set S S3. Compute the hyperbox HB1 = [a1 b1] by the ith class and HB2 = [a2 b2] by the other classes S4. If HB1 is empty, the procedure is terminated, return GSi S5. If the meet hyperbox HB of HB1 and HB2 is empty, GSi = GSi ∪ {HB1}, then HBt = HB1, else S6. If HB = HB1, then HBt = HB1, else S7. Computing the vertex set P of HB1 by the beginning point and the end point S8. If the beginning point x of HB and the beginning point x1 of HB1 are identical, then the HBt is induced by x and the vertex set P, else S9. The HBt is induced by y and the vertex set P S10. Find labels of data included in hyperbox granule in HBt S11. If all the labels are equal to i, then GSi = GSi ∪ {HBt} and remove the data, else update S, namely S include the data with different class labels, and go to S2.Algorithm1: the ith class isolation algorithm
In S3, the hyperbox can be formed by formula (5).
In S4, there are 2
N
vertexes in N-dimensional space, for HB1 = [0.1 0.2 0.3 0.2 0.4 0.7] in three-dimensional space, the beginning point is Bp = (0.1 0.2 0.3), and the end point is Ep = (0.2 0.4 0.7), the vertex set P induced by HB1 are shown in Figure 3.

In S5, suppose HB1 = [0.1 0.2 0.3 0.2 0.4 0.7] and HB2 = [0.15 0.35 0.55 0.3 0.5 0.9], the HB1 has the positive class and HB2 has the negative class, the meet hyperbox is HB = [0.15 0.35 0.55 0.2 0.4 0.7], which is obtained by formula (4) and nonempty, the beginning point of HB1 is a = (0.15 0.35 0.55), and the end point of HB1 is b = (0.2 0.4 0.7).
In S7, we computed eight hyperboxes, they are [0.1 0.2 0.3 0.15 0.35 0.55], [0.1 0.2 0.55 0.15 0.35 0.7], [0.1 0.35 0.3 0.15 0.4 0.55], [0.1 0.35 0.55 0.15 0.4 0.7], [0.15 0.2 0.3 0.2 0.35 0.55], [0.15 0.2 0.55 0.2 0.35 0.7], [0.15 0.35 0.3 0.2 0.4 0.55], [0.15 0.35 0.55 0.2 0.4 0.7], where [0.15 0.35 0.55 0.2 0.4 0.7] may be include other class data, and the other seven formed hyperboxes are added to the hyperbox granule set GS.
Finally, the data included into hyperbox [0.15 0.35 0.55 0.2 0.4 0.7] are composed of the updated training set S, and re-perform the procedure from S2.
A key issue in the design of GrC is its training. For purposes of explaining the algorithm, a simple example of two-class problem with two attributes is used. Figure 4 shows the example patterns, each input includes two attributes, each output includes 1 or 2.
Training data in two-dimensional space.
Given n-class classification problem S = {(x, y)|x ∈ RN, y ∈ N+}, the IHBGrC is performed by the following steps: IHBGrC algorithm Input: the training set S Output: the hyperbox granule set GS and the corresponding class label lab S1. GS = Ø, lab = Ø S2. i = 1 S3. Obtain the hyperbox granule set GSi and labi by algorithm 1 for the ith class data S4. GS = GS ∪ GSi and lab = lab ∪ labi S5. if i = n, then the procedure is terminated, and GS is return S6. i = i + 1, go to S3Algorithm 2
For the training data shown in Figure 4, if set ξ = 0.0001, 12 hyperboxes and the corresponding class labels are obtained by performing algorithm 2. They are
If class label is 1, all the data with class label 1 are included in one of the hyperbox granule set GS1 shown in Figure 5 with class lab1 by performing S3. If class label is 2, all the data with class label 2 are included in one of the hyperbox granule set GS2 shown in Figure 6 with class labs by performing S3. The final hyperbox granule set GS = GS1 ∪ GS2, and all the training data lie inside one of the hyperbox granule set GS with the corresponding class labels shown in Figure 7.
The firsth class data and the formed hyperboxes. The second class data and the formed hyperboxes. The training data and their formed hyperboxes.


For testing, the input unknown datum x is represented by the atomic hyperbox granule [x x] whose beginning point and end point are identical, and the distance between the atomic hyperbox granule [x x] and the anyone of GS is computed, the corresponding label of hyperbox in GS which is nearest to [x x] is the class label of the input unknown datum x.
Taking x = (0.5, 0.4) for example, the atomic hyperbox granule induced by x is HB
x
= [0.5 0.4 0.5 0.4], the distances between HB
x
and any one of hyperbox granule set GS are
Similarly,
So the class label of datum x is the corresponding class label of HB1, namely the class label of datum x is 1.
For the achieved granule set, there are two significant properties, (1) there are not intersection between any two hyperbox granules and (2) each training sample is included in the only one hyperbox granule.
Experiments
In this section, validation experimental results are performed for two-class problems and n-class problems in two-dimensional space and N-dimensional space. We evaluated the effectiveness of IHBGrC using Intel(R) Core(TM) i5 CPU with 3.2 GHz and 8 GB memory, running Microsoft Win7, and Matlab2008. We mainly analyze and discuss IHBGrC compared with support vector machines (SVMs) 19 from the size of GS or support vector sets (SVs), testing accuracy (generalization ability) (%), and training time (s).
Classification problems in two-dimensional space
The first data set is the Ripley’s synthetic data set that consists of two classes in two-dimensional space.
20
The data are divided into a training data set and a test set consisting of 250 and 1000 samples, respectively, with the same number of samples belonging to each of the two classes. The training data set appears in Figure 8.
Ripley’s synthetic training data set.
For IHBGrC, the GS including 24 hyperbox granules with class label 1 and 26 hyperbox granules with class label 2 are obtained and shown in Figure 9 by IHBGrC. The testing data and hyperbox are shown in Figure 10. For Ripley’s data set, the training accuracy by IHBGrC is 100%, and the testing accuracy by IHBGrC is 88.6% when the parameter epsilon is set to 0.00001, testing accuracy by SLMP-R is 81.2%, and testing accuracy by SLMP-P is 87.8%.
21
For Ripley’s data set, IHBGrC testing accuracy is greater than SLMP-R and SLMP-P.
GS by IHBGrC and 250 training data for Ripley. GS by IHBGrC and 1000 testing samples for Ripley.

For SVMs, the best testing accuracy is the index of optimal SVMs, the best testing accuracy is 90.3% when C = 5000, the kernel function is polynomial kernel with order 5. The corresponding training accuracy is 89.6%. The training data and the classification boundary are shown in Figure 11.
250 Training data and the boundary by SVMs for Ripley, the training accuracy is 89.6%.
The second data set was the spiral synthetic data set that consists of three classes, which is used to verify the feasibility of IHBGrC for multi-class classification prolems in two-dimensional space.
22
All the training data are shown in Figure 12, and the achieved hyperbox granules are shown in Figure 13. The achieved hyperbox granule set includes 61 hyperbox granules, where 31 hyperbox granules have the class labels 1, which are marked by the red rectangles in Figure 13, 6 hyperbox granules have the class labels 2 and are marked by the green rectangles in Figure 13, and 17 hyperbox granules have the class labels 3 and are marked by the blue rectangles in Figure 13.
Training data of spiral classification problem. The achieved hyperboxes of IHBGrC for the spiral classification problem.

For SVMs, the one-vs-rest strategy is adopted to the multi-class classification problems. The training accuracy is 100% when the radial basis function (RBF) kernel function with the width 5 is selected to perform the SVMs. Figure 14 shows the boundary of 1-vs-rest SVMs, Figure 15 shows the boundary of 2-vs-rest SVMs, and Figure 16 shows the boundary of 3-vs-rest SVMs.
Boundary of 1-vs-rest SVMs for the spiral classification problem. Boundary of 2-vs-rest SVMs for the spiral classification problem. Boundary of 3-vs-rest SVMs for the spiral classification problem.


Classification problems in N-dimensional space
Classification problems in space RN.
Banknote data were extracted from images that were taken from genuine and forged banknote-like specimens. For digitization, an industrial camera usually used for print inspection was used. The final images have 400 × 400 pixels. Due to the object lens and distance to the investigated object, gray-scale pictures with a resolution of about 660 dpi were gained. Wavelet Transform tool were used to extract features from images.
The data set wilt contains some data from a remote sensing study by Johnson et al. (2013) 24 that involved detecting diseased trees in Quickbird imagery. There are few training samples for the diseased trees class and many for other land cover class.
The data set iris contains three classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other two; the latter are NOT linearly separable from each other.
The data set sensor4 contains four sensor readings named “simplified distances” and the corresponding class label. These simplified distances are referred to as the “front distance,” “left distance,” “right distance,” and “back distance.” They consist, respectively, of the minimum sensor readings among those within 60° arcs located at the front, left, right, and back parts of the robot.
For data set shuttle, approximately 80% of the data belongs to class 1. Therefore, the default accuracy is about 80%. The aim here is to obtain an accuracy of 99–99.9%.
Performance of IHBGrC for classification problems in RN.
The best performances are marked by the boldface ones.
For data set banknote, the minimal testing accuracy, the maximal testing accuracy, and the mean of testing accuracies by IHBGrC are greater than or equal to those of SVMs. For data set wilt, the testing accuracies of IHBGrC are better than those of SVMs, such as the minimal, maximal, and mean testing accuracies. For data set iris, the mean of IHBGrC is less than SVMs.
The experiments reveal that IHBGrC algorithm has the following features:
Convergence in a finite number of steps. Perfect classification of the training data. No overlap between hyperboxes with distinct class labels. Once the value of ξ is set, if there are the same training patterns, the hyperboxes generated are always the same. No dependency of class presentation order. IHBGrC can be applied to classification problems of n classes and N attributes.
IHBGrC is presented by the meet operation between two hyperboxes with different class labels, and achieves the comparable classification accuracies compared with HBGrC, which is bottom-up granular computing classification algorithm designed by the join operation between two hyperboxes with the same class label. Two main drawbacks of the proposed algorithm are (1) the number of output hyperbox granule grows exponentially as the number of describing dimensions increases and (2) testing accuracy for the classification problems with the numerical attributes is better than the testing accuracy for the classification problems with the symbol attributes. For the symbol attributes, we must transform them into numerical attributes to perform IHBGrC, for example, symbol a is transformed into 1, and symbol b is transformed into 2, this affects the classification accuracy.
For the computational complexity, IHBGrC and HBGrC can achieve the hyperbox granules through only one time scan of the training set, so their time complexities are less than SVMs. The aforementioned property (Theorem 1) of the distance between two hyperbox granules is mainly used to judge the inclusion relation between two hyperbox granules and compute the distance between two hyperbox granule, especially compute the distance between an atomic hyperbox granule and a hyperbox granule during the testing process. The non-symmetry of distance formula (3) results in the higher storage space compared with the symmetry distance, for hyperbox granule set including n hyperbox granules, the symmetry distances between any two hyperbox granules need n(n + 1)/2 storage units, and the distances by formula (3) need n2 storage units.
Conclusions
The top-down hyperbox granular computing classification algorithm based on isolation is proposed in the paper. Firstly, a granule was represented as a hyperbox with the beginning point and the end point. Secondly, the meet operation was used to isolate the ith class data from the other data. Thirdly, the novel distance between two hyperbox granules was used to measure the inclusion relation between the two hyperbox granules. Finally, we evaluated the effectiveness of IHBGrC using the benchmark data sets. These results demonstrate a superior learning performance of the proposed algorithm IHBGrC. Two main drawbacks of the proposed algorithm are (1) it grows exponentially as the number of describing dimensions increases and (2) testing accuracy for the classification problems with the numerical attributes is better than the testing accuracy for the classification problems with the symbol attributes.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the Natural Science Foundation of China (Grant no. 61170202, 61402393) and Natural Science Foundation of Henan (nos. 132300410421, 132300410422).
