Abstract
It has been proven that the dendritic lattice neural network (DLNN) has the advantages of fast calculation, nonexistent convergence problems, and a superior capacity to store information. However, several datasets have also shown that the DLNN still suffers from low classification accuracy problems. This paper proposes that the main reason behind this problem is that the original DLNN cannot classify the samples that fall outside of all the hyperboxes. In order to solve this problem, a fuzzy inclusion measure is introduced to improve DLNN model’s testing algorithm. The improved testing algorithm of the DLNN model consists of two parts: (1) the classification of samples covered by a hyperbox with the DLNN model, and (2) the classification of samples outside all of the hyperboxes based on the principle of maximum membership degree. Throughout this study, four standard datasets were employed to evaluate the effectiveness of the improved DLNN (based on comparisons with the original DLNN). Experimental results show that, in both the training and testing samples, the improved DLNN is capable of higher classification accuracies than the original DLNN.
Introduction
The dendritic lattice neural network [9] (DLNN) was first proposed by introducing the model of signal neuron computation to artificial neural networks (ANNs). This combination has been shown to compute a perfect approximation to any data distribution [5, 9]. The DLNN and its related models [2, 10–12] have been adopted quickly and used widely in nonlinear problems of double helix [9] and N-bit parity [7], disease detection [4, 5], pattern recognition [6, 13], and image processing [8]. The DLNN and its models have been used due to the following advantages: efficient training, fast calculation, easy hardware implementation, nonexistent convergence problems, and superior information storage capacity (among others).
In our previous studies, we conducted a series of experiments on classical datasets to evaluate the DLNN model. However, the experimental results showed that the DLNN still suffered from low classification accuracy problems. Our current study demonstrates that the DLNN cannot accurately classify the testing samples that fall outside of the hyperboxes.
In view of the shortage of the DLNN, we improved the DLNN model by utilizing a fuzzy inclusion measure to increase the DLNN’s classification accuracy. When the sample is confirmed to be outside of all the hyperboxes, we compute the fuzzy inclusion measures of the first hyperbox of all classes, and we then assign the class label of the hyperbox with the maximum fuzzy inclusion measure to the testing sample. The original DLNN is still used to identify the samples covered by hyperboxes. The experimental results on four standard datasets indicate that the improved DLNN (IDLNN) has the capacity to outperform the original DLNN on both training and testing datasets.
The rest of this paper is structured as follows: Section 2 offers a brief introduction on the basic theories of the DLNN model; Section 3 describes the improvement in the DLNN via the use of a fuzzy inclusion measure; Section 4 shows our experimental results on four standard datasets; and Section 5 summarizes the study’s conclusions.
DLNN: Basic theories
There are input neurons N1, N2, …, N
n
and out-put neurons M (with a dendritic structure) in the DLNN model. Neuron N
i
(i = 1, 2, …, n) sends input information x
i
(i = 1, 2, …, n) through its synaptic branches to the dendritic trees of output neurons. The symbol denotes the connection weight between one synapse branch of N
i
(i = 1, 2, …, n) and the kth dendrite of output neurons M. The superscript l ∈ {0, 1} distinguishes whether the synapse branch causes excitation (l = 1) or inhibition (l = 0) on the dendrite. The computation formula of the kth dendrite is given by
For all dendrites D = {D1, D2, ⋯ , D
K
}, where K denotes the total number of dendrites, the total input received by neurons M is given by
The activation function used in signal layer morphological perceptron (SLMP) with dendrites is the hard limiter:
The total state computation of neurons M is given by
In the DLNN model, the lattice structure is not determined in advance. Rather, morphological neurons generate new dendrites that connect with the synapses of input neurons based on need during the training process. The training algorithms of DLNN are given below.
More detailed descriptions of the DLNN model can be found in the paper’s references [9].
Basic theories on fuzzy inclusion measure
If a crisp lattice L and a fuzzy membership function defined as μ p : S = {(x, y) : x, y ∈ L} → [0, 1], where μ p (x, y) denotes the degree of x contained in y, meet the condition that μ p (x, y) =1 if and only if x ≤ y, the pair (L, μ p (x, y)) is called a fuzzy lattice.
The inclusion measure was proposed to quantitatively describe the inclusion relation between two elements on a fuzzy lattice, which has been optimized in a reference [13] as follows:
When L denotes the Cartesian product of N lattices, then L
N
= L1 × L2 × ⋯ × L
N
. The function h defined on L
N
is given by
Suppose L is a complete lattice; the interval lattice τ (L) ={ [a, b] : a, b ∈ L } defined on L is a subset of L; and the lattice meeting point and lattice joint of τ (L) are defined as:
The corresponding partial ordering relation is given by
The partial ordering relation in L
α
that denotes the dual of a lattice L is the converse of L:
If an isomorphic function defined by θ : L
α
→ L in L meets x ≤ y ⇔ θ (x) ≥ θ (y), then the mapping transferring the inclusion measure from τ (L) to L is defined by ψ : [a, b] ∈ τ (L) → (θ (a) , b) ∈ L. Therefore, the inclusion measure in τ (L) is defined as:
More in depth descriptions of fuzzy lattice can be found in the paper’s references [16–19].
We can surmise, based on the training algorithm of the DLNN model, that a dendritic morphological neural network will generate a series of hyperboxes for each class sample. When a testing sample is classified, its total response of the first hyperbox in the kth class samples C k based on the Equation (2) must be computed. If the response is τ (x) ≥0, then f (τ) =1, which means that the testing sample is assigned to C k .
Figure 1 shows the corresponding classification hyperboxes for a two-class problem, in which the “+” represents the first class samples C1, the “∘” represents the second class samples C2, and where hyperboxes with a dark-colored background belong to C1 and those with a light-colored background belong to C2.
The point x1 (x1 ∈ C1) is a testing sample that falls outside all the hyperboxes in Fig. 1. To classify the sample x1 according to the definition outlined in Section 2, we can obtain τ C 1 (x1) <0 and τ C 2 (x1) <0. This means that the sample x1 is assigned neither to C1 nor to C2. The result above confirms that the DLNN model cannot classify the testing samples that fall outside all the hyperboxes.
In this section, the fuzzy lattice inclusion measures introduced in Section 3.1 are calculated as the membership values of the testing samples in different hyperboxes. Subsequently, the hyperbox with the maximum value of inclusion measure is assigned to the testing sample.
Taking the sample x1 and the hyperboxes of two classes’ in Fig. 1 as an example classifies the testing sample x1. The isomorphic function θ is defined by θ (x) =2 - x, x1 = [0.1, 1.1], c1 = [0.0070, 0.0014, 0.9892, 0.9797], c2 = [0.6089, 0.6149, 1.5806, 1.5912], where c1 (or c2) denotes the first hyperbox of C1 (or C2). Then we computed, respectively, the measures of x1 inclusion to hyperboxes c1 and c2 using Equation (10).
It is easily observable that when the sample is closer to the hyperboxes, the corresponding inclusion measure is larger. The testing sample x1 is accurately assigned to the first class samples C1 following the principle of maximum membershipdegree.
Testing algorithm of IDLNN
The training algorithm of IDLNN is the same as the DLNN model, and the testing algorithm of IDLNN based on the principle of maximum membership degree is described as follows:
Step 1. Receive the testing sample x i .
Step 2. Evaluate whether testing sample x i is covered by the hyperboxes: doesτ (x ξ ) ≥0 ?
If true, classify x i with the DLMM model described in Section 2. If false, run next step.
Step 3. Compute the inclusion measures to the first hyperbox w k of the kth class samples C k : σ (x i ≤ w k ) , k = 1, 2, ⋯ , K, where K denotes the total number of sample class.
Step 4. Obtain the maximum membership degree and the corresponding class of hyperbox C j , j ∈ {1, 2, ⋯ , K}.
Step 5. Classify the testing sample x i to C j .
Experimental results on standard datasets
In this section, the IDLNN model is applied to four standard datasets to verify the validity of the IDLNN model. The DLNN classifiers are used to provide a comparison. The classification accuracy referred to below is an average of 20 experiments.
Ripley dataset
The Ripley dataset is generated by mixture Gaussian distribution in a two-dimensional space [3], in which there is a training sample set consisting of 250 total samples, and a testing sample set consisting of 1000 samples for binary classification problem.
Figure 2 shows the hyperboxes’ training by the IDLNN model. Table 1 offers the testing classification accuracy of DLNN and IDLNN on the training set and the testing set. As illustrated in Fig. 2, the hyperboxes are able to cover the most training samples. One particular hyperbox was comparable to the DLNN classifier. Ultimately, the IDLNN classifier is capable of higher classification accuracy on both the training dataset and the testing dataset.
Spiral dataset
Spiral data [14] is a classical set for binary classification problems. It is generated by:
Figure 3 presents the spiral dataset’s training of hyperboxes by the IDLNN model; it reveals that the most training data can be covered by the hyperboxes. Table 2 lists the classification accuracies of the training dataset and testing dataset with the classifiers DLNN and IDLNN. As shown, the IDLNN outperforms DLNN on both training and testing datasets.
Iris dataset [15] consists of three classes of iris flowers with each class containing 50 samples. One of the three classes is linearly separable from the other two classes, and the other two classes are nonlinearly separable.
In each class, 25 total samples were randomly selected for training, with the rest utilized as the testing dataset. Table 3 displays classification accuracies with DLNN and IDLNN for this three-class problem. Comparatively, the DLNN performs more poorly than the IDLNN on both training and testing datasets.
Wine dataset
The Wine dataset [1] consists of three classes of wine, with 59, 71 and 48 samples per class (respectively). 30, 36 and 24 samples were randomly selected from each class as the training dataset. The remaining samples were used as the testing dataset. Table 4 shows the classification accuracies, which are much higher with IDLNN than with DLNN in both the training samples and the testing samples.
Conclusion
In this paper, we have analyzed some of the reasons behind the poor classification results of the dendritic lattice neural network (DLNN). We then improved the network based on a fuzzy inclusion measure. Four classical datasets (the Ripley dataset, Spiral dataset, Iris dataset, and Wine dataset) were used to verify the validity of the new IDLNN model and to compare it with the older DLNN model.
Experimental results have demonstrated that IDLNN is capable of higher classification accuracy than DLNN. For two binary classification problems, using the Ripley dataset and Spiral dataset, the DLNN classifier separated the training samples with 94% and 98% classification accuracy, lower than the IDLNN classifier with 99.6% and 99%. Meanwhile, the classification accuracy of the IDLNN model on testing samples is much higher than that of the DLNN model. For the other two multi-classification problems, the classification accuracy of the DLNN model on training samples and testing samples was not more than 80.17%; much lower than with the IDLNN.
