New perspective on the classification accuracy based on neighborhood system

Abstract

Classification accuracy plays an important role in the evaluation of single-label and multi-label classification methods. Many different evaluation methods have been proposed to evaluate different kind of classification methods. Even in the same kind of classification methods, there are also many different evaluation methods. In this paper, we seek to present a unified evaluation criterion for the classification accuracy measurement, irrespective of single-label and multi-label classification method. We use neighborhood system theory to design absolute accuracy (AA) and relative accuracy rate (RAR) to evaluate different single-label classification methods, then apply them to evaluate different multi-label classification methods. And finally, some important examples are illustrated to understand the unified evaluation criterion in different classification situations.

Keywords

Classification method accuracy neighborhood system single-label multi-label

1. Introduction

Classification refers to categorizing samples into different groups under the given labels, which is a very important task in data mining. Many classification methods have been proposed [1, 2]. However, how to choose a classification method which is more accurate than others, when a classification task is given, becomes crucial problem. As we all know, estimating the accuracy of classification method for the given task is important not only to predict its future prediction accuracy but also to choose a method. So far, there are several methods [3, 4], which are holdout, cross-validation, bootstrapping and leave-one-out etc. to be used to estimate the accuracy of classification method. While it is known that no accuracy estimation can be correct all the time [5], they all are prediction methods which used unknown data (without being classified) to identify a suitable classification model that is well suited for the given task. They usually assume that every sample data has only one label, that’s to say, each sample data must be labeled into only one category. Actually, sample data may have more than one label, which is concerned with multi-label classification [6].

In this paper, we focus on the accuracy measurement which is indispensable part for the classification method to evaluate its classification accuracy in the classification procedure. So far, there are many ways to measure classification accuracy [1, 7], different classification methods or different kind of classification methods use different ways to measure its accuracy. There is no unified measurement criterion, when considering all classification situations. So we continue our research based on our previous works [8]. We will take full consideration of different kind of classification methods (one is single-label, and the other is multi-label), and seek for a unified evaluation criterion, called unified evaluation paradigm, for the classification method accuracy with the background of neighborhood system theory.

The structure of the paper is as follows. Section 2 reviews briefly previous related work on the measurement of classification accuracy. Section 3 gives some basic concepts and innovative Theorem of neighborhood system, which will help us to present the unified evaluation criterion. A unified accuracy evaluation paradigm induced by single-label classification is shown in Section 4 and some important examples are illustrated to help understanding this paradigm. Section 5 discusses induced accuracy evaluation paradigm for the multi-label classification and shows its application with an important example. Finally, conclusions are summarized in the end.

2. Related works

For the measurement of single-label classification accuracy [1], the sensitivity and specificity measures can be used respectively. Sensitivity is also referred to as the true positive (recognition) rate (that is, the proportion of positive samples that are correctly identified), while specificityis the true negative rate(that is, the proportion of negative samples that are correctly identified). Then the accuracy is defined as a function of sensitivity and specificity.

Remarks 1 for the limitations. In single-label classification problems, it is commonly assumed that all samples are uniquely classified, that is, that each training sample can belong to only one class. Yet, owing to the wide diversity of data in large databases, it is not always reasonable to assume that all samples are uniquely classified. Rather, it is more probable to assume that each sample may belong to more than one class. How then can the accuracy of classification method on large databases be measured? The accuracy measure is not appropriate, because it does not take into account the possibility of samples belonging to more than one class.

Multi-label classification methods are increasingly required by modern applications, such as protein function classification, music categorization and semantic scene classification etc. For the measurement of multi-label classification accuracy [6, 7], different multi-label classification problems requires different metrics, which is different from those used in traditional single-label classification. So far, several methods, such as Hamming Loss [9], Forgiveness Rate [10], Accuracy-precision-recall [11], have been proposed to evaluate the accuracy of multi-label classification method.

Remark 2 for the limitations. As we mentioned above, different multi-label classification problems requires different metrics, exited that there is no unified measurement criterion for the accuracy of multi-label classification method. Besides, exited evaluation methods have their own defects or limitations. For example, it is difficult for the forgiveness rate method to set its parameters, and it is not very comprehensive for the method of hamming loss to make it into application.

3. Preliminaries

Definition 1 [12, 13]. Let $U$ be a non-empty finite set and $p\in U$ , then:

1. 1.
A neighborhood of $p$ , denoted by $N(p)$ , is a non empty subset of $U$ . $N(p)$ may or may not contains $p$ .
2.
A neighborhood system of $p$ , denoted by $\textit{NS}(p)$ , is the maximal family of neighborhoods of $p$ . If $p$ has no neighborhood, then $\textit{NS}(p)$ is an empty family.
3.
A neighborhood system of $U$ , denoted by $\textit{NS}(U)$ , is the collection of $\textit{NS}(p)$ for all $p$ in $U$ .
4.
A set $U$ together with $\textit{NS}(U)$ , $(U,\textit{NS}(U))$ , is called a neighborhood system space (NS-space) or simply neighborhood system.

Definition 2 [14] (Neighborhood). Let $U$ be a set, $C$ a covering of $U$ . For any $p\in U$ , we define the neighborhood of $p$ as follows:

$N(p)=\cap\{K\in C|p\in K\}.$

Definition 3 [15] (Neighborhood System). Let $U$ be a set, $C$ a covering of $U$ . For any $p\in U$ , we define the neighborhood system of $p$ as follows:

$\displaystyle\textit{NS}(p)=\{K|K\in C,p\in K\}.$ $\displaystyle\textit{Setting }Q_{\textit{NS}(p)_{C}}=\{\textit{subscript}(K)|K% \in\textit{NS}(p),$ $\displaystyle\quad∼{}K\in C\}$

which is the set of subscript number that every element in $\textit{NS}(p)$ occurs in $C$ .

For example. $U=\{a,b,c,d,e,f,g\}$ and $C$ $=$ $\{C_{1},$ $C_{2},$ $C_{3}\}=\{\{b,c,d\},\{a,e,f\},\{a,d,g\}\}$ , so

$\displaystyle Q_{\textit{NS}(a)_{C}}=\{2,3\},Q_{\textit{NS}(c)_{C}}=\{1\},$ $\displaystyle Q_{\textit{NS}(d)_{C}}=\{1,3\}.$

Here we have a new definition called derived covering according to neighborhood system.

Definition 4 (Derived Covering). Let $U$ be a set, $C$ a covering of $U$ . For any $p\in U$ , we define the derived covering of $C$ as follows:

$cov(C)=\{N(p)|p\in U\}.$

Finally, we induced a new Theorem, which can help us to find overlapped elements from the covering $C$ on set $U$ :

Theorem. Let $U$ be a set, $C$ be a covering of $U$ . $D=\cup\{Y|Y\in cov(C),\text{if }\exists K\in C,\text{then }Y\subset K\}$ is all the overlapped element in $U$ based on $C$ .

Proof For the construction of $D$ , $\forall p\in D$ , $\exists Y\in cov$ $(C)$ , $\exists K\in C\Rightarrow p\in Y$ , $Y\subset K$ . According to Definitions 2 and 4, $p$ belongs to at least one element in $C$ , and because of $Y\subset K$ , there are at least two element in $C$ containing $p$ . Therefore, $p$ is a overlapped element in $U$ based on $C$ .
4. A unified accuracy evaluation paradigm induced by single-label classification

4.1 Discussion for different single-label classification conditions

The purpose of this paper is to evaluate the accuracy of classification method distinguished with estimating accuracy in the different classification assumption after we knew classification result and ideal classification result. In order to continue our works, we define absolute accuracy (AA) and relative accuracy rate (RAR) and then evaluate accuracy of different classification method. Therefore, for the single-label classification, there are four kinds of classification results under the two ideal classification assumptions.

4.1.1 Ideal classification result is a partition

Suppose that $U$ is a non-empty labeled sample space, and ideal classification result is $P$ $=$ $\{P_{1},$ $P_{2},$ $\ldots,$ $P_{m}\}$ which is partition on $U$ according to the classification labels. Now there are two kinds of classification results after labeled samples in $U$ are classified by classification method.

(1) (1)
Classification result is a partition which is supposed as $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},\ldots,P_{m}^{\prime}\}$ .

We have

$\textit{AA}^{\prime}=\frac{T_{pos}}{|U|},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ , and

$\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}},$

where $T_{neg}=\sum\nolimits_{i=1}^{m}{|P_{i}^{\prime}-P_{i}|}$ or $|U|-T_{pos}$ .

For example 1. $U=\{a,b,c,d,e,f,g\}$ and $P$ $=$ $\{P_{1},$ $P_{2},$ $P_{3}\}$ $=\{\{b,c,d\},\{e,f\},\{a,g\}\}$ , $P^{\prime}$ $=$ $\{P_{1}^{\prime},$ $P_{2}^{\prime},$ $P_{3}^{\prime}\}$ $=$ $\{\{b,c\},\{e,f\},\{a,,d,g\}\}$ ,

so

$\displaystyle\textit{AA}^{\prime}=\frac{T_{pos}}{|U|}={\displaystyle\frac{\sum% \limits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}}{|U|}}$ $\displaystyle=\frac{2+2+2}{7}=\frac{6}{7},$ $\displaystyle\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}}=\frac{\sum\limits_{% i=1}^{m}{|P_{i}^{\prime}\cap P_{i}|}}{\sum\limits_{i=1}^{m}{|P_{i}^{\prime}-P_% {i}|}}$ $\displaystyle=\frac{6}{0+0+1}=6.$
(2)
Classification result is a covering which is supposed as $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},\ldots,P_{m}^{\prime}\}$ .

We have

$\textit{AA}^{\prime}=\frac{T_{pos}}{|U|},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ , and

$\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}},$

where $T_{neg}=\sum\nolimits_{i=1}^{m}{|P_{i}^{\prime}-P_{i}|}$ .

For example 2. $U=\{a,b,c,d,e,f,g\}$ and $P$ $=$ $\{P_{1},$ $P_{2},$ $P_{3}\}$ $=\{\{b,c,d\},\{e,f\},\{a,g\}\}$ , $P^{\prime}$ $=$ $\{P_{1}^{\prime},$ $P_{2}^{\prime},$ $P_{3}^{\prime}\}$ $=\{\{a,b,c\},\{e,f\},\{a,d,g\}\}$ ,

so

$\displaystyle\textit{AA}^{\prime}=\frac{T_{pos}}{|U|}=\frac{\sum\limits_{i=1}^% {m}{|P_{i}\cap P_{i}^{\prime}|}}{|U|}=\frac{6}{7},$ $\displaystyle\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}}=\frac{\sum\limits_{% i=1}^{m}{|P_{i}^{\prime}\cap P_{i}|}}{\sum\limits_{i=1}^{m}{|P_{i}^{\prime}-P_% {i}|}}=3.$

4.1.2 Ideal classification result is a covering

Suppose that $U$ is a non-empty labeled sample space, and ideal classification result is $P$ $=$ $\{P_{1},$ $P_{2},$ $\ldots,$ $P_{m}\}$ which is covering on $U$ according to the classification labels. Now there are two kinds of classification results after labeled samples in $U$ are classified by classification method.

(1) (1)
Classification result is a partition which is supposed as $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},\ldots,P_{m}^{\prime}\}$ .

We have

$\textit{AA}^{\prime}={\displaystyle\frac{T_{pos}}{\sum\limits_{i=1}^{m}{|P_{i}% |}}},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ , and

$\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}},$

where $T_{neg}=\sum\nolimits_{i=1}^{m}{|P_{i}-P_{i}^{\prime}|}$ .

For example 3. $U=\{a,b,c,d,e,f,g\}$ and $P$ $=$ $\{P_{1},$ $P_{2},$ $P_{3}\}$ $=\{\{b,c,d\},\{a,e,f\},\{a,g\}\}$ , $P^{\prime}$ $=$ $\{P_{1}^{\prime},$ $P_{2}^{\prime},$ $P_{3}^{\prime}\}$ $=\{\{b,c\},\{e,f\},\{a,d,g\}\}$ ,

so

$\displaystyle\textit{AA}^{\prime}={\displaystyle\frac{T_{pos}}{\sum\limits_{i=% 1}^{m}{|P_{i}|}}}=\frac{3}{4},$ $\displaystyle\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}}=\frac{\sum\limits_{% i=1}^{m}{|P_{i}^{\prime}\cap P_{i}|}}{\sum\limits_{i=1}^{m}{|P_{i}-P_{i}^{% \prime}|}}=3.$
(2)
Classification result is a covering which is supposed as $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},\ldots,P_{m}^{\prime}\}$ .

We have

$\textit{AA}^{\prime}=\frac{T_{pos}}{\sum\limits_{i=1}^{m}{|P_{i}|}},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ , and

$\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}},$

where

$\displaystyle T_{neg}=\sum\limits_{x\in T_{1}^{\prime}}x+\sum\limits_{y\in T_{% 2}^{\prime}}y,$ $\displaystyle T_{1}^{\prime}=\{|Q_{\textit{NS}(r)_{P^{\prime}}}-Q_{\textit{NS}% (r)_{P}}||r\in U-$ $\displaystyle\quad∼{}D_{P}\}.$ $\displaystyle T_{2}^{\prime}=\{\max(|Q_{\textit{NS}(s)_{P^{\prime}}}-Q_{% \textit{NS}(s)_{P}}|,$ $\displaystyle\quad∼{}|Q_{\textit{NS}(s)_{P}}-Q_{\textit{NS}(s)_{P^{\prime}}}|)% |s\in D_{P}\}.$

For example 4. $U=\{a,b,c,d,e,f,g\}$ and $P$ $=$ $\{P_{1},$ $P_{2},$ $P_{3}\}$ $=$ $\{\{b,c,d\},\{a,e,f\},\{a,g\}\}$ , $P^{\prime}$ $=$ $\{P_{1}^{\prime},P_{2}^{\prime},P_{3}^{\prime}\}=\{\{b,c\},\{e,c,f\},\{a,d,$ $g\}\}$ ,

so

$\textit{AA}^{\prime}=\frac{T_{pos}}{\sum\limits_{i=1}^{m}{|P_{i}|}}=\frac{3}{4},$

and according to Theorem, $D_{P}=\{a\}$ , $T_{1}^{\prime}$ $=$ $\{0,$ $1,$ $1,0,0,0\}$ , $T_{2}^{\prime}=\{1\}$ , finally,

$\textit{RAR}^{\prime}=\frac{T_{pos}}{T_{neg}}=\frac{6}{2+1}=2.$

4.2 A unified accuracy evaluation paradigm based on neighborhood system

We discussed four conditions for the accuracy measurement of single-label classification method. Actually, summarizing the universality from these four conditions, we have a unified form to evaluate classification accuracy according to neighborhood system. So, for the given sample space and ideal classification assumption, a unified evaluation paradigm for the accuracy of single-label classification is given as follows without need to consider difference of the classification results and difference of the ideal classification assumptions.

Suppose that $U$ is a non-empty labeled sample space, and ideal classification result is $P$ $=$ $\{P_{1},$ $P_{2},$ $\ldots,$ $P_{m}\}$ . Classification result of labeled samples in $U$ classified by classification method is $P^{\prime}$ $=$ $\{P_{1}^{\prime},$ $P_{2}^{\prime},$ $\ldots,$ $P_{m}^{\prime}\}$ .

We have the final

$\textit{AA}=\frac{T_{pos}}{\sum\limits_{i=1}^{m}{|P_{i}|}},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ and

$\textit{RAR}=\frac{T_{pos}}{T_{neg}},$

where

$\displaystyle T_{neg}=\sum\limits_{x\in T_{1}}x+\sum\limits_{y\in T_{2}}y,$ $\displaystyle T_{1}=\{\max(|Q_{\textit{NS}(r)_{P^{\prime}}}-Q_{\textit{NS}(r)_% {P}}|,|Q_{\textit{NS}(r)_{P}}-$ $\displaystyle\quad∼{}Q_{\textit{NS}(r)_{P^{\prime}}}|)|r\in U-D_{P}\}.$ $\displaystyle T_{2}=\{\max(|Q_{\textit{NS}(s)_{P^{\prime}}}-Q_{\textit{NS}(s)_% {P}}|,|Q_{\textit{NS}(s)_{P}}-$ $\displaystyle\quad∼{}Q_{\textit{NS}(s)_{P^{\prime}}}|)|s\in D_{P}\}.$

Remarks 3. The unified evaluation criterion for the single-label classification method, after ideal classification assumption and sample space are given, is that if method 1 is more accuracy than method 2 under the same classification assumption for the given sample space if and only if both AA and RAR of method 1 are not less than method 2 ${}^{\prime}$ .

For example 5. $U=\{a,b,c,d,e,f,g\}$ and $P$ $=$ $\{P_{1},$ $P_{2},$ $P_{3}\}=\{\{b,c,d\},\{a,e,f\},\{a,g\}\}$ , the classification result of classification method 1 and method 2 are $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},P_{3}^{\prime}\}=\{\{b,c\},\{e,f\},% \{a,d,g\}\}$ and $P^{\prime\prime}$ $=$ $\{P_{1}^{\prime\prime},$ $P_{2}^{\prime\prime},$ $P_{3}^{\prime\prime}\}$ $=\{\{b,c\},\{e,c,f\},\{a,d,g\}\}$ respectively. From the unified accuracy evaluation paradigm:

We have classification method 1:

$\textit{AA}^{\prime}=\frac{T_{pos}^{\prime}}{\sum\limits_{i=1}^{m}{|P_{i}|}}=% \frac{3}{4},$

$D_{P}=\{a\},T_{1}^{\prime}=\{0,0,1,0,0,0\},T_{2}^{\prime}=\{1\},$

$\textit{RAR}^{\prime}=\frac{T_{pos}^{\prime}}{T_{neg}^{\prime}}=\frac{6}{1+1}=3,$

and classification method 2:

$\textit{AA}^{\prime\prime}=\frac{T_{pos}^{\prime\prime}}{\sum\limits_{i=1}^{m}% {|P_{i}|}}=\frac{3}{4},$

$D_{P}=\{a\},T_{1}^{\prime\prime}=\{0,1,1,0,0,0\},T_{2}^{\prime\prime}=\{1\},$

$\textit{RAR}^{\prime\prime}=\frac{T_{pos}^{\prime\prime}}{T_{neg}^{\prime% \prime}}=\frac{6}{2+1}=2.$

Here, according to the Remarks 3, classification method 1 is more accurate than method 2 for the given sample space and ideal classification assumption.

5. Discussion for the accuracy of multi-label classification

We apply induced accuracy evaluation criterion into the conditions of multi-label classification. For the purpose of obvious comparison with induced criterion, we choose accuracy-precision-recall, which is mentioned above for the accuracy evaluation of multi-label classification method.

5.1 Measurement of accuracy-precision-recall [11]

Suppose that $U$ is a non-empty multi-labeled sample space, consisting of $|U|$ multi-label samples $(p_{i},P_{i})$ , where $P_{i}$ is ideal label set of $p_{i}$ and $i=1,\ldots,|U|$ ; Classification result of multi-label samples in $U$ classified by multi-label classification method is $(p_{i},P_{i}^{\prime})$ , where $P_{i}^{\prime}$ is label set of $p_{i}$ after classification and $i$ $=$ $1,$ $\ldots,|U|$ . So we have evaluation criterion

$\displaystyle\textit{Accuracy}=\frac{1}{|U|}\sum\limits_{i=1}^{|U|}{\frac{|P_{% i}\cap P_{i}{}^{\prime}|}{|P_{i}\cup P_{i}^{\prime}|}},$ $\displaystyle\textit{Precision}=\frac{1}{|U|}\sum\limits_{i=1}^{|U|}{\frac{|P_% {i}\cap P_{i}^{\prime}|}{|P_{i}^{\prime}|}},$ $\displaystyle\textit{Recall}=\frac{1}{|U|}\sum\limits_{i=1}^{|U|}{\frac{|P_{i}% \cap P_{i}^{\prime}|}{|P_{i}|}}.$

5.2 Measurement of induced accuracy evaluation criterion

Suppose that $U$ is a non-empty multi-labeled sample space with $m$ labels and $P_{i}$ is the sample set with the $i$ th label, where $i=1,\ldots,m$ , so the ideal classification result is $P=\{P_{1},P_{2},\ldots,P_{m}\}$ ; Classification result of multi-label samples in $U$ classified by multi-label classification method is $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},\ldots,P_{m}^{\prime}\}$ , where $P_{i}^{\prime}$ is sample set with the $i$ th label after classification and $i=1,\ldots,m$ . So we have evaluation criterion

$\textit{AA}=\frac{T_{pos}}{\sum\limits_{i=1}^{m}{|P_{i}|}},$

in which $T_{pos}=\sum\nolimits_{i=1}^{m}{|P_{i}\cap P_{i}^{\prime}|}$ and

$\textit{RAR}=\frac{T_{pos}}{T_{neg}},$

where

Remarks 4. we take these two measurements into contrast, and find that they have similar performance structures, essentially, Accuracy can be substituted by AA and Precision, Recall can be substituted by RAR. They have their own advantages and disadvantages, but the most advantage of induced accuracy evaluation criterion is that it can evaluate the accuracy of every classification method, irrespective of single-label and multi-label method. Also for the accuracy evaluation of multi-label classification method, if method 1 is more accuracy than method 2 for the given multi-label sample space if and only if both AA and RAR of method 1 are not less than method 2 ${}^{\prime}$ or Accuracy, Precision and Recall of method 1 are all not less than method 2 ${}^{\prime}$ .

Table 1
Multi-label sample set

[dir=NW] $U$ Label	1	2	3
a		*	*
b	*
c	*
d	*
e		*
f		*
g			*

Suppose that there are two multi-label classification methods, method 1 and method 2, to classify multi-label sample set listed by Table 1:

(1) (1)

According to the measurement of accuracy-precision-recall, $U=\{a,b,c,d,e,f,g\}$ , $|U|$ $=$ $7$ , $(a,P_{1})=\{2,3\}$ , $(b,P_{2})=\{1\}$ , $(c,P_{3})=\{1\}$ , $(d,P_{4})=\{1\}$ , $(e,P_{5})=\{2\}$ , $(f,P_{6})=\{2\}$ , $(g,P_{7})=\{3\}$ . Now classification results classified by methods 1 and 2 are respectively supposed to be $(a,P_{1}^{\prime})=\{3\}$ , $(b,P_{2}^{\prime})=\{1\}$ , $(c,P_{3}^{\prime})=\{1\}$ , $(d,P_{4}^{\prime})=\{3\}$ , $(e,P_{5}^{\prime})=\{2\}$ , $(f,P_{6}^{\prime})=\{2\}$ , $(g,P_{7}^{\prime})=\{3\}$ and $(a,P_{1}^{\prime\prime})=\{3\}$ , $(b,P_{2}^{\prime\prime})=\{1\}$ , $(c,P_{3}^{\prime\prime})=\{2\}$ , $(d,P_{4}^{\prime\prime})=\{3\}$ , $(e,P_{5}^{\prime\prime})=\{2\}$ , $(f,P_{6}^{\prime\prime})=\{2\}$ , $(g,P_{7}^{\prime\prime})=\{3\}$ , then we have

method 1:

$\displaystyle\textit{Accuracy}^{\prime}=\frac{11}{14},\textit{ Precision}^{% \prime}=\frac{6}{7},$ $\displaystyle\textit{Recall}^{\prime}=\frac{11}{14};$

method 2:

$\displaystyle\textit{Accuracy}^{\prime\prime}=\frac{9}{14},\textit{Precision}^% {\prime\prime}=\frac{5}{7},$ $\displaystyle\textit{Recall}^{\prime\prime}=\frac{9}{14};$

so, for the given multi-label sample set, according to Remarks 4, multi-label classification method 1 is more accurate than method 2.

(2)

According to the measurement of induced accuracy evaluation criterion, $U=\{a,$ $b,$ $c,$ $d,$ $e,$ $f,g\}$ , $m=3$ , $P=\{P_{1},P_{2},P_{3}\}$ $=$ $\{\{b,$ $c,$ $d\},$ $\{a,e,f\},\{a,g\}\}$ . Now suppose that classification results classified by method 1 and method 2 are respectively $P^{\prime}=\{P_{1}^{\prime},P_{2}^{\prime},P_{3}^{\prime}\}$ $=$ $\{\{b,$ $c\},$ $\{e,$ $f\},\{a,d,g\}\}$ and $P^{\prime\prime}$ $=$ $\{P_{1}^{\prime\prime},$ $P_{2}^{\prime\prime},$ $P_{3}^{\prime\prime}\}$ $=$ $\{\{b,c\},\{e,c,f\},\{a,d,g\}\}$ , then we have

method 1:

$\textit{AA}^{\prime}=\frac{T_{pos}^{\prime}}{\sum\limits_{i=1}^{m}{|P_{i}|}}=% \frac{3}{4},$

$D_{P}=\{a\},T_{1}^{\prime}=\{0,0,1,0,0,0\},T_{2}^{\prime}=\{1\},$

$\textit{RAR}^{\prime}=\frac{T_{pos}^{\prime}}{T_{neg}^{\prime}}=\frac{6}{1+1}=3;$

method 2:

$\textit{AA}^{\prime\prime}=\frac{T_{pos}^{\prime\prime}}{\sum\limits_{i=1}^{m}% {|P_{i}|}}=\frac{3}{4},$

$D_{P}=\{a\},T_{1}^{\prime\prime}=\{0,1,1,0,0,0\},T_{2}^{\prime\prime}=\{1\},$

$\textit{RAR}^{\prime\prime}=\frac{T_{pos}^{\prime\prime}}{T_{neg}^{\prime% \prime}}=\frac{6}{2+1}=2;$

so, according to Remarks 4, multi-label classification method 1 is also more accurate than method 2 for the given multi-label sample set by the measurement of induced accuracy evaluation criterion.

6. Conclusions

This paper, using neighborhood system theory, proposed a unified evaluation criterion for the accuracy of classification method, irrespective of single-label and multi-label method. We illustrated this evaluation criterion by some important examples. We hope that it can provide useful and revelatory help to the accuracy estimation of classification, accuracy evaluation of classification and accuracy improvement of classification method etc. in the future, without necessarily to consider the difference of classification method.

References

Han

J.W.

and Kambr

, Data Mining: Concepts and Techniques, New York: Morgan Kaufmann Publishers, 2001.

Tan

P.N.

Steinbach

and Kumar

, Introduction to Data mining, NewJersey: Addison-Wesley, 2006.

Kohavi

, A study of cross-validation and bootstrap for accuracy estimation and model selection, In the International Joint Conference on Artificial Intelligence (1995).

Weiss

S.M.

and Kulikowski

C.A.

, Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems, San Mateo, CA: Morgan Kaufmann, 1991.

Schaffer

, A conservation law for generalization performance, In Proceedings of 11th International Conference on Machine Learning, Morgan Kaufmann, (1994), 259–265.

Tsoumakas

Katakis

and Vlahavas

, Mining multi-label data, Data Mining and Knowledge Discovery Handbook, part 6, (2010), 667–685.

Tsoumakas

and Katakis

, Multi-label classification: An overview, International Journal of Data Warehousing and Mining 3(3) (2007), 1–13.

Chen

Y.B.

Liu

and Ye

, A unified paradigm for the accuracy of classification based on granular computing, In the IEEE International Conference on Granular Computing (2010), 669–672.

Schapire

R.E.

and Singer

, Boostexter: A boosting-based system for text categorization, Machine Learning 39(2/3) (2000), 135–168.

10.

Boutell

M.R.

Luo

Shen

, et al., Learning multi-label scene classification, Pattern Recognition 37(9) (2004), 1757–1771.

11.

Godbole

and Sarawagi

, Discriminative methods for multi-labeled classification, Lecture Notes in Computer Science 3056 (2004), 22–30.

12.

Lin

T.Y.

, Neighborhood systems: A qualitative theory for fuzzy and rough sets, Advances in Machine Intelligence and Soft Computing IV (1997) 132–155.

13.

Sierpinski

and Krieger

, General Topology, CA: University of Torantto Press, 1956.

14.

Zhu

, Topological approaches to covering rough sets, Information Sciences 177 (2007), 1499–1508.

15.

Yao

Y.Y.

and Yao

B.X.

, Covering based rough set approximations, Information Sciences 200 (2012), 91–107.