Abstract
Extreme learning machine (ELM) has been proved to be an efficient and effective machine learning method for pattern classification and regression. However, ELM is mainly applied to traditional supervised learning problems. ELM is not commonly used in multi-label image classification. In this paper, we propose a joint graph regularized extreme learning machine (JGELM) by simultaneously considering the feature information and label correlation of data. Specifically, we exploit the feature distance and label correlation in the local neighborhood. To this end, a joint graph regularizer based on a newly designed graph Laplacian to characterize both properties is formulated and incorporated into the ELM objective. Four popular multi-label image data sets are employed to test the proposed method. The experimental results show that JGELM are competitive with state-of-the-art multi-label classification algorithms in terms of accuracy and efficiency.
Introduction
Due to the development of Internet and visual data sharing websites, the available image databases have been dramatically increased in the last decade. Providing efficient solution to image classification has always been a major focus in computer vision [1, 2, 3, 4, 5, 6]. The recent state-of-the-art image classification methods mainly include support vector machine (SVM), spatial pyramid matching (SPM), locality constrained linear coding (LLC) and so on.
Support Vector Machine (SVM) was popular in the last two decades and they were designed to overcome the drawbacks of back propagation neural network (BPNN). However, the size of SVM model is usually large for a large training dataset because the number of selected support vectors increases when the size of training dataset increases. In addition, a SVM model of more support vectors takes longer execution time so that SVM may not fit the current requirements of a mathematical engine model.
One particular extension of the bag of features (BoF) model, called spatial pyramid matching (SPM) [7], has made a remarkable success on a range of image classification benchmarks like Caltech-101 [8] and Caltech-256 [3]. People have empirically found that, in order to obtain good performances, both BoF and SPM must be applied together with a particular type of nonlinear Mercer kernels, e.g. the intersection kernel or the Chi-square kernel. Accordingly, the nonlinear SVM has to pay a computational complexity O(
The locality constrained linear coding (LLC) algorithm [9] is an efficient local coordinate linear coding method, which projects each descriptor into a local constraint system to obtain an effective codebook or dictionary. It has been demonstrated that it is a promising image representation method. Experimental results for image classification based on several well-known dataset validate the good performance of LLC. The LLC descriptors encoding, in ensuring the constraint conditions of shift invariance, the reconstruction error need minimization characteristics, so the encoding may contain negative elements. If the negative elements in the encoding and the positive elements of difference, it will lead to the instability of coding.
Extreme learning machine (ELM) [10, 11], as a popular approach in recent years, has recently attracted the attention from more and more researchers [12, 13, 14]. Compared with traditional machine learning methods such as support vector machine (SVM) and spatial pyramid matching (SPM), it provides better generalization performance at a much faster learning speed and with least human intervening [15]. Though ELM has been well researched as a singular classifier, yet ELM ensembles are less explored for pattern classification tasks. However, recently such a demand has been raised by the so-called big data. This term refers to a collection of datasets so large and complex that it becomes awkward to work with using on hand database management tools [17].
In this paper, based on the idea that similar samples should share similar properties, we propose a joint graph regularized extreme learning machine (JGELM). In JGELM, the constraint imposed on output weights enforce the output of sample joint distance similarity and label correlation. The constraint is formulated as a regularized term being added on the objective of basic ELM model, which also makes the output weights be solved analytically. We perform our new method on multi-label classification benchmark data sets and compare the results state-of-the-art multi-label classification methods.
The remainder of this paper is organized as follows. Section 2 describes the basic extreme learning machine model as well as its
Extreme learning machine
In this section, we review the extreme learning machine (ELM) in detail. Extreme learning machine proposed by Huang et al. [10, 11], is a simple learning machine for single-hidden layer feed forward neural network (SLFN).
Given a training set
and
where
where
Therefore, the output weights matrix
where
In order to improve the stability and generalization performance of the ordinary ELM, Huang et al. proposed the equality constrained optimization-based NLM. In this method, the solution of regularized ELM can be expressed as
where
The solution shown in Eq. (7) can be obtained by solving the following optimization problem.
where
Feature neighbor graph
Given a set of
Considering the problem of mapping the weighted graph G to the sparse representations
where
As discussed in [18], instead of using L directly, we can normalized it by
Given a set of
We utilize the following cosine similarity to calculate label affinity matrix
Note that similar to the neighbor weighted graph G the sparse representations Y, a reasonable criterion for choosing a “good” map is to minimize the following objective function [17]
where
Where
The proposed JGELM, by modifying the ordinary ELM Eq. (4), we give the formulation of JGELM as:
where
set
As a result, we have
The algorithm description of our proposed JGELM is summarized in Algorithm 1.
Experiment data
We test the JGELM on four popular multi-label image data sets, which have been widely used for evaluating multi-label learning algorithms.
Barcelona image data set is composed of urban scenes from Barcelon, and consists of 139 urban scene images in “jpeg” format with minimum resolution of
Nature scene data set [2] contains 2407 images represented by a 294_dimensional vector, which are labeled with 6 semantic concepts.
PASCAL VOC 2007 is an extension visual object recognition challenge data based on PASCAL VOC 2006. It has 9663 images with 4 group annotations and each group can be further divided into the following classes, Person: person; Animal: bird, cat, cow, dog, horse, sheep; Vechicle: bicycle, boat, bus, car, motorbike, train; Indoor: bottle, chair, dining; diningtable, potted, plant, sofa, tv/monitor. we download the 512_dimnesion Gist feature and rgb 4096 as the image descriptor extracted from all the image.
MIR FLICKR2008 is public image data set used for ACM sponsored image retrieval evaluation. It has 25000 images with 38 classes downloaded from the social photography site Flickr through its public API. After removing the most common annotations, i.e. colors, seasons and place names, the average number of annotation per image is 8.94. In the collection there are 1386 annotations which occur in at least 20 images. We download the 512-dimension GIST image descriptor extracted from all the images.
We summarized the data as listed in Table 1.
Data sets summary
Data sets summary
In all experiments, we use 5-fold cross validation. Specifically, we split the data evenly into 5 folds and take choosing 4 folds for training and using the remaining 1 fold for testing. In each training step, we further divide the training data into 5 parts and pick up 4 parts for testing and choose the remaining 1 pat as the validation to tune the best regularization. We repeat the above procedure 5 times and report the average classification results.
We experiment prosed JGELM on four multi-label image data sets with differen combinations of parameters
Multi-label classification results
We evaluate the performance of the algorithm by using the accuracy of multi-labeled image classification.
Classification performance comparison on the four multi-label image data sets
Classification performance comparison on the four multi-label image data sets
Table 2 report the accuracies of five algorithms on four data sets. Our experimental results have demonstrated that our proposed JGELM model possesses excellent performance in multi-labeled image classification with conventional ELM. The JGELM algorithm performs better on a less classified database. When the amount of data and categories are more, our algorithm is less than LLC.
In this paper, we have proposed JGELM, to extend the traditional ELM for Multi-Label Classification. We propose a joint graph regularized extreme learning machine (JGELM) by simultaneously considering the feature information and label correlation of data. Specifically, we exploit the feature distance and label correlation in the local neighborhood. Compared to existing multi-label algorithms, the proposed JGELM maintains almost all the advantages of elms, such as the remarkable training efficiency and direct implementation for multi-class classification problems. It also led to competitive results with several state-of-the-art multi-label classification algorithms, and it required significantly less training time. The JGELM are expected to greatly expand the applicability of ELM, and provide new insights into the extreme learning paradigm.
