Abstract
Hyperspectral imaging provides new opportunities for improving face recognition accuracy. However, it poses such challenges as difficulty in data acquisition, low signal to noise ratio (SNR), and high dimensionality. In this paper, we propose a novel method for hyperspectral face recognition with good recognition rates. We first reduce noise adaptively from each spectral band and then crop each face. We perform minimum noise fraction (MNF) transform to the cropped face data cube in order to extract a number of MNF bands. We extract histogram of oriented gradients (HOG) features from each MNF band. We conducted some experiments to test this new method for hyperspectral face recognition with very promising results. For Hong Kong Polytechnic University Hyperspectral Face Database (PolyU-HSFD), we achieved an average correct recognition rate of 95.4% with standard deviation of 2.6 (95.4% ±2.6). For CMU Hyperspectral Face Database (CMU-HSFD), we achieved an average correct recognition rate of 98.1% with standard deviation of 0.8 (98.1% ±0.8). The reasons why we choose MNF for hyperspectral face recognition are because it can separate noise from fine features in the face data cube and at the same time reduce the dimensionality of the face data cube. In this way, our proposed face recognition method will be faster than those methods without dimensionality reduction.
Keywords
Introduction
Hyperspectral imagery analysis is a very popular topic in remote sensing. However, it is not popular in face recognition due to its difficulty in data acquisition, low signal to noise ratio (SNR), and high dimensionality. In traditional hyperspectral imagery, there exist two-dimensional (2D) spatial information and one-dimensional (1D) spectral information, where misalignment between spectral bands is negligible. In existing hyperspectral face databases, many kinds of distortion among spectral bands coexist, such as spatial shift, rotation, scaling, eye opening and closing, facial expression, and so on. This presents a challenge to hyperspectral face recognition. As a result, there is a need to improve existing methods and develop new ones for hyperspectral face recognition.
There are a number of published papers on this topic in the literature. We briefly review these papers here. Di et al. ([1, 2]) worked on hyperspectral face recognition in visible spectrum with feature band selection. Their experimental results showed that hyperspectral face recognition with the selected feature bands outperformed that by using a single band, using the whole bands, or using the conventional RGB color bands. Pan et al. ([3, 4]) used the 31 band spectral signatures at hand-picked locations of the face in the NIR spectrum (700– 1000 nm) for face recognition. They reported high recognition rate under pose variations on a database comprising 1400 hyperspectral images of 200 subjects. Shen and Zheng [5] proposed a hyperspectral face recognition method using 3D Gabor wavelets. They found that their method outperformed spectrum feature, PCA and 2D-PCA on the HK-PolyU database. Uzair et al. [6] studied hyperspectral face recognition with spatiospectral information fusion and partial least square (PLS) regression. They compared with the 13 extended and five existing hyperspectral face recognition techniques on three standard data sets. They showed that their proposed method outperformed all existing methods by a significant margin. Turk and Pentland [7] proposed a face recognition method with Eigenfaces. They treated face recognition as a 2D recognition problem and described facial images as a small set of 2D characteristic views. Facial images were projected onto a feature space that best encoded the variation in known facial images. Ahonen et al. [8] proposed a face recognition method using local binary pattern (LBP). The LBP is a very effective operator that labels the pixels of an image by thresholding the neighborhood of each pixel and considers the result as a binary number. Chen et al. [9] proposed a novel method for hyperspectral face recognition by extracting log-polar [10] – Fourier features and CRC-based classifiers. Both partial voting and full voting were used in their experiments. They claimed that their voting techniques achieved higher recognition rates than those without voting. Sun et al. [11] developed a robust tensor preserving projection (RTPP) method and a multispectral image can be represented as a three order tensor. RTPP constructs sparse neighborhoods and then computes weights of the tensor. It iteratively finds every spectral space transformation matrix by preserving sparse neighborhoods. Due to sparse representation, RTPP can keep the underlying spatial structure of multispectral images and also enhance robustness to distortions. Wang et al. [12] developed a new method using both spatial and spectral information for expression-invariant face recognition. The method utilized several three-dimensional Gabor filters to exploit spatial and spectral correlations whereas principal component analysis (PCA) was used to model expression variation. Experimental results demonstrated the superiority of their method over other existing methods. Pan et al. [13] studied illumination invariant face recognition in hyperspectral imagery. Weighted subspace projection over multiple tissue types is used for face classification. Their face dataset was very large and they achieved good classification rate for hyperspectral face recognition. Sharma and Gool [14] developed a new method for hyperspectral face recognition. They showed that the discriminative spectral information at image-level features lead to significantly improved performance in face recognition. They studied the potential of traditional feature descriptors in the hyperspectral facial images as well. Vinayak and Payal [15] proposed a hyperspectral face recognition method using multidimensional clustering on hyperspectral face images. They used Kekre’s Fast Codebook Generation Algorithm and Kekre’s Median Codebook Generation Algorithm to generate codebook feature vectors for each face sub-band. This feature vector set is used for identification of the person.
In this paper, we propose a new method for hyperspectral face recognition by extracting HOG features [16] and classifying the unknown hyperspectral faces with collaborative representation-based classifiers (CRC). Instead of extracting features from the hyperspectral face cubes directly, we use minimum noise fraction (MNF) to obtain a number of noise-reduced bands, which are good for hyperspectral face recognition. Our experimental results show that our classification rates are very competitive for both Hong Kong Polytechnic University Hyperspectral Face Database (PolyU-HSFD) and CMU-HSFD database.
The organization of this paper is as follows. Section 2 proposes a new method for hyperspectral face recognition by extracting HOG features and using collaborative representation-based classifier (CRC). Section 3 conducts some experiments on two hyperspectral face databases in order to verify whether our method is good or not. Finally, Section 4 draws the conclusion of the paper and proposes future research directions.
Proposed method
In this section, we present a novel algorithm for hyperspectral face recognition by combining minimum noise fraction (MNF), HOG feature extraction, and collaborative representation-based voting classifiers. We briefly review these three concepts here.
Minimum noise fraction
Minimum noise fraction (MNF) ([17, 18]) transform can be used to do the following three things: (a) determining the inherent dimensionality of image data; (b) segregating noise in the data; (c) reducing the computational requirements for subsequent processing. The MNF transform as modified from Green et al. [17] is essentially two cascaded principal component transforms. The first transform is based on an estimated noise covariance matrix, and it decorrelates and rescales the noise in the data. This first step results in transformed data in which the noise has unit variance and no band-to-band correlations. The second step is a standard principal component transform of the noise-whitened data. For the purposes of further spectral processing, the inherent dimensionality of the data is determined by examination of the final eigenvalues and the associated images. The data space can be divided into two parts: one part associated with large eigenvalues and coherent eigenimages, and a complementary part with near-unity eigenvalues and noise-dominated images. By using only the coherent portions, the noise is separated from the data, thus improving spectral processing results.
In this paper, we choose a modified MNF that is more efficient than the standard MNF. Assume Σ
N
= WΛ
N
W
T
is an eigen decomposition of Σ
N
. Also, assume
Therefore,
So, we know
Histogram of oriented gradients
Histogram of oriented gradients (HOG) [16] is a powerful descriptor for face recognition. This is because local object appearance and shape can be described by the distribution of intensity gradients. The facial image can be divided into small connected regions, and for the pixels within every region, a histogram of gradient directions is computed. The HOG descriptor is the concatenation of these histograms, and it is invariant to geometric and photometric transformations, except for object orientation. The implementation of the HOG includes the following five steps: Perform global image normalization to reduce the effects of illumination. Calculate first-order image gradients and capture contour, silhouette and some texture information. Produce an encoding that is sensitive to local image content while remaining resistant to small changes in pose or appearance. Perform normalization, which takes local groups of cells and contrast and normalize their overall responses. Collect the HOG descriptors from all blocks of a dense overlapping grid of blocks covering the detection window into a combined feature vector.
Collaborative representation-based classifier
Sparse representation based classification (SRC) [19] has been widely used for face recognition (FR). SRC first codes a testing sample as a sparse linear combination of all the training samples, and then classifies the testing sample by evaluating which class leads to the minimum representation error. It is the collaborative representation-based classification (CRC [20]) but not the l1-norm sparsity that makes SRC powerful for face classification. In CRC, one has to solve the following optimization problem:

The flowchart of our proposed method for hyperspectral face recognition.

The ten MNF bands from a hyperspectral face cube in the PolyU-HSFD.
where the A
T
means the transpose of the matrix A. Because D = (A
T
A + λI) -1A
T
can be computed offline, it is fast to calculate
In this paper, we propose a novel method for hyperspectral face recognition. The flowchart of this method is given in Fig. 1. Our method consists of the following steps: Since hyperspectral face databases often contain a significant amount of noise, we perform denoising adaptively to every noisy band of the data cube using BM3D ([21]) or to the whole data cube directly using our methods ([22–25]). Note that this step is optional in our method. We crop the denoised facial images by using Masayuki Tanka’s Matlab code [25] for face parts detection. This code finds left eye, right eye, noise, mouse, and face bounding boxes in each face image. In this paper, we only use the face bounding box to crop faces. Let the size of bbx be R×C, Dim = 128 and Dim2 = 64. If R > Dim, then
Where f
b
is the finally cropped facial image. We perform MNF to the cropped bands 1:N, bands 1:N/2, bands N/2 + 1:N, bands 1:N/3, bands N/3 + 1 : 2 N/3, bands 2 N/3 + 1:N, bands 1:N/4, bands N/4 + 1:N/2, bands N/2 + 1 : 3 N/4, 3 N/4 + 1:N, respectively. The value N is the number of bands in the hyperspectral face data cube. We extract the least noisy band from each of these MNF transform. In this way, we obtain ten MNF bands that are robust to noise and at the same time feature extraction can be applied to them. Note that we use parallel processing for the ten MNF transforms in Matlab, which makes the implementation faster than sequential processing. Figures 1 and 2 show ten MNF output channels of the PolyU-HSFD and CMU-HSFD data cubes, respectively. In order to alleviate distortion of face images, we extract the HOG features from every cropped face band. We use the CRC to classify every MNF band to one of the M classes individually and then the most frequent class in the ten MNF bands is chosen to be the true class of the original hyperspectral face data cube. We observe significant increases in recognition rates for this majority winning strategy when compared with pure single band-based CRC classification. As [9], we use both partial voting and full voting to classify the facial images. For partial voting, we use the [low, high] bands for both the PolyU-HSFD and the CMU-HSFD datasets. The values of low and high are assigned as follows. Let T be the current MNF band index and T ∈ [0, 1]. If T < 4, then low = 1 and high = T. If T ⩾ 4, then low = 3 and high = T. For full voting, we set low = 1 and high = 10. We use these values for our CRC voting throughout this paper.

The ten MNF bands from a hyperspectral face cube in the CMU-HSFD.
Our new method takes advantage of the merits of the excellent denoising capability of the BM3D, the MNF transform and the HOG transform for hyperspectral face recognition. In addition, our experimental results in the next section demonstrate that our new method is very competitive when compared with other existing methods for hyperspectral face recognition. More importantly, our MNF can perform both denoising and dimension reduction, which are very important for today’s hyperspectral imagery with hundreds of spectral bands.
A natural question about our proposed method in this paper is the reason why we introduce MNF instead of band selection. We argue that MNF is good at extracting noise-robust images from heavily noisy hyperspectral face images whereas band selection does not have this capability. Since our MNF is implemented in Matlab in parallel, it is fast as well in term of CPU computing time due to the wide availability of multi-core personal computers (PC) and clusters of servers.
Another natural question about MNF is why we use it instead of all hyperspectral face bands in our proposed method. In our algorithm, we only extract 10 MNF output bands, which are less than the total number of the original hyperspectral face bands (24 for PolyU-HSFD and 65 for CMU-HSFD). This means that MNF-based feature extraction and classification should be faster than non-MNF-based methods. As a general guideline, we suggest using MNF-based method in heavily noisy environment and non-MNF-based methods in less noisy environment. Furthermore, our MNF-based method should be used to speed up the classification process when there are hundreds or thousands of face bands in hyperspectral face datasets.
It should be pointed out that this paper is an extension of our previous conference paper [27] published in IEEE GARSS 2016.

A hyperspectral face cube from the PolyU-HSFD.
Database details about the PolyU-HSFD and CMU-HSFD databases
Database details about the PolyU-HSFD and CMU-HSFD databases

A hyperspectral face cube from the CMU-HSFD with only odd-numbered bands displayed.
We conduct a number of experiments on the PolyU-HSFD ([1, 2]) and CMU-HSFD ([3, 4]) datasets. In the PolyU-HSFD, each cube contains 33 bands covering the spectral range of 400– 720 nm with a step size of 10 nm. The database contains appearance variations of the subjects including hair style changes and skin conditions. The database has low signal to noise ratio (SNR) and interband alignment errors. Table 1 lists the details of this database. Figure 4 shows a hyperspectral face cube from the PolyU-HSFD and Fig. 2 displays the ten MNF bands generated from one hyperspectral face datacube. There are 48 subjects (13 females and 35 males) in this hyperspectral face database. Each of the first 25 subjects has 4 to 7 cubes and the rest subjects have only one cube. As [5] and [6], we use the first 25 subjects in our experiments comprising 113 hyperspectral images cubes. The first 6 and last 3 bands are removed due to low SNR according to [5] and [6]. For each subject, two cubes are randomly selected for training and the remaining 63 cubes are used for testing. Faces are cropped and resized to 64×64 pixels.
The correct classification rates (%) and standard deviation (STD) of different face recognition methods. The best method is highlighted in bold font
The correct classification rates (%) and standard deviation (STD) of different face recognition methods with and without feature extraction. ‘MNF10’ means the 10 MNF bands obtained in step (c) of our proposed method. ‘ALL’ denotes using all bands of the original hyperspectral face data cube without MNF transform. ’HOG’ is the histogram of oriented gradient features. ‘CRC’ is the CRC without voting and ‘Proposed’ is the results for CRC with partial voting. The best method is highlighted in bold font
The CMU-HSFD [3] is illustrated in Fig. 5 with only odd-numbered bands displayed. In this database, each hyperspectral face cube contains 65 bands covering the spectral range of 450 nm – 1090 nm range with a step size of 10 nm. This database contains 48 subjects, and each subject consists of 4 to 20 cubes acquired at different sessions and different lighting combinations. We selected only the cubes acquired with all lights turned on. Therefore, our experimental data contains 147 hyperspectral face cubes of 48 subjects where each subject has 1– 5 cubes. We randomly choose one cube from each class as training data set and the rest cubes are used as testing data set. Table 1 lists the details of this database. Figure 3 shows the ten MNF bands from a hyperspectral face cube in the CMU-HSFD.
We compare our method with several existing methods in Table 2. We run the proposed method for 10 times and then take the mean and standard deviation (STD) of the correct recognition rates. It can be seen that our method is very competitive among these methods. In our method, we set the eigenface dimension to 500 and the regulation parameter λ to 0.001 for our CRC classifier. The seed for random number generator (RNG) is set to 3.14 for our experiments. We obtain 95.4% ±2.6 correct recognition rate for the PolyU-HSFD face database. We also use partial least square (PLS [28]) regression for our method, but we only obtain 89.19% ±3.88, which is not as good as the results in [6]. This may be due to the experimental settings that are different from [6]. The correct recognition rates and STD’s for spectral signature [3] (24.6% ±3.9), spectral angle-based classification [3] (25.40% ±4.36), eigenfaces [7] (76.5% ±4.5), 3D Gabor wavelet transforms [5] (91.3% ±2.1), and band fusion PLS [6] (95.2% ±1.6) are taken from their original paper [6]. For the CMU-HSFD database, we obtain 98.1% ±0.8 correct recognition rate. The correct recognition rates and STD’s for spectral signature [3] (38.1% ±1.9), spectral angle-based classification [3] (38.1% ±1.9), eigenfaces [7] (82.6% ±4.1), 3D Gabor wavelet transforms [5] (91.6% ±2.9), and band fusion PLS [6] (99.1% ±0.6) are taken from their original paper [6]. We performed PLS as well for this database, and we obtained 93.1% ±2.5 recognition rate. Again, this is lower than the recognition rate reported in [6] since the experimental setup is different. Uzair et al. performed ten-fold cross validation for their experiments whereas we randomly selected training and testing cubes and reported the average classification rate. This means that we did not perform cross validation in our experiments.
Table 3 shows the correct recognition rates of different classification methods for both the PolyU-HSFD and CMU-HSFD databases with the best results in bold fonts. In this table, ‘MNF10’ means the 10 MNF bands obtained in step (c) of our proposed method. ‘ALL’ denotes using all bands of the original hyperspectral face data cube without MNF transform. ‘HOG’ means the HOG features are extracted. It can be seen that the standard CRC is always inferior to the proposed method for about 2.0% in terms of correct classification rates for both the PolyU-HSFD and CMU-HSFD databases. In addition, feature extraction with HOG is better than no feature extraction for face recognition as shown in Table 3. For example, MNF10 + HOG achieves higher classification rates than MNF10, and ALL+HOG obtains higher classification rates than ‘ALL’.

A comparison among CRC without votsing, CRC with partial voting, and CRC with full voting for the PolyU-HSFD.

A comparison among CRC without voting, CRC with partial voting, and CRC with full voting for the CMU-HSFD.
We have to point out that in Table 3 the recognition rates of ‘ALL’ are higher than those of MNF for both PolyU-HSFD and CMU-HSFD databases. This is because we introduce the BM3D denoising method as a preprocessing step in our proposed method in this paper. We believe that without BM3D denoising the ‘ALL’ cannot achieve higher classification rates than the MNF because the MNF can extract noise-robust images from the noisy hyperspectral face data cubes whereas ‘ALL’ does not have this capability. Another reason is because of the dimensionality reduction of MNF to the hyperspectral face databases, which loses some useful features during the MNF transform.
Figures 6 and 7 plot the curves of recognition rates versus the number of MNF bands for CRC without voting, CRC with partial voting, and CRC with full voting for the PolyU-HSFD and the CMU-HSFD, respectively. It can be seen that CRC with partial voting is always the best among the three methods in both figures, with only one exception (T = 1). Although CRC full voting does not perform well for T ∈ [1, 8], it outperforms the CRC without voting for T = 9 and 10 in both figures.
Hyperspectral face recognition has become one of the most important topics in biometrics recently. It offers the capability that the ordinary 2D face recognizer cannot achieve. Until now, there are a number of papers in the literature addressing this topic. As a result, better methods need to be developed in order to obtain higher classification rates for hyperspectral face recognition.
In this paper, we have proposed a novel method for hyperspectral face recognition. Our method extracts HOG features from the MNF output bands. These features are robust to all kinds of deformations. Experiments show that our new method is very competitive when compared with several existing methods for the PolyU-HSFD and CMU-HSFD face databases.
Further research on this topic is still undergoing in order to improve the recognition rates for hyperspectral face recognition. We would like to apply deep neural networks (DNN) to hyperspectral face recognition in order to obtain higher classification rates. We are currently working on deep convolutional networks (DCNN) for hyperspectral face recognition. We will apply class-dependent sparse representation classifier [29] for hyperspectral face recognition as well.
Footnotes
Acknowledgments
We would like to thank Drs. David Zhang, Lei Zhang and Meng Yang for posting the PolyU-HSFD dataset, and their source code of the CRC method on their websites. We also thank Pan et al. and Masayuki Tanka for their CMU-HSFD dataset and face part detection Matlab source code, respectively.
