Robust face recognition based on a new Kernel-PCA using RRQR factorization

Abstract

In the last ten years, many variants of the principal component analysis were suggested to fight against the curse of dimensionality. Recently, A. Sharma et al. have proposed a stable numerical algorithm based on Householder QR decomposition (HQR) called QR PCA. This approach improves the performance of the PCA algorithm via a singular value decomposition (SVD) in terms of computation complexity. In this paper, we propose a new algorithm called RRQR PCA in order to enhance the QR PCA performance by exploiting the Rank-Revealing QR Factorization (RRQR). We have also improved the recognition rate of RRQR PCA by developing a nonlinear extension of RRQR PCA. In addition, a new robust RBF $L_{p}$ -norm kernel is proposed in order to reduce the effect of outliers and noises. Extensive experiments on two well-known standard face databases which are ORL and FERET prove that the proposed algorithm is more robust than conventional PCA, 2DPCA, PCA-L1, WTPCA-L1, LDA, and 2DLDA in terms of face recognition accuracy.

Keywords

Principal component analysis singular value decomposition householder QR decomposition rank-revealing QR factorization face recognition

1. Introduction

Nowadays, insecurity has become a major issue in various sectors, including the computer resources to be implemented to resolve this problem. Verification and identification of individuals are one of the means to ensure this security. The human being uses his visual system in a daily basis to identify people automatically, although the process involved is complex. Today there are means of verification of identity, which are related to either what a person has such as an identity card, or knows such as a password or PIN. Nevertheless, these elements can be forgotten, stolen or falsified. To circumvent these limitations, another means of security has been carried out which makes it possible to use the information intrinsic to a person. This new way of identifying individuals is biometrics [1]. The advantage of biometric characteristics is to be universal that is to say present in all the people to be identified. On the other hand, they are measurable and unique; two people can not have exactly the same characteristic. They are also permanent which means that they do not vary over time. The main interest of biometrics is therefore to automatically recognize and identify the identities of individuals using their physiological or behavioral characteristics. The physiological characteristics include for example the face [2], the iris [3], and the fingerprint [4]. Behavioral characteristics can include for instance the voice [5], the signature [6], and the keystroke dynamics [7].

This study is interested in facial biometrics which has many advantages such as the ease-of-use, user acceptance and its low cost. Nevertheless, facial images constitute a large scale data set. Wich usually contain redundant and noisy features that most of the time lead to biased recognition accuracy. To overcome that problem, several conventional dimensionality reduction algorithms have been proposed like Principal Component Analysis (PCA) [8], Kernel PCA (KPCA) [9], Linear Discriminant Analysis (LDA) [10], Relevance Weigthed LDA (RW-LDA) [11, 12], Kernel RW-LDA (KRWDA) [13], Regularized Discriminant Analysis (RDA) [14], Singular Value Decomposition (SVD) [15], Locality Preserving Projections (LPP) [16], and Independent Component Analysis (ICA) [17, 18]. Also many hybrid technique combines several feature extraction algorithms have been successfully employed. For example, the paper [19] have proposed a new score fusion of two robust dimensionality reduction algorithms in discrete wavelet transform (DWT) domain. These algorithms are relevance weigthed LDA with QR factorization (RW-LDA/QR) [11, 12] and SVD using the left and right singular vectors. In [22], the authors suggested a simple and new discriminative post-processing framework to make the dimensionality reduction methods robust to outliers. In the same context, Hanli Qiao proposed a new algorithm called Discriminative PCA [23]. The purpose of the proposed algorithm is to retain both the strong points of PCA and LDA by to find a feature subspace contains discriminant principal components.

Generally speaking, PCA remains the most adapted technique to extract the expressive features in any recognition application, but with a very expensive computation cost. As a recent solution to this weakness, A. Sharma et al. [24] have proposed to exploit the QR decomposition to speed up the process of PCA computation. In this paper, we focus on the application of RRQR factorization in PCA algorithm to accelerate the computation of principal components (PCs). Then, we develop a new non-linear extension of PCA using kernel trick called KPCA/RRQR. In addition, a new robust $L_{p}$ -norm RBF kernel is proposed to improve Kernel PCA/RRQR algorithm in terms of recognition accuracy. Based on experimental results, we noted that our proposed approach provide the best face recognition rates compared with several efficient feature extraction algorithms under noisy and noise-free conditions. The face databases used in experiments are ORL [20] and Facial Recognition Technology (FERET) [25].

The rest of this paper is structured as follows. In Section 2, we review briefly some related PCA algorithms. The proposed algorithm is explained in Section 3. Experimental results and discussion are presented and discussed in Section 4. Finally in Section 5, we give a concluding relevant to the paper.

2. Related work

Principal Component Analysis (PCA) is a linear dimensionality reduction method very used and adapted in several fields of applications such as pattern recognition, artificial intelligence, data mining, and computer vision. Its main purpose is to reduce the number of variable of the input data $X=[x_{1},x_{2},\ldots,x_{n}]\in\mathbb{R}^{d\times n}$ into a representation of small dimension $Y=[y_{1},y_{2},\ldots,y_{n}]\in\mathbb{R}^{d^{\prime}\times n}$ with $d^{\prime}<<d$ . This is done mathematically by the search for a projection basis $Z^{*}=\{z^{*}_{i}\}_{i=1}^{d^{\prime}}$ which minimizes the mean square error (MSE). This is expressed mathematically by the following equation:

$\displaystyle Z^{*}=\min_{Z}\sum_{i=1}^{n}\|e_{i}\|_{2}^{2}=\min_{Z^{T}Z=I_{d^% {\prime}}}\sum_{i=1}^{n}\|x_{i}-ZZ^{T}x_{i}\|_{2}^{2}$ (1)

Where $\|.\|_{2}$ denotes the $L_{2}$ -norm of a vector and $I_{d^{\prime}}$ denotes a $d^{\prime}$ -dimensional identity matrix. By simple algebra, we can see that minimizing Eq. (1) mathematically implies maximizing the Eq. (2). Hence, the goal of PCA is to find the projection plan $Z^{*}$ which maximizes the data variance $F=\{f_{i}\}_{i=1}^{n}$ , where $f_{i}=x_{i}-m$ and $m=\frac{1}{n}\sum_{i=1}^{n}x_{i}$ is the centroid of training-set.

$\displaystyle Z^{*}=\text{argmax}_{Z^{T}Z=I_{d^{\prime}}}\sum_{i=1}^{n}\|Z^{T}% f_{i}\|_{2}^{2}=\text{argmax}_{Z^{T}Z=I_{d^{\prime}}}tr(Z^{T}C_{r}Z)$ (2)

We assume that $C_{r}=\frac{1}{n}FF^{T}$ is a covariance matrix and $tr(.)$ is the trace operator. The solution of these two equations is composed of the orthonormal eigenvectors of $C_{r}$ that correspond to the first $d^{\prime}$ largest eigenvalues. However, the advantage of PCA in face recognition is its good recognition accuracy in ideal facial databases, but its major problem is the high computation complexity of the eigenvectors of covariance matrix. In order to form the projection matrix, several researchers have proposed practical solutions to tackle this problem. Turk and Pentland [26, 27] overcame this obstacle by a subtle trick that computes needed values from $\tilde{C_{r}}=\frac{1}{n}F^{T}F$ . It is easy to prove that two matrix $C_{r}$ and $\tilde{C_{r}}$ have the same non-zero eigenvalues denoted as $\lambda_{i}$ . We denote the normalized eigenvectors of the matrix $\tilde{C_{r}}$ by $v_{i}$ . Finally, the normalized eigenvectors of the covariance matrix $\tilde{C_{r}}$ should be $Z_{i}=\frac{1}{\sqrt{\lambda_{i}}}Fv_{i}$ . Another formulation of PCA based on singular value decomposition (SVD) [28] is better than the conventional PCA can causes a loss of precision with many operations of multiplication and addition. For this objective, eigenvectors $u_{i}$ and eigenvalues $\lambda_{i}$ are calculated as $[U,D,V^{T}]=\textit{svd}(F)$ . Where $U$ are the eigenvectors of $FF^{T}$ , $V^{T}$ are the eigenvectors of $F^{T}F$ , and $D$ is the singular values such that $\sigma_{1}\geqslant\sigma_{2}\geqslant\ldots\geqslant\sigma_{n}$ with $\sigma_{i}=\sqrt{\lambda_{i}(F^{T}F)}$ . The SVD PCA method is presented in Algorithm 2.

[H] SVD PCA algorithm[1] Data matrix $X\in\mathbb{R}^{d\times n}$ , $d^{\prime}$ : number of PCs. Projection matrix $Z=\{z_{i}\}_{i=1}^{d^{\prime}}$ $m\leftarrow\textit{mean}(X)$ , with $m\in\mathbb{R}^{d\times 1}$ for $i=1,2,\ldots,n$ do $F(:,i)\leftarrow x(:,i)-m$ , end for Compute SVD of $F$ as $F=\textit{UDV}^{T}$ for $i=1,2,\ldots,d^{\prime}$ do $Z(:,i)\leftarrow\frac{1}{D(i,i)}FV^{T}(:,1:i)$ end for $Z$

The computational complexity of SVD PCA is $14dn^{2}$ flops. Recently, A. Sharma et al. [24] have proposed a numerically stable solution that is based on the QR decomposition called QR PCA, in order to improve the SVD PCA algorithm in terms of computation complexity. Nevertheles, the QR PCA algorithm uses the Householder QR method to perform the QR decomposition wich is more complexe. In this paper, we propose to exploit the rank-revealing QR factorization instead of using Householder QR factorization. Hence, we detail in the next paragraph the mathematical formulation of the RRQR factorization and we study the difference between the QR factorization based on Householder algorithm and the economic RRQR factorization in terms of computational complexity. Moreover, we develop a new version of kernel-PCA based on new kernel function called RBF $L_{p}$ -norm kernel.

3. Proposed algorithm

3.1 Principal component analysis using RRQR factorization

The rank revealing- $Q R$ (RRQR) factorization of an $d$ x $n$ matrix $A$ is defined by:

$\displaystyle AP=QR=\left[\begin{array}[]{cc}Q_{1}&Q_{2}\\ \end{array}\right]\left[\begin{array}[]{cc}R_{11}&R_{12}\\ 0&R_{22}\\ \end{array}\right]$ (3)

Where $P$ is a permutation matrix, $R_{11}$ is upper triangular, and $R_{22}$ is numerically negligible [29]. The order $r$ of $R_{11}$ then reveals the numerical rank of $A$ , the first $r$ columns of $Q$ form an orthonormal basis for the range space of $A$ and the first $r$ columns of $A P$ . Recently, several rank-revealing $Q R$ algorithms have been developed, we use the fastest variant of the RRQR algorithm that produces the “economy size” factorization which is computed in $O(\textit{dnr})$ flops [28]. In order to compare the QR factorization based on the Householder algorithm with the economic RRQR factoization in terms of their computational complexities, we randomly generate data with 3000 samples and variable dimension from 5000 to 10000. Figure 1 shows the numerical comparison of their CPU times when implemented on Matlab software. We can see from this figure that the economic RRQR factorization is computationally more efficient than Householder QR factorization.

Figure 1.

A comparison of CPU time between Householder QR and economic RRQR factorizations.

According to the low computational complexity of economic RRQR factorization, we propose to modify the QR PCA algorithm by the use of economic RRQR factorization. Hence, the proposed RRQR PCA method is presented in Algorithm 1.

[H] RRQR PCA algorithm[1] Data matrix $X\in\mathbb{R}^{d\times n}$ , $d^{\prime}$ : number of features. Projection Matrix $Z=\{z_{i}\}_{i=1}^{d^{\prime}}$ $m\leftarrow\textit{mean}(X)$ , with $m\in\mathbb{R}^{d\times 1}$ for $i=1,2,\ldots,n$ do $F(:,i)\leftarrow x(:,i)-m$ , end for $r\leftarrow\textit{rank}(F)$ Compute economic RRQR of matrix F as $F=QR$ Compute economic SVD of $R^{T}$ as $R^{T}=\textit{UDV}^{T}$ $Z\leftarrow Q(:,1:r)V^{T}(1:r,1:d^{\prime})$ $Z$

In the next experiment, we have compared our approach with QR PCA [24] in term of the execution time. For that, we have used 3000 samples with a variable size from 6000 to 11000. Figure 2 depicts a comparison betwen the CPU time for both factorizations. It is clear that our approach is more faster than QR PCA variant.

Figure 2.

Numerical comparison for PCA based on QR and RRQR factorizations.

3.2 Robust kernel-PCA using RRQR factorization

Among the weaknesses of PCA is that cannot extract the nonlinear structure of the raw data [30]. To overcome this limitation, the researchers have proposed an effective extension of PCA called Kernel-PCA (K-PCA) by exploiting a nonlinear kernel function [9, 31]. This nonlinear version implements the same principle as PCA except for the mathematical transformation of the original dataset $X=\{x_{i}\}_{i=1}^{n},$ $x_{i}\in\mathbb{R}^{d}$ in a higher dimensional feature-space $H$ [32]. This transformation is described by a non-linear mapping in space $H$ , $\Phi:x_{i}\rightarrow\Phi_{i}=\Phi(x_{i})$ maps the training dataset into a high-dimensional feature space. An important property of this feature space is that the dot product of two vectors $\Phi(x_{i})$ and $\Phi(x_{j}),i,j=1,\ldots,n$ , can be determined as follows:

$\displaystyle K_{ij}=\Phi(x_{i})^{T}\Phi(x_{j})=k(x_{i},x_{j})$ (4)

Where, $K_{ij}=<\Phi(x_{i}),\Phi(x_{j})>$ is the inner-product of two vectors in $H$ . It is explicitly defined by the following matrix formulation:

$\displaystyle K_{ij}=\left[\begin{array}[]{ccc}\Phi_{1}^{T}\Phi_{1}&\cdots&% \Phi_{1}^{T}\Phi_{n}\\ \vdots&\ddots&\vdots\\ \Phi_{n}^{T}\Phi_{1}&\cdots&\Phi_{n}^{T}\Phi_{n}\\ \end{array}\right]=\left[\begin{array}[]{ccc}k(x_{1},x_{1})&\cdots&k(x_{1},x_{% n})\\ \vdots&\ddots&\vdots\\ k(x_{n},x_{1})&\cdots&k(x_{n},x_{n})\\ \end{array}\right]$ (5)

Such that $k$ is the kernel function. The main goal of KPCA seeks to find the optimal projection bases in $H$ , on which all the data vectors will be projected in order to maximize the objective-function in Eq. (6):

$\displaystyle F_{\textit{KPCA}}(Z)=\text{argmax}_{Z}tr(Z^{T}C_{\Phi}Z)$ (6)

With $C_{\Phi}$ is the covariance matrix of the dataset in space $H$ are defined as:

$\displaystyle C_{\Phi}=\frac{1}{N}\sum_{i=1}^{n}\Phi(x_{i})\Phi(x_{i})^{T}$ (7)

The diagonalization of the covariance matrix $C_{\Phi}$ with non-negative eigenvalues $\lambda$ satisfying mathematically the next equation.

$\displaystyle C_{\Phi}Z=\lambda Z$ (8)

It’s clear to see that each eigenvector $Z$ of $C_{\Phi}$ can be expanded by the following linear combination:

$\displaystyle Z=\sum_{i=1}^{n}\alpha_{i}\Phi_{i}$ (9)

By replacing Eq. (9) in Eq. (6), we obtain a new eigenvalues problem as:

$\displaystyle\left(I_{n}-\frac{1}{n}1_{n}\right)K\left(I_{n}-\frac{1}{n}1_{n}% \right)^{T}\alpha=\lambda\alpha$ (10)

Where $I_{n}$ is the identity matrix of size $n\times n$ , $1_{n}$ is an $n\times n$ matrix of which all the elements are equal to $1$ , $\alpha$ is the expansion coefficients vector of the eigenvector $Z$ , and $K$ is the Gram matrix defined by the Eq. (5). For more technical details on the solution of Eq. (10) can be referred to [33].

The Gram $K$ matrix is symmetric and positive semi-definite. So, we propose to factorize the matrix $K$ with the Cholesky factorization, expressed mathematically by $K=\tilde{K}\tilde{K}^{T}$ . Then, we rewrite Eq. (10) with the following expression:

$\displaystyle\left[\left(I_{n}-\frac{1}{n}1_{n}\right)\tilde{K}\right]\left[% \tilde{K}^{T}\left(I_{n}-\frac{1}{n}1_{n}\right)^{T}\right]\alpha=\lambda\alpha$ (11)

Thanks to this matrix property $B^{T}A^{T}=(AB)^{T}$ , Eq. (11) becomes in this form:

$\displaystyle\tilde{K}_{\textit{centred}}\tilde{K}_{\textit{centred}}^{T}% \alpha=\lambda\alpha$ (12)

Where $\tilde{K}_{\textit{centred}}=(I_{n}-\frac{1}{n}1_{n})\tilde{K}$ . Finally, the Eq. (12) can be solved by applying the steps of the Algorithm 3.2 to calculate the optimal projection bases.

It can be noted that, the most used kernel functions in conventional KPCA algorithms are:

(1)

Polynomial kernel: $k(x,y)=(1+x^{T}y)^{d}$ , $d\in\mathbb{N}$

(2)

Gaussian RBF kernel: $k(x,y)=\exp^{-\frac{\|x-y\|^{2}_{2}}{2\sigma^{2}}}$ , $\sigma$ is the Guassian function width.

In general, Gaussian RBF kernel gives good recognition rates when the optimal parameter is adapted [34]. However, classical RBF kernel is based on $L_{2}$ -norm which is very sensitive to outliers and noises [35, 36, 37, 38]. Recently, among the solutions widely used to get around this weakness is to replace $L_{2}$ -norm with $L_{p}$ -norm, $p<2$ . For that, we propose to introduce $L_{p}$ -norm in RBF kernel function instead of using $L_{2}$ -norm. This new $L_{p}$ -norm RBF kernel is expressed mathematically by the following equation:

$\displaystyle k_{p}(x,y)=\exp^{-\frac{\|x-y\|_{p}}{2\sigma^{2}}},p<2$ (13)

In the rest of this paper, we use the $L_{p}$ -norm with the parameter $p$ can take only two values $p={0.5;1.0}$ and we also set the gaussian function width $\sigma$ , with $\sigma=3000$ [39]. The Algorithm 3.2 resume the steps of the proposed approach.

Robust kernel-PCA using RRQR factorization.[1] Data matrix $X=\{x_{i}\}_{i=1}^{n}\in\mathbb{R}^{d\times n}$ , $d^{\prime}$ : number of features, $p$ Projection matrix $Z=\{z_{i}\}_{i=1}^{d^{\prime}}$ Compute a Gram kernel matrix $K_{p}$ using Eqs (5) and (13). Compute the matrix $\tilde{K}$ using Cholesky factorization as $K_{p}=\tilde{K}\tilde{K}^{T}$ Compute the matrix $K_{\textit{centred}}$ as $K_{\textit{centred}}=(I_{n}-\frac{1}{n}1_{n})\tilde{K}$ Perform economic RRQR on $K_{\textit{centred}}$ : $K_{\textit{centred}}=QR$ Perform economic SVD on $R^{T}$ : $R^{T}=\textit{UDV}^{T}$ for $i=1,2,\ldots,d^{\prime}$ do $\tilde{V}^{T}(:,i)\leftarrow\frac{1}{D(i,i)}V^{T}(:,i)$ end for $Z\leftarrow Q\tilde{V}^{T}$ $Z$

Hereafter the proposed algorithm is called RKPCA. To get the feature vectors $Y(k)$ of the raw facial image training set, we project the facial image training vectors $X=\{x_{k}\}_{k=1}^{n}$ as follows:

$\displaystyle Y(k)=Z^{T}[k(x_{1},x_{k}),\ldots,k(x_{n},x_{k})]^{T}$ (14)

In the test step of our proposed algorithm, we get the test features $T_{l}=\{t_{l}\}_{l=1}^{nT}$ of the facial images reserved for the test by the following equation:

$\displaystyle T(l)=Z^{T}[k(x_{1},t_{l}),\ldots,k(x_{nT},t_{l})]^{T}$ (15)

Where $n T$ is the number of test facial images. Finally, we measure the similarity between the training features $Y$ and the test features $T$ by using the nearest neighbor classifier algorithm with Euclidean-distance.

4. Experimental results and discussion

In this section, we evaluate the recognition performance of our proposed algorithm using two face databases, such as ORL: Olivetti Research Laboratory [20] and Face Recognition Technology (FERET) [25]. The our RKPCA approach was compared to related existing algorithms including PCA [40], 2DPCA [41], PCA-L1 [42], WTPCA-L1 [43], LDA [10], and 2DLDA [44]. In our experiments, we have used a 2.00 GHZ i7 Processor, 8Go of RAM and the MATLAB as development environment.

Figure 3.

Some face images from the ORL database.

4.1 Experiments on ORL database

The ORL face database contains 40 individuals. Each individual is represented by 10 different images. So the base contains 400 grayscale faces with a fixed size of 112 $\times$ 92 pixels. All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position with tolerance for some side movement. All face-images are resized to 56 $\times$ 46 and quantized to 256 gray-level. The Fig. 3 shows some facial images from this database.

In the first experiment, we select the first five face-images of each individual for training-set and then we take the last five as testing face images. The total number of testing and training face-images is both 200. Table 1 shows the recognition rates of RRQR PCA and our two proposed algorithms. The first KPCA/RRQR algorithm uses the classic Gaussian RBF kernel and the second RKPCA method uses a new RBF $L_{p}$ -norm kernel, with $p=1.0$ and $p=0.5$ . According to these results, we observe that KPCA/RRQR algorithm gives better recognition rates than RRQR PCA. Furthermore, RKPCA algorithm performs better when using the $L_{p}$ -norm Gaussian RBF kernel with $p=0.5$ . The highest recognition rate of 95.00% is obtained when the number of features is 40.

Table 1
Recognition rates of RRQR PCA and our proposed algorithms on the ORL databases

Number of features	RRQR PCA	KPCA/RRQR	RKPCA
			$p=1.0$	$p=0.5$
5	71.00%	73.50%	76.50%	75.00%
10	84.00%	85.50%	85.00%	86.00%
15	84.50%	86.00%	88.00%	88.00%
20	86.50%	86.50%	88.00%	92.50%
25	87.00%	87.50%	89.00%	91.50%
30	87.50%	88.00%	90.00%	93.00%
35	88.50%	88.00%	91.50%	94.50%
40	88.50%	88.00%	91.00%	95.00%
45	89.00%	90.50%	93.00%	94.00%
50	89.00%	91.00%	93.00%	94.50%

Table 2

Average recognition rates on ORL database with noise free and noisy conditions with Salt and Pepper noise

Methods	Noise-free	Salt and Pepper noise
		$10^{-4}$	$10^{-3}$	$10^{-2}$	$10^{-1}$
PCA	88.50%	88.50%	88.50%	88.30%	78.25%
2DPCA	90.50%	90.50%	90.55%	90.30%	88.95%
PCA-L1	89.75%	89.75%	89.05%	88.40%	80.50%
WTPCA-L1	92.10%	92.10%	92.05%	92.15%	88.60%
LDA	90.25%	90.25%	90.30%	88.45%	74.20%
2DLDA	93.15%	92.95%	92.70%	92.15%	89.85%
RKPCA ( $p$ =1.0)	91.50%	91.25%	91.25%	91.25%	90.35%
RKPCA ( $p=$ 0.5)	95.00%	95.00%	95.00%	94.75%	94.65%

In the next experiment, we have also examined the robustness of our approach to noises. To do that, Salt & Pepper and Gaussian noises are added to the facial images in ORL database. The noise density inserted varies with the values $10^{-4}$ , $5\times 10^{-4}$ , $10^{-3}$ , $5\times 10^{-3}$ , $10^{-2}$ , $5\times 10^{-2}$ , and $10^{-1}$ . Then, we have executed our approach ten times and the average recognition rates are calculated. So, the impact of Salt & Pepper and Gaussian noises on the average recognition rates of our algorithm for the ORL database is shown in Table 2 and Fig. 4 respectively. From the obtained results in Table 2, it can be clearly seen that the recognition rates reduces with an increase of noises density. Moreover, the proposed algorithm achieved high recognition rates for different noise density degradations, especially when the accuracy rate is 94.75% even with $10^{-2}$ of noise density. Furthermore, from Fig. 4, we can clearly observe the effect of added Gaussian noises on recognition accuracy for ORL database. From these results, we can note that the recognition accuracy of PCA, 2DPCA, PCA-L1, WTPCA-L1, LDA, 2DLDA are decreasing when we rise the noise variance. However, the recognition accuracy of RKPCA algorithm remains the topmost one.

Figure 4.

Impact of Gaussian-noise on recognition accuracy for ORL database.

4.2 Experiments on FERET database

The Facial Recognition Technology (FERET) is a facial database containing over 10,000 facial images. These images are taken in different situations with a size of 80 $\times$ 80 pixels. There are other versions of this database with colorful and gray facial images. We have selected 50 individuals and 7 images per individual of gray-version FERET database. All images are tiff format and partial of them are displayed in Fig. 5. We select the first four face images of each individual to build the training set and we take the remaining images for the test set.

Figure 5.

Some face images from the FERET database.

Table 3 lists the recognition rates of RRQR PCA and RKPCA algorithms on the FERET database by varying the number of features from 10 to 120. Figure 6 shows some facial-images from the FERET database with and without noises. The second row and the third row are noisy with Salt & Pepper noise and Gaussian noise respectively. From the results displayed in Table 3, it can be observed that our KPCA/RRQR algorithm gives better recognition rates than RRQR PCA. Otherwise, RKPCA algorithm performs better when using RBF $L_{p}$ -norm kernel with $p=0.5$ . The highest recognition rate of 68.00% is obtained when the number of features is 120.

Table 3

Recognition rates of RRQR PCA and our algorithms on the FERET database

Number of features	RRQR PCA	KPCA/RRQR	RKPCA
			$p=1.0$	$p=0.5$
10	58.00%	58.33%	58.00%	55.33%
20	62.66%	62.66%	62.66%	61.33%
30	62.00%	62.33%	64.00%	64.00%
40	62.00%	61.00%	62.00%	63.33%
50	60.00%	61.00%	63.33%	63.33%
60	60.66%	61.66%	62.66%	64.66%
70	60.66%	60.66%	63.33%	65.33%
80	60.66%	60.66%	62.66%	65.33%
90	60.66%	60.66%	62.66%	64.66%
100	61.33%	62.33%	63.33%	66.00%
110	60.66%	62.66%	64.00%	66.66%
120	$61.33$ %	63.33%	64.00%	68.00%

Table 4

Average recognition rates on FERET database in noise free and noisy conditions with Salt and Pepper noise

Methods	Noise-free	Salt and Pepper noise
		$10^{-4}$	$10^{-3}$	$10^{-2}$	$10^{-1}$
PCA	61.33%	60.46%	60.08%	58.40%	28.26%
2DPCA	66.00%	66.00%	66.00%	65.66%	59.63%
PCA-L1	63.33%	62.60%	62.06%	61.53%	32.73%
WTPCA-L1	64.66%	63.80%	63.53%	52.86%	59.33%
LDA	64.00%	63.40%	62.80%	57.40%	21.86%
2DLDA	65.33%	64.00%	64.06%	62.53%	61.93%
RKPCA ( $p=$ 1.0)	64.00%	63.73%	63.40%	62.53%	46.20%
RKPCA ( $p=$ 0.5)	68.00%	68.00%	68.13%	66.40%	62.93%

Figure 6.

Some facial-images of the FERET database with and without noises. The second row and the third row are noisy with Salt and Pepper noises and Gaussian noises respectively.

We have also examined the robustness of our approach to noises. Salt & Pepper and Gaussian noises are added on the facial-images in FERET database. The noise density inserted varies with the values $10^{-4}$ , $5\times 10^{-4}$ , $10^{-3}$ , $5\times 10^{-3}$ , $10^{-2}$ , $5\times 10^{-2}$ , and $10^{-1}$ . For that, we have executed our approach 10 times.Then the average results are taken to compare them with other existing algorithms. In addition, the impact of Salt and Pepper and Gaussian noises on the average recognition rates for the FERET database is shown at Table 4 and Fig. 7 respectively. By examining the obtained results in Table 4, it can be clearly seen the recognition rates reduces with increase of noises density and that our proposed algorithm achieved high recognition rates for different noise density degradations, especially when the accuracy rate is 68.13% even with $10^{-3}$ of noise density. Furthermore, from Fig. 7, we can clearly observe the effect of added Gaussian noises on recognition accuracy of our algorithm applied to FERET database. From these results, we can note that the recognition accuracy of PCA, 2DPCA, PCA-L1, WTPCA-L1, LDA, 2DLDA are decreasing when we rise the noise variance. However, the recognition accuracy of the RKPCA algorithm remains the topmost one, which indicates the efficiency and robustness of our algorithm proposed.

Figure 7.

Impact of Gaussian-noise on recognition accuracy of our approach on FERET database.

5. Conclusion

Based on a novel RBF $L_{p}$ -norm kernel, we have proposed a new robust extension of kernel principal component analysis for face recognition. Our approach not only makes use of the strong robustness of the $L_{p}$ -norm to noises and outliers, but also utilizes RRQR factorization to speed up the computation of projection bases. Many experiments on ORL and FERET prove that our proposed algorithm is more robust than several advanced feature extraction algorithms in terms of recognition accuracy under noisy and noise-free conditions.

Footnotes

Acknowledgments

The authors wish to thank the anonymous reviewers, whose comments were a great help to raise the level of technical and formal correctness of this paper.

References

Patel

V.M.

Ratha

N.K.

and Chellappa

, Cancelable biometrics: a review, IEEE Signal Processing Magazine 32(5) (2015), 54–65.

Zhao

Chellappa

Phillips

P.J.

and Rosenfeld

, Face recognition: a literature survey, ACM computing surveys (CSUR) 35(4) (2003), 399–458.

Nithya

A.A.

and Lakshmi

, Iris recognition techniques: a literature survey, International Journal of Applied Engineering Research 10(12) (2015), 32525–32546.

Singh

S.P.

Ayub

and Saini

, Literature survey on different type of fingerprint recognition, in: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), IEEE, 2016, pp. 3748–3755.

Klevans

R.L.

and Rodman

R.D.

, Voice recognition, Artech House, Inc., 1997.

Obata

Uchikawa

Furuhashi

and Yang

, Signature recognition system, Google Patents, 1998, US Patent 5,825,906.

Monrose

and Rubin

A.D.

, Keystroke dynamics as a biometric for authentication, Future Generation Computer Systems 16(4) (2000), 351–359.

Jolie

, Principal Component Analysis, Springer-Verlag, New York, 1986.

Wang

and Zhang

, Facial recognition based on kernel pca, in: 2010 Third International Conference on Intelligent Networks and Intelligent Systems, IEEE, 2010, pp. 88–91.

10.

Belhumeur

P.N.

Hespanha

J.P.

and Kriegman

D.J.

, Eigenfaces vs. fisherfaces: Recognition using class specific linear projection, IEEE Transactions on Pattern Analysis & Machine Intelligence, 1997, 711–720.

11.

Tang

E.K.

Suganthan

P.N.

Yao

and Qin

A.K.

, Linear dimensionality reduction using relevance weighted LDA, Pattern Recognition 38(4) (2005), 485–493.

12.

Chougdali

Jedra

and Zahid

, Using Wavelets based Feature Extraction and Relevance Weighted LDA for Face Recognition, in: PRIS, 2007, pp. 183–188.

13.

Chougdali

Jedra

and Zahid

, Kernel relevance weighted discriminant analysis for face recognition, Pattern Analysis and Applications 13(2) (2010), 213–221.

14.

Dai

D.-Q.

and Yuen

P.C.

, Face recognition by regularized discriminant analysis, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 37(4) (2007), 1080–1085.

15.

Hsu

C.-H.

and Chen

C.-C.

, Svd-based projection for face recognition, in: 2007 IEEE International Conference on Electro/Information Technology, IEEE, 2007, pp. 600–603.

16.

Chen

H.-T.

Chang

H.-W.

and Liu

T.-L.

, Local discriminant embedding and its variants, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, IEEE, 2005, pp. 846–853.

17.

Comon

, Independent component analysis, a new concept? Signal Processing 36(3) (1994), 287–314.

18.

Bartlett

M.S.

, Independent component representations for face recognition, in: Face Image Analysis by Unsupervised Learning, Springer, 2001, pp. 39–67.

19.

Maafiri

and Chougdali

, New fusion of SVD and relevance weighted LDA for face recognition, Procedia Computer Science 148 (2019), 380–388.

20.

Samaria

F.S.

and Harter

A.C.

, Parameterisation of a stochastic model for human face identification, in: Proceedings of 1994 IEEE Workshop on Applications of Computer Vision, IEEE, 1994, pp. 138–142.

21.

Nefian

, Georgia Tech Face Database 1999, Online: http://www.anefian.com/research/facereco.htm.

22.

Abbad

Elharrouss

Abbad

and Tairi

, Application of MEEMD in post-processing of dimensionality reduction methods for face recognition, IET Biometrics 8(1) (2018), 59–68.

23.

Qiao

, Discriminative Principal Component Analysis: A REVERSE THINKING, arXiv preprint arXiv:1903.04963, 2019.

24.

Sharma

Paliwal

K.K.

Imoto

and Miyano

, Principal component analysis using QR decomposition, International Journal of Machine Learning and Cybernetics 4(6) (2013), 679–683.

25.

Phillips

P.J.

Wechsler

Huang

and Rauss

P.J.

, The FERET database and evaluation procedure for face-recognition algorithms, Image and Vision Computing 16(5) (1998), 295–306.

26.

Turk

and Pentland

, Eigenfaces for recognition, Journal of Cognitive Neuroscience 3(1) (1991), 71–86.

27.

Turk

M.A.

and Pentland

A.P.

, Face recognition using eigenfaces, in: Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, 1991, pp. 586–591.

28.

Golub

and Van Loan

, Matrix Computations 4th Edition The Johns Hopkins University Press, Baltimore, MD, 2013.

29.

Hong

Y.P.

and Pan

C.-T.

, Rank-revealing QR factorizations and the singular value decomposition, Mathematics of Computation 58(197) (1992), 213–232.

30.

Feng

Xiao

Liu

Song

Yang

and Zhang

, A kernel principal component analysis-based degradation model and remaining useful life estimation for the turbofan engine, Advances in Mechanical Engineering 8(5) (2016), 1687814016650169.

31.

Kim

K.I.

Jung

and Kim

H.J.

, Face recognition using kernel principal component analysis, IEEE Signal Processing Letters 9(2) (2002), 40–42.

32.

Taouali

Jaffel

Lahdhiri

Harkat

M.F.

and Messaoud

, New fault detection method based on reduced kernel principal component analysis (RKPCA), The International Journal of Advanced Manufacturing Technology 85(5–8) (2016), 1547–1552.

33.

Mansouri

Baklouti

Harkat

M.F.

Nounou

and Hamida

A.B.

, Kernel generalized likelihood ratio test for fault detection of biological systems, IEEE Transactions on Nanobioscience 17(4) (2018), 498–506.

34.

Cao

Chua

K.S.

Chong

Lee

and Gu

, A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine, Neurocomputing 55(1–2) (2003), 321–336.

35.

Kwak

, Principal component analysis by L⁢_⁢{p}-norm maximization, IEEE Transactions on Cybernetics 5(44) (2014), 594–609.

36.

J.H.

and Kwak

, Generalization of linear discriminant analysis using Lp-norm, Pattern Recognition Letters 34(6) (2013), 679–685.

37.

Park

and Kwak

, Independent component analysis by lp-norm optimization, Pattern Recognition 76 (2018), 752–760.

38.

Zhong

Zhang

and Li

, Discriminant locality preserving projections based on L1-norm maximization, IEEE Transactions on Neural Networks and Learning Systems 25(11) (2014), 2065–2074.

39.

Ebied

H.M.

, Feature extraction using PCA and Kernel-PCA for face recognition, in: 2012 8th International Conference on Informatics and Systems (INFOS), IEEE, 2012, p. MM–72.

40.

Abdi

and Williams

L.J.

, Principal component analysis, Wiley Interdisciplinary Reviews: Computational Statistics 2(4) (2010), 433–459.

41.

Yang

Zhang

Frangi

A.F.

and Yang

J.-y.

, Two-dimensional PCA: a new approach to appearance-based face representation and recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 26(1) (2004), 131–137.

42.

Kwak

, Principal component analysis based on L1-norm maximization, IEEE Transactions on Pattern Analysis and Machine Intelligence 30(9) (2008), 1672–1680.

43.

Maafiri

and Chougdali

, Face Recognition using Wavelets based Feature Extraction and PCA-L1 norm, in: 2019 International Conference on Vision Towards Emerging Trends in Communication and Networking (ViTECoN), IEEE, 2019, pp. 1–4.

44.

Yang

Zhang

Yong

and Yang

J.-y.

, Two-dimensional discriminant transform for face recognition, Pattern Recognition 38(7) (2005), 1125–1129.

Robust face recognition based on a new Kernel-PCA using RRQR factorization

Abstract

Keywords

1. Introduction

2. Related work

3.1 Principal component analysis using RRQR factorization

Table 1 Recognition rates of RRQR PCA and our proposed algorithms on the ORL databases

Footnotes

Acknowledgments

References

Table 1
Recognition rates of RRQR PCA and our proposed algorithms on the ORL databases