Robust Face Recognition Based on Texture Analysis

Abstract

In this paper, we present a new framework for face recognition with varying illumination based on DCT total variation minimization (DTV), a Gabor filter, a sub-micro-pattern analysis (SMP) and discriminated accumulative feature transform (DAFT). We first suppress the illumination effect by using the DCT with the help of TV as a tool for face normalization. The DTV image is then emphasized by the Gabor filter. The facial features are encoded by our proposed method - the SMP. The SMP image is then transformed to the 2D histogram using DAFT. Our system is verified with experiments on the AR and the Yale face database B.

Keywords

Discriminated Feature Transform SMP

1. Introduction

Image representation is one of the key issues in the computer vision and image processing communities. It is usually proposed for transforming the raw image to a new domain in which the recognition process can be tolerated against illumination changes, rotation and scaling, pose, occlusions, etc. Several techniques have been proposed for solving the problems in face recognition by means of image representation. For a holistic approach, PCA and LDA [10, 3] may be offered as the most prominent methods for face recognition. The concept of an eigenface is an optimal reconstruction method in the sense of minimum mean square error, which projects an image onto the directions that maximize the total scatter across all classes of face images [10]. This means that the eigenface is not the optimal method in the sense of discrimination ability, which depends upon the separation between different classes rather than the spread of all classes. For the problems of class separation, a method based on class-specific linear projection was proposed by Belhumeur et al. [3]. This method tries to find the eigenvectors for which the ratio of the between-class scatter and the within-class scatter is maximized. Recently, methods based on local encoding enhancement have been proposed for face recognition under variable illuminations [1, 2]. Ahonen et al. [1] propose a new face representation based on a local binary pattern (LBP) histogram. They divide the face areas into 49 small (7×7) windows. The LBP_8,2^U2 was used to encode the local pixels of the face images. Zhang et al. [12] presented a combination of Gabor filter and the LBP, which was called a local Gabor binary pattern histogram sequence (LGBPHS). They applied the Gabor filters at five scales and eight orientations on the face image. The LBP was then applied to all 40 Gabor magnitudes to generate the LGBPHS. The extension of the LGBPHS was proposed in [2]. The Gabor phase was used to encode and represent the histogram of the Gabor phase pattern (HGPP). The quadrant-bit codes, extending from the IrisCode [6], were first extracted from the face images based on the Gabor phase information. Global and local GPPs were proposed to encode the phase variations. GGPP captured the variations of the orientation changes of the Gabor wavelet at a given scale while the LGPP encodes the local variations by using a local XOR pattern.

In this paper, we present a method for illumination normalization and facial feature amplification. We combined the discrete cosine transform and total variation minimization - which is called DTV - for suppressing the variations in illumination. The aims of our normalization process are to suppress the illumination variations while enhancing the facial structures. In general, a face normalization method should satisfy the following criteria:

Good Normalization: the illumination variations should be suppressed and the facial components must not be destroyed.

Good Preservation: the facial structures (eyes, eyebrows, nose and mouth) must be maintained and should be more prominent than the others.

We will present our normalization process in order to achieve these criteria. Next, we encode the pre-processed image using our method - the SMP. Finally, the 2-D SMP image is transformed into the 2-D feature space - the 2-D histogram. The test statistic is used to measure the 2-D histogram.

This paper is organized as follows. Section 2 presents the face normalization and face encoding techniques. We introduce a feature transform in section 3. Extensive experimental results will be demonstrated in section 4. Finally, we conclude in section 5.

2. The Basic Principles

We give a brief overview of the basic principles which are used in this paper, including the discrete cosine transform, the total variation minimization, the Gabor filter and a sub-micro-pattern analysis.

2.1 Discrete Cosine Transform

The discrete cosine transform (DCT) is a useful tool in signal processing for frequency analysis. Its results correspond to the frequency components in an (signal) image - i.e., the low and high frequencies are arranged in DCT components. The DCT represents a sequence of finite pixel elements in terms of the sum of cosine functions oscillating at different frequencies. For a 2D image of size M × N, the DCT-II is defined as:

C (a, b) = α (a) α (b) \sum_{x = 0}^{M - 1} \sum_{y = 0}^{N - 1} f (x, y) C_{x}^{M} (a) C_{y}^{N} (b),

(1)

with its inverse transform:

f (x, y) = \sum_{a = 0}^{M - 1} \sum_{b = 0}^{N - 1} α (a) α (b) C (a, b) C_{x}^{M} (a) C_{y}^{N} (b),

(2)

where $C_{k}^{P} (h) = \cos [\frac{π (2 k + 1) h}{2 P}]$ . For both equations (1) and (2), α(a) and α(b) are'defined as $α (a) = 1 ∕ \sqrt{M}$ if a = 0, otherwise $α (a) = \sqrt{2 ∕ M}$ and $α (b) = 1 ∕ \sqrt{N}$ if b = 0, otherwise $α (b) = \sqrt{2 ∕ N}$ . If a = 0 and b = 0, then C(0,0) is $(1 ∕ \sqrt{M}) (1 ∕ \sqrt{N}) \sum_{x = 0}^{M - 1} \sum_{y = 0}^{N - 1} f (x, y)$ which is an average value of the sample sequences. Therefore, the suppression is on the DC and a first few AC components and is defined as a divisor matrix Q with:

\begin{matrix} Q (0, 0) = β_{0}, \\ Q (0, 1) = Q (1, 0) = β_{1}, \\ Q (0, 2) = Q (2, 0) = Q (2, 2) = β_{2} \end{matrix}

and

Q (0, 3) = Q (3, 0) = Q (2, 3) = Q (3, 2) = β_{3}

with all other Q(a,b)s set to 1, where β₀ ∈ [0.85,1.3], β₁ ∈ [0.5,0.7], β₂=β₁ + 0.1 and β₃ = β₂ + 0.1. We define the quantized DCT face image as:

D (x, y) = i D C T (\frac{D C T (f (x, y))}{Q (x, y)}),

(3)

where DCT and iDCT are the discrete cosine transform and its inverse as defined in equations (1) and (2). Using DCT in conjunction with TV is useful insofar as the facial features are enhanced while the illumination variations are suppressed. The quantized DCT D will be used in the next subsection.

2.2 Total Variation Minimization

Without lost of generalization, we define the input image for the TV model as f - in fact, f is the quantized DCT D generating from the previous subsection. Total variation (TV) minimization is a tool for image reconstruction which preserves edges and allows for sharp boundaries [4]. It can be used to decompose a large-scale component from a small-scale one, which separates an image f into two parts. In particular, the TV model decomposes an input image f into a large-scale output u and a small-scale output v, where f, u and v are functions of image intensity values with f,u,v ∈ ℝ². We assume that u and v are two images, extracting from an image f, defined on Ω. A general way to obtain the decomposition images is to solve the minimization problem:

{min}_{u} \int_{Ω} ∣ \nabla u ∣ + λ ∥ f - u ∥_{L^{x}},

(4)

where f _ω|∇u| is the total variation of u over its support Ω and λ is a scalar threshold. ‖ f -u‖_Lx is some measure of the closeness between u and f. The choice of measure ‖·‖_Lx depends upon the application. Thus, an image f can be decomposed as f = u + v where u represents a large scale image cartoon and v is a small-scale texture and noise. The objective of TV is to find u by which u is close to f with an edge-preserving property. A general step for finding u is as follows.

Minimize the total variation f _ω|∇u(x)|dx of u over its support Ω.

Compute the regulation measure term ‖ f -u‖_Lx using L¹, L²-norm or f - u.

The regularization measure term ‖ f -u‖_Lx generally tends to maintain u close to f and, therefore, also maintains the edges of f in u. A direct approach for solving the TV minimization is second-order cone programming (SOCP) with modern interior-point methods [4, 7] as a kernel. A standard form SOCP can be described as follows

\min c_{1}^{T} x_{1} + \dots + c_{r}^{T} x_{r}

such that $A_{1} x_{1} + \dots + A_{r} x_{r} = b$

x_{i} \in K^{n_{i}}, for i = 1, \dots, r,

(5)

where c_t ∈ ℝⁿⁱ and A_t ∈ ℝ^{m × ni}, b ∈ ℝ^m. Each sub-vector x_t must lie in an elementary second-order cone of dimension n_i

K^{n_{i}} \equiv {x_{i} = (x_{i}^{0}; {\bar{x}}_{i}) \in ℝ \times ℝ^{n_{i} - 1} : ∥ {\bar{x}}_{i} ∥ \leq x_{i}^{0}} .

(6)

We assume that the images are represented as 2-D n × n matrices, whose elements give the grey values of corresponding pixels, i.e.:

f_{i, j} = u_{i, j} + v_{i, j} for i, j = 1, \dots, n .

(7)

Let ∂⁺ be the discrete differential operator defined by:

\partial^{+} u_{i, j} = ({(\partial_{x}^{+} u)}_{i, j}, {(\partial_{y}^{+} u)}_{i, j}),

(8)

where (∂⁺_xu)_i,j = u_i+1,j-u_i,j, for i = 1,…, n − 1, j = 1, …, n, and (∂⁺_yu)_i,j = u_i,j − u_i,j for i = 1,…, n,j = 1, …, n − 1, which are equivalent to the first derivative in any image processing's definition. The discrete total variation of u is defined by forward finite differences as:

\int_{Ω} | \nabla u | : = \sum_{i, j} {[{({(\partial_{x}^{+} u)}_{i, j})}^{2} + {({(\partial_{y}^{+} u)}_{i, j})}^{2}]}^{1 ∕ 2}

(9)

By introducing t_t,j and the relation ${({(\partial_{x}^{+} u)}_{i, j})}^{2} + {({(\partial_{y}^{+} u)}_{i, j})}^{2} \leq {(t_{i, j})}^{2}$ for each pixel i,j = 1,…,n, the total variation can be cast as the SOCP (SOCP-TV) problem as:

\min \sum_{1 \leq i, j \leq n} t_{i, j} + λ s

such that:

\begin{matrix} u_{i, j} + v_{i, j} = f_{i, j}, \forall i, j = 1, \dots, n, \\ - {(\partial_{x}^{+} u)}_{i, j} + (u_{i + 1, j} - u_{i, j}) = 0, \\ \begin{matrix} {(\partial_{x}^{+} u)}_{n, j} = 0, {(\partial_{y}^{+} u)}_{i, n} = 0, \\ \sum_{1 \leq i, j \leq n} (f_{i, j} - u_{i, j}), \sum_{1 \leq i, j \leq n} \begin{matrix} (u_{i, j} - f_{i, j}) \leq s, \end{matrix} \\ (t_{i, j}; {(\partial_{x}^{+} u)}_{i, j}, {(\partial_{y}^{+} u)}_{i, j}) \in K^{3} \end{matrix} \end{matrix}

(10)

Having solved the total variation of u, the normalized face image can be formulated as:

f^{'} = \frac{f}{u}

(11)

where f is a quantized DCT as defined in equation (3) and u is a large-scale image, solved by SOCP-TV. We called this normalization method the DCT-TV or – in short-the DTV.

2.3 The Gabor Transform

Now, the DTV face image is emphasized using the Gabor transform. Let $\vec{x} = (x_{1}, x_{2})$ and $\vec{ξ} = (ξ_{1}, ξ_{2})$ be the space coordinates of an image $f (\vec{x})$ and the nearby $f (\vec{ξ})$ , respectively. The Gabor filter can be written as [5, 6]:

W_{μ, v} (\vec{ξ}, \vec{x}) = \frac{∥ {\vec{k}}_{μ, v} ∥}{σ^{2}} \cdot \exp [- \frac{∥ {\vec{k}}_{μ, v} ∥ ∥ \vec{ξ} - \vec{x} ∥}{2 σ^{2}}] \cdot {\exp [i {\vec{k}}_{μ, v} z] - \exp [- σ^{2} ∕ 2]}

(12)

where σ governs the spatial extent and bandwidth of the Gaussian function and μ and v control the orientation and scale of the Gabor kernel, z = (ξ₁-x₁, ξ₂-x₂,). The wave vector ${\vec{k}}_{μ, v}$ is defined as ${\vec{k}}_{μ, v} = k_{v} \cdot \exp [i ϕ_{μ}]$ , where $k_{v} = k_{m a x} ∕ Q^{v}$ , $ϕ_{μ} = π μ ∕ 8$ , $k_{m a x}$ is the maximum frequency and Q controls the spacing between kernels in the frequency domain. In this paper, $σ = 2 π$ , $k_{m a x} = π ∕ 2$ and $Q = \sqrt{2}$ . The Gabor transform is $O_{μ, v} (\vec{x}) = m_{μ, v} (\vec{x}) \cdot \exp [i ϕ_{μ, v} (\vec{x})]$ , where its magnitude is combined by:

Y (\vec{x}) = \sum_{v} \sum_{μ} m_{μ, v} (\vec{x}) \cdot ω_{μ, v}^{- 1}

(13)

where $ω_{μ, v} = \sum^{} W_{μ, v} (\vec{ξ}, \vec{x})$ is the total magnitude of the Gabor wavelet at μ ∈{1,…8} {1,…,8} and ν $v \in {0, \dots, 2}$ {0,…,2}.

2.4 A Sub-Micro Pattern Encoding Operator

In general, a 2-D isotropic Gaussian function is defined as $G_{σ} (\vec{ξ}, \vec{x}) = \frac{1}{σ \sqrt{2 π}} \exp [- \vec{ξ} - \vec{x} ∕ 2 σ^{2}]$ , where σ is a standard deviation f the associated probability distribution. Formally, a family of the DooG functions along the different orientations θ can be generated as:

D o o G_{σ, d, θ} (\vec{ξ}, \vec{x}) = G_{σ} ({\vec{x}}^{'}, {\vec{ξ}}^{'}) - G_{σ} ((ξ_{1}^{'}, ξ_{2}^{'} + d), {\vec{x}}^{'})

(14)

where ξ'₁ ξ₁ = ξ₁ cos θ + ξ₂ sin θ and ξ'₂ = −ξ₁ sin θ + ξ₂ cos θ with $\vec{x}'$ are defined similarly. It should be noted that DooG is very similar to the first derivative of the Gaussian function (GD) in the sense that they measure the differences at local vicinity between the positive and negative areas. GD measures the differences in local changes such as edge, i.e. the magnitude is high when GD is on the edge. In DooG, however, the distance between the positive and negative areas can be adjusted by varying the parameter d. Hence, DooG is useful in that it can be used to measure the differences that are farther away, such as the differences between two objects or textures. Therefore, GD may be appropriate, for example, for edge detection while a more useful DooG can be used to measure the texture differences. Let (σ, d, θ) be the triple parameters for the texture descriptor. The DooG coding function is defined by:

ψ_{σ, d, θ} (\vec{x}) = \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} Y (\vec{x}) D o o G_{σ, d, θ} (\vec{ξ}, \vec{x}) d \vec{ξ}

(15)

Ψ measures the texture variations around the centre positive Gaussian and gives the differences as the positive or negative values. Therefore, we define the ζ function as:

ζ (z) = {\begin{matrix} 1, z \geq 0 \\ 0, z < 0. \end{matrix}

(16)

ζ encodes the sign of the differences as 0 or 1. Consequently, the texture coding for each pixel can be written as:

C (\vec{x}) = {c_{1}, c_{2}, \dots, c_{n}}

(17)

with

c_{i} = ζ (ψ_{σ, d, θ_{i}} (\vec{x})),

(18)

where θ_i is a specified orientation of the DooG, e.g., θ_t = 0°, 15°,…,345° for θ = 15°, and $n = \frac{360}{θ}$ . It is known that a pixel in an image can be encoded as a micro-pattern representation, e.g., Daugman's IrisCode [6] and LBP [9]. We propose here the rotation invariant encoding method based on a sub - micro-pattern analysis. We start the encoding from c₁ to c_n where n is the number of texture bit coding. The sub-micro-pattern encoding operator can be formulated as:

\begin{aligned} Ξ = \sum_{i = 1}^{n} [(α c_{i} - (c_{i} \cdot c_{i - 1} \cdot c_{i + 1})) + ∣ c_{i} - c_{i + 1} ∣ \cdot ω_{1}] + \\ ∣ c_{i} - (c_{i + \frac{n}{2} - 1} \cdot c_{i + \frac{n}{2} + 1}) ∣ \cdot ω_{2} + \sum_{i = 1}^{n ∕ 2} ∣ c_{i} - c_{i + \frac{n}{2}} ∣ \cdot ω_{3}, \end{aligned}

(19)

where α and ω_i are constant values defined by a user to determine the number of unique values, |c_i − c_i₊₁| is the successive absolute different operator, $| c_{i} - (c_{i + \frac{n}{2} - 1} \cdot c_{i + \frac{n}{2} + 1}) |$ is the symmetric Y structural absolute different operator and $| c_{i} - c_{i + \frac{n}{2}} |$ is the opposite absolute different operator. We called this the sub - micro-pattern (SMP) analysis. The operator c_i − (c_i · c_i₋₁ · c_i₊₁) in equation (19) indicates the difference between the current position c_i and that of the multiplication of c_i with c_i₋₁ and c_i₊₁. We called this the self derivative. Let us define by χ=αc_i − (c_i · c_i-1· c_i−1) the self derivative encoding operator. Its properties are as follows: if c_i = 0 then χ = 0; if c_i = c_i₋₁ = c_l+1 = 1 then χ = α − 1; and if c_i = 1 and c_i₋₁ or c_i₊₁ is zero then χ = α. The operator |c_i − c_l₊₁| is the successive difference used to measure the variations in the micro features. Based on our knowledge, the variations in the micro-feature are useful information and, hence, implied to us that they must be used to measure the pattern alternation. Our proposed method is rotation invariant without performing the circular bitwise right shift on the n bits as the LBP does [9].

3. The Feature Transform

In this section, we propose a new 2-D global image descriptor which describes the characteristics of the SMP face image. As can be seen in Fig. 1 (a), the SMP image has more distinctive features. We therefore need a method for depicting the structure of the SMP face image. We know that the 1-D histogram is not appropriate for this case. We construct the global histogram for each image by accumulating the sign of the difference of the values of the SMP. Let SMP( $S M P (\vec{x})$ ) be the SMP image at 2; our 2-D discriminated histogram is described in Algorithm 1.

Algorithm 1: The construction of the histogram $H (n, v_{c} (\vec{x}))$

1. Initializes the histogram H to zero.

2. For each pixel $S M P (\vec{x})$ , do the following computation:

H (n, v_{c} (\vec{x})) = H (n, v_{c} (\vec{x})) + κ (v),

(20)

with

κ (v) = {\begin{matrix} 1, v > 0; \\ 0, v = 0; \\ - 1, v < 0, \end{matrix}

where $v = v_{n}^{r} (\vec{x}) - v_{c} (\vec{x})$ , $n = 1, \dots, N_{r}$ and $R = {r_{1}, \dots, r_{m}}$ . For simplicity, let us assume that the central coordinate is (0,0) and their vicinity coordinates, around its central, are (x,y) = (r sin(2πn/N_r),r cos(2πn/N_r)), r ∈ R.

3. Repeat step 2 for all pixels $S M P (\vec{x})$ .

In this paper, N_r = 40 is fixed for all rings r ∈ R and R ={5,11,15,20,25} is setup for all experiments. The $H (n, v_{c} (\vec{x}))$ is a 2D histogram which characterizes the SMP image. Based on the robustness of the SMP, we use the SMP value as the index of the histogram, which is robust against the illumination variation of the face images. This 2D histogram is called the Discriminated Accumulative Feature Transform-DAFT. Please note that before constructing the DAFT the SMP is normalized so that their values are in the ranges [0,1]. Fig. 1 demonstrates the construction of the DAFT descriptor with 5 rings. The 2D histogram H is rich in discriminative features which are transformed from the 2D image to the 2D feature space. Hence, the process of classification has become simple and efficient since the dimensions of the histogram H are fixed irrespective of the resolution of the face images.

Figure 1.

The construction of the histogram DAFT descriptor.

Having constructed the global histogram H, it is common to use the χ² test statistic:

R_{i j} = \frac{1}{2} \sum_{a} \sum_{b} \frac{{[H_{i} (a, b) - H_{j} (a, b)]}^{2}}{H_{i} (a, b) + H_{j} (a, b)}

(21)

where H_i and H_j represent the DAFT histograms.

4. Experimental Results

To assess the face recognition problem, we evaluated our approach on the Yale face database B (YaleB) [7] and the AR face database (ARDB) [8]. It should be noted that there are variations in illumination, facial expressions and occlusions for the two databases. The YaleB [7] contained 10 people viewed under 64 different illumination conditions. It has been used as a de facto standard for the evaluation of variable lighting face recognition. For this database, we mainly dealt with the illumination problem. The images are divided into five subsets according to the illumination directions: Subset 1 (0° to 12°), Subset 2 (13° to 25°), Subset 3 (26° to 50°), Subset 4 (51° to 77°) and Subset 5 (above 78°). Fig. 2 demonstrated the experiments of face normalization on the YaleB with variable illuminations. The first row showed the original images ranging from neutral to extreme light sources. The corresponding normalization results were shown in the second. When using the DTV, the illumination variations were suppressed while the facial structures were still maintained. In addition, the facial components (eyes, eyebrows, nose and mouth) were more prominent than the cheek and forehead, obviously. One can see that the eyes, eyebrows, nose and mouth were emphasized whereby the facial structures remained the same for all variable illuminations.

Figure 2.

The example results of our face normalization method on YaleB. The columns give images from subsets 1 to 5, respectively. 1^st row) original images, 2^nd row) DTV images.

Another benefit of the proposed approach was demonstrated in Fig. 3. The original and normalized versions of the face images were shown in Figs. 3 (a) and (b). The LBP face image was shown in Fig. 3 (c). The discriminative features of the SMP and DAFT were demonstrated in Figs. 3 (d) and (e). It is clear that our SMP and DAFT methods are rich features in which the facial images were described as discriminated characteristics. In addition, the SMP images as shown in Fig. 3 (d) can serve as the ‘faceprints’ which can help us recognize human faces better. We believed that using the SMP image would give us a more prominent representation than the LBP one.

Figure 3.

Examples of our descriptors. (a) Original images (b) DTV images (c) LBP images with DTV (d) SMP images with DTV and the Gabor filter and (e) the histogram DAFT.

4.1 Yale Face Database B

A comparison on different methods for YaleB was shown in Table 1. In this experiment, the neutral light source images were created as the gallery and all other subsets were used as the probes. The most difficult of YaleB were subsets 4 and 5. We achieved the best performance (99.67%, 98.23%), followed by LTV (98.2%, 97.9%). This stemmed from the fact that our proposed method attempted to suppress lighting variations while enhancing the facial structures in the face images.

Table 1.

Recognition performance (%) on using subset 1 to 5 as probes.

Methods/Subsets	1	2	3	4	5
DCT	83.3	82.4	81.9	73.8	61.5
SQI [11]	95.2	94.9	88.4	80.5	74.3
LBP [9]	96.1	95.3	81.2	66.5	49.2
LTV [4]	100	100	99.69	98.2	97.9
Our	100	100	100	99.67	98.23

4.2 AR Face Database

For ARDB, the first three images for each subject were used as the training set while the rest were used as the test set. To demonstrate the additional usefulness of the proposed illumination normalization process, we provide the results of face normalization on the ARDB in Fig. 4. The images in the first row were the original images with different variation conditions - e.g., expression, screaming, left-right light on, wearing a scarf, wearing sun glasses, etc. In the second row, the images were normalized using the LTV technique [4]. The results of our proposed method were shown in the third row. It is clear that our proposed method normalizes the illumination variations while maintaining the facial components with more prominence. This helps us in enhancing the facial features in which more accurate face recognition can be achieved. In order to suppress the illumination variations and enhance the facial components, the parameters for DTV were setup as shown in Table 2. As mentioned in subsection 2.1, the first component of the DCT transform is an average value of the sample sequences. Hence, β₀ was used to normalize the overall illumination conditions of the face images with β₀ > 1 (darker) or β₀ < 1 (brighter) as settings. The parameters β₁ to β₀ were used to enhance the facial components for which they are more prominent than the others.

Figure 4.

Face normalization methods. 1^st row) original images, 2^nd row) the LTV images [11] and 3^rd row) the DTV images.

Table 2.

The parameter settings for DCT and TV. Note: When setup β₁, β₂ = β₁ + 0.1 and β₃ = β₂ + 0.1.

Methods	Parameters	Values
DCT	β₀	0.85–1.3
DCT	β₁	0.5–0.7
TV	λ	0.2

Table 3 compares the different approaches for face recognition in terms of recognition rate. When we evaluated the face recognition results, we count the number of correct recognitions which were in ranks 1, 5, 10 and 15. For the correct recognition that was within rank 1, the proposed approach achieved the best performance (73.02%), followed by the LTV (61.61%) and LBP methods (57.8%). If we count the number of the recognition rate that was within rank 15, our approach still achieved the best performance with 95.67%, while the recognition rate of LTV was 93.22% and that of LBP was 68.9%. It is clear that our proposed method has very high recognition accuracy, even in the case of extreme conditions.

Table 3.

Comparison of the face recognition rate (%) for different approaches tested on the AR face database.

methods/ranks	1	5	10	15
DCT	39.6	46.8	53.3	59.2
SQI [11]	52.2	56.3	58.2	62.1
LBP [9]	57.8	63.3	66.4	68.9
LTV [4]	61.61	80.64	89.35	93.22
Our	73.02	84.57	92.33	95.67

5. Conclusions

We have presented a new framework for face recognition under varying illumination based on DTV, SMP and DAFT. The illumination variations were first suppressed using the DCT with TV. The facial features were amplified by our proposed method - the Gabor + SMP. The DAFT method was used to transform the 2D image into the 2D feature space. Finally, the χ² test statistic was used for histogram matching. Our system was verified with experiments on the ARDB and YaleB databases and has achieved a very high recognition rate.

References

Ahonen

and Hadid

and Pietikainen

Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Trans. PAMI, 28(12):2037–2041, 2006.

Zhang

and Shan

and Chen

and Gao

Histogram of Gabor Phase Patterns (HGPP): A Novel Object Representation Approach for Face Recognition. IEEE Trans. IP, 16(1):57–68, 2007.

Belhumeur

P.N.

and Hespanha

J.P.

and Kriegman

D.J.

Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Trans. PAMI, 19(7):711–720, 1997.

Chen

and Yin

and Zhou

and Comaniciu

and Huang

Total Variation Models for Variable Lighting Face Recognition. IEEE Trans. PAMI, 28(9):1519–1524, 2006.

Daugman

J.G.

Complete Discrete 2-D Gabor Transforms by Neural Networks for Image Analysis and Compression. IEEE Trans. ASSP, 36(7):1169–1179, 1988.

Daugman

J.G.

High confidence visual recognition of persons by a test of statistical independence. IEEE Trans. PAMI, 15(11):1148–1161, 1993.

Georghiades

A.S.

and Belhumeur

P.N.

and Kriegman

D.J.

From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose. IEEE Trans. PAMI, 23(6):643–660, 2001.

Martinez

A.M.

and Benavente

The AR Face Database. CVC Technical Report #24, 1998.

Ojala

and Pietikainen

and Maenpaa

Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns. IEEE Trans. PAMI, 24(7):971–987, 2002.

10.

Turk

and Pentland

Eigenfaces for Recognition. J. of Cog. Neuro., 3(1):71–86, 1991.

11.

Wang

and Li

and Wang

Face Recognition under Varying Lighting Conditions using Self Quotient Image. Proc. of AFGR, pages 819–824, 2004.

12.

Zhang

and Shan

and Gao

and Chen

and Zhang

Local Gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition. pages 786–791, 2008. ICCV.