Face occlusion removal for face recognition using the related face by structural similarity index measure and principal component analysis

Abstract

Facial occlusions like sunglasses, masks, caps etc. have severe consequences when reconstructing the partially occluded regions of a facial image. This paper proposes a novel hybrid machine learning approach for occlusion removal based on Structural Similarity Index Measure (SSIM) and Principal Component Analysis (PCA), called SSIM_PCA. The proposed system comprises two stages. In the first stage, a Face Similar Matrix (FSM) guided by the Structural Similarity Index Measure is generated to provide the necessary information to recover from the lost regions of the face image. The FSM generates Related Face (RF) images similar to the probe image. In the second stage, these RF images are considered as related information and used as input data to generate eigenspaces using PCA to reconstruct the occluded face region exploiting the relationship between the occluded region and related face images, which contain relevant data to recover from the occluded area. Experimental results with five standard datasets viz. Caspeal-R1, IMFDB, and FEI have proven that the proposed method works well under illumination changes and occlusion of facial images.

Keywords

Face recognition SSIM eigenspaces PCA FSM related face

1 Introduction

In the field of face recognition, computer vision has become a hot burning field, especially when dealing with occlusion. Occlusion can take many forms, including eyeglasses, sunglasses, masks, and many others. Presently, as the entire world is combating with COVID-19, people wear masks at all times in all places. As a result, their faces are obscured, making even familiar faces difficult to recognize. Face recognition is now used in real-time in a variety of settings, including airports, businesses, and mobile phone applications for locking and unlocking the devices. Face occlusion removal is essential in these cases to identify the subjects behind the masks for fast and unobtrusive services. Many authors have proposed various algorithms for reconstructing and restoring occluded regions, including the PCA, Fisherface algorithm, Linear Discriminant Analysis, hidden Markov model etc. However, these approaches do not demonstrate reasonable performances when tested with arbitrary data which vary completely from the training data.

To overcome this limitation, this research harnesses structural information of the facial images for occlusion detection and reconstruction. To locate an appropriate person, a face similarity calculation is used. The term “similarity” refers to the degree of structural similarity between two images. Generally, some similarity measures, such as Euclidean distance, Minkowski distances, cosine-based distances etc. are used to identify a specific person. SSIM is a similarity measurement used in face recognition. The primary advantage of SSIM is that it is a perception stand form that deals with structural data at the pixel level, even in spatially closed regions, whereas PSNR and MSE deal with absolute errors.

Briefly stated, our proposed work is as follows.

To reconstruct occluded faces effectively, FSM has been proposed as an effective method. FSM (F_i) provides additional information to fill occluded regions by identifying similar faces from the gallery face dataset that are similar to the probe image using Similarity Computation Matrix (SCM) computation. These similar faces are recommended for the reconstruction process using PCA for face recognition.

Instead of using massive images, similar or related face images are used for probe image reconstruction, resulting in a small computational time.

Rest of this paper is organized as below. Section 2 provides a brief account of the related works in this context and Section 3 describes proposed work and explains the restoration of occluded regions using SSIM_PCA. Image classifications are presented in Section 4 and experimental results with five dataset (CAS-PEAL-R1, IMFDB and FEI), are discussed in Section 5 and the paper is concluded in Section 6.

2 Related work

Several algorithms for the de-occlusion of face images have been proposed by various authors. In [1], a network of in-painting on the face is used as the foundation for Weighted Face Similarity (WFS-Net) to produce an improved restoration. The author of the paper [2] used effective pixels to detect and restore occluded pixels to reconstruct occluded face images. Sonu Agrawala et al. [3] used local texture descriptors to reduce the PCA dimension and age function model. Markus Storer et al. [4] proposed a two-stage approach for detecting outliers which, in enormous smaller subspaces are created to detect outliers in the first stage, while robust least-square fitting is used to detect outliers in the second stage. In [5], the authors employ an hypothesize and test paradigm to determine the coefficients of eigen images from a subset of image points. The competing hypotheses are subjected to Minimum Description Length principle to eliminate outliers in occlusion detection and employ multiple eigen image classes. In [6], the authors used a single sample per person methodology to solve a variation of face images and follow similarity measures for recognition. To solve a single sample per person problem, a combination of Traditional and Deep Learning (TDL) techniques is used. Inverse Euclidean distance similarity calculation [7] takes place for facial recognition using geometrical approximated PCA (gaPCA). Face recognition based on perceptual hash, which is used for feature extraction and preprocessing, was proposed in [8]. The authors employ Discrete Wavelet Transform (DWT) and a graph-oriented technique known as the Quintet Triple Binary Pattern (QTBP).This approach employs uses K-Nearest Neighbors (KNN) and Support Vector Machine (SVM) for face classification. In [9], the face is defined using a Local Gradient Number Pattern (LGNP) and the local gradient information and grey position are extracted using the Sobel operator. Further, Fuzzy Convex-Concave Partition (FCCP) was employed for capturing fray transitions in the local neighborhood and extraction of additional local and global information for face recognition. In [10], the authors utilize PCA, Local Binary Pattern, and pyramid pooling and propose a three-layer system with one convolutional layer, a nonlinear layer, and a pooling layer, implying that high-end hardware requirements are essential for face recognition. In [11], feature based similarity measure which combines the best features of SSIM and Feature Similarity Index Measure (FSIM) is used to resolve the limitations with feature and structural similarity measures in the detection of similar and dissimilar facial images. The author [12] makes three contributions. The author first discusses the application of cosine similarity following discriminant analysis, then the inadequacy problem of cosine similarity, and finally a new similarity measure called Face Recognition Grand Challenge (FRGC) to improve pattern recognition by the integration of absolute value of angular measure and lp norm. Image quality assessment based on structural information was proposed initially by Zhou Wang et al. [13]. According to a new philosophy, image degradations are viewed as perceived changes in structural information variation. As a result, SSIM measures image similarities effectively considering aspects such as object structure, luminance, and contrast. Jim Nilsson [14] explained the mathematical concepts underlying the SSIM operating principle. In this paper, authors examine the mathematical factors of SSIM and demonstrate that it can produce unexpected, sometimes undefined, and nonintuitive results in both synthetic and realistic use cases. As a result, SSIM is used to evaluate image quality by utilizing contrast, luminance, and similarity. In paper [15], on the reconstructed face image database, this study evaluated the performance using discrete wavelet transform of the principal component analysis and singular value decomposition algorithms (DWT-PCA/SVD) as a preprocessing mechanism. The half-face images were reconstructed using the frontal faces’ bilateral symmetry.

For modular face recognition systems, the author Mehmet Koc [16] presented a method to detect and use the non-occluded areas of the face image using three coefficients (i)image entropy, (ii) image correlation, (iii) root-mean-square error.

By combining a cropping-based strategy with the Convolutional Block Attention Module (CBAM), the author[17] proposes a new method for masked face identification. There are two special application scenarios: using unmasked faces for training to recognize masked faces and using masked faces for training to recognize unmasked faces.

Meixiang Zhao et al. [18] proposed a lighthearted novel. A new ridge regression model is used to propose 2DPCA (Two Dimensional Principal Component Analysis. R2DPCA (Rigid 2DPCA) produces a weighting vector based on label information and maximises a relaxed threshold using an optimal algorithm to obtain the key features.

The feature-based method for 2D face images was presented by the authors [19]. For feature extraction, Speeded Up Robust Features (SURF) and Scale Invariant Feature Transform (SIFT) are used.

In this research thus author [20] concentrates on facial occlusions and proposes an enhancement method using Principal Component Analysis with Singular Value Decomposition using Fast Fourier Transform (FFT-PCA/SVD) for preprocessing face recognition algorithm on face images with missingness and augmented face image database.

The author [21] of this study devised a look-up table-based method as well as a novel efficient restoration strategy. The benefits of spectrum independence of the reference image and pixel missing region were leveraged by the author. Using a reference image obtained from several sensors, significant heterogeneity zones were employed to recreate the pixel missing image.

Though considerable research is done in face recognition based on occlusion detection, the number of significant works based on structural and feature silarity are very few in number.

3 Proposed work

This section presents the implementation of the proposed occlusion detection system which is implemented in four phases. Generally, computer vision based problems require a broad dataset for training and testing machine learning models. However, large datasets demand sophisticated hardware requirements for effective computations, feature extraction, training and testing. Contrary to this, the proposed system is implemented with a small set of images and minimal hardware requirements demonstrating good performances with smaller computational times and lower error rates.

3.1 Pre-processing

In this research, for a given probe image, the images in the gallery face dataset are masked according to the occluded portion of a probe image. The masking process is described as below and the algorithm is given in Table 1.

Table 1
Face Masking Algorithm

Input GF = [GF₁, GF₂, ⋯ , GF_i, ⋯ GF_N]

Procedure 1. Determine the location of the probe’s image occluded face.

2. Make an occluded region mask.

FMM = CreateMask (e) (1)

Where e is the imellipse representation of the Region Of Interest object.

3. Build a Masked Face dataset (MF) by applying FMM to each face in the gallery face dataset.

MF_i = GF_i ⊗ FMM (2)

whereGF_i denotes the Gallery Face dataset (GF)i^th face image.

4. The FMM represents the Face Masking Matrix and MF_i represents the masked face of the corresponding i^th gallery face image. Finally, Masked Face (MF) dataset is produced. MF = [MF_1,MF_2, ⋯ MF_i, ⋯ MF_N] Where N is the total number of images.

Output MF = [MF₁, MF₂, ⋯ , MF_i, ⋯ MF_N].

Input	GF = [GF₁, GF₂, ⋯ , GF_i, ⋯ GF_N]
Procedure	1. Determine the location of the probe’s image occluded face.
	2. Make an occluded region mask.
	FMM = CreateMask (e) (1)
	Where e is the imellipse representation of the Region Of Interest object.
	3. Build a Masked Face dataset (MF) by applying FMM to each face in the gallery face dataset.
	MF_i = GF_i ⊗ FMM (2)
	whereGF_i denotes the Gallery Face dataset (GF)i^th face image.
	4. The FMM represents the Face Masking Matrix and MF_i represents the masked face of the corresponding i^th gallery face image. Finally, Masked Face (MF) dataset is produced. MF = [MF_1,MF_2, ⋯ MF_i, ⋯ MF_N] Where N is the total number of images.
Output	MF = [MF₁, MF₂, ⋯ , MF_i, ⋯ MF_N].

Let GF_i = [GF₁, GF₂, ⋯ GF_N] be the d-dimensional vector of the i^th image. Let GF = linebreak [GF₁, GF₂, ⋯ , GF_i, ⋯ GF_N] be the Gallery Face (GF) for all N images. The Table 1 shows the Face Masking Matrix (FMM) algorithm for Masked Face (MF) Dataset.

For comparing the gallery and probe images, the gallery face images need to be multiplied with FMM. The FMM is formed by detecting an occluded region of the probe image. Pixels in the MF dataset’s corresponding gallery face picture have a value of zero. The FMM is very useful in similarity computation because it only compares the similarity between known parts of the occluded probe image and corresponding regions of the gallery face image. The schematic of the proposed masking and Face Similarity Matrix (FSM) system is depicted in Fig. 1.

Fig. 1

Proposed pre-processing and SCM architecture.

3.2 Generation of similarity computation matrix

SSIM is used to compute the similarity of each image in the MF dataset and the probe image. SCM is calculated using the following equation:

$\begin{matrix} SCM (P_{f}, {MF}_{i}) \\ = \frac{(2 μ P_{f} + C_{1}) (2 σ P_{f} {MF}_{i} + C_{2})}{(μ^{2} P_{f} + μ^{2} {MF}_{i} + C_{1}) (σ^{2} P_{f} + σ^{2} {MF}_{i} + C_{2})} \end{matrix}$ (3)

μP_f, μMF_i is the probe’s global and MF_i images. σP_fMF_i, σ²P_f and σ²MF_i are covariance of two face images. C₁ and C₂ are constants to keep away from zero division. For each candidate image, FSM is generated which consists of Related Face Matrix for the corresponding face and finally SCMs are constructed for all images in the MF dataset. ${RF}_{i} = [R_{1}, R_{2}, \dots R_{i}, \dots R_{n}]$ (4)

Related Face (RF) images from the GF dataset are chosen from the FSM. Since each FSM is of size four, N = 4 is attributable to the covariance of two face images. These faces are used in the SSIM_PCA reconstruction process to reconstruct the probe image. Figure 2 depicts the use of SSIM to create Related Face images.

Fig. 2

Using SSIM to identify related faces from the Gallery Face dataset.

3.3 Reconstruction of the occluded region using SSIM_PCA

PCA is a machine learning technique for reducing the number of dimensions in an image. It is a well-known method for eigen face-based face recognition and feature extraction. When dealing with an occluded facial image, PCA has certain disadvantages resulting in performance degradations with some images. Generally PCAs constructed from the images are less interpretable compared to the original features and smaller number of PCAs results in loss of fine details. To overcome these limitations, several authors have employed kernel PCAs to capture non-linear relationships between the features.

The prime goal of the proposed system is to reconstruct occluded face images with fewer images and less processing time. Using SSIM, SCM was determined and FSM (F_i) was produced as a result of these factors. The FSM is used to choose Related Face (R) images, which are then used for reconstruction.

Initially, the eigenfaces are constructed from selected RF images. The eigen faces are represented as in Equation (5) where $\begin{matrix} {ef}_{i} = ({ef}_{1}, {ef}_{2}, \dots {ef}_{i}, \dots {ef}_{n}) \\ EF = [\begin{matrix} {ef}_{11} \dots {ef}_{m 1} \\ ⋮ \dots ⋮ \\ {ef}_{1 d} \dots {ef}_{md} \end{matrix}] \end{matrix}$ (5)

Where ef_i is the eigenface for the i^th image, d is the dimension and m is the number of eigenfaces, where m < N usually. Eigenfaces or eigenvectors are used to solve the problem of face recognition. In a high-dimensional vector space, the covariance matrix is used to measure eigenvectors. Eigenfaces serve as the origin of all images in the gallery face dataset. These result in a reduction in the size of the original face images. Classification can be accomplished by examining how the source location represents faces. Principal component scores are generated using eigenfaces, and the reconstruction process is carried out.

In [2], the authors claim that eigenspace projection of the probe image yields a low principal component score. Instead of projecting the input image into eigenspace which is common, this approach projects normalized input images into eigenspace. The principal component (PC) score is computed as in Equation (6) where PC represents the principal component, W represents the weight matrix and y is the normalized probe image. $PC = \sum_{i = 1}^{m} {ef}_{i} W_{i} y_{i}$ (6)

The schematic of the reconstruction process and the corresponding algorithm are shown in Fig. 3 and Table 2 respectively. The pipeline of the reconstruction process depicting the intermediate images generated in each stage is shown in Fig. 4.

Fig. 3

Architecture of reconstruction process.

Table 2

SSIM_PCA algorithm for reconstruction of occluded face images

Steps for Reconstruction:

1. Create an eigenfaceef_i

2. Set the weight w to 1 at the start.

3. Create a PC for the image

4. Using SSIM_PCA, obtain a reconstructed image with Equation (7).

\hat{R} = μ_{RF} + {ef}_{i} \otimes PC

(7)

Where

\hat{R}

stands for the reconstructed face image and ⊗ denotes element-wise multiplication.

5. Revise the weight W by comparing the probe face (P_f) image to the reconstructed image.

W_{i} = {\begin{matrix} 1 | \hat{R} - P_{f} | < θ \\ 0 otherwise \end{matrix}

(8)

6. Steps 2 to 5 should be repeated until the reconstructed images are generated.

Fig. 4

Reconstruction process pipeline.

3.4 Restoring occluded parts of a probe face image

SSIM_PCA can be used to reconstruct the entire probe face image and achieve $\hat{R}$ as a flattened image. Using the reconstructed image $\hat{R}$ , the occluded region of pixels can be reshaped. The rules can be used for restoration are given in Equation (9),

$\bar{P} = {\begin{matrix} P_{f} if W_{i} = 1 \\ \hat{R} otherwise \end{matrix}$ (9)

Finally, $\bar{P}$ is achieved by restoring the occluded region of the probe image. In-painting can be used to improve the reshaped image $\bar{P}$ by eliminating the blurring induced by the reshaped image if desired as in Equation (10). $P^{''} = FMM \otimes p_{f} + (1 - FMM) * \hat{R} .$ (10)

The algorithm for determining the accuracy of reconstruction is given in Table 3. It uses Euclidean distance to equate the reshaped image to the original image before classifying them with gallery pictures.

Table 3

The algorithm for determining accuracy employs Euclidean distance

1. Determine the Euclidean distance between the original gallery image and the reshaped image with Equations (11) and (12)

d ({GF}_{i}, \bar{P}) = ∥ {GF}_{i} - \bar{P} ∥

(11)

Q = repmat (Q, 1, size (X, 2)) (12)

2. Measure the Euclidean distance between all of the gallery images and the reshaped image with Equation (13) where Q denotes the reshaped image vector, while X denotes the vector of all gallery images in the above.

E_{dist} = \sqrt{\sum_{i = 1}^{N} (Q - X)}

(13)

3. Determine the shortest possible distance for E_dist.

4. The original image corresponds to the minimum distance.

5. Evaluate the accuracy with Equation 14.

ACC = \frac{Predicted Images}{Number of Images} * 100

(14)

4 Classification

The reconstructed images after occlusion removal and reshaping are matched under two categories as below. Given a probe image, under image with image matching, the matching processes identify a similar images from a dataset in which the images are subjected to different kinds of occlusions. Similarly, in image with class matching, a probe image is compared with images within an occlusion class to find a matching image.

Image with Image matching and

Image with Class matching.

4.1 Image with Image matching

In the Image with Image matching, the reshaped image is compared with the set of all gallery images, and the image for which Euclidean distance is minimum is identified to be the closest matching image. This process is illustrated in Fig. 5.

Fig. 5

Caspeal-R1 dataset comparison with three different categories of eyeglass occlusion.

4.2 Image with Class matching

The first step in Image with Class Matching is to determine if the probe image belongs to a specific class. After determining the class of the probe image, the reconstruction process begins, and the image with the highest similarity using SSIM becomes the corresponding original image. The schematic of the Image with the Class Matching method is depicted in Fig. 6.

Fig. 6

An example of an experimental flow of image classification using the IMFDB dataset’s Class Matching.

5 Experiments and discussion of the results

This section presents quantitative and visual experimental results with three datasets viz. Caspeal-R1, IMFDB, and FEI; additionally it compares the experimental flow on RMFRD and FGNET datasets and presents their interpretations for a clear understanding of the behavior of the proposed system.

The occluded facial recognition process had two components. To mask the images in the gallery face dataset, the occluded section of a probe image is used. FMM is particularly useful in similarity calculation because it simply examines the similarity between known areas of the occluded probing image and corres-ponding regions of the gallery face image after masking. By calculating SCM and FSM, a minimum number of similar image scans based on the probe image are generated to recover from the occluded region. The reconstruction of the obstructed face took place based on SSIM and PCA to properly restore the original face.

5.1 Experiments with data from the CASPEAL-R1 dataset

The Caspeal-R1 dataset [22] includes various frontal image variations, including Normal, Aging, Accessory, Distance, Expression, Background, and Lighting. Normal images are called gallery images in those combinations, whereas Aging, Accessory, and Distance are considered probe images. In addition to that, additive Gaussian white noise is applied to aging images and considered as probe images. Under Aging, there are 66 images in total, (Male-51, Female-15). In 66 images, 33 were used for training and the remaining were used for testing at random. The same set is used for the additive Gaussian white noise probe set. There are 161 males and 133 females from the Distance (D2) group chosen for the probe sets. The Distance (D2) category has a total of 294 images. In 294 images, 50 percent of the images are classified as training, while the remaining 50 percent are classified as testing. Eyeglass types are selected and studies are performed in Accessory variation. There are three types of frames in the world of eyeglasses, and tests are carried out on each of them. There are a total of 200 images in category frame 1 (Full Rim Frame-Black Colored), with 162 females and 38 males. In category frame 2 (Full Rim-Silver Colored), a total of 300 images were selected, with females accounting for 160 and males for 140. In category frame 3 (Rimless frame), 230 images are selected to test the experiments, including both females (170) and males (60). For each eyeglass category, 50 percent is used for training and 50 percent is used for testing. Figure 8 (a) shows experimental results on a Caspeal-R1 dataset of eyeglass category frame1 with various existing methods. Figure 8(b) shows the experiment flow of additive Gaussian white noise of the Caspeal-R1 dataset. Figure 7 depicts some Caspeal-R1 images.

Fig. 7

Caspeal-R1 dataset example images.

Fig. 8(a)

Shows different state-of-art methods comparison of CAS-PEAL-R1 dataset (a) PCA, (b) FW-PCA, and the proposed method (c) SSIM_PCA.

Fig. 8(b)

Experiment flow of additive Gaussian White noise of the Caspeal-R1 dataset.

5.2 Experiments with data from the IMFDB dataset

The Indian Movie Face database (IMFDB) [23] is a large, unrestricted face database consists of 34512 images of 100 Indian actors culled from more than 100 videos. Each actor or actress’s face image was collected from a minimum of three different movies. So, in IMFDB, there are many variations such as occlusion, illumination, poses, and makeup are present. In the IMFDB dataset, one actor or actress was itself varied for the following reasons.

Age,

Expression,

Occlusion,

Pose and so on.

As a result, even a single actor has a large number of variations. Actor Amirkhan’s age, expression, occlusion, and pose are all shown in Fig. 9.

Fig. 9

Shows some examples of Amirkhan‘s image of different variations of occlusion by hand and eyeglass, expression, pose.

Occlusion and pose with the same style images are chosen to conduct our experiments on these datasets. On the IMFDB and FEI datasets, Fig. (10) compares the occlusion removal procedure to various state-of-the-art methods.

Since it is difficult to find neutral images of an actor or actress due to the movie clippings in the dataset, each actor or actress is considered a class to identify on the IMFDB dataset. IMFDB is an expression database in general. There are a total of 35,512 images of all of the actors, including (64 male actors and 36 female actresses). Experiments are performed on 30 actors (or) classes, each with 10 images with various variations from different movies, from a total of 100 actors. As a result, a total of 300 images are chosen at random from each actor’s or actress’s various films. Each actor has a different hairstyle, makeup, voice, age, and poses for each film. As a result, the proposed work aims to locate the probe image from all gallery images that belong to a specific actor or actress’s class and compare it to the Reshaped image. At random, 150 images were used as training and 150 as testing.

5.3 Experiments with data from the FEI dataset

The FEI face database [24] is a face database of Brazilians. It has 14 images each of 200 individuals, for a total of 2800 images. In the FEI face dataset, frontal images are chosen to test the proposed experiments. 200 neutral images are chosen and tested in two different ways. 200 smiling images are used as test datasets in the FEI dataset, while 200 neutral images are used as train datasets. Alternatively, 200 neutral images are artificially occluded in various forms of the same scale and then tested with 200 neutral images. The results of two different types of experiments on the FEI dataset are shown in Fig. 10. Figure 10 depicts the combined experimental results on IMFDB and FEI datasets of various state-of-the-art methods and the proposed method. The artificially occluded face images from the FEI dataset are shown in Table 4. Figure 11 depicts some FEI dataset examples.

Fig. 10

Shows a comparison of multiple datasets from IMFDB and FEI using the proposed method SSIM_PCA.

Table 4

Shows different ways of artificially occluding parts of the facial region

Fig. 11

Example images from the FEI dataset.

5.4 Experiments with data from the RMFRD and FGNET dataset

RMFRD [25] includes 5000 masked faces from 525 people and 90,000 normal faces. It is a sizable dataset. Out of 525 people, the first ten were used, each with five images. So a total of 50 images were used, with 25 serving as training and the remaining 25 serving as testing at random. To begin, the proposed work seeks to identify a person’s specific class based on the input image. Second, based on the class identification, it generates the matched image for the input image. Figure 12 depicts the experimental results for the RMFRD and FGNET datasets. Figure 13 depicts a comparison of the proposed method with the CBAM and GSO methods.

Fig. 12

Shows the experimental results for the (a) RMFRD and (b) FGNET datasets.

Fig. 13

Results Experiment results on the RMFRD and FGNET datasets using state-of-the-art methods on CBAM and GSO.

Figure 13 depicts a comparison of various methods on different datasets. The occluded image was reconstructed for the input image, and their corresponding original image was generated, indicating that the proposed methods work well.

5.5 Experiments with data from the FGNET dataset

FG-NET [26] is made up of 1002 images of 82 people ranging in age from 0 to 69. It is a face ageing dataset with significant variations such as pose, illumination, and expression. Each person differs from the next based on their age. As a result, each individual is regarded as a class. First, the proposed work aims to identify the person’s specific class based on the input image. Second, it generates the matched image for the input image based on the class identification. Only 30 of the 82 people in the dataset are used, each with 5 images. So, in total, 150 images are used to put our proposed method to the test. Out of 150 images, 30 are used for testing and 50 are used for training at random.

5.6 Calculation of Root Mean Square Error (RMSE)

Root Mean Square Error can be calculated to estimate the difference between the original image and the reconstructed image. This can be done by computing the arithmetic mean of the square root of all images as in Equation (15).

$RMSE = \sqrt{1 / N * \sum_{i = 1}^{N} (\bar{P} - {GF}_{i})}$ (15)

Where N is the number of images, $\bar{P}$ is the reconstructed image, GF_i is the gallery image.

5.7 Calculation of SSIM

The structural similarity index measure is used to find similarities between two images in comparing luminance, contrast, and structure as in Equation (16).

$\begin{matrix} S (X, Y) \\ = \frac{(2 μ Y μ X + C_{1}) (2 σ YX + C_{2})}{(μ^{2} Y + μ^{2} X + C_{1}) (σ^{2} Y + σ^{2} X + C_{2})} \end{matrix}$ (16)

SSIM calculates the value of the similarity index for image Y using X as the reference image. In accordance with the experiments, Y is the test image (or) reconstructed image and X is the gallery image.

SSIM returns the value of the neighboring pixels of every pixel in the image Y. The SSIM value ranges from –1 to 1. The value from 0.94 to 1 denotes the specific match of two images. If the two images are different, the SSIM value goes down to zero.

Table 7(a) to Table 7(c) indicates the correlation between the gallery and the reconstructed image with their SSIM value and the gallery and test image with the SSIM value (SSIM_val). SSIM generates similarity map (SSIM_Map).

Table 7(a) to Table 7(c) describe the SSIM_Map for comparing test and reference images.

The maximum SSIM value appeared as bright pixels, whereas the minimum SSIM value appeared as dark pixels, which means that two images are different from each other.

The range of Euclidean distance and SSIM value is represented in Table 5. The range denotes the degree of similarity between two images, whether they are similar or dissimilar.

Table 5

Calculate the similarity between the original image and the probe image with the following observation of different datasets

	Observation		Observation
Euclidean Distance	$E_{dist} = {\begin{matrix} 0 if similar image \\ > 0 if not similar \end{matrix}$	SSIM	$SSIM = {\begin{matrix} - 1 < SSIM_val ⩽ 1 similar \\ otherwise not similar \end{matrix}$ .

Table 6(a) and 6(b) shows the comparison of the proposed method against various existing methodologies on a different dataset. Among the IMFDB data, a total of 300 images, including male and female images, were tested. In the FEI dataset, the smiley face and the artificially occluded face were tested on 400 images. Within the Caspeal-R1 dataset, different experiments were carried out, on Eyeglass with different categories, Aging and Distance

Table 6(a)

Compares the IMFDB and FEI datasets

Method	Comparison of proposed method with various related work on different dataset
	IMFDB Dataset	FEI Dataset
	(Total Images 300)	Smiling (Total Images 400)	Artificial Occluded Face (Total Images 400)
PCA	63.3	52.5	65
FW-PCA [2]	70	62.5	75
GSO [3]	83.3	89.7	92.25
DWT-PCA/SVD [15]	66.6	85	72.5
CBAM [17]	80	75	87.5
R2DPCA [18]	70	75	85
SURF and SIFT [19]	66.6	70	77.5
FFT-PCA/SVD [20]	70	80	82
SSIM_PCA(proposed)	96.6	96.25	97.5

Table 6(b)

Shows the comparison on the Caspeal-R1 dataset

Method	Comparison of proposed method with various related work on Caspeal_R1 dataset
	Caspeal-R1 Dateset
	Eyeglass			Aging
	Type 1 (Total Images 200)	Type 2 (Total Images 300)	Type 3 (Total Images 230)	Aging (Total Images 66)	AddiTive Gaussian Noise (Total Images 66)	Distance (Total Images 294)
PCA	55	43.3	59.1	66.6	62.1	57.8
FW-PCA [2]	60	62	55.6	75.7	77.2	64.6
GSO [3]	80	70	80.4	83.3	84.8	81.6
DWT-PCA/SVD [15]	60	70	82	84	77.2	73.1
CBAM [17]	80	83.3	82.6	83.3	80.3	85.03
R2DPCA [18]	75	83.3	78.26	69.6	68.1	74.8
SURF and SIFT [19]	55	63.3	69.5	60.6	62.1	71.4
FFT-PCA/SVD [20]	75	66.6	73.9	75.7	78.7	79.9
SSIM_PCA (Proposed)	96	97	95.2	92.4	93.9	95.2

The next three tables Table 7(a) to 7(c) present the results of the Caspeal-R1, IMFDB and FEI dataset experiments.

Table 7(a)

Comparison of the Original image by reconstructed image with its SSIM value and map of Caspeal-R1 dataset

Table 7(b)

Comparison of the Original image with the Test image with its SSIM value and map of Caspeal-R1 dataset

Table 7(c)

The comparison of the original images with the reconstructed image using SSIM with its SSIM value and map of IMFDB and FEI dataset

Table 7 (a) presents the results of the comparison of the original image to the reconstructed image. Caspeal-R1 dataset generated by the proposed work, such as the SSIM value (SSIM_val) and their corresponding similarity map (SSIM_Map). The map illustrates the similarity with larger and brighter regions.

Table 7 (b) presents comparable results of the Original image with the Test image of the Caspeal-R1 dataset, such as the SSIM value and its corresponding similarity map. The map shows the dissimilarity with the dark regions.

Table 7 (c) displays comparable results from the original image with the test image and the reconstructed image of the IMFDB and the FEI dataset with the SSIM value and its corresponding similarity map.

5.8 System requirements

The computer software and hardware requirements used here are CPU: AMD E2-9010 RADEON R2, 4 Compute Cores 2C+2G, RAM: 8GB; Operating System: Windows 10 64bit; Matlab 2017a. Table 8 shows execution time using these system requirements. Figure 14 depicts how the original image and reshaped image are similar. That is, the proposed method successfully reconstructed the occluded face image, and the distance between the reshaped image and the original image is more equal. The distance is measured using Euclidean distance.

Table 8
Comparison of execution time with existing techniques of the FEI dataset

Method Image Size Execution Time(sec)

PCA 64x64 500sec

FW_PCA 64x64 400sec

GSO 64x64 350sec

DWT-PCA/SVD 64x64 300sec

CBAM 64x64 450sec

R2DPCA 64x64 400sec

SURF and SIFT 64x64 400sec

FFT-PCA/SVD 64x64 300sec

SSIM_PCA 64x64 200sec

Method	Image Size	Execution Time(sec)
PCA	64x64	500sec
FW_PCA	64x64	400sec
GSO	64x64	350sec
DWT-PCA/SVD	64x64	300sec
CBAM	64x64	450sec
R2DPCA	64x64	400sec
SURF and SIFT	64x64	400sec
FFT-PCA/SVD	64x64	300sec
SSIM_PCA	64x64	200sec

Fig. 14

Shows the Euclidean distance of the original image and reshaped image of the CAS-PEAL dataset with the accessory eyeglass frame 1 category as occluded input.

Figure 14 highlighting the similarity between the original and reshaped image. That is the suggested technique effectively recovered the occluded facial image, reducing the gap between the reshaped and original images. To compare the reshaped image and the original image, one of the Caspeal-R1 dataset images with eyeglass category frame 1 as an occluded was used. The distance between the reshaped image and the original image was calculated using Euclidean distance, resulting in a more accurate match.

6 Conclusion

Because of the COVID-19 outbreak, individuals are wearing masks when they go out. Authors have built many existing face recognition algorithms to distinguish occluded faces, but these algorithms are unable to detect masks and other occluded regions. Under the proposed system, there were two aspects of the occluded facial recognition process. The occluded section of a probe image is used to mask the images in the gallery face dataset. Because it simply examines the similarity between known areas of the occluded probing image and corres-ponding regions of the gallery face image after masking, FMM is particularly useful in similarity calculation. By calculating SCM and FSM, a minimum number of similar image scan be generated according to the probe image to recover from the occluded region. To properly bring back the original face correctly, the reconstruction process of the obstructed face took place based on SSIM and PCA. While, from the point of view of system requirements, the proposed approach works well. Accuracy of recognition will be considered in future work when handling the heavy obstructed portion. As a future improvement, the algorithm should be improved when dealing with images with side poses and it will also be used in deep learning.

Footnotes

Acknowledgments

The research in this paper uses the CAS-PEAL-R1 face database collected under the sponsor of the Chinese National Hi-Tech Program and ISVISION Tech. Co. Ltd. We sincerely thank them for allowing us to use the above mentioned dataset.

Thank you also to the authors [23–25 , ] for allowing us to use their dataset.

References

Qin

, Bai

and Zhao

, Face inpainting network for large missing regions based on weighted facial similarity, Neurocomputing 386 (2020), 54–62. https://dx.doi.org/10.1016/j.neucom.2019.12.079.

Hosoi

, Nagashima, et al., Restoring Occluded Regions Using FW-PCA for Face Recognition, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012. doi: 10.1109/cvprw.2012.6239211.10.1109/CVPRW.2012.6239211

Agrawal

, Kumar

and Thomas

, A novel Robust feature extraction with GSO-optimized extreme learning for age-invariant face recognition, The Imaging Science Journal 67(6) (2019), 319–329. doi: 10.1080/13682199.2019.1658914.

Storer

, Roth

P.M.

, et al., Fast-Robust PCA.: SCIA 2009, LNCS 5575, pp. 430–439 Springer, 2009. https://dx.doi.org/10.1007/978-3-642-02230-2_44.

Leonardis

and Bischof

, Robust Recognition Using Eigenimages, Computer Vision and Image Understanding 78(1) (1999), 99–118. https://dx.doi.org/10.1006/cviu.1999.0830.

Zeng

, Zhao

, Gan

, Mai

, Zhai

and Wang

, Deep Convolutional Neural Network Used in Single Sample per Person Face Recognition, Volume-2018 Article ID-9861697, 11 Pages, Published 23 August 2018 Hindawi, Computational Intelligence and Neuroscience, https://dx.doi.org/10.1155/2018/3803627.

Machidon

A.L.

, Machidon

O.M.

and Ogrutan

P.L.

, Face Recognition Using Eigenfaces, Geometrical PCA Approximation and Neural Networks, 2019 42nd International Conference on Telecommunications and Signal Processing (TSP). doi: 10.1109/tsp.2019.8768864.

Tuncer

, Dogan

, Abdar

and Pławiak

, A novel facial image recognition method based on perceptual hash using quintet triple binary pattern, 12 August 2020, Multimedia Tools and Applications 79 (2020), 29573–29593. https://dx.doi.org/10.1007/s11042-020-09439-8.

Sun

, Lv

, Tang

, Sima

and Wu

, Face Recognition Based on Local Gradient Number Pattern and Fuzzy Convex-Concave Partition, IEEE Access 8 (2020), 35777–35791, IEEE. doi: 10.1109/ACCESS.2020.2975312.

10.

Alahmadi

, Hussain

, Aboalsamh

H.A.

and Zuair

, PCAPooL: unsupervised feature learning for face recognition using PCA, LBP, and pyramid pooling, 25 March 2019, Pattern Analysis and Applications 23 (2020), 673–682. https://dx.doi.org/10.1007/s10044-019-00818-y.

11.

Shnain

N.A.

, Hussain

Z.M.

and Lu

S.F.

, A Feature-Based Structural Measure: An Image Similarity Measure for Face Recognition, Applied Sciences 7(8) (2017), 786. https://dx.doi.org/10.3390/app7080786.

12.

Liu

, Discriminant analysis and similarity measure, 47(1) (2014), 359–367. https://dx.doi.org/10.1016/j.patcog.2013.06.023.

13.

Wang

, Bovik

A.C.

, Sheikh

H.R.

and Simoncelli

E.P.

, Image quality assessment: From error visibility to structural similarity, IEEE Transactions on Image Processing 13(4) (2004), 600–612. doi: 10.1109/TIP.2003.819861.

14.

Nilsson

, Akenine-Moller

, Understanding SSIM, June 2020.

15.

Asiedu

, Essah

B.O.

, et al., Evaluation of the DWT-PCA/SVD Recognition Algorithm on Reconstructed Frontal Face Images, Hindawi, Journal of Applied Mathematics , Volume 2021, Article ID 5541522, 8 pages. https://dx.doi.org/10.1155/2021/5541522.

16.

Koc

, A novel partition selection method for modular face recognition approaches on occlusion problem, Machine Vision and Applications, volume 32, Article number: 35 (2021), 11 pages. https://dx.doi.org/10.1007/s00138-020-01156-4.

17.

, Guo

, et al., Cropping and attention based approach for masked face recognition, Applied Intelligence 51 (2021), 3012–3025. https://dx.doi.org/10.1007/s10489-020-02100-9.

18.

Zhao

, et al., Advanced variations of two-dimensional principal component analysis for face recognition, Neurocomputing 452 (2021), 653–664. https://dx.doi.org/10.1016/j.neucom.2020.08.083.

19.

Gupta

, et al., 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions, The Visual Computer 37 (2021), 447–456. https://dx.doi.org/10.1007/s00371-020-01814-8.

20.

Ayiah-Mensah

, et al., Recognition of Augmented Frontal Face Images Using FFT-PCA/SVD Algorithm, Hindawi, Computing, Volume 2021, Article ID 6686759, 9 pages. https://dx.doi.org/10.1155/2021/6686759.

21.

Fan

, et al., A Pixel Missing Patch Inpainting Method for Remote Sensing Image, 2011 IEEE,19th International Conference on Geoinformatics. doi: 10.1109/geoinformatics.2011.5980782.

22.

Gao

, Cao

, Shan

, Chen

, Zhou

, Zhang

and Zhao

, The CAS-PEAL Large-Scale Chinese Face Database and Baseline Evaluations, IEEE Trans on System Man, and Cybernetics (Part A) 38(1), pp149–161. 2008.1. doi: 10.1109/TSMCA.2007.909557.

23.

Setty

, Husain

, Beham

, Gudavalli

, Kandasamy

, Vaddi

, Hemadri

, Karure

J.C.

, Raju

, Rajan, V. Kumar and C.V. Jawahar, Indian Movie Face Database: A Benchmark for Face Recognition Under Wide Variations.

24.

FEI dataset https://fei.edu.br/ cet/facedatabase.html

25.

https://www.kaggle.com/muhammeddalkran/masked-facerecognition

26.

http://yanweifu.github.io/FG_NET_data/FGNET.zip