Abstract
Masked face recognition embarks the interest among the researchers to find a better algorithm to improve the performance of face recognition applications, especially in the Covid-19 pandemic lately. This paper introduces a proposed masked face recognition method known as Principal Random Forest Convolutional Neural Network (PRFCNN). This method utilizes the strengths of Principal Component Analysis (PCA) with the combination of Random Forest algorithm in Convolution Neural Network to pre-train the masked face features. PRFCNN is designed to assist in extracting more salient features and prevent overfitting problems. Experiments are conducted on two benchmarked datasets, RMFD (Real-World Masked Face Dataset) and LFW Simulated Masked Face Dataset using various parameter settings. The experimental result with a minimum recognition rate of 90% accuracy promises the effectiveness of the proposed PRFCNN over the other state-of-the-art methods.
Introduction
In recent years, the Covid-19 pandemic gives a massive impact to many countries in the aspects of economic, financial and social disruption. In this situation, many industries have shifted to the virtual working environment for online business activities. However, there are still many job areas that cannot provide a virtual working environment, and this could be a challenge and risk for the outdoor workers. Moreover, to minimize the risk of covid-19 infection, people started to put up masks to prevent inhaling viruses from the nose and mouth. Research proves that wearing masks will reduce the transmission of the Covid-19 virus. Unfortunately, there are two main consequences of wearing a mask during the pandemic:
Criminals might exploit this advantage for law violation activities. Partial of the face area had been covered by the mask and less information can be extracted, it deters the effectiveness of the existing face recognition methods when comes to recognizing masked face images.
In this modern-day, the face recognition system is widely implemented in many industries used for attendance, authorization, identification, crime prevention, etc. Covid-19 pandemic limits the usability of most of the face recognition system with poor True Acceptance Rate. To overcome this problem, a robust masked face recognition system is needed to solve those challenges.
Recently, a few researchers have proposed various masked face recognition methods such as masked face recognition using CNN with MLP [1], masked face recognition with MTCNN with SVM [2], masked face recognition using ResNet-50 [3] and masked face recognition using K-Nearest Neighbors with Facenet [4].
In this paper, a masked face recognition method is proposed based on deep-learning approach with the use of existing public benchmarked datasets for experiments to reduce the error rate and thus strengthen the security system. The proposed Principal Random Forest Convolutional Neural Network method (PRFCNN) consists of Principal Components Analysis (PCA), which acts as the dimension reduction algorithm, CNN as a deep-learning algorithm, and random forest for face classification. In this context, PCA can reduce high feature dimensions, and thus it speeds up the entire neural network process. PCA algorithm transforms the large dimension dataset into a smaller dimension without dumping important information from the data. The smaller dimension of the dataset, the more significant the accuracy; the entire dataset will be well ordered. With the combination of convolutional neural networks (CNN), the data in this experiment is well-trained. The PCA aids to smooth the convolutional neural network process when training a large dataset. Lastly, the image classifier used in this experiment is the Random Forest classifier. This classifier handles some missing data or insufficient training data problems after the CNN process and enhances the recognition of True Acceptance Rate.
After a large dataset had been trained with CNN, there might have been some overfitting or underfitting problem. The tree algorithm in random forests prevents overfitting datasets and improves recognition accuracy. The more the number of trees, the higher the accuracy of predictions that are suitable with large numbers of datasets. Hence, two public datasets are used in this experiment: Real-World Masked Face Dataset (RMFD) and Labeled Faces in the Wild Simulated Masked Face Recognition Dataset (LFW- SMFRD). Before the experiment, a Test-Dataset (TD) is extracted from the RMFD dataset and LFW- SMFRD. TD forms the subset of data to test the appropriateness of the dataset is constructed. Both experimental results will be evaluated using the True Acceptance Rate and the result will be compared to each other.
Section 2 of the paper discusses the literature review of masked face recognition related research. Next, section 3 describes the proposed method PRFCNN with detailed algorithm processes. Section 4 is the experimental analysis, which shows this researchs’ practical result and will be compared with other existing masked face recognition methods. Finally, section 5 draws the conclusion of the works.
Literature review
In this section, related research in masked face recognition is conducted to investigate the three main processes in masked face recognition. Also, the relevant existing methods have been investigated. In masked face recognition, feature extraction is the most crucial process to enhance the entire recognition system.
Feature extraction extracts valuable features from the original facial image for matching purposes. It expedites the deep learning process and improves the systems’ accuracy. There are some existing well-established feature extraction methods and deep learning methods in the literature.
Existing masked face recognition
In the paper by [1], the author proposed a study of masked face recognition to handle the ongoing coronavirus pandemic by recognizing masked faces in an unconstrained environment. This paper proposed a method by discarding the mask region and extracting deep learning features. The pre-trained deep Convolutional Neural Network (CNN) VGG16 is the feature extraction to extract from the eyes and forehead region. Next, Bag-of-feature (BoF) is applied to the feature maps in the last convolutional layer. At last, Multilayer Perceptron (MLP) is used as the classification. The experiment result was tested on Real-World-Masked-Face-Dataset and reaches a good accuracy of 91.3%. The proposed method is suitable for real-time situations compared aimed at unmasking the masked face more efficiently. The authors noticed some experiment settings such as RGF neurons give a higher recognition rate to the proposed method.
In the research of Masked Face Recognition Algorithm for a Contactless Distribution Cabinet proposed by [5], the author designed a contactless cabinet for picking up couriers to recognize masked facial images and reduce the COVID-19 transmission rate. This paper uses the Local Constrained Dictionary Learning (LCDL) to separate the facial images. Next, the dilated convolutional method was applied to decrease the resolution reduction in the subsampling process. To overcome the loss caused during the subsampling process, an attention mechanism was applied to produce a better training model. The author used a convolutional neural network to further reduce the information loss in the subsampling process and improve performance accuracy. The RMFRD and SMFRD databases were tested in this experiment and achieve 98.38% accuracy by using normal face images and 95.31% accuracy in a masked face image.
In the paper by [2], the authors proposed a masked face recognition to enhance and degrade face recognition performance with facial masks in a challenging or unconstraint environment. The proposed method implements a face detection algorithm that uses the Multi-Task Cascaded Convolutional Neural Network (MTCNN) to perform the tasks. After the facial region has been detected, the feature extraction process is done by using a pre-trained convolutional neural network GoogleFaceNet to extract the facial data. The author selected the Support Vector Machine (SVM) as the classification task. The Masked Face Dataset (MFD) was used to test the accuracy of the experiment and achieves the best result of 99.96% for training accuracy and 98.50% for testing accuracy. Lastly, the author concludes that the FaceNet pre-trained model has effectively improved the recognition rate of masked face images. Nevertheless, the method is not suitable for recognizing masked face images with different types of masks. More works can be embedded to enhance the proposed method.
In Masked Face Recognition using ResNet-50 [3], the authors proposed a deep learning-based model to improve face recognition by recognizing masked face images using a deep learning-based model. The proposed method uses a ResNet-50 transfer learning model to predict masked face images with several hyperparameter settings are tuned to find the best result. The experiment first ran on the face images without a mask in the dataset. Then the second run used the masked face images on the dataset. The primary purpose of this experiment is to classify the masked images on an unmasked face image as training. The dataset used in this experiment is the Real-World Masked Face Recognition dataset (RMFRD) which achieved 89.33% accuracy on unmasked images and 46.13% on masked images. The proposed method shows that the technique has less accuracy when recognizing masked face images. Finally, the author mention 3D learning mask cropping technique can increase the recognition rate effectively, which means cropping out the masked region on the masked face image and implementing other CNN architecture will improve the performance.
In the research by [4], the authors implemented transfer learning approaches to retain the FaceNet model using ResNet v 1 and ResNet50 architecture for masked face recognition. To address the overfitting challenge on validation sets, the authors perform some hyperparameter tuning in the experiment setting. The Facenet model is retrained using deep convolutional architecture ResNet v 1 and ResNet50 architecture to the datasets on unmasked and masked face images. The model produces embeddings for the Train, Validation and Test sets. The K-Nearest Neighbor was then implemented in the classification process. The datasets used in this experiment are the LFW (Labeled Faces in the Wild) dataset and the Simulated Masked Face Recognition Dataset (SMFRD), achieving 99.98% accuracy. The authors fine-tuned the hyperparameter to increase the recognition rate in the experiment.
In the paper by [6], the authors proposed a face recognition method that recognizes masked and unmasked facial images for smartphone security. The experiment was conducted using own created datasets. The dataset contains 6 classes with different person face images. An object detection algorithm, namely YOLO (You Only Look Once) was used to identify the masked and unmasked faces according to each label. In the experiment, several versions of the YOLO algorithm were tested which include YOLOv3, YOLOv4, YOLOv3 TINY, YOLOv4 TINY. The best result of 84% accuracy can be attained by using the YOLOv4 algorithm.
According to the paper [7], the authors made a review of masked face recognition which cover the dataset, deep learning approaches, feature extraction, masked detection, face restoration and classification available in the literature. The authors summarize all the results comparison of transfer learning used in the current masked face recognition field. The author highlighted some factors that could affect the performance of a masked face recognition including feature extraction, image preprocessing, face detection, face unmasking and restoration, localization, verification and matching [7] and concluded that devoted more effort in deep learning could increase the performance of masked face recognition system.
The paper by Hang [8] discussed a masked face recognition method to overcome the problem of NIR-VIS training data and testing method, and propose a novel heterogeneous training method that can be maximizing the shared information. The proposed method provides a domain-invariant face representation approach which can strongly cover masks closure area. Lastly, the authors employed 3D face reconstruction to combine masked face data to overcome the insufficient masked face data. The experiment was conducted with three datasets which are CASIA NIR-VIS 2.0, Oulu-CASIA NIR-VIS and BUAA-VisNir. The experiment achieved a good result with 98% accuracy by using the HSST (Triplet) method, CASIA NIR-VIS 2.0 dataset.
Ding [9] proposed used the Latent Part Detection (LPD) model to detect the latent facial region of mask wearing and the detected latent region is then used for next feature extraction process. The LPD model is trained in an end-to-end manner and specifically utilizes training data. The experiment was conducted on MFI, MFV and synthetic masked LFW datasets and achieved good accuracy of 97.94% in the MFV dataset, 94.34% in the MFI dataset, 95.70% in the synthesized LFW dataset.
In the paper [10], the authors adopted an open source tool MaskTheFace to generate a masked facial dataset for experimental evaluation. The authors used the Facenet model in the experiment, the result shows that the FaceNet could enhance around 38% of True Positive Rate (TPR) by using a retrained model which includes 3 simulated datasets: VGG Face2-mini, VGGFace2-mini-SM 1, LFW-SM (combined) and 1 real-world dataset, MFR2. The best result in the experiment achieves a 97.25% accuracy in the LFW-SM dataset.
Feature extraction
Some existing feature extractions are recently being widely used by researchers. Other than an existing word that has been described, the existing feature extraction is worth investigating. For example, Principal Component Analysis (PCA) [11], Independent Component Analysis (ICA) [12], Linear Discriminant Analysis (LDA) [13], Locally Linear Embedding (LLE) [14] and T-Distributed Stochastic Neighbor Embedding (t-SNE) [15]. The existing method has been enhanced or generalized to improve the frameworks’ performance.
Deep learning algorithm
Deep learning is part of the machine learning in artificial intelligence (AI), which mimics how humans learn various new knowledge. With this technology, a human can deal with large datasets and unstructured data that humans cannot perform manually in a short amount of time. Today, many existing deep-learning methods are used in currency prediction [16], natural language processing [17], autonomous vehicles [18], computer vision [19] and robotics [20].
ConvNet [16] is a type of CNN architecture that receives input image, assigns the weights and biases to many aspects of the object in the data and categorize each from the others. This method is widely used for image or object classification with promising results. With the data augmentation in CNN, the size of the training set could be increased to prevent overfitting problems when using large-size datasets.
Recurrent Neural Networks (RNNs) [21] are artificial neural networks (ANN) used for time series and sequential data. This approach is widely used to solve some ordinal problems. For example, speech recognition, language translation and natural language processing. They are distinguished by their “memory” as they take information from prior inputs to influence the current input and output ConvNet is a deep learning algorithm that receives input images and assigns the weights and biases to many aspects of the object in the data and categorizes each from the others.
Generative Adversarial Networks (GANs) [22] is a method of generative modeling with deep learning methods. It is unsupervised learning in machine learning. This method can self-discover and learn patterns from the input data. The model can be used to further steps such as generating and producing examples from the input data.
Self-Organizing Maps (SOM) [23] is another type of ANN that was trained with unsupervised learning to make a low dimensional, factored of the input spaces from the training sample, which is the “Maps” in this approach. Competitive knowledge is applied in this method for error-correction learning and the neighborhood function is used to maintain the topology of the input space.
Classification algorithm
The classification algorithm predicts a model when given the input data, such as targets, categories or labels. There are many classification algorithms that researchers use in various machine learning fields such as target marketing, medical diagnosis and credit approval.
Logistic Regression [24] is a classification algorithm used to predict categorical variable datasets or a set of discrete classes. It is widely used where the data output as a binary result, such as it’s a spam email or not a spam email, fraud or not fraud in online transactions. This classification approach cannot perform multiclass classification such as predicting dogs, cats and mice.
Naïve Bayes [25] is a classification algorithm based on the Bayes’ Theorem. The classifier assumes that the presence of a particular feature in a class is unrelated to of any other component. This method is suitable for large datasets and can handle a complicated classification method.
Stochastic Gradient Descent (SGD) [26] is a classification method used to find the best parameter setting corresponding to the prediction and the output. It is used chiefly with machine learning and its SGD contains backpropagation.
K-Nearest Neighbors (KNN) [27] is a classification algorithm with non-parametric settings. The structure of the model is based on the input dataset. It is suitable to classify the real-world dataset problem. However, this method needs more time for testing phases because KNN used all the training data as the testing data.
Random Forest classifier [28] is an algorithm randomly selects a subset of the training set to create decision trees. This method combines all of the votes from each decision tree to predict the input data class. Random Forest can be more accurate when the number of trees increases and solves some overfitting problems. It can be used in both binary and multiclass classification.
Support Vector Machine (SVM) [29] is a supervised machine learning model that uses classification algorithms for two-group classification problems. With the well-labeled training data in each category, SVM can better classify the data. When it comes to multiclass classification, this method does not support multiclass classification. In terms of doing that, SVM breakdown the multiclass dataset into a minor subclass, which can perform binary classification.
Motivation and contribution
The face recognition system is widely used for effective user authentication. Unfortunately, the covid-19 pandemic has had a significant impact on this biometric system and made the system relatively hard in recognize and verify the faces since the human faces are partially covered with a mask to avoid getting infected by Corona Virus. The use of a half-covered face mask led to a low true acceptance rate of the existing face recognition system due to failure in recognizing the correct faces. Therefore, this research aims to design a method that can solve the unconstrained face with masked for personal recognition.
In this research, a masked face recognition using a deep-learning method is proposed, which is termed as Principal Random Forest Convolutional Neural Network (PRFCNN). The contribution of this research is to improve the reliability of the face recognition system by customizing the deep-learning architecture. With this solution, the system can accurately recognize the masked face. The idea behind this method is to split the dataset into two categories (training and testing) and use the convolutional neural network of VGG16 deep-learning architecture with the random forest classifier for classification purposes. The VGG16 acts as the feature extractor of the dataset for subsequent processing. The input image is passed through 16 convolutional layers to extract the original data. Meanwhile, the principal component analysis is added to sort and reduce the size of the extracted feature for classification purposes. With this implementation, the proposed PRFCNN method provides the efficacy solution of customizing a combination of a deep-learning and classification method for a more reliable masked face recognition system.
The benchmarking framework Masked Face Recognition Using Neural Convolutional Network uses Multi-Task Cascaded Convolutional Networks (MTCNN) to assess the masked face recognition with FaceNet feature extraction, which has 22 convolutional layers and Support Vector Machine (SVM) as classification layer. The proposed method uses vgg16, which has 16 convolutional layers for feature extraction and a random forest algorithm as the classification layer. PCA is added as part of the process to improve the systems’ performance. Compared to the benchmarked framework, our proposed method adopts a different architecture in feature extraction and dimension reduction processes, which has a lightweight and efficient design. It is noticeable that the benchmarked method uses the MTCNN for face cropping, while the input image is being normalized separately by greyscaling and resizing in the proposed method. Moreover, the random forest suits well for multiclass classification problems. In addition, random forest reduces the overfitting problems and helps strengthen the system.
The contribution of this paper listed as:
Address the weaknesses of unconstrained masked face image problem by using the CNN method. The proposed method is able to recognize facial images and masked facial images with RMFD and LFW- SMFRD datasets. Masked image classification problem solved by using VGG16 deep neural network. and Random Forest classification. The proposed method solved the overfitting problem by using the combination of the transfer learning method with the Random Forest classification algorithm. PCA is applied after VGG16 feature extraction and Random Forest algorithm as classification layer to enhance the performance of face recognition/classification. The dimension reduction process of PCA works well with the VGG16 deep neural network architecture.
Proposed solution
This section will discuss different phases of overall methodology, detailed processes of each stage, and their implementations. In the benchmarked method, FaceNet is used as the feature extraction process with SVM classification. Their works have inspired us to design a better model by combining PCA, RF and CNN in the proposed method to recognize different types of masked face images with varying label categories. Besides that, the benchmarked process uses the Support Vector Machine (SVM) as the classifier for the data. According to research [30] found that the algorithm of Random Forest is much faster and can achieve better performance than the SVM algorithm in terms of accuracy. Therefore, a Random Forest classifier is implemented to absorb the strength of the classification task fully. Rather than that, the vgg16 model is used as the feature extraction as compared to the benchmarked method of the Facenet model. This is because the research [31] found that the vgg16 approach performs better than the Facenet model in term of the accuracy. Both methods are categorized as a pre-train model. Both methods are similar because the architecture is based on the convolutional net.
In this paper, the Principal of Random Forest Convolutional Neural Network (PRFCNN) is proposed to overcome the low true acceptance rate issue of existing face recognition. The method combines feature extraction, dimension reduction process, deep-learning, and classification mechanism to build a trustworthy and reliable masked face recognition system. Figure 1 shows the overview diagram of the experiment.

Overall process of PRFCNN.
First of all, the input image 160×160 pixel is selected from one of the benchmarked datasets which is the MFRD dataset. Then, the input image is normalized by resizing into 224×224 pixel, greyscale conversion and image alignment grey img . Equation (1) shows the resize and greyscale processes. The normalization process is done by using the min-max normalization process. Before that, the input image has separated into training and testing sets. Both sets of the image will be normalized before the feature extraction process. x represent the height of the resized image, y represent the width of the resized image After all the images are normalized, the image is labeled with categories by giving the Equation (1).
The R, G, B in Equation (2) represent red, green and blue pixel respectively, x, y, z are fixed value which is 0.299, 0.587 and 0.114. Next, the label of each category Cl is encoded to discrete numbers as the category of each label, X
n
. The equation below shows the process of encoding the labels of datasets.
The next section is to fit the training and testing sets into the training and testing data of the experiment respectively. Let’s say the training and testing set is S
train
and S
test
respectively. S
train
and S
test
go through the min-max normalization, N
d
by dividing the RGB which is 255 to make all the data between 0 and 1 in Equation (5). x
i
represent the data, min(x) represent the minimum value of RGB, max(x) represent the maximum value of RGB.
Thenceforth, VGG16 is used as the feature extractor of input data in the experiment., and random forest is set as the classification layer due to its low computationally expensive and it will give various clarifications on the decision tree. N
d
is the normalized data that prepare for further processing. The first layer of VGG16 is the input size of the image which is 224×224×3 and goes through all the convolutional layers with the 3×3 filters with stride 1 and max-pooling and padding layer with 2×2 filters of stride 2. Three fully connected layers are used with different depths and lastly the softmax function as the output. Equations (6) shows the Convolutional Neural Network equation.
In Equation (6), P (x, y) is pixel value of original image with coordinate (x, y). M (x, y) is the kernel of the convolutional multiplied with size of ω × m.
There are several parameter settings of the VGG16 in the experiment which are the ImageNet as the weight, input shape is 224×224 which are the resized image, and the trainable parameter set to false, this prevents the model uses plenty of time to retrain the model since it is only used as the feature extractor for the experiment. After the model had been prepared, the VGG16 is applied to the data. The normalized data N d go through the feature extraction process which is VGG16.
F
e
is the feature extracted from the VGG16 feature extractor. The next step of the experiment is to apply the PCA dimension reduction process to the extracted feature. The PCA equation is written as Equation (8).
M (IxJ) is a matrix of data, S (IxR) is the scores, L (JxR) is the loadings, E (IxJ) is the residuals, and M is described by the R which is the principal component. Before applying the Principal Component Analysis, the principal component is found by using the cumulative variance. By using the try and error method, the principal component is set and the best number of components is decided where the cumulative variance is adjacent to 1. The variance and cumulative variance CV equations are shown in Equations (9) and (10).
The extracted feature F
e
fits into the PCA to reduce the feature size of the trainable data. This process improves the speed during the training phases. P
f
is the data in which the feature size has been deducted by the PCA process.
Afterward, the random forest classifier is used to train and predict the accuracy of the experiment. This classifier classifies the data after the PCA dimension reduction process.
The Random Forest classifier is used to train and classify the PCA feature after extraction with VGG16. This classifier algorithm classifies the data after the PCA dimension reduction process. The parameter used in the random forest is the random state and estimator numbers which are the number of trees in the random forest. The higher the estimator (trees) the better the accuracy is. The equation of the Random Forest classifier is written as Equation (12) classification equation RFfi
i
, Equation (13) Gini impurity and Equation (14) the entropy calculation.
Where, P (+)/P (-) = % of + ve class1 % of - ve class
After the PCA feature is trained and classified, the result will be shown in verification rate or accuracy. The proposed PRFCNN is given in Algorithm 1.
The proposed method is evaluated on two different benchmarked datasets and one test subset from both directories, which are the RMFD (Real-World Masked Face Dataset), LFW Simulated Masked Face Dataset (LFW-SMFD) and the TD (TestDataset), respectively. The images from TD are the same as both benchmarked datasets. The RMFD dataset consists of 243 categories and has a 1996 samples of facial images. In LFW-SMFD, there are 2271 categories and 5442 facial images. Lastly, the TD dataset consists of 36 categories and 251 facial images. Figure 2 shows the testing result of the two different dataset images.

Testing result from both benchmarked datasets (a) LFW Simulated Masked Face Dataset (b) RMFD.
The accuracy of this masked face recognition experiment is evaluated on both datasets. During the investigation, there are several combinations of experiment settings, and different parameter settings are tested on both benchmarked datasets to find out the difference and the best accuracy of this masked face recognition. The results are recorded and compared with other state-of-the-art methods. There are several categories of experiments separated by different datasets.
Table 1 shows the experiment result for the TD dataset for testing the appropriateness of the method used in both benchmarked datasets. First, the default CNN setting, experiment ID 001 –004 with the epoch of 10, is tested using the TD datasets to observe the experiments’ performance. The best result obtained is the combination parameter setting of the Relu function and the Adam optimizer reaches 78% accuracy with the default CNN. After that, the Random Forest algorithm is added to the experiment, with the experiment ID of 005 and 006 estimators of 10. In these two experiments, the function used in 005 and 006 is Relu and sigmoid, respectively. The result shows that both experiments can achieve high accuracy of 99%. Lastly, the experiment ID 017 is the proposed method that added the PCA algorithm in the experiment with the vgg16 CNN architecture. The PRFCNN method is tested in different estimators, which are 5, 10 and 15. The higher the estimator, the better the accuracy is.
Different parameter setting experiment results for the TD dataset
Table 2 shows the experiment results for the RMFD dataset with different parameter settings. The experiment ID 007 –011 is the default CNN setting experiment in this experiment category.
Different parameter setting experiment results for the RMFD dataset
Similar to the experiment from Table 1, we can conclude that the Relu and Adam function are the best combination in the CNN parameter setting, so we applied that to experiment 008 and achieved 42.40% accuracy, which is the highest result among other CNN experiments. For the experiments ID 012 and 013, the Random Forest algorithm is added and a different function is used. The result shows the sigmoid function 99.29% accuracy is slightly lower than the Relu function 99.33% accuracy. For the experiment with ID = 18, which is the proposed method reaches the highest accuracy with 99.77% accuracy at 10th trees.
Table 3 shows the experimental result of the benchmarked dataset LFW-SMFD with different parameter settings. Experiment 014 is the default CNN experiment with the Relu and sigmoid function which achieves only 0.09% accuracy. The experiment ID of 015 and 016 with added Random Forest algorithm have effectively improved the performance and performed between 99.72–99.79% accuracy. The proposed method PRFCNN can achieve 99.96% accuracy, which is the highest accuracy of the overall experiment. Moreover, the epoch and estimator from Tables 1, 2 and 3, CNN and the CNN with Random Forest method are fixed with 10. This is because after the 10th epoch or estimator, the result will be stable which means the accuracy does not increase with the 11th epoch or estimator. Therefore, the best result we obtained is the 10th epoch or estimator as the optimal limit of the method. However, the proposed PRFCNN method breaks the 10th estimator up to the 15th. The result after the 15th estimator will be optimum. Lastly, we can conclude that the limit for the proposed method is the 15th estimator which lead to the highest accuracy in the experiment.
Different parameter setting experiment results for the LFW-SMFD dataset
Referring to Fig. 3, the performance of various epochs or estimators on each experiment with ID 017 (PRFCNN) has the highest accuracy which is 100% with the 15th estimator and experiment ID 002 has the lowest accuracy which is 27.51% among others in the TD dataset.

Graphical representation of experiment result from Table 1.
Referring to Fig. 4, the performance of various epochs or estimators on each experiment with ID 018 (PRFCNN) has the highest accuracy which is 99.38% with the 15th estimator and experiment ID 010 has the lowest accuracy which is 2.68% as compared with others in the RMFD dataset.

Graphical representation of experiment result from Table 2.
Referring to Fig. 5, the performance of various epochs or estimators on each experiment on experiment ID 019 (PRFCNN) has the highest accuracy which is 99.96% with the 15th estimator and experiment ID 014 has the lowest accuracy which is 9.69% among others in the LFW-SMFD dataset.

Graphical representation of experiment result from Table 3.
To visualize the performance of the classification algorithm, ROC is presented to show the predicted accuracy of the experiment. The ROC curve is used for the performance measurement of different classification algorithms. ROC represents the probability curve and Area Under Curve (AUC) is the degree or measure of separability. It shows how many of the models can be categorized between each label. The higher the AUC, the better the performance of the classification algorithm. The X-axis represents the false positive rate and the Y-axis represents the true positive rate of the experiment.
Figures 6 and 7 show the highest performance of the ROC curve of the experiment on both benchmarked datasets, the RMFD (Real-World Masked Face Dataset) and LFW Simulated Masked Face Dataset with different parameter settings.

ROC curve for experiment ID 018 (a) 5 estimator (b) 10 estimator (c) 15 estimator.

ROC curve for experiment ID 014 (a) 5 estimator (b) 10 estimator (c) 15 estimator.
Table 4 shows the computational time for the proposed method. The two benchmarked datasets are tested, which are RMFD and LFW-SMFD. The training time for the proposed method with RMFD is 45 minutes, while the LFW-SMFD is 2 hours (120 min). The testing time for both datasets is identical, which is 0.01 seconds per image. This experiment shows that the proposed method is computationally inexpensive as it needs only 0.01 seconds to test one image.
Different parameter setting experiment results for the LFW-SMFD dataset
The result of Table 5 shows the lowest accuracy with masked face recognition using ResNet-50 which reaches 47% accuracy [3]. The proposed method has enhanced the performance compared to the ResNet method, which uses a VGG16 and will perform better than ResNet50 in feature extraction using the transfer learning model [26]. The highest accuracy is the Masked Face Recognition Using Deep Learning [24] achieving a performance which is 99% accuracy in year 2021. Our proposed methods PRFCNN also reach the same accuracy as the Masked Face Recognition Using Deep Learning research which also has a 99% accuracy. The paper [24] experiments on the full frontal face of a facial image including cropped eyes, nose, and forehead by using the transfer learning method FaceNet and VGG, our proposed method experiments on facial images without cropping out the image by region and the Random Forest are added to stabilize the performance of image classification. Next, the MFCosface method [25] achieves a good accuracy of 92% which recognizes images based on the large margin loss. The proposed method uses a transfer learning method as feature extraction which is VGG16 rather than using Multi-Task Cascaded Convolutional Network(MTCNN). The second and fourth methods [2, 5] in the state-of-the-art methods comparison have the same result which is 98% accuracy. The experiment in [2] used FaceNet as the feature extractor and SVM as the image classifier in the experiment. According to [24], the performance of VGG16 is better than FaceNet, therefore the vgg16 is chosen as the feature extractor in the proposed method. In summary, we can conclude that the proposed method has increased the performance of Masked Face Recognition of past recent years in terms of accuracy.
Result compared to other state-of-the-art methods
In conclusion, masked face recognition is worth exploring and researching. This research will be useful to enhance the accuracy and efficiency of face recognition system. It is found that the proposed PRFCNN method can achieve the accuracy up to 99% by using both benchmarked datasets (RMFD and LFW-SMFD). We can conclude that the combination of Random forest with the PCA algorithm is suitable for masked face recognition, which performs well in the experiments. Besides, the proposed method PRFCNN also addresses the computational issue of the deep learning approach by speeding up the training process of large datasets and solving overfitting problems in the data training process. While the limitation is the proposed method does not need any parameter or hyperparameter tuning. In other words, when another dataset is tested with this proposed method, the performance might drop and there is no parameter tuning to improve the result. In the future, it is necessary to extend the works by testing on different datasets as well as implementing others deep learning and classification approaches.
Footnotes
Acknowledgments
This work was supported by the Internal Research Fund (IR Fund) MMUI/210034, 2021 and MMUI/220025, 2022 from Multimedia University.
