Extraction of human face features from color images

Abstract

Face detection has been widely studied by researchers. However, detection and extraction of human face features is very important as it plays a vital role in variety of applications involving automated face processing. This article focuses on extraction of face parts such as eyes, nose, lips, mustache, and beard on Indian people, for which we have prepared our own face dataset containing variety in faces, from both urban and rural areas. This study focuses on how a detected face part becomes useful in detecting other face parts. We implement our approaches of detecting face parts and evaluate them on our dataset. We exploit YCbCr color model, Viola Jones technique, landmark detection, and level set evolution technique in our approaches of face part detection and extraction. We found that our approaches are effective on extracting face boundary, eyes, nose, and lips and provide comparable results.

Keywords

Image processing facial features human face boundary extraction eye extraction nose and lip extraction beard and mustache extraction

1. Introduction

Computer based automatic extraction of human facial features is an important research topic due to its widespread applications such as face recognition, face grouping, facial expression recognition, etc. Extracting facial features [1] involves discovering exact locations of different face parts, such as eyes, nose, mouth, eyebrows, beard, moustache, chin, etc., and then separating these portions for further required processing. Extraction of human face parts plays an important role in human face analysis [2], visual interpretation, and human face recognition [3, 4]. Face detection has attracted much interest since a long and has progressed drastically over past few decades [5, 6, 7]; however, detection of human face parts is of prime importance in a wide variety of applications such as computer vision, facial animation, face recognition, facial expression detection, face image database management, etc. A human being can identify almost a thousand of faces in one’s entire life and can recognize faces without any trouble. However, for computer programs, it is a challenging task.

Many researchers, e.g., in [8], have attempted face detection and have shown its applications in various systems [9, 10, 11, 12]. Various researchers have studied detection and extraction of individual face parts. Berbar et al. [13] carried out detection of faces in color images. Oravec et al. [14] used skin color based segmentation for extraction of face and proposed methods for eye localization and mouth localization. A research work by Mahoor et al. [15] improved Active Shape Model (ASM), which can be used for extraction of facial features. Shih et al. [16] detected face candidates, which are segmented using Gaussian skin-color model, by its size and shape. Their work [16] created an ellipse model to roughly locate eyes and mouth, but then used SVM to classify them. All these studies are limited to two to three face parts. Moreover, calculating accurate distance measures among face parts is also essential for practical use, e.g., in query driven face retrieval system.

Extracting face features involves two major steps: (1) locate approximate position of the desired face part and (2) extract precise shape of the face part. Many image processing operations are available for processing of color images. However, to identify exact face parts such as eyes, lips, etc., the usage of appropriate image processing technique can provide useful answer, which requires practical study on various face images. In this article, we try to study and improve extraction of useful face parts. Furthermore, our study is on faces of Indian people, which have different skin color than those already studied in the literature.

This article intends to study available, applicable techniques for extraction of face parts on our face dataset of 221 frontal face images, including 163 male and 58 female (in all 13 are children). Various researchers have studied extraction of individual face parts. However, our study focuses on how to utilize a detected face part in detecting and extracting other face parts, e.g., using nose to locate lips. We aim at extraction of five face parts, including face boundary, (1) eyes, (2) nose, (3) lips, (4) beard, and (5) mustache. We also detect eye points, nose tip, and lip center to measure distances between two eyes, nose and eyes, eyes to lips, and nose to lips. These selected five face parts and the measures can provide substantial feature information to face classification, clustering, and query based face retrieval, which might become useful to crime department, various government organizations, matrimonial organizations, etc.

2. Background and related work

2.1 Background

In today’s era, there is a need of large amount of human face image data to be maintained and used. In various organizations, such as crime department, government department, there arises need to efficiently manage and use human image data. Different departments have many human images to be managed and retrieved according to requirements. There can be a smart system to manage and retrieve human images using specified face features. A human face contains many face features or face parts including eyes, eye color, eyebrows, chin, nose, lips, beard, mustache, hair, hair color, skin color, spectacles, etc. Extraction of such face parts from a face image requires image processing operations. Generally, in image processing [17], an image has to pass through some stages such as image acquisition, image pre-processing, image segmentation, feature extraction, etc. Extraction of face features involves two important operations: face detection and face part detection. Face detection is the process of detecting face from an image. Some techniques such as Viola Jones [18], skin color detection, and morphological operations are used for detecting face region.

Face part detection is the process of detecting facial parts from face region. For extracting different face parts, different techniques are available. Generally, any image processing technique falls into two domains: spatial domain and transform domain. Spatial domain techniques directly deal with the image pixels. Spatial domain techniques used for facial feature extractions are Local Derivative Patterns (LDP), Local Binary Patterns (LBP), edge detection techniques, smoothing techniques for preprocessing, gray scale manipulation, etc. For face part extraction, the following techniques of spatial domain are used: Principal Component Analysis (PCA), edge detection, skin color segmentation, Active Shape Models (ASM), and Gabor features.

Transformation or frequency domain techniques are based on the manipulation of the orthogonal transform of the image rather than the image itself. Transformation domain techniques are suited for processing the image according to the frequency content. Frequently used techniques for feature detection of face are Discrete Cosine Transform (DCT), Discrete Fourier Transform (DFT), Discrete Wavelet Transform (DWT), 2D-DCT, DT-DWT and Hough transform. Circular Hough transform is used for detecting iris from eye window. This transform technique finds circular shape in eye region and locates iris. The circular Hough transform gives accurate result for iris detection.

Figure 1.

Proposed work for extraction of human face features.

2.2 Related work

Khan et al. in [19] proposed an algorithm that performs multiclass semantic segmentation of face parts using Conditional Random Fields (CRF). They segment six regions, including hair, eyes, nose, mouth, skin, and background using position, HSV color, and shape information in CRF model. Work of Happy and Routray in [20] recognized six universal facial expressions automatically using features of salient facial patches. They localize face as well as the facial landmark points and extract all major active regions on the face. Their work employed SVM with one-against-one classification method to classify an expression as Anger, Fear, Disgust, Happiness, Sadness, or Surprise. They exploited localization of facial landmark such as eyes, nose, lip corners, and eyebrow corners.

In recent years, researchers have also attempted to apply deep learning in the area of face and facial parts detection. For example, Ranjan et al. [21] proposed an algorithm using deep convolutional neural networks (CNN) that allows multi-task learning. They proposed a single CNN model, called HyperFace, that allows simultaneous face detection, landmarks localization, pose estimation, and gender recognition. Similarly, Yang et al. [22] proposed a deep CNN, called Faceness-Net, that can detect faces even under severe occlusion and unconstrained pose variations. Their work employs local facial parts based supervision and computes faceness score based on spatial arrangement of face parts. However, for face detection, the work of Ranjan et al. [21] and Yang et al. [22] do not focus on extracting exact face boundary. Another work by Zhang et al. in [23] proposed Tasks-Constrained Deep Convolutional Network (TCDCN) that uses auxiliary information such as head pose estimation, gender classification, age estimation, and facial expression recognition to optimize facial landmark detection. Their work focuses on land-marking eye, mouth, and nose.

Researchers have also tried to improve 3D facial recognition; however, facial recognition in 2D is more challenging than in 3D. Boukamcha et al. in [24] presented a method using landmark point detection for 3D frontal face. Their work exploits face segmentation and surface curvature information. Another work by Dhahri and Belaid in [25] attempted to remove beard from 3D human face model. Their work segments the face to locate beard area and then estimate whether beard is present or not based on their proposed measure, compute of angle between face normal vectors (SANV). Once the beard is located, it is removed by applying Taubin smoothing with regression of SANV.

In our work, we consider extraction of total five face parts: eyes, nose, mouth, beard, and mustache. Prior to this work, we have carried out survey on techniques of extraction of different facial features [26]. In our survey [26], we considered total five face parts: eyes, nose, mouth, beard, and mustache. In our survey, we consider total four approaches of facial feature detection. Different approaches of face parts detection are as follows: geometry based approach, appearance based approach, color based approach, and template based approach. We also considered different criteria of comparison such as used dataset, approach for feature detection, and accuracy of results in our survey. Detailed survey and analysis are presented in [26].

Through the survey of literature [26], we conclude that geometry and color based approaches are more useful for finding face features. For image pre-processing, techniques such as image normalization, noise removal, and background removal technique are used. Face detection techniques such as Viola Jones, skin color detection, and morphological operations are used to detect a face from an image. For eye detection, circular Hough transform, YCbCr color model, gabor filter, and Viola Jones techniques are used. In mouth detection, techniques such as Snake models, ASM, Viola Jones technique, and multi-state mouth model are used to detect mouth region and shape. For nose detection, Viola Jones, edge detection, and geometric templates are used. Active shapes models and skin color information are used to detect beard and mustache.

Amongst various methods, Viola Jones algorithm gives good result for detection of face and face parts with high accuracy and low computational power on frontal view images. Face detection with skin color information gives good result, but sometimes it considers other skin color portions such as neck and hands as face region. PCA, neural network, and eigen faces give very good result but these all are appearance based methods and take high computational power and needs lots of data for training purpose. Thus, VJ algorithm is better to use for face detection and face parts detection as compared to all other techniques because of its low computational power and accurate results for frontal-view color images.

3. Proposed work to extract human face features

Figure 1 shows the architecture of our proposed work to extract face features from frontal face images. Furthermore, we propose to demonstrate how a detected face part becomes useful in detecting other face parts. Detailed methodology and detailed steps to extract various face parts are discussed in Section 4. This section discusses higher-level steps, which are as follows: Image acquisition, Image pre-processing, Face region detection, Detecting each face parts, and Feature extraction of each face part.

3.1 Image acquisition

There are various types of human images such as posed and un-posed images, frontal and non-frontal view images, images with uniform and non-uniform background, etc. For this research, we use human face images which contain frontal view with uniform background. As we aimed at extraction of face features of Indian people, we collected passport size photographs from a local photographer (from Kalol city of Gujarat, India). For purpose of face recognition and facial feature extraction, researchers used datasets such as CVL [27], AR dataset [28], LFW [29], and FERET [30], etc.; we used some images of CVL dataset to maintain variation in skin color in our database. Furthermore, we also selected some images available on the Internet. For detection of face parts, we have total 221 images with frontal face and uniform background with different lighting conditions. Our dataset contains 163 male and 58 female images (in all 13 are children). The images in our dataset are in .jpg or .png format.

3.2 Image pre-processing

To get better results in further steps, image pre-processing is required. In our work, images were pre-processed by resizing or normalizing to bring them in a same size. We used inbuilt imresize() and imcrop() function of Matlab [31] for resizing images. We also used Lighting Compensation (LC) technique, such as used in [32], to deal with different lighting conditions.

3.3 Face region detection and face parts detection

For detection of human face, different algorithms are available such as Viola Jones (VJ) Algorithm [18], Edge detection techniques like canny edge detection, sobel edge detection, skin color information, morphological operations, geometric templates, etc. However, VJ only gives bounding box and skin color detection detects face region; however, skin color detection also considers neck portion into skin region. Therefore, to overcome this problem, we propose to use combination of skin color detection and landmark detection techniques to get an exact face boundary. After getting the face boundary from an image, the next step is to find the face parts such as eyes, nose, lips, beard, and mustache. To locate eyes, nose, and mouth, we used VJ technique. However, to locate mustache and beard, we propose to use our own simple way.

Figure 2.

Results of Lighting Compensation (LC) technique.

Figure 3.

Block diagram of skin detection process.

3.4 Extraction of face parts

To extract features of each face part, different techniques can be applied. For eye point detection, we used a method which is based on color segmentation on YCbCr color space. Detection of nose region is conducted by using Viola-Jones technique, which provides the bounding box for it. The nose tip is at the center of the bounding box. To detect lip point from mouth region, we apply face landmark detection. However, in many images, the technique fails to provide correct boundary of lips. In order to improve the detection accuracy, we use bounding box that we get by applying Viola Jones. Thus bounding box can localize the lip landmarks. In order to detect mustache area, we select area that lies between detected mouth and nose tip. Next, we search pixels of skin color and select other pixels assuming of mustache and beard. Based on non-skin color, we can extract any mustache from a face. To detect beard, we set a rule that beard can only have area below mouth. Therefore, any non-skin area below mouth corresponds to beard. We also apply a filter that filters out any blobs smaller than a fixed threshold.

Figure 4.

Result of skin detection from a sample image.

4. Methodology and results

In this section, we detail our methodology and present results on our dataset. We have implemented our ways of face and face parts detection using Matlab tool.

4.1 Image pre-processing

4.1.1 Image resizing and cropping

We used imresize() and imcrop() functions of Matlab to normalize the image and to crop the image, respectively.

Figure 5.

Result of landmark detection and face boundary extraction.

4.1.2 Lighting compensation (LC)

The images in our dataset have varying lighting conditions. The lighting compensation (LC) algorithm is very efficient in enhancing and restoring the natural colors into the images, which are taken in darker and varying lighting conditions. Therefore, lighting compensation has been used in skin and face detection, and this algorithm is indispensable for robust skin-tone color detection [33]. Figure 2 shows the result of LC technique on a sample image.

Figure 6.

Results of face boundary extraction on sample images.

Figure 7.

Process of detecting eye centers, adapted from [36].

Figure 8.

Results of eye point detection and distance between two eyes.

4.2 Extraction of face boundary

To get the exact boundary of the face from an image, we used a combination of two techniques: 1) Skin detection and 2) Landmark detection.

4.2.1 Skin detection

Human skin segmentation aims to locate skin regions in an unconstrained input image. Most existing skin segmentation approaches use skin color as a basis of segmentation. In frontal face images, there is a substantial large portion similar to skin color. Therefore, for face detection, many authors have used skin color based methods due to their suitability and quick detection. The technique helps to detect faces from different environmental variations. Figure 3 shows the steps to detect skin color.

Figure 9.

Diagram of process of nose tip detection.

We use color based segmentation for skin detection. In this research, we used YCbCr color space as a basis to detect human skin, as used in [34]. We use skin-color map on the chrominance components of the input image and use it to detect pixels that fall in the range of skin color. In the YCbCr color space, we use the following ranges, shown in Eq. (1), of Cb and Cr, as used by [34], that are illustrative for the skin:

$\displaystyle\textit{skin}=(77\leqslant Cb\leqslant 127\&133\leqslant Cr% \leqslant 173)$ (1)

We also add $Y$ threshold, shown in Eq. (2), to improve the segmentation result:

$\displaystyle\textit{skin}=(90\leqslant Y\leqslant 180)$ (2)

Then detected skin is converted to binary image where white pixels indicate skin and black pixels represent non-skin. Morphological operations like dilation and erosion are applied next. Next, we remove small blobs using a filter. At the end, we use the biggest blob as a mask on the original image to get skin region. Figure 4 shows result of skin detection from a sample image.

4.2.2 Landmark detection

After skin detection, we applied landmark detection [35] technique on the face region, which we obtained using skin detection. Landmark detection method gives 66 face landmarks that surround lower face-boundary and the facial features like eyes, nose, mouth, and eyebrows. We found that on original images, landmarks are not successfully detected. To improve the result, we use only detected skin portion from previous step and feed it to this method. Since only at the bottom of the face there is a possibility of neck region having similar skin color, we use only the bottom landmarks of the face to create the face boundary.

Figure 5 shows result of landmark detection and face boundary extraction. Figure 6 shows results of face boundary extraction on sample images of our database.

4.3 Feature extraction

We extracted facial features such as eye centers, distance between two eyes, height and width of a face, nose, lip shape, distances between nose to eyes, distances between eyes to mouth, beard, and mustache region.

Figure 10.

Results of nose detection.

4.3.1 Eye point detection

For detection of eye centers, we adapt method of Nasiri et al. [36] with removing overhead of various geometric tests, and exploit Viola-Jones method, which can detect a bounding box of the eye pair. Their method [36] finds out candidate eye pairs on whole face and then selects appropriate one based on four geometric tests: (1) eye-pair distance, (2) eye-center distance, (3) eye-angle test, and (4) eye-shape test; however, we do not need such tests in our approach. Figure 7 shows our way of detecting eye centers. We first, detect a bounding box of eyes using Viola-Jones method. We then divide the bounding box, obtained using Viola-Jones method, into two parts: right eye and left eye portions. For each eye portion, we use steps indicated in [36] with our modifications. Each portion is converted from RGB to YCbCr color space. Next, as per a step indicated in [36], we make two eye maps: EyeMapC (from the chrominance components) and EyeMapL (from the luminance component) and merge (AND operation) them to obtain a final map: EyeMap, details and formula are omitted for simplicity, interested readers can refer [36] for further details. As stated earlier, we do not perform any geometric test used by [36], since we first straight away detect eye bounding box hence we do not get candidate eye pairs and thus we do not need to perform any geometric tests. Next, we apply morphological operations such as dilation on binary image to get the exact locations of the centers of the eyes. Figure 8 shows results of eye detection and distance between two eyes.

Figure 11.

Flowchart of process of lip shape extraction.

4.3.2 Nose tip detection

Detection of the nose tip is conducted by using Viola-Jones without any additional method. In many cases, this method can detect nose area with high accuracy as long as we can feed the method with appropriate input. In this research, we use image portion below center of eyes. Figure 9 shows process of nose tip detection and Fig. 10 shows the result of nose tip detection, which is used to measure distances between eyes and nose tip.

4.3.3 Lip point detection and shape extraction

Figure 11 shows major steps of lip extraction.

In detecting mouth location, existing landmark detection method on whole face image fails to give the correct boundary of lips in many images. In order to improve the detection accuracy, we assume that the mouth is at the area below nose tip, and therefore we take image portion below nose. We, then, apply Viola-Jones method on the extracted image portion to detect mouth. Next, we apply face landmark detection on the mouth bounding box to get lips boundary. The landmark detector actually implements many lips models on the face to see which one of the models that suits mouth form. Next, we find the centroid of mouth using regionprop() function and then find mouth corners and calculate height and width of mouth.

Figure 12 shows the result of lip point detection, lip shape extraction, and distance between different face features on a sample image.

4.3.4 Mustache region extraction

Figure 13 shows steps of mustache region extraction. The detection of mustache is based on method available in [37]. This method actually is a segmentation method. It segments image according to initial input given to it. If we input skin area, then this method will search pixels that have similar color with skin. At the end, we will get areas that are similar with skin and non-skin. In this research, we automatically generate input for the method from a skin image, which is an area on nose. Then, the program searches skin color and puts all such pixels in skin portion and puts all other pixels like mustache, beard, and hair into non-skin portion. From non-skin portion, we can easily extract any mustache or beard from face. In order to select mustache area, we select non-skin area that lies between detected mouth and nose tip. Figure 14 shows the results of mustache region extraction.

Figure 12.

Results of lip point detection and lip shape extraction on a sample image.

4.3.5 Mouth region refinement

In some cases, any mustache detected based on previous steps overlaps previously detected mouth portion. Thus, the mustache will affect the measurement of mouth properties like width, height, and its center. Therefore, based on region of detected mustache, we refine the area of mouth by setting a new rule. Mouth cannot be over mustache region, so we must lower the boundary of mouth according to the lower portion of the mustache.

Figure 13.

Steps of mustache region extraction.

Figure 14.

Results of mustache region detection on a few sample images.

Figure 15.

Steps of beard detection.

4.3.6 Beard region extraction

When we apply the method to detect mustache, we also get beard area at the same time. Therefore, we use the same approach to get the area of beard. We set a rule that beard can only have an area below mouth. Therefore, any non-skin area below mouth corresponds to beard. However, we also add a rule that the height of beard can have only more than 15 pixels, based on observations of beard images in our dataset. We also add morphological filter that filters out any blobs (non-skin area) that are less than threshold area. Figure 15 shows steps of beard region detection and Fig. 16 shows its results on two sample images.

5. Evaluation

Figure 16.

Results of beard region detection.

Figure 17.

Results of detection of all facial features.

This section presents quantitative results of accuracies in detection and extraction of different face parts. We performed all the experiments using MATLAB 2015a. Figure 17 shows images of different people with all extracted features such as eyes, nose, lips, beard, mustache, and distances between face parts. Table 1 shows accuracies of detection of the face parts. We observed that our results are comparable with others’ results and are compared in Table 4. For face extraction, our dataset contains 270 images having frontal, human faces. Accurate face boundary was successful on 221 images. Thus, the accuracy for face boundary detection was 81.85%. For extraction of face parts, our image dataset contains 221 faces having extracted face boundary. For further processing, i.e., extraction of face parts, we performed detection of face parts on these 221 images as per the methodology presented in Section 4.

Table 1

Accuracy of detection of different face features

Face features	Feature detected ratio	Accuracy (%)
	(Correct result/total images)
Eye center	209/221	94.57%
Nose tip	217/221	98.99%
Mouth points	213/221	96.38%
Lip shape	178/221	80.54%

Table 2

Different features (various distances) of a few face images

IMG	LE	RE	LE	RE	LE	NTip	MT-W	MT-H
	to	to	to	to	to	to
	RE	NTip	NTip	MT	MT	MT
1.jpg	83	65	71	95	97	33	84	41
2.jpg	86	62	60	95	97	43	71	22
3.jpg	97	82	86	134	135	56	97	55
4.jpg	81	61	60	101	104	49	66	26
5.jpg	91	66	72	99	101	37	84	39
6.jpg	86	69	72	107	106	42	77	28
7.jpg	88	66	73	100	102	37	104	23
8.jpg	96	63	68	100	101	44	78	33
9.jpg	107	84	82	123	121	46	91	33

Figure 18.

Faces that we considered as without mustache.

Figure 19.

Faces for which we get False Positive (FP) beard.

Table 2 shows the extracted features of face parts. We use the following notation in the table: LE stands for Left Eye, RE stands for Right Eye, NTip stands for Nose Tip, MT stands for mouth, W stands for width, and H stands for height. The first column indicates the name of an image in our dataset, and all other columns represent various distances, in terms of number of pixels.

Our database included a variety of people, including from rural area and urban area. Generally, villagers do not pay attention on keeping clean shave. Therefore, to consider tiny grown hair on mustache region as mustache or not was a question for us, whose answer is subjective. We show a few faces in Fig. 18 that we considered as without mustache. We faced the same problem for beard also.

Table 3 shows results of beard and mustache detection and extraction. We observed that for mustache detection, we could get comparable results, however, for beard detection, we could get only 73.30% accuracy. The reason behind lower accuracy for beard is due to high false positive rate. Fig. 19 shows a few faces for which we got False Positive beard.

Table 4 shows the comparison between our results with other researchers’ work. For eye detection, the authors of [38] used circular Hough transform with unsupervised k-mean for eye ball detection. They used 415 eye images from FERET database and got 90% accuracy on testing. Vukadinovic and Pantic in [39] used VJ method to detect eye region and gabor filter for locating eye box. They used Cohn-Kanade database and got 93% accuracy. A recent work by Yang et al. [22] used convolutional neural network (CNN) for eye detection and achieved 95.87% accuracy for cropped dataset and 97.19% for uncropped dataset. Khan et al. in their recent work [19] used conditional random fields (CRF) for eye detection and got 91.87% and 84.56% accuracies for FASSEG V2 database and FASSEG V4 database, respectively. We used 221 color images of human face and achieved 94.57% accuracy on eye ball and eye center detection.

Table 3

Performance measure of beard and mustache detection

Face features	TP	TN	FP	FN	Accuracy (%)
Mustache	63	141	02	15	92.30
Beard	31	131	51	08	73.30

Table 4

Comparison between results of our approaches with others’ methods for eye detection, nose detection, mouth/lip detection, mustache detection, and beard detection

Face part	Method	Accuracy
Eye detection	Circular Hough transform [38]	90.00%
	VJ and gabor filter [39]	93.00%
	Faceness-Net: deep convolutional neural network (CNN) [22]	95.87% (Cropped dataset)
		97.19% (Uncropped dataset)
	Conditional Random Fields (CRF) [19]	91.87% (FASSEG V2 database)
		84.56% (FASSEG V4 database)
	Our approach	94.57%
Nose detection	VJ and gabor filter [39]	93.00%
	Geometric template and region growing method [40]	93.33%
	Faceness-Net: deep convolutional neural network (CNN) [22]	92.09% (Cropped dataset)
		91.25% (Uncropped dataset)
	Conditional Random Fields (CRF) [19]	68.97% (FASSEG V2 database)
		59.23% (FASSEG V4 database)
	Our approach	98.99%
Mouth/lip detection	Active contour model [41]	90.00%
	Gabor filter and VJ [39]	93.00%
	Faceness-Net: deep convolutional neural network (CNN) [22]	94.17% (Cropped dataset)
		93.55% (Uncropped dataset)
	Conditional Random Fields (CRF) [19]	84.17% (FASSEG V2 database)
		74.53% (FASSEG V4 database)
	Our approach	96.38%
Mustache detection	Modified Active Shape Model [42]	98.80% (MBGC dataset)
		97.00% (FERT dataset)
	Geometric template and image binarization [43]	89.00%
	Our approach	92.30%
Beard detection	Modified Active Shape Model [42]	96.20% (MBGC dataset)
		95.80% (FERT dataset)
	Geometric template and image binarization [43]	89.00%
	Our approach	73.30%

For nose detection, a work by Yin and Basu [40] used Geometric template and region growing method with 270 frames from real video sequences and got 93.33% accuracy. We used VJ technique to detect nose and midpoint of bounding box to detect nose tip and got 98.99% accuracy. For lip detection, Le and Savvides in [41] used active contour method on MBGC dataset and obtained 90% accuracy. Use of Gabor filter and VJ for mouth/lip detection in [39] could achieve 93.00% accuracy. Recently, for mouth detection, Yang et al. [22] used CNN and achieved 94.17% and 93.55% accuracies for cropped dataset and uncropped datasets, respectively. We used landmark detection and VJ method to detect lips shape and points and got 96.38% accuracy.

For beard and mustache detection Le et al. in [42] used modified active shape model on two datasets: MBGC and FERT. For mustache detection, they achieved 98.80% and 97.00% accuracies for MBGC and FERT databases, respectively. For beard detection, they achieved 96.20% and 95.80% accuracies for MBGC and FERT databases, respectively. Wang and Yau in [43] uses beard detection to recognize gender. They used geometric template and image binarization and achieved 89% accuracy on FERET dataset. We used level set evolution method to differentiate skin and non-skin region and morphological operation to detect beard and mustache region. We could achieve 92.30% accuracy for beard detection, but for mustache detection, accuracy was comparatively low, 73.30%.

From the extracted face parts, features can be built, e.g., face color, face shape, eye-ball color, nose shape, lips shape, etc., that can become useful in query driven face retrieval. Such extracted features of human face images can be used to design a system where user can input a query in form of different expected features of faces and system will retrieve the matching faces, which can become useful in variety of applications, e.g. in crime department. We are experimenting for improving mustahce and beard detection and extraction.

6. Conclusion

During this research, first we studied about the different methods used for extracting each facial feature: eyes, nose, mouth, beard, and mustache. After analyzing available techniques for face parts detection, we concluded that techniques from geometry based approach and color based approach can give accurate results for frontal-face images. In this article, we implemented the proposed work using Matlab tool. We prepared our own face dataset containing total 270 color images with human face. For face boundary extraction, we used combination of skin color detection and landmark detection technique. Skin color detection detects skin color and landmark points of the bottom of a face are helpful to remove neck portion detected by skin color.

Out of 270 images, for 221 images we extracted face boundary correctly. For detecting further features, we used Viola Jones method to get region of eyes, mouth, and nose. We used YCbCr color space and morphological operations to detect eye centers. We used median to detect nose tip. For lip shape detection, we used combination of landmark points and VJ technique. For mustache and beard detection, we used level set evolution technique to detect non-skin region from face, and applied some morphological operations and connected component technique. We achieved 81.85% accuracy on face boundary extraction. We achieved 94.57%, 98.99%, 96.38%, 80.54% accuracy for eye, nose, lip, and lip shape detection, respectively. Thus, we demonstrated that a detected face part can provide substantial input in detecting other face parts accurately. In future, we plan to study retrieving images based on the face features, which is useful in query driven image retrieval.

References

Wang

Yen

Hsiao

. Facial feature extraction and applications: A review. In: Asian Conference on Intelligent Information and Database Systems 2012 Mar 19; (pp. 228-238). Springer, Berlin, Heidelberg.

Samal

Iyengar

. Automatic recognition and analysis of human faces and facial expressions: A survey. Pattern Recognition 1992 Jan 1; 25(1): 65-77.

Chellappa

Wilson

Sirohey

. Human and machine recognition of faces: A survey. Proceedings of the IEEE 1995 May; 83(5): 705-741.

Zhao

Chellappa

Phillips

Rosenfeld

. Face recognition: A literature survey. ACM Computing Surveys (CSUR) 2003 Dec 1; 35(4): 399-458.

Sakai

Nagao

Kanade

. Computer analysis and classification of photographs of human faces. Kyoto University; 1972 Oct.

Craw

Ellis

Lishman

. Automatic extraction of face-features. Pattern Recognition Letters 1987 Feb 1; 5(2): 183-187.

Yuille

Hallinan

Cohen

. Feature extraction from faces using deformable templates. International Journal of Computer Vision 1992 Aug 1; 8(2): 99-111.

Zhou

. A survey of face detection, extraction and recognition. Computing and Informatics 2012 Feb 20; 22(2): 163-195.

Acosta

Torres

Albiol

Delp

. An automatic face detection and recognition system for video indexing applications. Acoustics, Speech, and Signal Processing (ICASSP), 2002; IEEE International Conference on 2002 May 13; (4, pp. IV-3644). IEEE.

10.

Lee

Kim

. Video summarization and retrieval system using face recognition and mpeg-7 descriptors. In: International Conference on Image and Video Retrieval 2004 Jul 21; (pp. 170-178). Springer, Berlin, Heidelberg.

11.

Kim

. Intelligent immigration control system by using passport recognition and face verification. In: International Symposium on Neural Networks 2005 May 30 (pp. 147-156). Springer, Berlin, Heidelberg.

12.

Liu

Wang

Feng

. iBotGuard: An Internet-based intelligent robot security system using invariant face recognition against intruder. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 2005 Feb; 35(1): 97-105.

13.

Berbar

Kelash

Kandeel

. Faces and facial features detection in color images. In: Geometric Modeling and Imaging – New Trends 2006; (pp. 209-214). IEEE.

14.

Oravec

Kristof

Kolarik

Pavlovicova

. Extraction of facial features from color images. Radioengineering 2008 Sep 1; 17(3): 115-120.

15.

Mahoor

Abdel-Mottaleb

Ansari

. Improved active shape model for facial feature extraction in color images. Journal of Multimedia 2006 Jul; 1(4): 21-28.

16.

Shih

Cheng

Chuang

Wang

. Extracting faces and facial features from color images. International Journal of Pattern Recognition and Artificial Intelligence 2008 May; 22(03): 515-534.

17.

Jain

. Fundamentals of digital image processing. Englewood Cliffs, NJ: Prentice Hall; 1989.

18.

Viola

Jones

. Robust real-time face detection. International Journal of Computer Vision 2004 May 1; 57(2): 137-154.

19.

Khan

Ahmad

Ullah

Din

. Multiclass semantic segmentation of faces using CRFs. Turkish Journal of Electrical Engineering and Computer Sciences 2017 Jul 30; 25(4): 3164-3174.

20.

Happy

Routray

. Automatic facial expression recognition using features of salient facial patches. IEEE transactions on Affective Computing 2015 Jan 1; 6(1): 1-12.

21.

Ranjan

Patel

Chellappa

. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 2019 Jan 1; 41(1): 121-135.

22.

Yang

Luo

Loy

Tang

. Faceness-net: Face detection through deep facial part responses. IEEE Transactions on Pattern Analysis and Machine Intelligence 2018 Aug 1; 40(8): 1845-1859.

23.

Zhang

Luo

Loy

Tang

. Learning deep representation for face alignment with auxiliary attributes. IEEE Transactions on Pattern Analysis and Machine Intelligence 2016 May 1; 38(5): 918-930.

24.

Boukamcha

Hallek

Smach

Atri

. Automatic landmark detection and 3D face data extraction. Journal of Computational Science 2017 Jul 1; 21: 340-348.

25.

Dhahri

Belaid

. A new method to detect and remove a beard from 3D human face model. International Journal of Operational Research 2016; 27(1-2): 201-211.

26.

Brahmbhatt

Prajapati

Dabhi

. Survey and analysis of extraction of human face features. In: Power and Advanced Computing Technologies (i-PACT), 2017 Innovations in 2017 Apr 21; (pp. 1-8). IEEE.

27.

Solina

Peer

Batagelj

Juvan

Kovač

. Color-based face detection in the “15 seconds of fame” art installation. In: Mirage, Conf Computer Vision/Computer Graphics Collaboration for Model-Based Imaging, Rendering, Image Analysis and Graphical Special Effects 2003 Mar; (pp. 38-47). Inria.

28.

Martinez

. The AR face database. CVC Technical Report. 1998.

29.

Huang

Mattar

Lee

Learned-Miller

. Learning to align from scratch. In: Advances in Neural Information Processing Systems 2012; (pp. 764-772).

30.

Phillips

Wechsler

Huang

Rauss

. The FERET database and evaluation procedure for face-recognition algorithms. Image and Vision Computing 1998 Apr 27; 16(5): 295-306.

31.

Solomon

Breckon

. Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab. John Wiley and Sons; 2011 Jul 5.

32.

Hsu

Abdel-Mottaleb

Jain

. Face detection in color images. IEEE Transactions on Pattern Analysis and Machine Intelligence 2002 May; 24(5): 696-706.

33.

Rahman

Afrin

. Human face detection in color images with complex background using triangular approach. Global Journal of Computer Science and Technology 2013 May 31.

34.

Patravali

Wayakule

Katre

. Skin segmentation using YCBCR and RGB color models. International Journal of Advanced Research in Computer Science and Software Engineering 2014 Jul; 4(7).

35.

Huang

Zhang

Yan

Metaxas

. Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: Proceedings of the IEEE International Conference on Computer Vision 2013; (pp. 1944-1951).

36.

Nasiri

Khanchi

Pourreza

. Eye detection algorithm on facial color images. In: Modeling and Simulation, 2008 AICMS 08. Second Asia International Conference on 2008 May 13; (pp. 344-349). IEEE.

37.

Gui

Fox

. Level set evolution without re-initialization: A new variational formulation. In: Computer Vision and Pattern Recognition, 2005 CVPR 2005. IEEE Computer Society Conference on 2005 Jun 20; (1, pp. 430-436). IEEE.

38.

van Huan

Binh

Kim

. Eye feature extraction using K-means clustering for low illumination and iris color variety. In: Control Automation Robotics and Vision (ICARCV), 2010 11th International Conference on 2010 Dec 7; (pp. 633-637). IEEE.

39.

Vukadinovic

Pantic

. Fully automatic facial feature point detection using Gabor feature based boosted classifiers. In: Systems, Man and Cybernetics, 2005 IEEE International Conference on 2005 Oct 10 (2, pp. 1692-1698). IEEE.

40.

Yin

Basu

. Nose shape estimation and tracking for model-based coding. In: Acoustics, Speech, and Signal Processing, 2001 Proceedings (ICASSP’01). 2001 IEEE International Conference on 2001 (3, pp. 1477-1480). IEEE.

41.

Savvides

. A novel shape constrained feature-based active contour model for lips/mouth segmentation in the wild. Pattern Recognition 2016 Jun 1; 54: 23-33.

42.

Luu

Seshadri

Savvides

. Beard and mustache segmentation using sparse classifiers on self-quotient images. In: Image Processing (ICIP), 2012 19th IEEE International Conference on 2012 Sep 30; (pp. 165-168). IEEE.

43.

Wang

Yau

. Real-time beard detection by combining image decolorization and texture detection with applications to facial gender recognition. In: Computational Intelligence in Biometrics and Identity Management (CIBIM), 2013 IEEE Workshop on 2013 Apr 16; (pp. 58-65). IEEE.