Impact of self adaptive-elephant herding optimization towards neural network for facial emotion recognition

Abstract

FACIAL expression is one of the most efficient, universal and fundamental indicators to identify their emotions and intentions in humans. Various experiments have already been performed on automatic Facial Emotion Recognition (FER) owing to useful significance in medical diagnosis, stress monitoring for drivers, sociable robots, and other human-computer interface devices. Here, this proposed framework consists of two processes namely; “(i) proposed feature extraction and (ii) classification”. Here, a major novelty relies in the initial phase (i.e. feature extraction phase), where the Proposed Local Vector Pattern (Proposed- LVP) based features are extracted. In addition to the proposed-LVP, the other Discrete Wavelet Transform (DWT) and Gray Level Co-occurrence Matrix (GLCM) based features are also extracted. Besides, the Principal Component Analysis (PCA) method is used for reducing the dimension of the features. Further, they are subjected to classification process, where Optimized Neural Network (NN) is used. More particularly, a new Improved Elephant Herding Optimization (EHO) model termed as Self Adaptive-EHO (SA-EHO) is used to train the NN model via selecting the optimal weights. At last, the proposed work performance is computed over the other traditional systems with respect to the positive measures like “accuracy, sensitivity, specificity and precision”; negative measures like “False Positive Rate (FPR), False Negative Rate (FNR) and False Discovery Rate (FDR)”; other measures like “Negative Predictive Value (NPV), F1-score and Matthew’s Correlation Coefficient (MCC)”, respectively.

Keywords

Facial expressions proposed LVP neural network SA-EHO based optimization

Nomenclature

FER

Facial Emotion Recognition

LVP

Local Vector Pattern

EHO

Elephant Herding Optimization

SA-EHO

Self Adaptive – EHO

Neural Network

HCI

Human- Computer Interaction

MF-JFF

Mean Fitness Oriented JA+FF

CNN

Convolutional Neural Network

FPR

False Positive Rate

MCC

Matthews correlation coefficient

FNR

False Negative Rate

RAF-DB

Real-world Affective Face Database

NPV

Negative Predictive Value

FDR

False Detection Rate

LBP

Local Binary Pattern

DLP

Deep Locality-Preserving

STRNN

Spatial-Temporal Recurrent NN

C-RNN

Convolutional RNN

FFNN

Feed Forward NN

WMDNN

Weighted Mixture Deep NN

WOA

Whale Optimization Algorithm

GWO

Grey Wolf Optimization

Fire Fly

Jaya algorithm

DWT

Discrete Wavelet Transform

GLCM

Gray Level Co-occurrence Matrix

PCA

Principal Component Analysis

1. Introduction

“Facial expressions can be defined as the facial changes in response to a person’s internal emotional state, intentions, or social communication” [15]. FER plays the main role in identifying the emotional conditions of humans like pleasure, sorrow and annoyance, by means of facial outlook and their voices [23,35]. The face detection of human is mostly used in real time applications like security systems, medical field, artificial intelligence, machine learning, research fields etc [3,34,43]. In previous years, the automatic FER systems is concerned in various applications based on the following factors like biometrics, data-driven animation, clinical monitoring, analytics of crowd, interactive games, human-robot communication, and HCI systems [13,15,20,30].

Three main steps involved in an automatic facial analysis system are face detection or alignment, feature extraction and classification [52]. The new approaches have shown impressive performances in highly controlled environments and moreover, the automatic FER is found to be an extremely challenging task in the real-world scenario [41]. Generally, there are two main methods in conventional FER approaches: “(i) Geometric-based methods and (ii) Appearance-based methods” [2,5,27,42]. The Geometric-based methods include the characteristics of deformations, curvatures, distance calculations and various geometric parameters indicating the face geometry [10,39]. The variation in local texture can be measured based on appearance-based methods. The emotional analysis process poses more difficulties owing to its lower tangibility ineffective practical applications [22]. In order to overcome these difficulties, the researchers make use of various electric devices to acquire more emotions from the external signals [16,21,50].

The new achievements on deep learning approach include CNNs that concerned on object recognition and detection, which seems to be a complex task in FER [40,48]. The automatic learning of various levels of data representation regarding the facial emotions at higher levels can be determined by the CNN [1,1,33]. The deep-learning approach can be performed under two possible ways; they are (1) the majority of the applications can hold more data at the present situation (2) current improvement in GPU technology [4,26,32]. The 1st method is exploited in deep architectures for NN training and it avoids the issues regarding over-fitting [31]. Thus, it offers essential information based on the numerical computations that are essential for the training process [11,12]. However, it cannot be deployed in FER field due to the insufficiency of datasets.

The main contribution of the presented model is listed below:

Introduces a novel proposed LVP- based feature extraction model, where the features are extracted in dense spiral form.

Establishes an optimized NN, where the weights are optimally tuned using a new improved EHO algorithm.

For optimization purpose, a Self-Adaptive EHO is introduced, which ensures better recognition outputs.

Finally, the proposed SA-EHO method is analysed based on the performance with respect to specificity, accuracy, precision, sensitivity, FNR, FPR, NPV, F1-score, FDR and MCC.

The sections in the paper are arranged as follows: Section 2 presented the reviews about the FER. Development of a novel FER Model is represented in Section 3. Section 4 portrays the proposed LVP framework for feature extraction. Section 5 depicts the weight optimized NN for classification via SA-EHO model. The result and discussion are analysed experimentally in Section 6. The conclusion to this research work is presented in Section 7.

2. Literature review

2.1. Related works

In 2019, Zhang et al. [51] have proposed a novel FER method based on image edge detection and CNN algorithm. Here the facial emotions were extracted by convolution process; edges were extracted by maximum pooling method, and classification was carried out by softmax classifier. The optimization and network weight update was performed during training process and the CNN was optimized by back propagation model. Finally, the proposed algorithm has offered higher recognition rate when compared over the traditional R-CNN and FRR-CNN algorithms.

In 2019, Kim et al. [19] formulated new techniques depending on a hierarchical deep learning using FER. Here, the proposed algorithms are classified into classical feature extraction method and deep learning-based method. By using autoencoder technique, the facial images with neural emotions were produced and extracted with no required data. Moreover, proposed method involved CNN with LBP features and facial expressions were extracted from the geometrical changes of images. Finally, the combination of static appearance features and dynamic geometric features provided more accurate and efficient output.

In 2019, S. Li et al. [25] introduced to improve the deep features by concerning on inter-class scatter and locality closeness using new DLP-CNN techniques. In the proposed method, a novel facial expression database and RAF-DB included many facial images with different emotions, races and ages and each image were identified from various annotators. In the end, the output results of proposed method were showed improved performance than other existing methods.

In 2019, T. Zhang et al. [53] proposed STRNN in face image-based human emotion recognition. The proposed STRNN included multidirectional RNN layer that were identified by the co-occurring patterns in human emotions and collect long-range contextual cues of spatial portions of each time-frame under various directions by traversing it. Finally, the betterment of the implemented approach was found to be better accuracy over the existing traditional approaches.

In 2018, B. Yang et al. [49] have suggested WMDNN for recognizing the facial emotions. WMDNN included two channels for facial images; they were LBP and facial grayscale images. Hence, the outputs obtained from both channels were combined as a weighted manner. For extracting facial characteristics from the gray scale images, the partial VGG16 networks were established. The proposed network obtained higher performance when compared over the deep networks of multiple channels.

In 2018, P. M. Ferreira et al. [13] have formulated an end-to-end NN architecture for intended loss function. Moreover, the entire learning procedure normalized the loss function and proposed network provided the clear information regarding expression features. The proposed NN included three processes, they were: facial-part components, representation components, and classification component. The experimental outcome offered very high promising results for the proposed method while comparing with other methods.

In 2018, Neha Jain et al. [18] introduced Hybrid C-RNN method consists of CNN and RNN model. CNN models are mainly used for feature extraction and to eliminate the regression layer. In the proposed method, two different signals sources were joined to a spatial-temporal dependency model and the facial emotion in each image was identified. The experimental outputs obtain with high efficiency and better performance when compared to the traditional methods.

In 2018, Shui-Hua Wang et al. [47] proposed the FER model for overcoming the problems occurring in the current FER systems. The stationary wavelet entropy method removed the features and accordingly, this work exploited the single hidden layer FFNN as classifier. Moreover, the JA was deployed in this work that prevented the classifier training at the points of local optimum. The outcomes of proposed method indicate highest accuracy when compared to other traditional algorithms.

2.2. Review

Table 1 shows the reviews on FER systems. Initially, CNN was deployed in [51] that offers higher recognition rate and it also increases the training speed. Nevertheless, it needs to overcome the complexity of network structure. DNN was exploited in [19] that offer better performance, but it increases the accuracy rate and hence it needs the consideration of training data. In addition, DLP-CNN was introduced in [25], which offers better performance along with we can learn more discriminative features. However, it has to concern on more quantity and diversity of database. Likewise, STRNN was exploited in [53], which offers high performance and high accuracy. However, it has to overcome the issues of loss function. Moreover, WMDNN method was deployed in [49], which presents better efficiency and improves the recognition ability compared to other traditional methods; nevertheless it needs to speed up the algorithm and to improve the fusion network. DNN was exploited in [13] that are improved efficiency, better performance and provides quite promising results; however, it needs to consider training strategies on small datasets. DNN was suggested in [18] that offer better efficiency, high accuracy and it also reduces the false detection. However, it requires large amount of data. Finally, JA was implemented in [47], which offers increased accuracy. However, it has to focus on performance test.

Table 1
Reviews on conventional FER methods: features and challenges

Author [citation] Adopted scheme Features Challenges

H.Zhang et al. [51] CNN ♦ Higher recognition rate.
♦ Increased training speed. ♦ Have to overcome the complexity of network structure.

J. Kim et al. [19] DNN ♦ Increases the accuracy rate.
♦ Better performance. ♦ Needs consideration on training data.

K S. Li et al. [25] DLP-CNN ♦ Learn more discriminative features.
♦ Better performance. ♦ Quantity and diversity of database is not considered.

T. Zhang et al. [53] STRNN ♦ Higher accuracy.
♦ High performance. ♦ Have to overcome the issues of loss function.

B. Yang et al. [49] WMDNN ♦ Improve the recognition ability.
♦ High accuracy. ♦ To speed up the algorithm and to improve the fusion network.

P. M. Ferreira et al. [13] DNN ♦ Provides quite promising results.
♦ More effective and better performance. ♦ Have to consider training strategies on small datasets.

Neha Jain et al. [18] DNN ♦ High accuracy and better performance.
♦ Reduces the false detection. ♦ Requires large amount of data.

Shui-Hua Wang et al. [47] JA ♦ Higher accuracy. ♦ Have to concentrate on performance test.

Author [citation]	Adopted scheme	Features	Challenges
H.Zhang et al. [51]	CNN	♦ Higher recognition rate. ♦ Increased training speed.	♦ Have to overcome the complexity of network structure.
J. Kim et al. [19]	DNN	♦ Increases the accuracy rate. ♦ Better performance.	♦ Needs consideration on training data.
K S. Li et al. [25]	DLP-CNN	♦ Learn more discriminative features. ♦ Better performance.	♦ Quantity and diversity of database is not considered.
T. Zhang et al. [53]	STRNN	♦ Higher accuracy. ♦ High performance.	♦ Have to overcome the issues of loss function.
B. Yang et al. [49]	WMDNN	♦ Improve the recognition ability. ♦ High accuracy.	♦ To speed up the algorithm and to improve the fusion network.
P. M. Ferreira et al. [13]	DNN	♦ Provides quite promising results. ♦ More effective and better performance.	♦ Have to consider training strategies on small datasets.
Neha Jain et al. [18]	DNN	♦ High accuracy and better performance. ♦ Reduces the false detection.	♦ Requires large amount of data.
Shui-Hua Wang et al. [47]	JA	♦ Higher accuracy.	♦ Have to concentrate on performance test.

3. Development of a novel face emotion recognition model

3.1. Proposed method

FACIAL expression from the human faces can be taken based on their emotions and objective of communication among them. Various experiments have been performed on artificial FER systems owing to its functional significance in vehicle stress monitoring, medical diagnosis, sociable robots and other HCI systems. The input image is given to the proposed feature extraction process. Here, this proposed framework consists of two processes namely; “(i) feature extraction and (ii) classification”. During feature extraction phase, the Proposed-LVP based features can be removed in dense spiral form. These LVP based features along with the DWT and GLCM features are then subjected to the process of PCA after which they enter the classification phase, for which NN is exploited. Furthermore, to attain more precise classification accuracy, the weights of NN are optimally tuned using an improved EHO model termed as SA-EHO algorithm. EHO is a modern form of swarm-based meta-heuristic search approach that is influenced by the herding activity of the elephant community. Figure 1 illustrates the architectural representation of the proposed method.

Fig. 1.

Architectural representation of the proposed method.

4. Feature extraction: Proposed LVP framework

LVP [17] in the high-order derivative space can create the novel local pattern descriptor for recognizing facial emotions. The vector representation of every pixel is created through calculating both neighboring pixels and reference pixel values for a range of directions using varying distance. For reference pixel, the vector representation can be developed for including one dimensional structure of the micro patterns. Hence, LVP decreases the length of features by transforming the comparative space to represent the different spatial surrounding associations between its neighboring pixels and the reference pixel. In fact, the LVPs concatenation was compressed for creating many individual characteristics. In order to generate the accurate details of specific sub-region, LVP can be optimized through different directions of local derivative in nth order of LVP using $(n - 1)$ th order derivative space.

Hence, the presented work concerns on introducing a novel LVP framework, where the features are extracted in dense spiral form. The word dense indicates the average of neighbors. Usually in an image, a center pixel consists of 8 neighboring pixels. Here, let us consider the proposed LVP framework with $x_{1}$ as center pixel and let $x_{0}$ be one of the neighboring pixel and ‘α’ represents the index angle in different direction.

For center pixel $x_{1}$ , the pixels value at $α = 45$ ° can be computed by the following equation. $\begin{matrix} (1) & v_{α} (x_{1}) = \frac{1}{s} \sum_{k = 1}^{s} (x_{k} - x_{1}) \end{matrix}$ Here, $x_{k}$ indicates the neighboring pixel. The mathematical expression for pixel intensity $x_{k}$ can be expressed as in Eq. (2), where $m_{k}$ and $n_{k}$ indicates the coordinates of spiral neighbors. $\begin{matrix} (2) & x_{k} = I (m_{k}, n_{k}) \end{matrix}$

Accordingly, the coordinates of spiral neighbors denoted by $m_{k}$ and pixel intensity indicted by $n_{k}$ are evaluated as per Eq. (3) and (4). When α is greater than zero, i.e. $α > 0$ , it will be considered as the spiral angle. In the below equations $k = 1$ to s, where “s” refers to the total number of spiral neighbors. $\begin{array}{l} (3) & m_{k} = k cos ((k - 1) α) \\ (4) & n_{k} = k sin ((k - 1) α) \\ (5) & s ⩽ \frac{2 π}{α} + 1 \end{array}$

Equation (5) can be performed to limit the number of neighbors within 360° (i.e.) 1 spiral.

Under the condition $s > \frac{2 π}{α} + 1$ , there are no spiral neighbors and it adds the complexity in undesired relevance. Similarly, for $α + 45 = 90$ ° the same method is repeated with the same formula to obtain the required pixel values.

Consequently, $u_{2} (\cdot, \cdot)$ can be formally defined as in Eq. (6) $\begin{array}{l} u_{2} (v_{α} (x_{1}), v_{α + 45 °} (x_{1}), v_{α} (x_{0}), v_{α + 45 °} (x_{0})) \\ (6) & = \{\begin{matrix} 1, & if v_{α + 45 °} (x_{1}) - (\frac{v_{α + 45 °} (x_{0})}{v_{α} (x_{0})} \times v_{α} (x_{1})) ⩾ 0 \\ 0, & else \end{matrix} \end{array}$

From Eq. (6), the values of eight neighbor pixels can be determined in binary form (either 1 or 0). These binary values are then converted to decimal form, which is replaced at the place of center pixel. Consequently, by changing the centre pixel of the image, the above process is repeated and the LVP features are generated.

In addition, the DWT and GLCM [7] features of the image are also extracted. Thus the final features extracted via proposed LVP along with DWT and GLCM is denoted by $Fe$ , whose dimensions are then reduced by using PCA [8].

5. Weight optimized neural network for classification: Introduction to SA-EHO model

5.1. Optimized neural network

NN [31] contain the features $Fe$ and it is represented in Eq. (7), here $nu$ indicates total number of characters. $\begin{matrix} (7) & Fe = {{Fe}_{1}, {Fe}_{2}, \dots, {Fe}_{nu}} \end{matrix}$

The model contains input, output, and hidden layers. The output in hidden layer $e^{(H)}$ can be shown in Eq. (8), where $mf$ denotes the “activation function”, $u i$ and j indicates that the neurons in hidden layer and input layer respectively, $w_{(D u i)}^{(L)}$ indicated the bias weight with $u i$ th hidden neuron, $n_{u i}$ symbolizes input neurons count and $w_{(j u i)}^{(L)}$ represented the weight from jth input neuron to $u i$ th hidden neuron. The output of the network ${\hat{R}}_{o}$ is demonstrated as in Eq. (9), where ôdenotes the output neurons, $n_{h}$ indicated the number of hidden neurons $w_{(D \hat{o})}^{(R)}$ refers to the output bias weight of $\hat{o}$ th output layer, and thus $w_{(u i \hat{o})}^{(R)}$ indicated the weight from $u i$ th hidden layer to $\hat{o}$ th output layer. Consequently, the error present in both predicted and actual values can be determined as per Eq. (10) that should be reduced. In Eq. (10), $n_{G}$ indicates the output neuron count, $R_{\hat{o}}$ and ${\hat{R}}_{\hat{o}}$ denotes the actual and predicted output correspondingly. Figure 2. depicts the neural network model. $\begin{array}{l} (8) & e^{(H)} = mf (w_{(D u i)}^{(L)} + \sum_{j = 1}^{n_{i}} w_{(j u i)}^{(L)} In) \\ (9) & {\hat{R}}_{\hat{o}} = mf (w_{(D \hat{o})}^{(R)} + \sum_{u i = 1}^{n_{h}} w_{(u i \hat{o})}^{(R)} e^{(H)}) \\ (10) & {Er}^{*} = arg min_{{w_{(D u i)}^{(L)}, w_{(j u i)}^{(L)}, w_{(D \hat{o})}^{(R)}, w_{(u i \hat{o})}^{(R)}}} \sum_{= 1}^{n_{G}} | R_{\hat{o}} - {\hat{R}}_{\hat{o}} | \end{array}$

Fig. 2.

Neural network model.

In the presented work, it is planned to train the NN model by optimizing the weights $w = w_{(D u i)}^{(L)}$ , $w_{(j u i)}^{(L)}$ , $w_{(D \hat{o})}^{(R)}$ and $w_{(u i \hat{o})}^{(R)}$ using a new algorithm termed as SA-EHO.

5.2. Proposed SA-EHO model

EHO [11,24] helps in finding the near-optimal or optimal function values. Even though, the existing EHO provides better performance, it also involves few drawbacks such that it will not utilize the required information to identify the future and current searches. Therefore, to prevail over the drawbacks of the existing EHO, some improvements are made in the proposed SA-EHO algorithm. Elephants naturally exist in social groups that consist of more number of clans and hence every clan that resides in Matriarch is a female leader. Self-improvement is proven to be promising in traditional optimization algorithms [14,36–38,44]. Elephants naturally exist in social groups and it consists of a variety of clans, each clan resides under the female leader of the Matriarch. Moreover, the male elephant resides apart in these groups, which leave certain clans when growing up [24]. To imitate and implement the characteristics of elephants, the EHO algorithm is summarised into three major rules.

There are number of clans in population with permanent number of female and male elephants are present in every clans.

Some male elephants can live separately apart from clans.

The leader of all clans is a matriarch female elephant.

The elephant population is produced randomly and it is classified into certain number of clans and then arranged based on their fitness. Updating of every clan can be performed separately. Clan updating operators such as a matriarch influence each elephant in clan $ci$ and the next position c.

Update the elephant j in clan c as follows: $\begin{matrix} (11) & S_{n, c, j} = S_{c, j} + α . r . (S_{best, c} - S_{c, j}) \end{matrix}$ Where $S_{n, c, j}$ be the newly updated clan and $S_{c, j}$ represents the earlier location for elephant j in clan c, $α \in [0, 1]$ will be the scaling factor which indicates the effect of matriarch on c for $S_{c, j}$ . $S_{best, c}$ denotes matriarch c that is the fittest elephant individuals of c clan belongs to $r \in [0, 1]$ .

Moreover, the procedure of the proposed SA-EHO is as follows: As per the traditional method, the elephant fittest in every clan could not be indicated in Eq. (11). Therefore in the proposed method, a new evaluation is introduced for evaluating the fitness as given in Eq. (12). Conventionally clan update takes place based on the center clan; however as per the implemented method, the clan update takes place based on best population in current solution. $\begin{matrix} (12) & S_{n, c, j} = \frac{S_{n, c, j}^{*} + B_{p}}{2} \end{matrix}$ Here, $S_{n, c, j}^{*}$ represents the new clan in current solution, $B_{p}$ refers to the best population.

Separating operator: As mentioned earlier, the male elephants live alone apart from their clans, as they reach maturity. In each generation, the separation process can be determined by the operator modeling with worst fitness of individual elephants that is given in Eq. (13): $\begin{matrix} (13) & S_{worst, c} = S_{min} + r . (S_{max} - S_{min} + 1) \end{matrix}$ Where $S_{min}$ and $S_{max}$ are the minimum and maximum bounds in the single elephant positions correspondingly. $S_{worst, c}$ be the worst elephant individuals of clan $ci$ and r represents the random number $\in [0, 1]$ .

The pseudo-code for SA-EHO model is manifested in Algorithm 1 and the flow chart is illustrated in Fig. 3.

Algorithm 1

Proposed SA-EHO method

Fig. 3.

Flowchart of proposed SA-EHO algorithm.

6. Results and discussion

6.1. Simulation procedure

The proposed SA-EHO algorithm-based FER model is executed using MATLAB with resultant outcomes of each analysis can be observed. In addition, the betterment of the proposed SA-EHO based FER model was evaluated by comparing it over the traditional models like GWO-NN [29], WOA-NN [28], JA-NN [45], FF-NN [46] and MF-JFF-NN [36] with respect to “accuracy, sensitivity, specificity, precision, FPR, FNR, NPV, FDR, F1-score and MCC”. Here, the performances were carried out with respect to variation in hidden neurons that range from 10 to 30.

6.2. Performance analysis

The FER method performance using SA-EHO over the conventional model with respect to “accuracy, sensitivity, specificity, precision, FPR, FNR, NPV, FDR, F1-score and MCC” are represented in Fig. 4, Fig. 5 and Fig. 6. Moreover, the analysis of proposed SA-EHO method is compared with the conventional methods like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN is described in this section. Figure 4 revealed the analysis on positive measures. However, in Fig. 4(a) the accuracy of the proposed SA-EHO method with 30 counts of hidden neurons has attained superior to the traditional methods such as GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN with 8.76%, 3.13%, 5.63%, 6.88% and 1.87% respectively. In addition, the Fig. 4(b) indicated the sensitivity of proposed SA-EHO algorithm for 30 hidden neuron count is 4.51%, 9.67%, 10.96% and 18.70% better than the traditional models like WOA-NN, JA-NN FF-NN, MF-JFF-NN and EHO, correspondingly. Similarly, the specificity of SA-EHO proposed method with 30 counts of hidden neurons as shown in Fig. 4(c) is 9.73%, 2.63%, 4.66% and 6.18% superior to GWO-NN, WOA-NN, JA-NN FF-NN, respectively. However, in Fig. 4 (d) the number of hidden neurons is 25, the proposed SA-EHO model is 21.87%, 25%, 37.5%, 19.79% and 13.5% superior to GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN, respectively with higher precision.

Fig. 4.

Performance analysis in proposed SA-EHO scheme over existing models for postive measures like (a) accuracy, (b) sensitivity, (c) specificity, (d) precision.

The analysis of negative measures is revealed in Fig. 5. For better performance in Fig. 5(a), FPR measures should be minimized. The presented SA-EHO method holds a value of 1 when the numbers of hidden neuron is 25and it is compared to the traditional method in terms of GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN has the values of 4, 4.8, 8, 3 and 2 correspondingly. Moreover, in Fig. 5(b) the FNR of SA-EHO proposed method for 30 counts of hidden neurons attains a minimal value of 23, whereas, the existing WOA-NN, JA-NN, FF-NN and MF-JFF-NN attained the FNR values of 26, 30, 32 and 37. In Fig. 5(c), the FDR in proposed SA-EHO method holds a value 4 for 25 numbers of hidden neuron, whereas the compared to traditional methods like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN has attained values of 25, 28, 40, 23 and 15 correspondingly.

The analysis on other measures is revealed in Fig. 6. An improvement of 3.5%, 4%, 7.53%, 2.51% and 1.5% is obtained by the proposed SA-EHO method over the existing models like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN respectively in terms of NPV with 25 counts of hidden neurons in Fig. 6(a). However, in Fig. 6(b) the F1-score of the proposed SA-EHO method over the existing models like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN shows an enhancement of 23.8%, 11.9%, 19.04%, 22.61% and 10.7%, respectively when the count of hidden neurons is 30. In addition, the Fig. 6(c) MCC of SA-EHO model at 30 counts of hidden neurons is 30.48%, 14.63%, 24.39%, 29.26% and 10.97% better than the existing methods like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN, correspondingly. Moreover, from attained outputs it is clear that SA-EHO proposed method can be better than other traditional methods in terms of performance measures.

Fig. 5.

Performance analysis in proposed SA-EHO scheme over existing models for negative measures like (a) FPR, (b) FNR, (c) FDR.

Fig. 6.

Performance analysis of proposed SA-EHO scheme over existing models for other measures such as (a) NPV, (b) F1-score, (c) MCC.

6.3. Analysis of overall performance

Table 2 demonstrate the analysis of overall performance in the proposed SA-EHO method over the existing method for different performance measures. When comparing with other traditional methods, the proposed SA-EHO method shows higher outcomes for positive measures and lower outcomes for negative measures, which is necessary for the optimal model. The proposed SA-EHO method in terms of accuracy is 3%, 3.8%, 6.23%, 3% and 2.15% better than the traditional models like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN correspondingly. Moreover, the sensitivity of SA-EHO model is 2.12% and 8.5% superior to the traditional models such as GWO-NN and JA-NN, correspondingly. However, for the specificity measure the proposed SA-EHO is 3.58%, 4.3%, 7.89%, 2.63% and 1.67% is superior to GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN model respectively. However, the precision of proposed SA-EHO method will be 23%, 26.86%, 38.17%, 19.52% and13.4% superior to the existing methods with respect to GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN, respectively. Moreover, the proposed SA-EHO model of FPR provides the value of 0.0047619, which is better than the traditional method values such as 0.040476, 0.047619, 0.083333, 0.030952 and 0.021429 for GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN model respectively. On analysing FNR, the proposed SA-EHO model is 4.34% 17.39% and 13% superior to traditional methods like GWO-NN, JA-NN, FF-NN and MF-JFF-NN model, correspondingly. Further, the proposed SA-EHO methods overcome the conventional methods like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN by 3.58%, 4.3%, 7.89%, 2.63% and 1.67% for NPV. In addition, FDR of existing method holds the value of 0.26154, 0.29851, 0.40698, 0.22807 and 0.16981 for GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN, whereas, proposed SA-EHO attains lowest value of 0.040816. In addition, the proposed SA-EHO techniques are 9.97%, 13.13%, 17.22%, 12.28% and 9.42% superior to the traditional methods with respect to GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN in F1-score. Finally, MCC of proposed SA-EHO method is 14.4% superior to GWO-NN, 18.29% better to WOA-NN, 23.67% better to JA-NN, 16.12% superior to FF-NN and 12% better than MF-JFF-NN with greater MCC. Thus, the overall performance in SA-EHO proposed method were found to be better when comparing with traditional models.

Table 2
Overall performance analysis in proposed SA-EHO method with the existing models

Metrics GWO-NN [29] WOA-NN [28] JA-NN [45] FF-NN [46] MF-JFF-NN [36] SA-EHO

Accuracy 0.92041 0.91224 0.8898 0.92041 0.92857 0.94898

Sensitivity 0.68571 0.67143 0.72857 0.62857 0.62857 0.67143

Specificity 0.95952 0.95238 0.91667 0.96905 0.97857 0.99524

Precision 0.73846 0.70149 0.59302 0.77193 0.83019 0.95918

FPR 0.040476 0.047619 0.083333 0.030952 0.021429 0.004762

FNR 0.31429 0.32857 0.27143 0.37143 0.37143 0.32857

NPV 0.95952 0.95238 0.91667 0.96905 0.97857 0.99524

FDR 0.26154 0.29851 0.40698 0.22807 0.16981 0.040816

F1-score 0.71111 0.68613 0.65385 0.69291 0.71545 0.78992

MCC 0.66564 0.63536 0.59355 0.65225 0.68405 0.77762

Metrics	GWO-NN [29]	WOA-NN [28]	JA-NN [45]	FF-NN [46]	MF-JFF-NN [36]	SA-EHO
Accuracy	0.92041	0.91224	0.8898	0.92041	0.92857	0.94898
Sensitivity	0.68571	0.67143	0.72857	0.62857	0.62857	0.67143
Specificity	0.95952	0.95238	0.91667	0.96905	0.97857	0.99524
Precision	0.73846	0.70149	0.59302	0.77193	0.83019	0.95918
FPR	0.040476	0.047619	0.083333	0.030952	0.021429	0.004762
FNR	0.31429	0.32857	0.27143	0.37143	0.37143	0.32857
NPV	0.95952	0.95238	0.91667	0.96905	0.97857	0.99524
FDR	0.26154	0.29851	0.40698	0.22807	0.16981	0.040816
F1-score	0.71111	0.68613	0.65385	0.69291	0.71545	0.78992
MCC	0.66564	0.63536	0.59355	0.65225	0.68405	0.77762

6.4. Analysis on classifier

To showcase the effectiveness of the proposed SA-EHO algorithm, analysis is carried out with deep learning methods like DNN [19], CNN [51], and STRNN [53]. The results attained for both the existing and proposed model is illustrated in Table 3. On observing the result, the accuracy of the proposed SA-EHO model is 10.4%, 9.46%, and 4.51% superior to traditional methods DNN, CNN, and STRNN, respectively. In terms of sensitivity, the adopted method is 97.8%, 95.7%, and 42.5% higher than traditional methods DNN, CNN, and STRNN. On analyzing FPR, the proposed SA-EHO model is 86.96%, 30.4%, and 6.16% superior to traditional models like DNN, CNN, and STRNN. Thus, from the result, it can be noticed that the proposed SA-EHO method is much better than the state of art models.

Table 3
Performance analysis of proposed model with existing classifiers

Metrics NN [34] CNN [43] RNN [33] SA-EHO

DNN CNN STRNN

Accuracy 0.8449 0.85918 0.90612 0.94898

Sensitivity 0.014286 0.028571 0.38571 0.67143

Specificity 0.98333 0.99762 0.99286 0.99524

Precision 0.125 0.66667 0.9 0.95918

FPR 0.016667 0.002381 0.0071429 0.0047619

FNR 0.98571 0.97143 0.61429 0.32857

NPV 0.98333 0.99762 0.99286 0.99524

FDR 0.875 0.33333 0.1 0.040816

F1-score 0.025641 0.054795 0.54 0.78992

MCC −0.0065744 0.11749 0.55256 0.77762

Metrics	NN [34]	CNN [43]	RNN [33]	SA-EHO
Accuracy	0.8449	0.85918	0.90612	0.94898
Sensitivity	0.014286	0.028571	0.38571	0.67143
Specificity	0.98333	0.99762	0.99286	0.99524
Precision	0.125	0.66667	0.9	0.95918
FPR	0.016667	0.002381	0.0071429	0.0047619
FNR	0.98571	0.97143	0.61429	0.32857
NPV	0.98333	0.99762	0.99286	0.99524
FDR	0.875	0.33333	0.1	0.040816
F1-score	0.025641	0.054795	0.54	0.78992
MCC	−0.0065744	0.11749	0.55256	0.77762

7. Conclusion

The proposed SA-EHO framework consists of two processes, namely, feature extraction and classification. Initially, proposed LVP based features along with DWT and GLCM features were extracted in the feature extraction phase and also PCA was deployed for reducing the dimension of the features. However, they are subjected to a classification process for which an optimized NN were used. Further, a new improved EHO model termed as SA-EHO was exploited for training the NN model via selecting the optimal weights. Finally, the proposed SA-EHO provided better performance when compared to the other traditional models like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN, respectively. On observing the analysis, the proposed SA-EHO method in terms of accuracy is 3%, 3.8%, 6.23%, 3% and 2.15% better than the traditional models like GWO-NN, WOA-NN, JA-NN, FF-NN and MF-JFF-NN correspondingly. Finally, MCC of the proposed SA-EHO method were 14.4% superior to WOA-NN, 18.29% better to WOA-NN, 23.67% better to JA-NN, 16.12% superior to FF-NN and 12% better than MF-JFF-NN with greater MCC, respectively. Thus, the enhancement of the proposed SA-EHO method was validated effectively.

References

V.H.

Arul,

V.G.

Sivakumar,

Marimuthu and

Chakraborty, An approach for speech enhancement using deep convolutional neural network, Multimedia Research 2(1) (2019), 37–44.

M.M.

Assed,

T.C.

Khafif,

G.O.

Belizario,

Fatorelli,

C.C.

de Ameida Rocca and

de Pádua Serafim, Facial emotion recognition in maltreated children: A systematic review, Journal of Child and Family Studies 29(5) (2020), 1493–1509. doi:10.1007/s10826-019-01636-w.

Bollero,

Rocco,

Gianfreda,

Gualtieri,

Miranda and

Barlattani, Epidemiology, etiopathogenesis, treatment and prognosis of oral thermal burns from food and drinks, Dental Hypotheses 10(3) (2019), 80–81. doi:10.4103/denthyp.denthyp_56_19.

K.A.

Bonfils,

Ventura,

K.L.

Subotnik and

K.H.

Nuechterlein, Affective prosody and facial emotion recognition in first-episode schizophrenia: Associations with functioning & symptoms, Schizophrenia Research: Cognition 18 (2019), 100153.

Cooper,

C.W.

Hobson and

S.H.

van Goozen, Facial emotion recognition in children with externalising behaviours: A systematic review, Clinical child psychology and psychiatry 25(4) (2020), 1068–1085. doi:10.1177/1359104520945390.

Cristin,

V.C.

Raj and

Marimuthu, Face image forgery detection by weight optimized neural network model, Multimedia Research 2(2) (2019), 19–27.

Devi, An innovative facial emotion recognition model enabled by optimal feature selection using firefly plus Jaya algorithm, In communication.

Devi and

M.M.S.J.

Preetha, Automatic face emotion recognition with the aid of probability-based bird swarm-trained neural network, International Journal of Swarm Intelligence Research (IJSIR) 12(4) (2021), 1–24. doi:10.4018/IJSIR.2021100101.

Di Mauro and

Longo, Skype traffic detection: A decision theory based tool, in: 2014 International Carnahan Conference on Security Technology (ICCST), IEEE, 2014, pp. 1–6.

10.

Ding,

Zhao,

Li and

Yuan, Facial expression recognition from image sequence based on LBP and Taylor expansion, IEEE Access 5 (2017), 19409–19419. doi:10.1109/ACCESS.2017.2737821.

11.

M.A.

Elhosseini,

R.A.

El Sehiemy,

Y.I.

Rashwan and

X.Z.

Gao, On the performance improvement of elephant herding optimization algorithm, Knowledge-Based Systems 166(15) (2019), 58–70. doi:10.1016/j.knosys.2018.12.012.

12.

Fan and

Hung, A novel local pattern descriptor – local vector pattern in high-order derivative space for face recognition, IEEE Transactions on Image Processing 23(7) (2014), 2877–2891. doi:10.1109/TIP.2014.2321495.

13.

P.M.

Ferreira,

Marques,

J.S.

Cardoso and

Rebelo, Physiological inspired deep neural networks for emotion recognition, IEEE Access 6 (2018), 53930–53943. doi:10.1109/ACCESS.2018.2870063.

14.

George and

B.R.

Rajakumar, APOGA: An adaptive population pool size based genetic algorithm, in: AASRI Procedia – 2013 AASRI Conference on Intelligent Systems and Control (ISC 2013), Vol. 4, 2013, pp. 288–296.

15.

Guo

et al., Dominant and complementary emotion recognition from still images of faces, IEEE Access 6 (2018), 26391–26403. doi:10.1109/ACCESS.2018.2831927.

16.

Hao,

W.-H.

Cao,

Z.-T.

Liu,

Wu and

Xiao, Visual-audio emotion recognition based on multi-task and ensemble learning with multiple features, Neurocomputing. In press, corrected proof Available online 20 Jan. 2020.

17.

T.-Y.

Hung and

K.-C.

Fan, Local vector pattern in high-order derivative space for face recognition, in: 2014 IEEE International Conference on Image Processing (ICIP), IEEE, 2014, pp. 239–243. doi:10.1109/ICIP.2014.7025047.

18.

Jain,

Kumar,

Shamsolmoali and

Zareapoor, Hybrid deep neural networks for face emotion recognition, Pattern Recognition Letters 115 (2018), 101–106. doi:10.1016/j.patrec.2018.04.010.

19.

Kim,

P.P.

Roy and

Jeong, Efficient facial expression recognition algorithm based on hierarchical deep neural network structure, IEEE Access 7 (2019), 41273–41285. doi:10.1109/ACCESS.2019.2907327.

20.

Kim and

E.M.

Provost, ISLA: Temporal segmentation and labeling for audio-visual emotion recognition, IEEE Transactions on Affective Computing 10(2) (2019), 196–208.

21.

Koulierakis,

Siolas,

Efthimiou,

Fotinea and

A.-G.

Stafylopatis, Recognition of static features in sign language using key-points, in: Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives, 2020, pp. 123–126.

22.

Lenzoni,

Bozzoni,

Burgio,

de Gelder and

Semenza, Recognition of emotions conveyed by facial expression and body postures in myotonic dystrophy (DM), 2020, CortexIn press, uncorrected proof Available online 19.

23.

et al., The fusion of electroencephalography and facial expression for continuous emotion recognition, IEEE Access 7 (2019), 155724–155736. doi:10.1109/ACCESS.2019.2949707.

24.

Li,

Lei,

A.H.

Alavi and

G.G.

Wang, Elephant herding optimization: Variants, hybrids, and applications, Mathematics 8(9) (2020), 1415. doi:10.3390/math8091415.

25.

Li and

Deng, Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition, IEEE Transactions on Image Processing 28(1) (2019), 356–370. doi:10.1109/TIP.2018.2868382.

26.

Mehdizadehfar,

Ghassemi,

Fallah and

Pouretemad, EEG study of facial emotion recognition in the fathers of autistic children, Biomedical Signal Processing and Control 56 (2020), 101721. doi:10.1016/j.bspc.2019.101721.

27.

Mehendale, Facial emotion recognition using convolutional neural networks (FERC), SN Applied Sciences 2(3) (2020), 1–8.

28.

Mirjalili and

Lewisa, The whale optimization algorithm, Advances in Engineering Software 95 (2016), 51–67. doi:10.1016/j.advengsoft.2016.01.008.

29.

Mirjalili,

S.M.

Mirjalili and

Lewis, Grey wolf optimizer, Advances in Engineering Software 69 (2014), 46–61. doi:10.1016/j.advengsoft.2013.12.007.

30.

Mistry,

Zhang,

S.C.

Neoh,

C.P.

Lim and

Fielding, A micro-GA embedded PSO feature selection approach to intelligent facial emotion recognition, IEEE Transactions on Cybernetics 47(6) (2017), 1496–1509. doi:10.1109/TCYB.2016.2549639.

31.

Mohan,

S.S.

Chee,

D.K.P.

Xin and

L.P.

Foong, Artificial neural network for classification of depressive and normal in EEG, in: 2016 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES), 2016.

32.

Passardi,

Peyk,

Rufer,

T.S.H.

Wingenbach and

M.C.

Pfaltz, Facial mimicry, facial emotion recognition and alexithymia in post-traumatic stress disorder, Behaviour Research and Therapy 122 (2019), 103436. doi:10.1016/j.brat.2019.103436.

33.

R.A.

Potamias,

Siolas and

Stafylopatis, A robust deep ensemble classifier for figurative language detection, in: International Conference on Engineering Applications of Neural Networks, Springer, Cham, 2019, pp. 164–175. doi:10.1007/978-3-030-20257-6_14.

34.

Precenzano,

Parisi,

Lanzara,

Vetri,

F.F.

Operto,

G.M.G.

Pastorino,

Ruberto,

Messina,

M.C.

Risoleo,

Santoro and

Bitetti, Electroencephalographic abnormalities in autism spectrum disorder: Characteristics and therapeutic implications, Medicina 56(9) (2020), 419. doi:10.3390/medicina56090419.

35.

et al., Facial expressions recognition based on cognition and mapped binary patterns, IEEE Access 6 (2018), 18795–18803. doi:10.1109/ACCESS.2018.2816044.

36.

B.R.

Rajakumar, Impact of static and adaptive mutation techniques on genetic algorithm, International Journal of Hybrid Intelligent Systems 10(1) (2013), 11–22. doi:10.3233/HIS-120161.

37.

B.R.

Rajakumar, Static and adaptive mutation techniques for genetic algorithm: A systematic comparative analysis, International Journal of Computational Science and Engineering 8(2) (2013), 180–193. doi:10.1504/IJCSE.2013.053087.

38.

B.R.

Rajakumar and

George, A new adaptive mutation technique for genetic algorithm, in: Proceedings of IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 2012, pp. 1–7.

39.

L.M.

Rappaport,

D.M.

Carney,

Verhulst,

M.C.

Neale and

Roberson-Nay, A developmental twin study of emotion recognition and its negative affective clinical correlates, Journal of the American Academy of Child & Adolescent Psychiatry 57(12) (2018), 925–933. doi:10.1016/j.jaac.2018.05.028.

40.

Reed and

Steed, The effects of concurrent cognitive task load on recognising faces displaying emotion, Acta Psychologica 193 (2019), 153–159. doi:10.1016/j.actpsy.2019.01.001.

41.

Ryu,

A.R.

Rivera,

Kim and

Chae, Local directional ternary pattern for facial expression recognition, IEEE Transactions on Image Processing 26(12) (2017), 6006–6018. doi:10.1109/TIP.2017.2726010.

42.

Sarvakar,

Senkamalavalli,

Raghavendra,

J.S.

Kumar,

Manjunath and

Jaiswal, Facial emotion recognition using convolutional neural networks, Materials Today: Proceedings. (2021).

43.

Soni and

Y.S.

Rawal, A Comparative Assessment Of Emotional Intelligence Of Gen-X And Gen-Y Professionals In Udaipur District.

44.

S.M.

Swamy,

B.R.

Rajakumar and

I.R.

Valarmathi, Design of hybrid wind and photovoltaic power system using opposition-based genetic algorithm with Cauchy mutation, in: IET Chennai Fourth International Conference on Sustainable Energy and Intelligent Systems (SEISCON 2013), Chennai, India, 2013.

45.

Venkata Rao, Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems, International Journal of Industrial Engineering Computations,vol. 7 (2016), 19–34. doi:10.5267/j.ijiec.2015.8.004.

46.

Wang,

Zhou,

Sun and

Cui, Firefly algorithm with neighborhood attraction, Information Sciences 382–383 (2017), 374–387. doi:10.1016/j.ins.2016.12.024.

47.

S.-H.

Wang,

Phillips,

Z.-C.

Dong and

Y.-D.

Zhang, Intelligent facial emotion recognition based on stationary wavelet entropy and Jaya algorithm, Neurocomputing 272 (2018), 668–676. doi:10.1016/j.neucom.2017.08.015.

48.

M.J.

West,

A.J.

Angwin,

D.A.

Copland,

W.L.

Arnott and

N.L.

Nelson, Cross-modal emotion recognition and autism-like traits in typically developing children, Journal of Experimental Child Psychology 191 (2020), 104737. doi:10.1016/j.jecp.2019.104737.

49.

Yang,

Cao,

Ni and

Zhang, Facial expression recognition using weighted mixture deep neural network based on double-channel facial images, IEEE Access 6 (2018), 4630–4640. doi:10.1109/ACCESS.2017.2784096.

50.

Yitzhak,

Gurevich,

Inbar,

Lecker and

Aviezer, Recognition of emotion from subtle and non-stereotypical dynamic facial expressions in Huntington’s disease, 2020, CortexIn press, uncorrected proof Available online 7.

51.

Zhang,

Jolfaei and

Alazab, A face emotion recognition method using convolutional neural network and image edge computing, IEEE Access 7 (2019), 159081–159089. doi:10.1109/ACCESS.2019.2949741.

52.

Zhang,

Huang,

Du and

Wang, Facial expression recognition based on deep evolutional spatial-temporal networks, IEEE Transactions on Image Processing 26(9) (2017), 4193–4203. doi:10.1109/TIP.2017.2689999.

53.

Zhang,

Zheng,

Cui,

Zong and

Li, Spatial–temporal recurrent neural network for emotion recognition, IEEE Transactions on Cybernetics 49(3) (2019), 839–847. doi:10.1109/TCYB.2017.2788081.