Ensemble of pre-learned deep learning model and an optimized LSTM for Alopecia Areata classification

Abstract

Alopecia Areata (AA) is one of the most widespread diseases, which is generally classified and diagnosed by the Computer Aided Diagnosis (CAD) models. Though it improves AA diagnosis, it has limited interoperability and needs skilled radiologists in medical image interpretation. This problem can be solved by developing Deep Learning (DL) models with CAD for accurately diagnosing AA patients. Many studies engaged only in specific DL models such as Convolutional Neural Network (CNN) in medical imaging, which provides different independent results and many parameters, which limits their generalizability for different datasets. To combat this limitation, this work proposes an Ensemble Pre-Learned DL and an Optimized Long Short-Term Memory (EPL-OLSTM) model for AA classification. Initially, many healthy and AA scalp hair images are separately fed to the pre-learned CNN structures, i.e. AlexNet, ResNet, and InceptionNet to extract the deep features. Then, these features are passed to the OLSTM, in which the Battle Royale Optimization (BRO) algorithm is applied to optimize the LSTM’s hyperparameters. Moreover, the output of the LSTM is classified by the fuzzy-softmax into the associated AA classes, including mild, moderate, and severe. Thus, this model can increase the accuracy of differentiating between healthy and multiple AA scalp hair classes. Finally, an extensive experiment using the Figaro1k (for healthy scalp hair images) and DermNet (for different AA scalp hair images) datasets demonstrates that the EPL-OLSTM achieves 93.1% accuracy compared to the state-of-the-art DL models.

Keywords

Alopecia areata computer-aided diagnosis deep learning pre-learned CNN LSTM battle royale optimizer fuzzy-softmax

1 Introduction

Hair is an essential feature of a person’s physical appearance. The keratin layer of hair becomes brittle and split due to the impact of environmental factors, like temperature, and humidity, along with physicochemical treatments, therefore damaging hair quality or causing hair loss. This results in Alopecia Areata (AA), which is an autoimmune disease involving nonscarring hair loss in well-defined patches that can influence the whole scalp area and tend to baldness [1]. This AA affects millions of individuals worldwide, particularly those with a family background of AA. It instigates while the body’s autoimmune system targets the hair follicles, impeding their regular operations, and avoiding potential hair growth. According to the World Health Organization (WHO), it is predicted that 1 in 1000 individuals are affected by AA disease. The lifetime risk of occurring AA in the population is nearly 2% [2]. Particularly, AA data related to the condition and its symptoms exhibit many distinct characteristics compared to other kinds of data. AA data includes hair loss patterns, clinical features, treatment options, disease progression, psychological impact, genetic factors and research and clinical trials. It helps understand the clinical profile of individuals affected by the condition, as well as the response to treatment and management strategies. Mostly, trichoscopy and biopsies are required to classify and diagnose AA in the past decades [3]. But the disadvantages of these diagnostic models include the unpredictability of the number of tests needed for proper diagnosis.

As a result, there is a huge opportunity to develop novel models based on Artificial Intelligence (AI) algorithms for classifying and diagnosing AA [4 –6]. Machine learning models including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), decision trees, etc., have revealed effective performance in the classification and diagnosis of multiple diseases. These models adopt various computer algorithms that exhibit the perspective to learn and adapt. In dermatology, effective classification and diagnosis have been accomplished by various machine learning models [7]. For instance, SVM, KNN, and decision trees have been applied to analyze and categorize scalp images, which assists in classifying scalp conditions like dandruff, AA, etc. But, these models do not perform well on multi-classification tasks, are sensitive to the parameters like kernel function, and do not learn the correlation of samples.

To tackle all these issues, DL models have been employed in recent medical diagnosis systems. In dermatology, few researchers developed different CNN models to classify and diagnose scalp hair problems, i.e., dandruff, AA, allergies, and folliculitis, oily scalp. Also, these models can predict the different levels of hair loss from the human scalp or skin images [8, 9]. However, these models realize different performances for different datasets due to the variation in the number of samples. This results in restricting the generalizability of these models and challenging to set the appropriate parameters for network learning when using a variety of datasets.

Hence, to address these problems, in this manuscript, an EPL-OLSTM model is proposed for AA classification from both human healthy scalp hair and AA scalp hair images. Compared to the previous works in AA classification, the proposed model can extract deep features from the scalp hair images using pre-learned CNN structures and classify them into corresponding AA classes using the OLSTM network in an automated way. This alleviates the manual extraction of features and reduces the computational complexity. The main contributions of this model are the following:

First, healthy and AA scalp hair images of various individuals are independently given to the AlexNet, ResNet, and InceptionNet models for deep feature extraction.

Second, the extracted deep features are passed to the OLSTM network, wherein the LSTM’s hyperparameters are optimized by the BRO algorithm.

Finally, the fuzzy-softmax function is applied to classify the resultant features from the LSTM network into the associated AA classes, including mild, moderate, and severe.

Based on this model, the accuracy of classifying and diagnosing AA can be improved significantly. It also enhances the model generalization for different kinds of medical images. The findings reveal the future potential of the ensemble DL model to differentiate AA classes and diagnose patients suitably.

The remaining article is prepared as follows: Section 2 discusses the works related to the classification and diagnosis of AA/scalp hair problems. Section 3 explains the EPL-OLSTM model and Section 4 illustrates its performance compared to the existing models. Section 5 summarizes the study and presents its future enhancement.

2 Literature survey

Nabahhin et al. [10] developed an expert model, which conducts treatment for various probable hair loss disorders of the levels between individuals by asking yes or no questions. First, it may ask the customer to choose the proper answer on all screens. At the end of the dialog session, the treatment and suggestions for the disorder were provided to the customer. But more characteristics related to hair loss were needed to improve the diagnosis.

Wang et al. [11] applied the DL models to hairy scalp images to identify the different scalp conditions. In this model, the ImageNet-VGG-f structure Bag-Of-Words (BOW) was executed with an SVM classifier and Histogram-Of-Gradients (HOG) or Pyramid HOG (PHOG) with an SVM classifier. But the number of scalp images for training was inadequate and the accuracy was limited to the small datasets.

Lee et al. [12] identified the topographic phenotypes of AA using cluster analysis and designed a grading model to stratify diagnosis. At first, clinical images of patients with AA were collected. Then, topographic phenotypes of AA were detected by hierarchical clustering with Ward’s method. Also, variances in clinical features and diagnosis were compared across the different clusters. But the statistical efficiency was degraded because of the limited number of patients with severe AA.

Seo and Park [13] presented a scheme to prevent hair loss and diagnose the scalp by capturing Alopecia Feature (AF) depending on the scalp image. Primarily, the scalp images were preprocessed by image processing to fine-tune the contrast of microscopy input and reduce the light reflection. Then, the AFs like the number of hair, follicles, density, etc., were extracted from the preprocessed images by the gridline selection and eigenvalues to compute the growth level of alopecia. But it needs a massive quantity of scalp images and designs an AI model to automatically extract several kinds of AFs for increasing efficiency.

Fatima et al. [14] investigated clinical, dermoscopic, and histopathological findings in patients of AA. In this investigation, 50 successive patients participating dermatology outpatient department of a tertiary care hospital over 2 years with clinical attributes evocative of AA were chosen. After that, a clinical analysis was conducted by dermoscopy and skin biopsy taken from the margin of an active lesion. Moreover, the data was evaluated by determining the mean and standard variance. However, it needs an automated model to identify and diagnose AA appropriately.

Ibrahim et al. [15] presented an analysis of the pre-trained categorization of scalp conditions with the help of image processing methods. At first, the scalp images were collected and preprocessed. Then, various characteristics like shape, color, and texture were obtained from all images to determine the Region-Of-Interest (ROI). The values of the pre-trained features were utilized as a reference during the categorization. The SVM was used to categorize the scalp conditions. But it takes more time and complexity due to the independent feature extraction and classification processes.

Zhang et al. [16] developed a rapid and simple technique to identify the level of hair damage based on the lightweight CNN model called Hair Diagnosis MobileNet (HDM-Net). In this technique, the HDM-Net was utilized to obtain and choose the features. Such features were then fed to the SVM to categorize hair damage images. Though it reduces the number of parameters, its accuracy was not effective.

Shakeel et al. [17] developed a model for the categorization of healthy hairs and AA. First, hair images of healthy and AA conditions were collected and preprocessed for partition. Then, various features such as texture, shape, and color were extracted from each segment. Moreover, SVM and KNN classifiers were employed to classify those features into healthy and AA. But these classifiers have a high computational complexity while using more images.

Gao et al. [18] presented a deep learning model for automated trichoscopy scan evaluation and a quantitative framework to categorize male androgenetic alopecia. First, trichoscopy scans were obtained, and a deep learner was constructed based on a Fully Convolutional Network (FCN). Then, the relationships between fundamental and detailed categorization were examined, and a quantitative framework was applied to predict fundamental and detailed categorization through multiple ordinal logistic regressions. But its performance was limited to the number of samples.

Jeong et al. [19] developed a deep learning-based intelligent scalp diagnosis and classification system called AI-ScalpGrader using EfficientNet to diagnose and categorize scalp conditions. But it achieved accuracy values of 87.3 to 91.3%. Roy and Protity [20] presented the 2D CNN model to predict different kinds of hair loss and scalp-related diseases. But the drawback of this framework was the unavailability of a proper dataset and the lack of variety among the images distributed over the internet.

Ying and Lin [21] developed a new self-learning fuzzy automaton with input and output fuzzy sets for system modeling, which can be used to solve issues in medical applications. Xing et al. [22] developed an efficient federated distillation learning system for multi-task time-series classification. It can be used for medical systems to analyze time-series data.

From the literature, it is observed that the current studies focused on machine learning and DL models for scalp hair problem classification. However, such studies face many challenges in AA classification such as limited and heterogeneous data, challenging scalp hair image analysis, inter- and intra-observer variability, lack of standardized classification criteria and generalizability to diverse populations. These challenges hinder the accuracy of classification models and hinder the generalizability of the condition. Therefore, this study develops an ensemble DL model for classifying and diagnosing AA diseases in humans using both healthy scalp and AA scalp hair images.

3 Proposed methodology

In this section, the EPL-OLSTM model is described briefly for classifying and diagnosing AA. Figure 1 depicts the overview of this study. First, the deep features are extracted from both healthy scalp hair and AA scalp hair images using the pre-learned CNN structures. After that, the extracted features are given to the OLSTM network, followed by the fuzzy-softmax layer for AA classification.

Fig. 1

Block diagram of the study.

3.1 Image acquisition

In this study, two different publicly available databases are acquired and they are:

Figaro1k database: It is an open database comprising 1050 healthy scalp hair images, equally distributed in various classes like straight, wavy, and curly [23]. Of these, 350 images of normal hair are considered for this study.

Dermnet database: It is an open database accessible on Dermnet, containing 23 classes of dermatological disorders, including AA [24]. Overall, 1050 images (350 from each AA type) are obtained for three distinct AA types: mild, moderate, and severe.

The healthy scalp hair and AA scalp hair images from these databases are processed by the EPL-OLSTM model for AA classification and diagnosis.

3.2 Deep feature extraction using pre-learned CNN model

In this study, three distinct pre-learned CNN structures are considered for deep feature extraction: AlexNet, InceptionNet-V1, and Residual Network (ResNet). The InceptionNet-V1 and ResNet structures have a single Fully Connected (FC) layer. The AlexNet structure has 3 distinct FC layers (FC6, FC7, and FC8), which contain various distinctive characteristics with efficiencies that vary from all others. The separate efficiencies of these layers are determined and the best-performing layer of these models is predicted as the FC6 layer. Table 1 presents the characteristics of the pre-learned CNN structures. Figures 2 –4 illustrates the structures of the pre-learned CNN models.

Table 1
Details of different pre-learned CNN structures

Network Depth Size (MB) Variables (millions) Image dimension

AlexNet 8 227 61 227×227

InceptionNet-V1 20 27 7 224×224

ResNet 150 77 20 224×224

Network	Depth	Size (MB)	Variables (millions)	Image dimension
AlexNet	8	227	61	227×227
InceptionNet-V1	20	27	7	224×224
ResNet	150	77	20	224×224

Fig. 2

Structure of AlexNet.

Fig. 3

Structure of InceptionNet-V1.

Fig. 4

Structure of ResNet.

So, these pre-learned CNN structures are separately used for extracting the deep features from both healthy scalp hair and AA scalp hair images. After completing the deep feature extraction, the extracted features are fed to the OLSTM network for further processing.

3.3 Optimized LSTM network model

The LSTM network includes 3 gate control strategies such as forget, input, and output gate. Meanwhile, it adopts the choice of dependent data on LSTM unit regulation that efficiently prevents the issue of gradient explosion and vanishing. Its architecture is depicted in Fig. 5.

Fig. 5

Architecture of LSTM network.

The presence of the forget gate is to compute the level of forgetting of the data course preceded by the ongoing LSTM unit as Equation (1):

$f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})$ (1)

In Equation (1), W_f, b_f are the weight vector and bias value of the forget layer, respectively. σ is the sigmoid activation function, x_t is the input feature in the input gate, f_t is the forget gate, and h_t-1 is the result of a previous hidden state.

The role of the input gate is to estimate how much present data is included in the data course as Equations (2), and (3):

$i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})$ (2)

${\tilde{C}}_{t} = tanh (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})$ (3)

In Equations (3), W_i, W_C are the weight vector of the input gate and neuron condition vector, respectively. b_i, b_C are the bias values of the input gate and neuron condition vector, respectively. tanh is the hyperbolic tangent activation function, $i_{t}, {\tilde{C}}_{t}$ are the input gate, and the updated new cell state, respectively.

Once the data traverse via the input and forget gates, the LSTM fine-tunes their units to determine the outcome of the ongoing LSTM unit and pass it to the consecutive LSTM unit as Equation (4):

$C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t}$ (4)

In Equation (4), C_t is the current cell state, and C_t-1 is the old cell state. The output gate merges the present input and LSTM unit to compute the result of the present LSTM unit as Equations (6):

$o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})$ (5)

$h_{t} = o_{t} * tanh (C_{t})$ (6)

In Equations (6), h_t represent the hidden state that serves as the solution of the block over t, o_t is the output gate, W_o and b_o are the weight vector and bias value of the output gate, respectively.

The LSTM network is trained by the Adam optimizer with an initial training rate of 0.001, an epoch number of 100, and a batch size of 15.

On the other hand, the major problem is tuning of LSTM network’s hyperparameters such as the number of hidden layers, number of hidden nodes in each layer, batch size, number of epochs, training rate, weight, and bias values. To solve this problem and optimize the LSTM network’s hyperparameters, the BRO algorithm is adopted in this study.

The BRO algorithm is motivated by the kind of digital games such as battle royale. The BRO is a population-based algorithm, where all individuals are defined by the warrior (different set of LSTM network’s hyperparameters) who wants to relocate to the safest (best hyperparameter set) location and stay living.

The BRO initiates with a random population that can be evenly dispersed over the search area. Then, all individuals fire a gun at the soldier who is closest to them in an attempt to kill them. So, soldiers in powerful locations attack their closest compatriots. Each time, a soldier is wounded by the other; the injury level rises by 1. Such relations are computed by x_i . injury = x_i . injury + 1, where x_i . injury defines the injury level of the i^th warrior among the population. Additionally, warriors seek to switch locations as soon as they get an injury to hit enemies from a different angle. As a result, to concentrate on exploitation, the injured warrior travels in the direction of a location between its original location and the safest location thus far (leaders). Such relations are determined according to:

$x_{inj, d} = x_{inj, d} + r (x_{opt, d} - x_{inj, d})$ (7)

In Equation (7), r denotes an arbitrary number evenly distributed between 0 and 1, x_inj,d represents the location of the injured warrior in size d and x_opt,d indicates the location of the optimal result obtained thus far. As well, when injured warriors will kill their enemy in a consecutive iteration, x_i . injury can be reassigned to 0. To concentrate on search, when the injury level of a warrior exceeds the fixed threshold value, the warrior dies and respawns arbitrarily from the possible search area and x_i . injury can be reassigned to 0. According to the test and error, the threshold value is set to 3. This activity prevents early convergence and offers a better search. The warrior returning to the search area after being killed is as:

$x_{inj, d} = r ({ul}_{d} - {ll}_{d}) + {ll}_{d}$ (8)

In Equation (8), ll_d and ul_d are the minimum and maximum limits of size d in the search area, correspondingly. Further, in all Δ iteration, the possible search area of the issue starts to minimize toward the optimal result. The initial value is Δ = log₁₀ (MaxCircle); however, then $Δ = Δ + round (\frac{Δ}{2})$ , where MaxCircle defines the maximum number of generations. This relation aids in both search and development. Therefore, the minimum and maximum limits can be modified by

${ll}_{d} = x_{opt, d} - SD ({\tilde{x}}_{d})$ (9)

In Equations (9) $SD ({\tilde{x}}_{d})$ denotes the standard variation of the entire population in size d. So, when $\frac{{ll}_{d}}{{ul}_{d}}$ surpasses the actual minimum/maximum limit, it assigns to the actual $\frac{{ll}_{d}}{{ul}_{d}}$ . Moreover, to concentrate on superiority, the optimal warrior (leader) discovered in all iterations is reserved and termed a leader.

Also, the computational complexity of this BRO relies on the population dimension and the maximum number of iterations. Because all results should be evaluated with each other to determine their Euclidean distance from each other result, for the population dimension n, the computational complexity for each result is O (n²). So, for the number of iterations m, the computational complexity of BRO is O (n³). Figure 6 illustrates the OLSTM for AA classification.

Fig. 6

Flow diagram of OLSTM network model for AA classification.

Algorithm 1: OLSTM using BRO algorithm

Input: Set of LSTM hyperparameters (i.e., training rate, epoch number, batch size, weight, bias, etc.)

Output: Optimal set of hyperparameters

Begin

Arbitrarily initialize a population (set of hyperparameters);

Initialize the maximum iteration (Itr_max);

Initialize

Shrink = ceil (log₁₀ (MaxCircle));

$Δ = round (\log_{10} (\frac{MaxCircle}{Shrink}))$ ;

Itr = 1;

while (Itr > Itr_max)

for (i = 1 : population size) //evaluate the fitness function f (classification accuracy) of i^th warrior with the closest one (j^th);

inj = j;

win = i;

if (f (x_i) < f (x_j))

inj = i;

win = j;

end if

if (x_inj . injury < Threshold)

for (d = 1 : dimension)

Modify the location of the injured warrior depending on:

x_inj,d = r (max (x_inj,d - x_opt,d) - min (x_inj,d - x_opt,d)) + max (x_inj,d - x_opt,d)

end for

x_inj . injury = x_i . injury + 1;

x_win . injury = 0;

else

for (d = 1 : dimension)

x_inj,d = r (ul_d - ll_d) + ll_d;

end for

Modify f (x_inj);

x_inj . injury = 0;

end for

Modify (ul - ll) depending on Equation (9);

When ll_d or ul_d surpasses the actual minimum/maximum limit, it is assigned to the actual ll_d or ul_d;

end while

Choose the best warrior (optimal hyperparameters) as the result;

According to this BRO, the optimal hyperparameters utilized in the LSTM network model are chosen for model training. Moreover, the output of the LSTM network (FV) is provided to the FC layer followed by the fuzzy-softmax layer for AA classification.

3.4 Fuzzy-softmax classifier

If the LSTM network provides n features, then the LSTM network layer has the output in the form:

$FV = (x_{1}, x_{2}, \dots, x_{n})$ (10)

The output of the LSTM layer is passed through the softmax layer, which converts the raw output to class probabilities. The LSTM layer provides a vector with 4 scores, each score is associated with a different scalp hair condition. By the softmax classifier, the final result of the scalp hair conditions is calculated by

$\hat{y} = \underset{l \in L}{argmax} (\frac{\exp (T_{L_{j}} (FV))}{\sum_{j = 1}^{n} \exp (T_{L_{j}} (FV))})$ (11)

In Equation (11), T_{L
_j} (FV) is the probability of choosing L_j class as the scalp hair condition (i.e., healthy, mild AA, moderate AA, and severe AA).

Yu [25] defined that using the softmax function with fuzzy interference can increase the discrimination ability of this classification function. So, a new fuzzy-softmax function is applied that utilizes the Intuitionistic fuzzy sets. So, this function considers both membership and non-membership values of the LSTM x state values to the accurate classes.

In this fuzzy softmax classifier, T_{L
_j} (FV) is calculated according to the fuzzy membership and fuzzy non-membership degree for all the FV input vectors to the associated classes.

$T_{L_{j}} (FV) = \sum_{i = 1}^{n} (s_{L_{j}} (x_{i}) f (x_{i} \times v_{L_{j}} (x_{i})) + x_{i})$ (12)

In Equation (12), s_{L
_j} (x_i) is the fuzzy significance of x_i feature, associated with the L_j output class, v_{L
_j} (x_i) is the weight value between x_i feature and the L_j output class, and f (x) is an activation function. The value of s_{L
_j} (x_i) is determined according to the fuzzy membership value μ_{L
_j} (x_i) and the fuzzy non-membership value ϑ_{L
_j} (x_i) of x_i feature and the L_j output class as:

$s_{L_{j}} (x_{i}) = \frac{1 + μ_{L_{j}} (x_{i}) - ϑ_{L_{j}} {(x_{i})}^{λ}}{2}$ (13)

The values of the factors μ_{L
_j} (x_i), ϑ_{L
_j} (x_i) are calculated depending on the weight vector U connecting each LSTM feature to the appropriate class, where u_nj is the weight between n^th feature in the LSTM layer adjacent to the L_j output class.

$U = (u_{11}, u_{12}, \dots, u_{nj})$ (14)

$μ_{L_{j}} (x_{i}) = (u_{ij})$ (15)

$ϑ_{L_{j}} (x_{i}) = \frac{\sum_{o = 1, o \neq j}^{n} μ_{L_{o}} (x_{i})}{(n - 1)}$ (16)

The degree of significance for the non-membership of the fuzzy significance value in Equation (13) is controlled by the variable λ, which is equal to 0.7.

Thus, the fuzzy-softmax layer classifies the features from the healthy scalp hair and AA scalp hair images into different classes.

4 Results and discussion

This portion investigates the success of the EPL-OSTM model by executing it in MATLAB 2017b using Figaro1k and Dermnet databases (discussed in Section 3.1). In this experiment, a total of 1400 photos (350 from Figaro1k and 1050 from DermNet databases) are used. Of these, 1120 photos (280 from Figaro1k (i.e., normal hair class) and 840 from DermNet databases (i.e., 280 mild, 280 moderate, and 280 severe AA)) are applied for training. Similarly, 280 photos (70 from Figaro1k (i.e., normal hair) and 210 from DermNet databases (70 mild, 70 moderate, and 70 severe AA)) photos are applied for testing. Figure 7 shows the some sample scalp hair images from the considered databases for various classes.

Fig. 7

Healthy scalp hair and different kinds of AA scalp images.

The classical models, including HDM-Net [16], KNN [17], FCN [18], and CNN [20], which are also tested by using the considered datasets to ensure the proposed EPL-OLSTM models’ effectiveness. The performance evaluation metrics are defined as:

Accuracy: It is the proportion of precise identification over the total images analyzed. It is calculated by Equation (17).

$Accuracy = \frac{True Positive (TP) + True Negative (TN)}{TP + TN + False Positive (FP) + False Negative (FN)}$ (17)

In Equation (10), the number of healthy pictures precisely identified as healthy is TP, while the number of AA pictures precisely identified as AA is TN. In addition, FP is the number of AA pictures identified as healthy, whereas FN is the number of healthy pictures identified as AA.

Precision: It is determined by Equation (18).

$Precision = \frac{TP}{TP + FP}$ (18)

Recall: It is determined by Equation (19).

$Recall = \frac{TP}{TP + FN}$ (19)

F-measure: It is calculated by Equation (20).

$F - measure = \frac{2 \times Precision \times Recall}{Precision + Recall}$ (20)

Table 2 presents the confusion matrices for the EPL-OLSTM on the considered test images.

Table 2

Confusion matrix for existing and proposed AA classification and diagnosis models during testing phase

Models	Actual class
		Class	1	2	3	4
KNN	Classified	1	57	6	3	4
		2	5	52	8	5
		3	5	5	55	5
		4	3	7	4	56
HDM-Net	Classified	Class	1	2	3	4
		1	59	4	3	4
		2	3	56	6	5
		3	5	5	56	4
		4	3	5	5	57
FCN	Classified	Class	1	2	3	4
		1	59	4	3	4
		2	3	58	7	2
		3	5	5	56	4
		4	3	3	4	60
CNN	Classified	Class	1	2	3	4
		1	63	2	2	3
		2	3	60	5	2
		3	2	5	60	3
		4	2	3	3	62
Proposed (EPL-OLSTM)	Classified	Class	1	2	3	4
		1	64	1	3	2
		2	1	66	1	2
		3	1	2	66	1
		4	4	1	0	65

*Note: 1 –Healthy; 2 –Mild AA; 3 –Moderate AA; 4 –Severe AA.

Table 3 shows the performance values for existing and proposed AA classification and diagnosis models during the testing phase.

Table 3

Performance analysis of existing and proposed AA classification and diagnosis models

Models	Accuracy (%)	Precision (%)	Recall (%)	F-measure (%)
KNN	78.52	77.96	78.31	78.14
HDM-Net	81.36	81.10	81.24	81.17
FCN	83.47	82.49	83.05	82.77
CNN	87.68	87.82	87.24	87.53
Proposed (EPL-OLSTM)	93.1	92.84	92.97	92.91

Figure 8 illustrates the values of performance metrics for both existing and proposed AA classification and diagnosis models. It is noticed that the proposed EPL-OLSTM model can achieve higher efficiency compared to the other existing models. The accuracy of the EPL-OLSTM is increased by 18.57%, 14.43%, 11.54%, and 6.18% compared to the KNN, HDM-Net, FCN, and CNN models, respectively. The precision of the EPL-OLSTM is enhanced by 19.09%, 14.48%, 12.55%, and 5.72% compared to the KNN, HDM-Net, FCN, and CNN models, respectively. The recall of the EPL-OLSTM is 18.72%, 14.44%, 11.94%, and 6.57% compared to the KNN, HDM-Net, FCN, and CNN models, respectively. Also, the f-measure of the EPL-OLSTM is 18.9%, 14.46%, 12.25%, and 6.15% compared to the KNN, HDM-Net, FCN, and CNN models, respectively.

Fig. 8

Comparison of proposed and existing AA classification and diagnosis models.

This reveals that the EPL-OLSTM can classify both healthy scalp hair and AA scalp hair images efficiently, in contrast with the other existing models.

Table 4 shows the computational complexity of existing and proposed AA classification and diagnosis models.

Table 4

Computational complexity of existing and proposed AA classification and diagnosis models

Models	Computational complexity
KNN	O (n log k)
HDM-Net	O (k · n · d)
FCN	O (k · n · d²)
CNN	O (k · n² · d²)
Proposed (EPL-OLSTM)	O (n · d)

*n: number of training data, k: nearest neighbor,k: convolution kernel size, d: input dimension.

4.1 Limitations of the proposed study

The limitations of the proposed study include: (i) the deep-learning model needs a huge quantity of training samples, but this study considers limited samples, (ii) the availability of more well-annotated images representing different stages of AA can impact the model’s generalizability, (iii) the pre-learned CNN models cannot capture more discriminative features from scalp hair images, which may require additional preprocessing steps like segmentation.

5 Conclusion

In this study, the EPL-OLSTM model was designed to classify healthy and AA scalp hair images. First, the AlexNet, ResNet and InceptionNet-V1 were applied for deep feature extraction. Then, the OLSTM network with the fuzzy-softmax classifier was developed for classification. At last, the test results proved that the EPL-OLSTM model on the Figaro1k and DermNet datasets has an accuracy of 93.1% compared to the existing models. As a result, it supports physicians to diagnose patients who suffer from AA earlier. Future work will acquire more images from various sources for model training, validate using other pre-learned CNN models and develop advanced image segmentation models for improved feature extraction.

References

Lintzeri

D.A.

, Constantinou

, Hillmann

, Ghoreschi

, Vogt

and Blume-Peytavi

, Alopecia Areata–Current Understanding and Management, Journal der Deutschen Dermatologischen Gesellschaft 20(1) (2022), 59–90.

Pratt

C.H.

, King

L.E.

, Messenger

A.G.

, Christiano

A.M.

and Sundberg

J.P.

, Alopecia Areata, Nature Reviews Disease Primers 3(1) (2017), 1–17.

Ocampo-Garza

and Tosti

, Trichoscopy of Dark Scalp, Skin Appendage Disorders 5(1) (2019), 1–8.

Gupta

A.K.

, Ivanova

I.A.

and Renaud

H.J.

, How Good is Artificial Intelligence (AI) at Solving Hairy Problems? A Review of AI Applications in Hair Restoration and Hair Disorders, Dermatologic Therapy 34(2) (2021), 1–9.

Daniels

, Tamburic

, Benini

, Randall

, Sanderson

and Savardi

, Artificial Intelligence in Hair Research: A Proof-of-concept Study on Evaluating Hair Assembly Features, International Journal of Cosmetic Science 43(4) (2021), 405–418.

Jhong

S.Y.

, Yang

P.Y.

and Hsia

C.H.

, An Expert Smart Scalp Inspection System Using Deep Learning, Sensors and Materials 34(4) (2022), 1265–1274.

Alarcón-Soldevilla

, Hernandez-Gómez

F.J.

, García-Carmona

J.A.

, Carreño

C.C.

, Grimalt

, Vañó-Galvan

and Arcas-Tunez

, Use of Artificial Intelligence as a Predictor of the Response to Treatment in Alopecia Areata, Iproceedings 7(1) (2021), 1–2.

Chang

W.J.

, Chen

M.C.

, Chen

L.B.

, Chiu

Y.C.

, Hsu

C.H.

, Ou

Y.K.

and Chen

, A Mobile Device-Based Hairy Scalp Diagnosis System Using Deep Learning Techniques, In IEEE 2nd Global Conference on Life Sciences and Technologies, (2020), pp. 145–146.

Chang

W.J.

, Chen

L.B.

, Chen

M.C.

, Chiu

Y.C.

and Lin

J.Y.

, ScalpEye: A Deep Learning-Based Scalp Hair Inspection and Diagnosis System for Scalp Health, IEEE Access 8 (2020), 134826–134837.

10.

Nabahhin

, Aloun

A.A.

and Almurshidi

S.H.

, Hair Loss Diagnosis and Treatment Expert System, International Journal of Engineering and Information Systems 1(4) (2017), 160–169.

11.

Wang

W.C.

, Chen

L.B.

and Chang

W.J.

, Development and Experimental Evaluation of Machine-Learning Techniques for an Intelligent Hairy Scalp Detection System, Applied Sciences 8(6) (2018), 1–28.

12.

Lee

, Kim

B.J.

, Lee

C.H.

and Lee

W.S.

, Topographic Phenotypes of Alopecia Areata and Development of a Prognostic Prediction Model and Grading System: A Cluster Analysis, JAMA Dermatology 155(5) (2019), 564–571.

13.

Seo

and Park

, Trichoscopy of Alopecia Areata: Hair Loss Feature Extraction and Computation Using Grid Line Selection and Eigenvalue, Computational and Mathematical Methods in Medicine 2020 (2020), 1–9.

14.

Fatima

, Arif

and Shivakumar

, Clinical, Dermoscopic and Histopathological Assessment in Patients of Alopecia Areata: A Hospital Based Cross-Sectional Study, Journal of Pakistan Association of Dermatologists 30(2) (2020), 256–260.

15.

Ibrahim

, Noor Azmy

Z.A.

, Abu Mangshor

N.N.

, Sabri

, Ahmad Fadzil

A.F.

and Ahmad

, Pre-trained Classification of Scalp Conditions Using Image Processing, Indonesian Journal of Electrical Engineering and Computer Science 20(1) (2020), 138–144.

16.

Zhang

, Man

and Cho

Y.I.

, Deep-Learning-Based Hair Damage Diagnosis Method Applying Scanning Electron Microscopy Images, Diagnostics 11(10) (2021), 1–12.

17.

Shakeel

C.S.

, Khan

S.J.

, Chaudhry

, Aijaz

S.F.

and Hassan

, Classification Framework for Healthy Hairs and Alopecia Areata: A Machine Learning (ML) Approach, Computational and Mathematical Methods in Medicine 2021 (2021), 1–10.

18.

Gao

, Wang

, Xu

, Yang

, Nie

and Jiang

, Deep Learning-Based Trichoscopic Image Analysis and Quantitative Model for Predicting Basic and Specific Classification in Male Androgenic Alopecia, Acta Dermato-Venereologica 102 (2021), 1–6.

19.

Jeong

J.I.

, Park

D.S.

, Koo

J.E.

, Song

W.S.

, Pae

D.J.

and Choi

H.J.

, Artificial Intelligence (AI) Based System for the Diagnosis and Classification of Scalp Health: AI-ScalpGrader, Instrumentation Science & Technology (2022), 1–11.

20.

Roy

and Protity

A.T.

, Hair and Scalp Disease Detection Using Machine Learning and Image Processing, European Journal of Information Technologies and Computer Science 10(11) (2022), 1–7.

21.

Ying

and Lin

, Self-Learning Fuzzy Automaton With Input and Output Fuzzy Sets for System Modelling, IEEE Transactions on Emerging Topics in Computational Intelligence 7(2) (2022), 500–512.

22.

Xing

, Xiao

, Qu

, Zhu

and Zhao

, An Efficient Federated Distillation Learning System for Multitask Time Series Classification, IEEE Transactions on Instrumentation and Measurement 71 (2022), 1–12.

23.

Figaro 1K. Figaro 1K | share Your Project, (n.d.). Retrieved March 6, 2023, from http://projects.i-ctm.eu/it/progetto/figaro-1k

24.

Image library, DermNet. (n.d.). Retrieved March 6, 2023, from https://dermnetnz.org/image-library

25.

, Softmax Function Based Intuitionistic Fuzzy Multi-Criteria Decision Making and Applications, Operational Research 16 (2016), 327–348.

Ensemble of pre-learned deep learning model and an optimized LSTM for Alopecia Areata classification

Abstract

Keywords

1 Introduction

2 Literature survey

3 Proposed methodology

3.2 Deep feature extraction using pre-learned CNN model

Table 1 Details of different pre-learned CNN structures Network Depth Size (MB) Variables (millions) Image dimension AlexNet 8 227 61 227×227 InceptionNet-V1 20 27 7 224×224 ResNet 150 77 20 224×224

5 Conclusion

References

Table 1
Details of different pre-learned CNN structures

Network Depth Size (MB) Variables (millions) Image dimension

AlexNet 8 227 61 227×227

InceptionNet-V1 20 27 7 224×224

ResNet 150 77 20 224×224