Abstract
In this article, a methodological approach to classifying malignant melanoma in dermoscopy images is presented. Early treatment of skin cancer increases the patient’s survival rate. The classification of melanoma skin cancer in the early stages is decided by dermatologists to treat the patient appropriately. Dermatologists need more time to diagnose affected skin lesions due to high resemblance between melanoma and benign. In this paper, a deep learning based Computer-Aided Diagnosis (CAD) system is developed to accurately classify skin lesions with a high classification rate. A new architecture has been framed to classify the skin lesion diseases using the Inception v3 model as a baseline architecture. The extracted features from the Inception Net are then flattened and are given to the DenseNet block to extracts more fine grained features of the lesion disease. The International Skin Imaging Collaboration (ISIC) archive datasets contains 3307 dermoscopy images which includes both benign and malignant skin images. The dataset images are trained using the proposed architecture with the learning rate of 0.0001, batch size 64 using various optimizer. The performance of the proposed model has also been evaluated using confusion matrix and ROC-AUC curves. The experimental results show that the proposed model attains a highest accuracy rate of 91.29 % compared to other state-of-the-art methods like ResNet, VGG-16, DenseNet, MobileNet. A confusion matrix and ROC curve are used to evaluate the performance analysis of skin images. The classification accuracy, sensitivity, specificity, testing accuracy, and AUC values were obtained at 90.33%, 82.87%, 91.29%, 87.12%, and 87.40%.
Keywords
Introduction
Melanoma is one kind of malignant cancer as well as it is the third-most threatening skin cancer worldwide [1]. Recent research shows that the number of people suffering with melanoma skin cancer are increasing day by day, and it is a type of fast-spreading skin cancer [2]. Recently, the advancement of non-invasive computer-assisted detection of melanoma based on lesion features has prompted researchers in this field to strengthen themselves [3]. The World Health Organization (WHO) has revealed that melanoma is fastest growing skin cancer among whites worldwide. According to research statistics, 3 million Americans are diagnosed with non-melanocytic skin cancers each year, which are classified as basal cell carcinoma and squamous cell carcinoma. In 2021, the number of new melanoma cases was estimated at 106,110, of which 62,260 were men and 43,850 were women, and a total of 7,180 died, including 4,600 men and 2,580 women [4].
Microsoft has developed the Resnet convolutional network, and google has developed the inception v3 network. The inception process divides by size, combining the results then repeats. ResNet consists of a simple, single-scale processing unit in which data passing over links. ResNet generates 2,048 features per image compared to 1,536 for Inception. Even though, the extracted features from inception network are much similar to characteristics of the Resnet network, whereas Inception executes it slightly more robustly [5]. From the aforementioned statements reveals that, the selection of training data set is much more significant compare to selection of convolutional neural network (CNN). Inception v3 model’s [6, 7] performance was improved by balancing the network’s depth and width with computationally less expensive, and for deeper networks, an auxiliary classifier that serves as a regularizer was used. Increasing the size of deep neural networks is the simplest way to enhance their performance. The resemblance of the structure, the abundance of filters, and larger batch size allow the use of effective dense computation is made possible. This architecture’s main advantage is that it enables substantial increases in the number of units at each stage without uncontrollably distending computational complexity.
In this work, Inception v3 models are used for the following reasons. Its processing time is significantly less compared to Resnet 50, Resnet 150 and Resnet 152 models. Accuracy of Inception v3 was superior to other models based on performance comparison. Then, the accuracy was achieved as expected.
Early detection of skin cancer helps to cure it with a simple incision, which increases the chances of survival. However, making an accurate diagnosis is a challenging task for non-specialists and experienced dermatologists [8]. Many dermatologists use a variety of heuristic techniques, for example seven-point checklist, CASH algorithm, pattern analysis and ABCD, Menzies methods for diagnosing and classifying lesions [9, 10]. Even experienced dermatologists find it difficult to accurately diagnose skin lesions using traditional methods. By considering the multifaceted visual characteristics of the hair, veins, light, shapes, and borders of the skin lesions, it takes a long time and causes errors. These traditional methods of diagnosing skin lesions are highly rigorous and long process [11].
The diagnosis made by dermatologists is somewhat subjective, so it provides less accuracy when delineate the benign and melanoma skin lesions. Computer assisted detection systems do not have this subjective control when analyzing dermoscopy images. These systems assist physicians in making decisions such as lesion border diagnosis, measuring diagnostic features, and classifying lesions [12, 13]. Dermoscopy has two major constraints due to its subjectivity and it requires extensive training. The researchers and dermatologists have been striving to overcome the aforementioned issues with the advancement of CAD systems [14]. Classifying dermoscopic images through CAD has attracted significant research attitudes, particularly in remote areas with limited dermatologists. Further, an automated analysis possesses great potential to diagnose and cure patients in a timely manner [15, 16].
Organization
The organization of this work is summarized as follows: An overview of the literature review of information related to the proposed model is provided in section 2. Section 3 describes the techniques used for the new proposed method, such as data preprocessing, feature extraction, and classifiers. Section 4 describes the transfer learning approach with our proposed model. Section 5 provides the configuration of the Inception v3 and the modified Inception v3 model. Section 6 provides implementation details and evaluates performance using the confusing matrix of the proposed model. Section 7 compares the results obtained from the new model with those found in the literature. Section 8 provides the conclusion of the work.
Literature review
Nowadays, deep learning techniques are applied to a variety of clinical image analysis issues, such as image recognition [17], lesion detection [18], image segmentation [19, 20], and image classification [21]. This technique is a much more powerful and widespread model that contains many layers which transform the training images into feature extraction using convolution filters [19, 22]. Double matching system was established and used convolutional neural networks (CNNs) to detect lung nodules [23]. The deficiency of CNNs is that they increase the features and computation time due to the overlapping, that occurs between the input patches and the neighbouring areas. In that case it is rectified by fully connected layers reframed into convolutions that can train a large number of images. Most sophisticated methods of image segment analysis use CNNs and they have been modified by using convolutional layers at the end of networks instead of fully connected layers.
Convolutional Neural Networks
CNN makes efforts to imitate the process of image recognition by the visual information in the brain. Feature extraction has been used according to machine learning tasks for better outcomes in image classification [24]. The Conventional Neural Network is a unique neural network established to study visual structures in images. Currently, CNN image classification is the most effective deep learning method for dealing with image classification tasks. A CNN is a combination of three layers such as the convolutional, the pooling, and the fully connected layers. The first layer is very important for performing most calculations involving kernels with weight groups. During the training phase, visual features are learnt from the input images. The whole image is rotated on each kernel that generates the feature map, and the output of this layer is mainly used by the pooling layer to diminish the size of the feature map [25, 26]. Consequently, the count of training parameters for the next layer is reduced, which helps to control over-fitting as well as to retain the size and shape of the unchanged features of the input image. The fully connected layer is used to categorize the input images, which are provided in the form of a feature map from the previous layer of the network.
Different CNN architectures
In order to diagnose skin cancer and distinguish benign melanocyte lesions from malignant melanoma, numerous CNN architectures are presented. However, according to medical tasks, gathering large amounts of information for CNN training can be challenging. To overwhelm this difficult issue, the model is trained for the given foundation tasks using transfer learning techniques, and it is used to some extent for a new purpose [27]. Thereby, the ImageNet dataset [28] weights are used to initialize the models, which are then fine-tuned using their own dataset.
Deep CNN based methods
Deep learning techniques can automatically extract features from hierarchical network structures in place of conventional techniques like the ABCD rule, Menzies scoring, and the 7-point checklist [9, 10]. These deep learning methods have been found to be extremely successful in natural image recognition tasks like image classification. Krizhevsky et al. [28] have won the ImageNet challenge 2012, particularly the deep learning is incontrovertible in convolutional neural networks (CNNs) that have become successful techniques in many computer vision problems. For medical imaging tasks, Deep CNN technique has been adopted to either analyse or segment tissues and structures in medical images with an ever-increasing number of applications [22]. Nowadays, with the advancement of the deep learning approach to clinical imaging systems, there is growing concern about the future of the human radiologist. This approach shows significant accuracy in image classification tasks.
Inception V3 model
Wang et al. [29] proposed that lung image classification systems rely on transfer learning with the ImageNet framework and the GoogleNet [7] framework with the ImageNet [30] dataset, and at that moment results were compared with DCNN. This method is used to more accurately classify benign and malignant pulmonary nodules. Joshi et al. have classified different sports activities with a 96.64 % accuracy using Neural Networks classifiers and Inception v3 model [31]. A pretrained model is reprocessed in the newly proposed model to diminish training time and enhance performance when using a smaller dataset. Furthermore, a fully interconnected classification layer is used to improve classification accuracy. Al Husaini et al. have initiated using deep learning to classify the breast disorders [32]. The DCNN-based Inception v3 system has never been used previously to diagnose cervical lymphadenitis in cytological images [33]. Li and Liu have developed a model based on Inception v3 with SVM classifier [34]. It is suitable for image quality classification task and performs well compared to simple CNN. A pre-trained Inception v3 uses the ImageNet dataset for image description and whose inputs are normalized features extracted.
Contribution
The main contribution of the proposed work is given as follows: A new deep learning model has been proposed for diagnosing skin cancer in an initial stage. In order to avoid overfitting problem, the dataset images are augmented with rotation, translation. Inception v3 Net has been taken as the baseline architecture for its improved feature extraction. The Dense Net Block has been incorporated in the proposed model for extracting fine grained features. In order to avoid vanishing gradient problem RMSprop Optimiser has been included in the network which gives better results to detect the skin lesion disease compared to other optimizers. During the training process, different packages are used and explored to verify effective and efficient learning with the same limited training dataset.
Materials and methods
Data preprocessing
Preprocessing of information is the initial step in formulating unprocessed data and making it more relevant to the machine learning model. Image size and integrity have an important effect on the separation of image features. The structural details of the images are an important basis for distinguishing the features that most cropping cause in skin lesions. Besides, the training time will also vary when different sized images are used as the training set [35]. Therefore, all the images are resized in the dataset to 224×224 x 3 from various sizes of images and it is the most common in pre-processing methods. The input feature map size has been reduced considerably for the network architecture [36]. Besides, reduction of data, normalisation of data, extraction of features is implemented, and finally, labelling of data is converted from string into numerical [37].
The procedure of dropping the quantity of images in a medical imaging database is known as data reduction [37, 38]. Data reduction is a primary task to obtain high classification rate from the entire dataset that has images with noise, blurred images, artifacts such as air, hair, bubbles, and some images that have low contrast. This data reduction technique reduces the overfitting and the computing burden. Besides, the performance of a system on these datasets can be correlated with its performance on downstream tasks [39].
In this paper, each RGB input image is divided into normalized images. Normalization is the process of converting pixel values to a range between 0 and 1. Image normalization involves subtracting a single pixel value from the average pixel value of an image, and is evaluated over the entire database [17]. Data normalization is a process of formulating a database to eliminate useless attributes like insertion, enhancement, and irregular shapes and thus minimizes the data recurrence and facilitating data integration [37].
Normally, ground truth is obtained by using the information from clinical images and image labels, which are short notes provided by radiologists. Accurate image labels can be very useful for investigation by a large number of radiologists. The Ground Truth data is available publicly in the website https://www.isic-archive.com/. Only ground truth is considered in medical imaging, including bleeding into the skull, bone fractures, kidney stones, and aortic dissection for specific diseases. It is difficult to know the labels of an image without obtaining surgical, pathological, genetic or clinical observation [40].
Dropout is a formulation system in deep learning that trains lots of neural networks simultaneously with different topologies. This avoids overfitting as well as improves the validation accuracy, thus increasing the generalizing power [41]. It is used to randomly dropout neurons during the training of neural networks in each iteration. Overfitting can be acknowledged by inspecting validation metrics such as accuracy and loss. When each training incident is delivered to the network during the training phase, each hidden neuron is neglected with a probability of 0.2 to 0.4. Ultimately, all hidden neurons are tested, and output is amplified by 0.5 which means that twice as many neurons are now active. The result is a better regulating effect, which significantly reduces over-fitting [28].
Batch normalization is the operation of adding additional layers to a deep network to make it quicker and reliable. Hence, for this purpose, the batch normalization layers are included in the output of every convolutional and de-convolutional layers in the same feature map. The extra layer comes from the previous layer and performs the standardization and normalization functions on the input layer [42]. Deep neural networks possess challenging tasks to train because, the input from the previous layers can be modified after weight updates. Some architectures include the batch normalization layer before applying the nonlinearities. It speeds up the training process, in certain cases by halving the epochs or better, and provides proper regularization as well as reduces the generalization error [14]. The Batch Normalization (BN)-CNN based system gives acceptable results for skin cancer image detection compared to CNN models with the results of accuracy, loss, precision, recall and F1 score values [36].
Data augmentation plays a significant role in skin cancer image classification. Appropriate data augmentation can directly alter the outcome of skin cancer image classification. Data augmentation is important because it can effectively enhance the quantity of the training data, which reduces overfitting of the sample. Further, the model’s generalisation capability was significantly increased in the limited skin cancer image data set [42]. This process is performed by applying various transformations that preserve the labels, as in rotations, scaling and intensity shifts of images. Furthermore, sensible and appropriate noise data are included to enrich the model’s robustness [29, 43].
Transfer learning approach with our proposed model
In medical image classification and recognition tasks, Transfer learning has been widely used for tumor classification [44], retinal diseases diagnosis [45], and skin lesion or cancer classification [2]. Recent research explores the characteristics of transfer learning for medical imaging tasks and it enables the pretrained standard large networks of ImageNet to frequently have extensive parameters. Thus, it may not be the best solution for medical image diagnosis [45]. The disadvantages of this approach are that it weighs heavily and requires lots of parameters and huge computational sources, which demonstrates the inefficiency of these models in clinical image analysis [11].
Transfer learning is a renowned deep learning technique that imposes a huge number of images and high computing power. Trained model parameters are transferred to novel assignments to reduce the computing time. In modern days, deep neural networks are not suitable for the classification of skin lesions due to the small number of images in the dataset. But, Transfer learning with deep learning is the good choice to overwhelm this problem. Owing to the admiration of deep learning, the importance of data has also increased, which has stimulated the development of a transfer learning approach [2, 29].
Ablation studies assess the performance of an artificial intelligence system by removing specific parts to determine how those parts affect the system as a whole. It provides perceptions into the relative contribution of different architectural and regularization components to the performance of deep learning models. Examples of ablatable components include dataset features and model elements, but any design decision or system module can be considered in an ablation study. Ablation studies are classified into two types, model ablation studies and feature model ablation studies. It depends on the type of model components or dataset features removed in the study respectively [46].
Transfer learning is a deep learning technique that uses a model created for one task as a starting point for a model of another task. There are three phases to the transfer learning process. A pre-trained Inception v3 model was selected for the phase I trained on the ISIC 2019 dataset. In this research work, fully connected layer and SoftMax layer are ablatable model components. At the end of Inception v3 model, the following model components have replaced the ablated part. They are the flat layer, four density layers, three volume normalisation layers, and three dropout layers in phase II to design a new model. By fine-tuning the layers, the layers of the model are updated and adapted to new tasks in the phase III. Typically, Transfer learning is performed by considering a standard neural network architecture along with its pretrained weights on large scale datasets.
The proposed approach
The strategy of the proposed model is as follows. Preprocessing of the dataset is in the first stage. Inception v3 is used to extract features in the second phase. In third stage, image classification task SoftMax classifier is used. Finally, the model is evaluated using performance measurements after training and verification.
The strategy of the offered model is shown in Fig. 5.1.

Strategy of the suggested model.
Structure of Inception v3 model
In this research work, a pretrained Inception v3 model is utilized to classify the dermoscopy images. The major blocks of basic inception v3 has convolutions, max pooling, average pooling, concatenates, dropouts, and fully connected layers [32, 43].
Batch normalization is also utilized throughout the model and it is involved for the activation of inputs (Relu). Inception v3 model architecture is shown in Fig.5.2. Inception v3 model has 312 layers and it is comprised of several convolutional and pooling layers. The Inception v3 model uses a convolutional neural network to extract the recognition and classification features in the image [34]. Further, classification of skin cancer images is performed with fully connected and SoftMax classifiers.

Structure of Inception v3 model.
Inception v3 model has 323 layers and different sizes of convolutional filters are concatenated into a single output vector for forming the input to the next stage. This helps to lessen the parameters count to be trained and computational problem.
The default input image size for Inception-v3 is 299 × 299 pixels, whereas the publicly available dataset original image size in this research work was 224×224 pixels. During training and testing of basic Inception-v3 and modified Inception-v3, the image is not resized to 299×299 pixels. The number of channels remains the same during training, but the size of the generated feature maps is only changed, and the outcome was satisfactory. However, the size of the input image has a great impact on the performance of the deep CNN. As the size of the input image increases, the accuracy and sensitivity are higher and the efficiency also increases [47, 48]. The structure of the feature map after convolutional layer, inception module and average pooling is 5×5 with 2,048 channels. After average pooling, in the modified Inception v3 model, the final three layers of the pretrained model are successively replaced by a flatten layer, four dense layers, three dropout layers, and three batch normalization layers. Flattening function that flattens the multi-dimensional input tensors to single dimension at the end of the model.
A dense layer is closely connected to the preceding layer, which means all neurons in the previous layer receive input from all neurons in the current layer. It is used to categorise the clinical images based on the output of convolutional and pooling layers [48]. In this work, a dropout layer with a dropout rate of 40% during training was used. Batch normalization stabilizes the effects of the learning process and calculates the training sessions required. Instead of the classification layer, the SoftMax layer is involved as a classifier, which is used as a class-based feature that generates probabilities for each class and opt one with the highest probability as the anticipated class [49]. The output of the Inception-v3 network has 1,000 classes but in modified Inception v3 model had 2 classes, namely benign 0 or malignant l. Therefore, we modified the number of output channels of the last layer from 1,000 to 2. The dropout layer sets the input units to approximately 0 throughout the training phase, with the frequency of the rate each step, which helps to prevent overfitting. In the entire experimental process, 3307 images were used [50], with a block size of 64, as well as the model was operated for 30 epochs. Throughout the testing and validation process, 2647 skin lesion images were used, including 1450 benign and 1197 melanoma. During the investigation process, 2647 x 80% =2118 was used for training and 2647 x 20% =529 for validating data from the database. Finally, the trained data will be sent to the classifier. Furthermore, in the testing process, 660 images were selected for the classification task, including 360 benign and 300 melanoma and finally, the confusion matrix is used to examine the proficiency of training[50]. Modified Inception v3 architecture is depicted in Fig. 5.3.

Structure of proposed model.
Inception v3 model used an input format of 224 × 224 × 3 with ISIC dataset. Four dense layers are added during the classification phase, with the first layer have ReLu activation function with 512 neurons. Subsequently, a 0.4 dropout layer and a batch normalisation layer are added. The second and third dense layers contain 256 and 128 neurons, respectively, with the same activation function as ReLu, while the dropout and batch normalisation layers are treated the same as the first layer. The final dense layer has a single unit that provides output for the task. The optimizers such as Adam, RMSprop, and SGD are considered and they are momentum based optimizers. Binary cross entropy is utilized to measure the model accuracy and loss. Python Interface for proposed model is depicted in Fig. 5.4.

Python interface for modified Inception v3 model.
Implementation details
The experimental setup has been implemented in Google Colab on a TPU machine with the following specifications: The experiments are performed using a HP computer equipped with an Intel i7-6700 @ 4.00 GHz, 4 Core, RAM: 32GB DDR4, Windows 10, as well as a GPU: NVIDIA Jetson Nano V2 developer kit, 8 GB graphics card and two software applications such as Tensorflow2 and Keras are used. The Jetson Nano is a compact developer kit that can be utilized to simultaneously operate neural networks for image classification, identifying objects, segmentation and analysing speech process.
Datasets
Skin images are collected from the generally available ISIC database and segmented into training, verification and test data. The ISIC dataset, which comprises 3307 images, has been used for the classification model training, validation and testing [51]. Figure 6.1 shows sample images of different types of skin cancers.

Sample images of different kinds of skin cancer.
Performance measurements for the classification task include sensitivity, specificity, F1 score, accuracy, and AUC curve [52]. These measures are computed from following Equations (1) to (7).
The findings of each patient’s disease test may be positive or negative, but they never rely on an individual’s actual status. Confusion matrix 2 × 2 is employed for measuring performance of a classification model, where two is represent outcome of classes [52].
The confusion matrix makes comparisons between the authentic and forecast values to analyse the model performance. Some of these important relationships are as follows. True Positives (TP): Ailing individuals are appropriately diagnosed as ailing. False Positives (FP): Healthful individuals are misdiagnosed as ailing True Negatives (TN): Healthful individuals are appropriately diagnosed as healthful False Negatives (FN): Sick individuals are misdiagnosed as healthful
Performance metrics values are determined after obtaining the count of true positives, false positives, true negatives and false negatives.
Sensitivity, Recall or True Positive Rate (TPR): it is defined as capability to detect the correct count of actual positives.
Specificity, Selectivity or True Negative Rate (TNR): It is defined as ability to find out the percentage of real negatives in a perfect manner.
Precision or positive predictive value (PPV): Ratio between count of individuals diagnosed perfectly as belongs to true positive observations and whole count of individuals categorized as belongs to positive observations.
Accuracy: It is defined as perfectly diagnosed observations to entire count of observations.
F1 Score: it is estimated by average of precision and sensitivity. Mathematical calculation requires in both false positive and false negative, which perform well on an imbalanced dataset [53].
False Positive Rate:
ROC Curve: The performance of the model is measured through ROC curve graph. This graph drawn between true positive rate and false positive rate [52].
AUC Curve: This is a key determining factor in performance appraisal.
The parameters were fine-tuned to suggested model using transfer learning approaches in this experiment for skin cancer image categorization. In the experiment, the SoftMax classifier is employed. Finally, the recall, specificity, and accuracy test results are compared to those of other exiting approaches. In addition, Figs. 7.1, 7.2, and 7.3 show the suggested model’s confusion matrix for RMSprop, Adam optimizer, and SGD optimizer outcomes, respectively. It’s also a table of suggested output for assessing the model’s performance.

Confusion matrix of recommended model with RMSprop optimizer.

Confusion matrix of recommended model with Adam optimizer.

Confusion matrix of recommended model with SGD optimizer.
In training and testing process, 3307 images in the dataset were utilized, with batch size of 64. Out of which, 2647 were for training and validation, and 660 for testing. Comparison between the proposed Inception v3 model with the transfer learning techniques for various optimizers has been considered by taking 80 percentage of training data, 20 percentage of validation data, and 660 images for testing.
The classification reports of Adam optimizer, RMSprop optimizer, and SGD optimizer are depicted in Tables 7.1, 7.2, and 7.3, respectively. Performance criteria such as accuracy, recall, F1-rating, training accuracy, and test accuracy are utilized to assess the classification system, and results are analogized to prevailing methods such as ResNet, VGG-16, DenseNet, and MobileNet [54–57].
Classification reports of suggested model with RMSprop optimizer
Classification reports of suggested model with RMSprop optimizer
Classification reports of suggested model with Adam optimizer
Classification reports of suggested model with SGD optimizer
The classification results are well matched with existing approaches, with overall accuracy, precision, recall, F1-score, training accuracy, as well as test accuracy of 90.33%, 82.87 %, 86.44 %, 97.52 %, and 87.12 %, respectively. Table 7.4 shows the findings for the various optimizers studied using the ISIC archive database. The experiments and the detailed results concentrate on the comparison of classification models for the two classes such as malignant and benign. The following analyzes are performed and reported from the confusion matrix table. In this experiment, 300 melanoma images were used. 271 images of melanoma are accurately predicted and diagnosed as melanoma with a classification accuracy of 90.33 %. The test included 360 Benign images, with 304 of them accurately diagnosed as Benign with an accuracy of 84.44 %.
Results of the recommended model analogized with the prevailing models, considering 80% of training and 20% of validation data
The outcomes of different optimizers on testing dataset for skin lesion classification shown in Table 7.5. we compare the three optimizer such as RMSprop, Adam optimizer, and SGD optimizer for modified Inception v3 model. F1 measure value from SGD optimizer is slightly less compare to RMSprop, Adam optimizer and other methods available in the literature. SGD optimizer classification performance may be enhanced by suitable tuning of hyperparameter such learning rate, batch size, etc. We can see that our recommended model AUC has attained excellent performance with precision, accuracy, sensitivity and specificity. This demonstrates the effectiveness of this optimization algorithm. The overall accuracy of the test set was computed as the average of the area under curve matching the melanoma classification findings. Table 7.6 shows the results of calculating and comparing common performance metrics like accuracy, sensitivity, and specificity with prevailing studies described in the literature [58].
Outcomes of skin cancer classification of different optimizers
Comparison of experimental classification results on the test set with the existing methods
Performance metrics Accuracy and loss are first evaluated for classification. For various optimizers, results of accuracy and loss shown in Fig. 7.4(a), Fig. 7.4(b) and Fig. 7.5(a), Fig.7.5(b). Using RMSprop and Adam optimizers, outcomes of overall accuracy of 97.52%, 93.04% and training losses of 6.79%, 6.00% are obtained respectively. However, when compared to the other two optimizers in this work, SGD optimizer provides overall accuracy is lower and training loss is high.

Accuracy curve of suggested model with RMSprop optimizer.

Loss curve of suggested model with RMSprop optimizer.

Accuracy curve of suggested model with Adam optimizer.

Loss curve of suggested model with Adam optimizer.
The RMSprop optimizer has achieved the highest classification accuracy, followed by Adam optimizer and the SGD optimizer. The ROC curve for the suggested model with RMSprop, Adam, and SGD optimizers is shown in Fig. 7.6. Subsequently numerous times of fine tuning, the finest outcomes with 30 epochs have been obtained.

ROC curve for suggested model with RMSprop, Adam, and SGD optimizer.
As a result, the AUC values of 87.40%, 84.97%, and 82.40% are obtained with RMSprop, Adam, and SGD optimizers and they are shown in Fig 7.6. we obtained a better evaluation of the classification task, using binary classifier with a confusion matrix. RMSprop Optimizer’s overall accuracy for this model is 97.52 % t, with 87 % for both the macro and weighted averages. Also, Adam Optimizer’s overall accuracy for this model is 93.04 % t, with 86 % for both the macro and weighted averages. This represents the excellent diagnosis of melanoma and benign images, and it is happened on the testing dataset. Different models such as Resnet 50, Resnet 150, and the Inception v3 model are used with the same dataset and epoch, the same optimizer, and the same batch size. By comparing Inception v3 with other models, it can be seen that the execution time and verification accuracy are satisfactory.
Hyperparameters are parameters whose values are predetermined when starting the training procedure of the model. During the operation of training, specified values of hyperparameters can impact the model’s learning rate, training speed and performance, and they are used to optimize automatically. Nevertheless, the range of proper learning rate and strategy can be truly challenging tasks. In the deep network, learning rate, and optimizer are the main hyper-parameters [11]. Suitable tests have been conducted for hyper-parameter tuning to achieve a substantial impact on classification performance of the suggested system. The ultimate purpose of this parameter alteration is to lessen overfitting and improve classification system forecasts. These parameters are varied as follows:
In order to minimize losses, optimizers are employed to alter the characteristics of the neural network such as weights and learning rate. In this, for each optimizer, all hyperparameter values are varied [11]. Various optimization algorithms are available, but in the present research, three kinds of optimizers have been investigated and fine-tuned.
In general, these algorithms vary with the speed of training and the effectiveness of the classification task. The performance of these optimization algorithms is analogized later tuning their respective hyperparameters with RMSprop and it has generated the better performance, followed by Adam optimizer and then SGD.
The values of learning rates are investigated between 0.001 and 0.00001. While the learning rate value is very small, it requires a lengthy training process and lessen the computational efficiency, whereas large learning rates result in an unstable training process.With a lower learning rate of 0.0001, ADAM and RMSprop Optimizer were able to achieve better computational performance. This created improved generalization accuracy for the proposed model.
Dropout is a regularization technique that involves randomly dropping, or eliminating, neurons from the neural network during training. It is used to prevent overfitting and to strengthen validation accuracy, thus increasing the generalizing power [59]. In the present experiment, dropout rate of 40% was used.
Batch size means the number of training examples employed in one iteration. The larger the batch size, the quality of the model may degrade and it is unable to generalize well on the data. It also has an impact on the model’s training speed [60]. The batch size of 64 yielded the greatest results in this research.
Conclusion
In this work, the offered model with transfer learning technique was used to automatically classify skin lesion images as benign or melanoma. It is capable of classifying the task by combining the output activation layer with the SoftMax classifier for the binary classification. The recommended model was estimated using the 3307 images in the ISIC archive dataset, with training and testing for accuracy. In this work, good experimental results were obtained from the RMSProb optimizer with an overall accuracy and an AUC score percentage of 97.52% and 87.40%, respectively. The dataset is balanced and the number of images is increased by augmentation techniques, which improve the accuracy of classification. The obtained AUC score is 87.40% from RMSprop optimizer, which indicates the ability of the proposed model to correctly distinguish between true positives and false positives in skin lesions and it can assist the dermatologists in making decisions. Besides, the outcomes found using a CAD system prove the effectiveness of the suggested method in malignancy diagnosis of melanocytic skin lesions. In this paper, the classification performance report and AUC metrics score have been obtained using the confusion matrix, and hyperparameter values are fine-tuned with different optimizers. These are displayed in various tables and diagrams, which can help distinguish between benign and melanoma images. Skin lesions were successfully classified as benign or malignant using a CAD system that used the basic Inception v3 model modified with a transfer learning technique, and it would enable the dermatologist to begin treatment at an early stage, reducing mortality. The suggested model has provided better accuracy based on different performance metrics used in the medical field analogized to the prevailing methods available in the literature.
Footnotes
Acknowledgments
We thank the International Skin Imaging Collaboration (ISIC) for their efforts in gathering and sharing the skin lesion classification dataset for the assessment of Computer Aided Diagnosis (CAD) to categorize dermoscopy skin lesion images.
