Abstract
Cervical cancer is the most frequent and fatal malignancy among women worldwide. If this tumor is detected and treated early enough, the complications it causes can be minimized. Deep learning demonstrated significant promise when imposed on biomedical difficulties such as medical image processing and disease prognostication. Therefore, in this paper, an automatic cervical cell classification approach named IR-PapNet is developed based on Inception-ResNet which is an optimized version of Inception. The learning model’s conventional ReLu activation is replaced with the parametric-rectified linear unit (PReLu) to overcome the nullification of negative values and dying ReLu. Finally, the model loss function is minimized with the SGD optimization model by modifying the attributes of the neural network. Furthermore, we present a simple but efficient noise removal technique called 2D-Discrete Wavelet Transform (2D-DWT) algorithm for enhancing image quality. Experimental results show that this model can achieve a top-1 average identification accuracy of 99.8% on the pap smear cervical Herlev datasets, which verifies its satisfactory performance. The restructured Inception-ResNet network model can obtain significant improvements over most of the state-of-the-art models in 2-class classification, and it achieves a high learning rate without experiencing dead nodes.
Introduction
Cervical cancer is the most predominant and lethal type of tumor among the women population that develops in the lower part of the uterus. It ranks 4th position in the mortality rate of cancer [1]. According to GLOBOCAN 2020, in India, about 1,23,907 individuals were affected by cervix uteri accounting for nearly 9.4% of all cancer cases [2]. The occurrence and death of cervical cancer are 14.7/105 and 9.2/105 respectively [3]. Human papillomavirus and human immune virus contribute about 99% to the occurrence of cervical cancer [4–6]. Cervical cancer occurs in two main types namely adenocarcinoma and squamous cell. The squamous cell carcinoma progresses in the outer lining of the cervix and adenocarcinoma progresses in glandular cells of the cervical canal. It has five stages to attain the malignant phase [7]. Early detection of cervical cancer using an effective screening tool could increase the survival rate [8]. Commonly Radiation treatment and chemotherapy are used to treat cervical cancer following surgery if there is a significant chance of cancer returning or if the tumor has progressed. Accurate selection of the tumor cells aids the therapist and prevents radiation exposure to healthy cells. However, manual segregation lasts up to 90 –160 minutes and even has variations from the ground truth [9].
These atypical cell alterations usually occur over a lengthy period and refer to a wide range of abnormalities. Lugol’s Iodine test [11, 12], colposcopy, cervicography, and Pap smear test are commonly done to test the spread of cancer [13]. Pap smear [16], one of the most common screening procedures for cervical cancer prevention and timely diagnosis, where pathologists examined the tissue under a microscope for aberrant cell growth. As shown in Fig. 1 the testing result contains thousands of cells, and the pathologist studies each one carefully before making a decision [10, 15], this is a lengthy process including subjective or skewed experiences. Over the last few centuries, extensive research has been carried out to build computer-aided diagnostic (CAD) systems to aid in disease detection and medical image analysis, as well as computer-assisted reading systems. The artificial intelligence and Machine learning algorithms exposed a great way to compute the medical images that can segment and classify maladies without humans indeed. The convolution neural network (CNN) is the most promising technology for medical image classification. The notations used in this paper are presented in Table 1.

Nucleus and cytoplasm differentiation of pap smear blood cells under microscopic view.
Notations and abbreviations
PReLu hyper parameter setting
Existing methods utilized different CNN structures to classify the pathological condition of the pap smear cervical cell. The pretrained model is employed in [19] with customized last layers. [14, 18] employed CNN based backbone network for segmentation and classification. In the 2-class classification task, the machine learning-based classifier integrated CNN model [17] achieved 99.5 percent prediction accuracy.
The goal of this work is to establish a robust system for automatically detecting cancer cells on Pap smear images utilizing a deep learning strategy. Initially, the cervical image is denoised and normalized using discrete wavelet transform (DWT) to remove the distortion, artifacts, and noises from the microscopic cervical cells. An IR-PapNet is then designed to label the nucleus and cytoplasm of the cancerous cervical cells. The main contributions of this paper are as follows: We propose an IR-PapNet model for automatically classifying the cancerous cervical cell. We explore the effectiveness of the activation function PReLu. This function overcomes the dying ReLu problem, and assists to achieve a high learning rate. So, problem adaption was accomplished swiftly and with high prediction accuracy. We used a 2D-DWT-based noise reduction approach in preprocessing to increase image quality, and augmentation is used to improve the generalization ability and avoid overfitting. A cervical cell classification system is provided, and the pap smear Herlev datasets are used to validate the reliability and accuracy of our prediction model.
The order of the proposed work is arranged as follows; Section 2 states a brief note on existing techniques and enlist the limitations, Section 3 elaborates the work modules, processing, system architecture, and classification process of the proposed IR-PapNet system. Section 4 covers the result analysis and performance comparison between the proposed model and the existing technique, while Section 5 describes the conclusion and future efforts.
Manually differentiating anomalous cells from healthy cells takes time, and is susceptible to mistakes. The study of computer-based methodologies has developed deep learning and machine learning-based algorithms to automatically detect cervical cancer by Pap smear samples. This section looks at a few of those strategies in depth.
In 2020 Wang, P. et al established an adaptive pruning-based deep convolution neural network for cervical cell categorization. The transfer learning method is leveraged for obtaining a pre-trained model. In CNN the convolution layer has been updated and several convolution kernels have been pruned for optimization. A digital camera with a microscope is used to capture Pap smear images. Their optimization method was time-consuming and had a complexity of computation [20].
In 2021 Venkatesan et al. deployed the Ensemble strategy in the deep learning model to diagnose cervical cancer. The dataset comprises 5679 colposcopy pictures gathered through Intel and Smartphone ODT’s cervical screening data collection. Two types of CNN model is used in this system for analyzing the data set. One is VGG6 which is used as a transfer learning technique and The other one is a novel implementation called CYGNET which is used to classify colposcopy images. This system failed to determine the pre-cancerous stage also trained data sets are limited [21].
In 2020 Li, Y. et al developed a deep learning model to classify cervical cancer. Time lapsed colposcopy images are collected from the collaborative hospitals it has 7668 patient results. It uses a CNN to identify the cervix region from the set of data. Then, at different time intervals, we extract features from cervical areas using independent feature encoding networks. Finally, a GCNN was utilized to combine the retrieved information to select the patient who requires a biopsy. This system fails to find Lesions inside the cervical canal [22].
In 2020 Luo, Y.M. et al. implemented the Multi-CNN model for the diagnosis of Cervical premalignant lesions. Colposcopic images were obtained from a cooperation hospital. Images are sent into the k means algorithm for preprocessing. In the Ensemble model transfer learned CNN model was fused with the XG-Boost model. This method saves time, but it is not ready for practical use due to its limited data set [23].
In 2020 Hussain, E, et al. presented a comprehensive study to analyze the prediction performance of the deep learning model in pap smear cervical cell images. They compare the performance of six distinct learning models and an ensemble model. The best three individual models were chosen as the basic model of ensemble learning. The comparative results show that the ensemble strategy performed better than the individual model [24].
In 2020 Hussain, E. et al developed an automatic segmentation and classification algorithm based on deep learning for cervical cell images. The standard U-Net model was updated through dense layers and residual structures, and the encoder-decoder division was designed to be a fully connected structure. The overall framework achieves segmentation and classification rates of 97 percent and 98.8%, respectively. The cells in the densely clustered region were not detected by this approach [25].
Khamparia, A. et al. presented a cervical cancer classification model employing a deep convolutional model in 2021. The integrated framework of CNN and variational autoencoder was established for a prediction model. The Herlev dataset was carried out to evaluate the performance of the suggested technique. This model takes a small sample image for assessment, so generalization is to be not sufficient [26].
Jia, A.D., et al. established an intelligent system to predict aberrant cells in cervical pap smear images in 2020. The features for the deep analysis were extracted concurrently using radiomic feature extraction (Gabor and GLCM) and the deep CNN model. Finally, the features are combined and sent into the SVM classifier for detection. For 2-class classification, this model achieved 99.3 percent classification accuracy [27].
In 2020, Kuko, M., and Pourhomayoun, M. created a learning model to classify pap smear cervical cells. The obtained samples are divided into single and cluster formats, and analysis is accomplished using deep learning and ensemble learning approaches. The findings show that, when compared to machine learning ensemble techniques, the deep learning model functioned well and provided higher accuracy [28].
Haryanto, T., et al., 2020, investigated the role of padding in cervical cell classification. AlexNet is used to classify cervical cells. The prediction method is then repeated with and without padding. According to the experimental results, using a padding technique on the AlexNet structure can boost accuracy by 2.44 percent [29].
According to the literature review, some techniques use machine learning models which makes feature engineering complex. Because certain cases use limited samples, generalization ability is poor. Most techniques used ReLu, so there’s a chance of dead ReLu, which means the model can’t be reactivated, and the necessary attributes of the cells have vanished from the feature map, resulting in a lower prediction rate. In this paper, we employ IR-PapNet and preprocessing techniques to improve the performance of an automatic cervical cell classification framework. The IR-PapNet leverage the advantages of the customized PReLu activation function which overcome the dead ReLu problem and helps to achieve optimal results. The augmentation method provides sufficient data to improve our model generalization.
Proposed methodology
The human cervix is a sensitive thin layer of soft tissues situated in the mouth region of the uterus connecting to the vagina. Mutation of DNA in healthy cells develops malignance in cervical cells. The target of the proposed work focuses on the precise labeling of cytoplasm and nucleus from the pap smear blood cell image. The proposed technique has three processing phases such as data collection, data augmentation, preprocessing, and Classification. Figure 2 illustrated the outline of the proposed learning model framework. The detailed description of the proposed work is elaborated in the following section.

Outlook of Multi-tasking deep convolution neural Network Framework for 2-class classification of cervical cancer.
Pre-processing is a huge step forward in medical image analysis, and it comprises techniques such as data augmentation, image contrast enhancement, and denoising. Noises on medical images frequently degrade the quality of the images, influencing diagnosis and therapy. Denoising is the process of removing noise from images, which has a considerable impact on other inspecting methods. In the proposed model 2D-DWT-based image denoising technique is used.
Two-dimensional DWT (2D-DWT) decomposes an image into 4 subs: vertical (V), diagonal (D), horizontal (H), and approximation (A) factors, where D, H, and V are detailed coefficients or sub-groups that can be the threshold for denoising. 2D-DWT is applied in three unique directions which had been produced before using the proposed re-slicing strategies (axial, sagittal, or coronal) and the outcomes will be a lot of volumes, every one of those volumes contains extricated features from the first volume. MATLAB tool compartments have been utilized again here to remake the obtained slices for building the portioned 3D volume. Generally, the equation of dilation is shown in Equation (5).
The wavelet function of (t) and the relationship with (t) is shown in Equation (2):
The proposed method makes use of a hybrid multi-branch architecture Inception-Resnet-v2 [Pap Net] that combines residual connections and inception v2 technology. The objective of the inception- ResNet-v2 is to expand the proportion of layers while lowering the number of parameters. Batch normalization is used to improve the learning rate of Inception-ResNet-v2. It uses Factorization to lower the filter size, hence reducing the overfitting problem and the number of parameters. This network has 164 depth layers, which allows it to learn high-level feature representations from a variety of images. The internal module of the network, which contains stem, Inception-ResNet, and reduction blocks, is depicted in Fig. 3.

The structure representation of inception-ResNet-V2.
Here z is the activation function input, α
i
is the learnable parameter that regulates the pitch in the negative part of the characteristic map. The subscript i in α
i
indicates the utilization of
Where μ represents the momentum and l depicts the learning rate.
Due to the possibility of pushing α i to zero, regularization is not used when updating α i , which leads PReLu to ReLu. Furthermore, we do not restrict the range of α i , allowing the activation function to be non-monotonic.
Here convolution operation kernel size is 3X3 with 2stride. The feature map’s dimension is reduced by using stride 2 max-pooling, which also reduces the parameter size of the stem output.
This hybrid block has 3 sections Inception-ResNet-A, B, and C which is illustrated in Fig. 4. Here the outcome of the inception module’s convolution operation is added via residual connections. 1 × 1 Convolutions are used after the multi-size convolution (1×1, 3×3, 5×5, etc...) to match the depth sizes in the concatenation operation.

The detailed layer structure representation of a) Inception-ResNet-A b) Inception-ResNet-B & c) Inception-ResNet-C.
Where z represents the input vector and k is the real number. The output of the softmax is to be in the interval of (0,1)
IR-PapNet Hyper parameter setting
Implementation details
The proposed IR-PapNet (Inception-ResNet-v2) architecture design was implemented in Matlab R2019b, and the CPU processor with an I9 Intel core and 8 GB memory is used for the performance and experimental evaluations. The IR-PapNet is evaluated using the Herlev dataset images. The performance of the proposed IR-PapNet is assessed through parameters such as accuracy, Recall, sensitivity, precision, specificity, and F1 score. The suggested PapNet’s performance was measured using a comparison analysis of state-of-the-art approaches.
Dataset description
Herlev pap smear Database is utilized for the training, Testing process in IR-PapNet architecture. The pap smear database was developed by Herlev university hospital in Denmark. The image has a 0.201μm resolution per pixel. It has 917 single-celled Pap smear cervical images with segmented and labeled cytoplasm under seven classes. Three normal classes (400 images) and four abnormal classes (517 images). A detailed description of the Herlev dataset is given in Table 4. Clipping, scaling, flipping, Rotations, and translations are applied to each cell picture in the Herlev dataset to produce a reasonably homogeneous data distribution. After augmentation, the normal and abnormal class contains around 700 and 760 images respectively. The proposed method used 80% of the data for the training process and 20% for the training and validation process.
Sample description of the Microscopic Pap smear Blood cell data
Sample description of the Microscopic Pap smear Blood cell data
The experimental results were evaluated with accuracy, specificity, sensitivity, precision, F1 score, and Recall. The statistical evaluation of the parameters is given below,
Classification accuracy, which measures the number of correctly predicted cervical cancer categories divided by the total number of cervical samples.
Where TP represents true positive which indicates the model correctly predicted the abnormal class, true negative (TN) indicates the model correctly predict the normal class. False positive (FP) are cases in which the model predicted the normal class as abnormal. The False Negative are cases in which the model made the prediction as normal but the cell actually belonged to an abnormal class.
Table 5 shows the performance measure of the proposed IR-PapNet model for both ReLu and PReLu, with a high value of sensitivity replicating the model’s resilience. The specificity index demonstrates that the model correctly recognizes real negative samples. In comparison to ReLu, the proposed PReLu activation aids in achieving high performance. Figure 5 shows the epoch vs accuracy curve of the proposed model, the x-axis depicts the epoch, and the y-axis represents accuracy. The model provides significant performance even in the small epoch due to the fusion of PReLu instead of ReLu. The chart shows that as the epoch value is increased, the model’s performance improves. Figure 6 displays the epoch and loss curve, which shows that as the epoch is increased, the model loss decreases. So, the model predicted results highly reliable.
Performance of the proposed IR-PapNet

Training and testing accuracy of the proposed model.

Training and testing loss curve of the proposed model.
Figure 7 depicts the proposed model’s confusion matrix for two-class categorization. The figure shows that the proposed model’s FN and FP rates are notably low. Using PReLu avoids dead ReLu and assists to leverage a high learning rate, which improves the model’s performance.

Confusion matrix of proposed 2 class classification.
Figure 8 illustrates the ROC curve of the proposed model. In this graph, the curve is plotted against the false positive rate and the true positive rate. This curve provides the efficiency of the proposed model in predicting the 2-class classification of cervical cancer.

ROC curve of the proposed model.
Figure 9 depicts the proposed IR-PapNet model’s layer representation; our model has 164 layers, so the layer has been shortened in the representation. In Fig. 9, the top, middle, and bottom layers have been depicted. Figure 10 shows the data augmentation result of a test image result. Cropping, flipping, rotation, scaling, and translation techniques were used to increase the data size in the proposed model.

layer representation of IR-Papnet.

Augmentation results of the proposed model.
Figure 11 shows the predicted result of proposed model. The images are obtained from the herlev dataset and split into two categories for training and testing. Figure 10 result is from the tested set, similarity between the ground truth and the predicted label is prove the accomplishment of the proposed model.

Predicted images from the testing Set.
Figure 12 shows the filter size visualization of the proposed model. Here the initial stem block kernel representation was provided. Figure 13 shows the implemented activation map representation of random layers for the specified input. Each kernel in the IR-PapNet is responsible for learning specific features in an image, and the performance of a learning model is entirely dependent on the details in the image’s activation map.

Visualization of filter size in IR-PapNet model.

Activation map representation of random layers in IR-PapNet.
To calculate the effectiveness of the proposed IR-PapNet, performance metric parameters such as accuracy, specificity, Precision, and, Recall are compared to existing approaches in a comparative analysis.
Table 6 illustrates the performance assessment between the traditional and proposed algorithms, which indicates the proposed model outperforms the other learning model. The implemented confusion matrix result of the traditional model in Table 6 is depicted in Fig. 14.
Comparative analysis of traditional techniques with proposed Inception ResNet V2 for Accuracy
Comparative analysis of traditional techniques with proposed Inception ResNet V2 for Accuracy

confusion matrix of the traditional model utilized in the comparison.
Table 7 shows that the proposed model achieved 0.4 percent more accuracy than the DCAVN [26] approach in Herlev 2-class categorization. The recognition accuracy of the technique used in this work was higher than that of the model in [3, 28]. Unlike other existing approaches, there is no possibility of a dead node in our model. As a result, the model can be rapidly adapted to the problem. FN (False Negative) values were also decreased during testing, indicating that our technique may effectively distinguish single-cell cancer cells.
Assessment of proposed methods with existing state-of-the-art models
Figure 15 depicts a performance comparative analysis of the existing and proposed model. Compare to others, the proposed model precision value is high, so the positive predicted value was in a reliable form.

Performance comparison of existing methods.
Figure 16 shows the epoch vs loss curve between the inception v2, ResNet-101, and the proposed model. We implemented other traditional models with the ReLu activation function, so the dead ReLu problem is acquired, which converts the data unit of the gradient to zero, then activating it again is never possible. However, in the proposed model, PReLu is used, which can improve system stability while also increasing performance. The proposed method’s robustness has been demonstrated by a comparison with existing methods.

Performance curve between different network.
In this paper, a deep learning framework is designed to diagnose cervical cancer using pap-smear images. Firstly, the image noises are eliminated by the 2D-DWT algorithm. Afterward, novel IR-PapNet is applied to classify cervical cells. The learning model has the advantages of residual function and inception model, so it enormously eliminates the vanishing gradient and overfitting challenges. To consider the traditional activation function limitations, it is replaced with PReLu, which increases the performance and decreases the time consumption. Finally, an SGD optimizer is utilized to lessen the loss function of the learning model. The proposed model achieved 99.8% classification accuracy for 2-class classification. The analysis of the results shows that the proposed framework’s performance is far superior to that of other existing learning model strategies. In the future, the model will be trained for 7 class classifications in order to accurately predict cancer stages and improve the accuracy of diagnoses. To make the model more practical, we will combine multitask learning with other imaging techniques in the future, such as colposcopy and cervical biopsy samples.
Footnotes
Acknowledgments
The authors with a deep sense of gratitude would thank the supervisor for his guidance and constant support rendered during this research.
Funding statement
The authors received no specific funding for this study.
Conflicts of interest
The authors declare that they have no conflicts of interest to report regarding the present study.
