Automated carcinoma classification using efficient nuclei-based patch selection and deep learning techniques

Abstract

Breast cancer can be successfully treated if diagnosed at its earliest, though it is considered as a fatal disease among women. The histopathology slide turned images are the gold standard for tumor diagnosis. However, the manual diagnosis is still tedious due to its structural complexity. With the advent of computer-aided diagnosis, time and computation intensive manual procedure can be managed with the development of an automated classification system. The feature extraction and classification are quite challenging as these images involve complex structures and overlapping nuclei. A novel nuclei-based patch extraction method is proposed for the extraction of non-overlapping nuclei patches obtained from the breast tumor dataset. An ensemble of pre-trained models is used to extract the discriminating features from the identified and augmented non-overlapping nuclei patches. The discriminative features are further fused using p-norm pooling technique and are classified using a LightGBM classifier with 10-fold cross-validation. The obtained results showed an increase in the overall performance in terms of accuracy, sensitivity, specificity, and precision. The proposed framework yielded an accuracy of 98.3% for binary class classification and 95.1% for multi-class classification on ICIAR 2018 dataset.

Keywords

Breast cancer histopathology nuclei-based patches nuclei feature fusion LightGBM

1 Introduction

In India, National Cancer Registry Programme (NCRP) extrapolated a surge in the number of cases from 13.9 lakh in 2020 to 15.7 lakh by 2025. Despite the surge, one-third of these cancers are preventable and curable. Global Breast Cancer Initiative (GBCI), an establishment of the World Health Organization (WHO) targeted to curtail the overall death rate of breast cancer across the globe by 2.5% per year, deterring the mortality of breast cancer nearly 2.5 million between 2020 and 2040, thereby reduce 25% of breast cancer deaths by 2030 and 40% by 2040 [1].

Though the initial screening of breast tumor is done using non-invasive imaging modalities [2], the type of the tumor is determined by biopsy, an invasive procedure for tissue extraction [3]. Further, to visualize the nuclei and cytoplasm, the tissues are stained with Hematoxylin and Eosin (H & E) stains. In digital pathology, the manual assessment of the images is quite strenuous as they are complex and disparate in nature. Hence, the Computer-Assisted Diagnosis systems (CADs) are used as the decision support systems by the pathologists for an automatic detection and diagnosis of tumors [4]. Recently, the amelioration of high computing and deep models such as Convolutional Neural Networks (CNNs) are intertwined with the CADs, have shown a remarkable performance compared to conventional methods for localization, segmentation, feature extraction and classification tasks [5 –10]. However, there are certain shortcomings in employing CNNs directly on the histopathology images leading to a poor performance in the classification process. Amongst the several factors, the variations due to the staining process leads to improper classification. Hence, the images must be prepared and standardized such that the nuclei in the images are made evident to attain a better classification accuracy. The preprocessing involves two phases, stain normalization and patch extraction. The stain normalization has gathered considerable interest among the researchers for past few decades as it has shown noticeable increase in the classification performance [6]. Amongst the significant approaches, the normalization technique proposed in [11] is employed in majority of the research contributions [8–10 , 12]. The histopathological images encompass cell structures which are intricate and overlapping. In the above research works, it is observed that stain normalization alone does not guarantee to capture the clear edges of the nuclei for an extraction of discriminative features. Better visualization of the nuclei and its structure can be obtained by using certain image enhancement techniques along with stain normalization method.

In general, the histopathology images are large and computation intensive. However, resizing the input size for CNN leads to loss of crucial information. Hence, the patch extraction method is employed to extract significant patches from histopathology images which encompasses essential discriminative information for classification. The prevailing method for patch extraction is sliding window approach which is used in majority of the research works [8–10 , 12–18]. This method involves limitations as the discriminative nuclei are present in only certain areas of the tissues. In paper [12], the patch sizes considered for extraction are 400×400 and 650×650 where most of the patches have irrelevant information. Whereas the patch size of 512×512 is used in [8–9 , 17] and 128×128 is used in [8–10 , 19]. However, these patches are further resized to 32×32 and the Principal Component Analysis [8] and Autoencoders [9] are applied prior to K-means clustering to cluster the patches based on the phenotypes. This process is both computation and time intensive. On the other hand, some research works are more focused on the patch extraction through segmentation guided approach [7, 16], where the informative regions are targeted for a less computation intensive process. However, in many of the extracted patches, there are fewer nuclei and more cytoplasm leading to degradation in the classification performance [8]. This issue can be addressed using a novel method which focuses on extracting the nuclei features, which motivated to propose a nuclei-based patch extraction approach, where each patch obtained possess a non-overlapping individual nucleus.

Algorithm 1: Nuclei-Highlighting Algorithm
Input: Hematoxylin &Eosin-stained breast Histopathology RGB Dataset (D_BH) with size of m×n, where m = 2040, n = 1536
Parameters: α= 1, β= 0.15, I_o = 255, Hematoxylin &Eosin reference matrix (HE_ref) and reference for maximum stain concentrations of H&E (HE_cmax)
Output: Normalized stained image (I_norm), Eosin-stained image (I_e) and Hematoxylin-stained image (I_h)
begin
Read image from D_BH, where D_BH = I₁, I₂ . . . I_n and n = 400
for I = 1 to 400 do
Convert RGB pixel value (P₁, P₂……P_m_×_n) of the image to the OD space using Eq. (1)
if (P_i > β) /i = 1 to m×n /
Retain the pixels (P_i) in the OD space
else
Remove the trivial pixels
end if
Apply singular vector decomposition on OD tuples to obtain eigen vector (E_vec) and eigen values (E_val)
Project the plane from the E_vec direction with respect to the two largest E_val
Determine the angle φ_min and φ_max for the first E_vec direction and calculate the robust extreme values
/Robust extreme values are α and (100-α)/
Find the vectors V_min,V_max for φ_min, φ_max and project back to OD
/Vectors corresponding to hematoxylin are assigned first and eosin vector as second/
if (V_min > V_max)
HE = [V_minV_max] /Hematoxylin/
else
HE = [V_maxV_min] /Eosin/
endif
Calculate the individual stain concentrations C = HE/Y
/where Y is a matrix with rows to channels and columns to OD values/
Regenerate the normalized image I_norm = I_o_×exp (-HE_ref*C)
Separate I_e and I_h from I_norm
/* Consider I_h images for gamma correction as it contains nuclei components*/
Convert I_h to grayscale
Apply gamma correction using Eq. (2) to obtain I_gc
/* Where I_gc, gamma corrected image*/
returnI_gc
endfor
end

The renaissance of deep learning along with high performance computation has achieved significant success in medical imaging domain. The deep learning techniques have achieved substantial progress in several tasks such as medical image reconstruction, enhancement, registration, segmentation, and diagnosis of the disease. The new progress of deep learning technology involves several strategies such as deeper the network, there is an increased discrimination knowledge. The involvement of the adversarial and attention models assists in an automated identification of “where” and “what” for a comprehensive decision making. For channel attention mechanisms, squeeze and excite networks are incorporated. An architecture design can be automated using the Neural architecture search (NAS) for a better performance. The limited data condition in the medical imaging domain has also prompted to explore techniques to scale up the dataset. This is achieved through data augmentation using conventional approach called affine transformations, which comprises rotation, scaling, translation, flipping and other geometric transformations [20, 21]. However, these affine transformations include occupancy of supplementary memory and at times some operations such as cropping, or translation which needs extra observation to keep a check for any alteration in the images. Especially in medical imaging domain, the affine transformation has a limitation as the discrimination between the training and the testing data are highly complex in positional and translational variance. Now, researchers have profound insights on augmentation of images using deep learning approaches like neural style transfer, adversarial training, and Generative adversarial networks [21] to obtain the synthetic image. These synthetic data obtained through deep learning approaches represents a solution for the medical image data scarcity challenge. The other challenge addressed by the deep learning methods is the segmentation of Region of Interest (RoIs) in medical images. The tissues which contribute more to the disease are identified and marked as the RoIs. Though several traditional approaches such as edge based, threshold based, region based, watershed based, clustering based. The deep method includes CNNs which typically use an encoder-decoder based architecture such as SegNet, UNet, DeepLab, Mask-RCNN.

For classification tasks, training deep models from scratch are usually highly intensive and need more expertise. However, these advent CNNs are capable to produce an excellent classification at the cost of high memory and computing usage. To overcome this, transfer learning approaches are used for feature extraction and classification [8–10 , 13–18]. Despite the contributions of the researchers, there is indeed a demand for an efficient algorithm that could be used for multiple datasets for a better classification performance [22]. The pretrained models can be easily finetuned and are simple to infer and extract the high-level features from the extracted patches. In majority of the research contributions, the VGG [23], ResNet [24], Inception [25] are used as the promising feature extraction approaches. Nonetheless, some approaches yield an efficient result on one dataset but an average result on another depending on the features. This challenge can be addressed by combining pretrained models or ensemble of best models for a better identification of the class labels [26 –29].

Especially, training an ensemble of pretrained networks on the nuclei patches assists to extract more discriminative features that contribute towards an enhanced classification. From the literature, it is evident that there is a lot of limitations with respect to the existing grid-based patch extraction as most of the extracted patches includes irrelevant information leading to a suboptimal training process. Hence the proposed model aims in the extraction of appropriate features for an optimal classification of breast tumors through feature fusion and decision trees.

1.1 Key contributions

Rather than the traditional approaches for patch extraction such as random sampling or grid sampling, the proposed method of direct extraction of nuclei patches will streamline the feature extraction process. Amongst several deep learning models, the well-established pretrained models adapted in the medical imaging domain for feature extraction are VGG16, Inception, and ResNet models where the features are fused to obtain a fused model. Further, the classification of the extracted features is achieved using a LightGBM models.

Algorithm 2: Highlighted Nuclei Patch Extraction
Input: Gamma corrected hematoxylin-stained images (I_gc)
Parameters: K = 1.414, σ= 2, t = 0.02, l = 10
Output: Nuclei patches of size 33×33
begin
Read gamma corrected images
for i = 1 to n /* n = 400 gamma corrected images*/
Normalise the image I_gc
for j = 1 to l /* where l is the levels in scale space*/
σ₁=σ * k^j
Create LoGfilter using Equation (3)
I_conv ← Convolve the image with LoGfilter
I_pad ← Padding the I_conv with 1’s
I_final ← Square the response of the I_pad
/* To locate the local maxima among its 26 neighbours in scale-space (3×3×3) */
Find the maximum peaks of pixels in the image
if (pixels > t)
Find the coordinates (C) of the pixels
for C_i = 1 to y /* y is the total number of coordinates in the image*/
Ignore the overlapping coordinates around 25 pixels of C_i
Extract patches of size 33×33 with the coordinate of the nuclei in the center
endfor
return extracted patches of 33×33
endif
endfor
endfor
end

The key contributions of this paper are summarized as follows:

The nuclei whose features contribute more to tumor detection are highlighted using a combination of two image pre-processing techniques, stain normalization and gamma correction.

A novel nuclei-based patch extraction algorithm is proposed for the selection of non-overlapping nuclei, which eradicates the redundancy in feature extraction and results in the extraction of better features.

An ensemble of transfer learning algorithms is used for an efficient discriminative feature extraction from the nuclei-based patches.

LightGBM, a robust algorithm based on The Gradient Boosting Decision Tree (GBDT) is employed on the obtained features for classification.

2 Materials and methods

In this section, the materials and methods used in this work are discussed in detail. Figure 1 illustrates the schematic representation of the proposed system.

Fig. 1

Schematic representation of the proposed system.

2.1 Nuclei based patch extraction

As the histopathology images are prone to variations due to staining procedures, there is a need to eliminate the differences and highlight the nuclei for a better classification. The nuclei in the histopathology images are extracted using the approach illustrated in Algorithm 1 and Fig. 2. In this work, the normalization strategy [11] is employed to normalize the stained images and separate the hematoxylin (I_h) and eosin (I_h) components. Since there is a nonlinear relationship between light intensity and stain concentration, the RGB intensity data cannot be used directly for stain separation. Hence, the color intensities of the RGB present in the H&E-stained images are transformed to optical density (OD) space by employing logarithmic transformation using Eq. (1), where I represent a vectorized matrix of the image and I₀ is the incident light intensity set to 255 as each channel is of 8 bits [6]. The OD is a linear combination of the stain vectors and the saturation matrices for each stain. $Optical Density = - {log}_{10} (\frac{I}{I_{0}})$ (1)

Fig. 2

Feature Extraction and classification process.

The geodesic path is the shortest route between any two unit-norm color vectors. The OD transformed pixels can be projected onto this geodesic direction in stain vectors. The plane is determined with vectors which correlate to the largest singular values obtained because of singular value decomposition of the OD converted pixels. Thus, the optimal stain vectors are identified, and their individual stain concentrations are calculated. The normalized image I_norm is obtained from where the hematoxylin component I_h and eosin component I_e are separated which is illustrated in Fig. 4.

Fig. 4

Results of stain normalization of the H&E-stained images.

The key markers for the diagnosis of tumor include variations in the morphological and textural features of nuclei. As hematoxylin binds with nuclei-based components and eosin highlights the other components. the hematoxylin-stained image, I_h is separated from the I_norm for a better selection of features depicted in Fig. 4(d). Consequently, involving non-overlapping nucleus patches directly rather than using grid sampled patches can assists the pretrained CNNs to extract the appropriate features pertaining to the nuclei. Hence, the I_h images are further subjected to gamma correction in order to highlight the nuclei edges and removal of noise. The output gamma corrected image is obtained using Equation (2). The I_gc is the gamma corrected image, and γ represents gamma value considered between 0.9 < γ< 1. A sample of the gamma corrected image is depicted in Fig. 5(a).

Fig. 5

(a) denotes the gamma corrected images (b) The nuclei blobs are identified using LoG (c) sample of non-overlapping nuclei (d) Sample of overlapping images.

$I_{gc} = (I_{h})^{1 / γ}$ (2)

The nuclei in the I_h image are highlighted than its respective background. Due to the inconsistent staining process, there are certain intensity discontinuities observed in the boundary of the nucleus and in certain cases it is multi-modal. However, with appropriate gaussian blurring effect, these intensity distributions can be witnessed as unimodal. Thus, the profile of the nucleus appears as a ridge with smooth change. In order to obtain a blob, the ridge pattern can be rotated around its central axis. By rotating a Gaussian’s second derivative around its axis, the nucleus can be modelled as a blob along with some additive gaussian noise. This approach is called LoGFilter. Thus, the Laplacian of Gaussian blob detection method [30] is used to extract the potential nuclei. The preprocessed image is convolved with the gaussian filter at different scales and the extrema is captured in the resultant scale space. For the given image at a point(x,y), the gaussian filter is generated using Equation (3). $G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (\frac{x^{2} + y^{2}}{2 σ^{2}})}$ (3)

The variable σ is the standard deviation of the kernel which defines the scale of the filter. Typically, the value corresponds to the size of the nuclei to be detected. In this scenario, we consider the nuclei as a blob and tune the parameters of the Laplacian of the gaussian (LoG) in order to detect the blobs at a particular scale. Nucleus varies between 5–10μm and hence the initial value of σ is considered as 2 to identify the smallest nuclei and the scale levels are chosen such that the final sigma is obtained to detect the largest blob in the given image. The scales obtained as a result of convolution is termed as scale sigma, σ₁. However, increased σ₁ value leads to a decreased response. Hence, a scale normalized LoG is applied by multiplying the LoG with σ²to find the nucleus. A multiplying scale factor of 1.414 i.e $\sqrt{2}$ and a scale level of 10 is set such that the final sigma is achieved to detect the maximum size of the blob in the image. A threshold(t) is considered for the selection of blobs. If the filter response is larger than the threshold t, then the point is considered as the coordinate of the nuclei. A set of values from 0.01 to 0.07 are chosen. While analyzing empirically on the impact of the threshold values, it is observed if the threshold t is set to a larger or lesser value than 0.02, the detected blobs decreased in number. So, it is determined that the 0.02 covers most of the blobs which contribute to the nuclei selection. Figure 5(b) illustrates the nuclei blobs detected.

Further, the local maxima are detected in the scale space using non-maximum suppression which tends to identify pixels compared to its 26 scale-space neighbors [30]. The centroids of the nucleus are approximated where the neighbors of a recognized pixel within a coverage of less than 25 pixels are ignored for the consideration of non-overlapping nuclei. If the blobs are extracted as such with the intricate edges, they are prone to overfitting. Hence, the patches are selected in such a way where all the nuclei can be accommodated. Once the centroids of the blobs are detected, a nucleus centered square patch of 33×33 is considered. The nuclei square patches of size 33×33 will cover most of the non-overlapping nuclei using Algorithm 2, depicted in Fig. 5.

2.2 Nuclei feature extraction

Transfer Learning includes fine-tuning, training the model is quicker and easier when compared to building a model from scratch with random weights. In our work, the pre-trained State Of The Art (SOTA) models VGG16 [20], ResNet50 [21] and InceptionV3 [22] are used for feature extraction process.

These pretrained networks are utilized as they have yielded better performance with both natural and medical images. VGG16 models when applied on medical data have shown promising results and it is quite simple to implement. Simonyan et al. [20] proposed VGG16, composed of 13 convolutional layers, five maxpooling layers and three fully connected layers or dense layers. The convolutional and dense layer involves learning weights and are considered as the learnable layers. In convolutional layers, the model utilizes filters or kernels of 3×3 with Rectified Linear Units (ReLU) as activation function. The activations in the dense layers extract the trainable parameters with 4096 nodes. As the network grows deeper for a better precision it is quite tedious to optimize. With its defined architecture, ResNet50 addresses this limitation with its 50 layered CNN devised by He et al. [21] to classify the images in ImageNet dataset. The input size of images for ResNet is 224×224, with 64 filters in the convolutional layer of size 7×7 along with a stride of 2. This is followed by a maxpooling layer of 3×3 with stride of 2. The blocks comprise of three convolutional layers with residual connections, which sum up the actual input to output of the convolutional block. The architecture has 1000 neurons in its dense connected layer for a better classification. Amongst the several versions of Inception, InceptionV3 is an optimized model proposed by Christian Szegedy et al. [22] includes factorized convolutions, smaller convolutions, asymmetric convolutions, auxillary classifier and grid size reduction to develop the architecture. The 1×1, 3×3 or 5×5 convolutions assists feature pooling to extract the maximum number of features from each convolution layer. However, in all the models, dense layers are replaced with a Global Average Pooling (GAP) layer, thereby control overfitting by reducing the number of network parameters. Breast histopathology images comprises a variety of textures, shapes, and histology elements including cytoplasm and nuclei. Hence, the nuclei patches are extracted from the images and in order to extract deep representative features from the patches, an ensemble of three algorithms are employed. The nuclei feature extraction and classification process is illustrated in Fig. 2.

2.3 Nuclei feature fusion and classification

As a result of feature extraction, the feature maps of each nucleus patch extracted from the image are combined to form a single vector using p-norm pooling [8, 12] using Equation (4). ${fv}_{pool} = {(\frac{1}{n} \sum_{i = 1}^{n} ({fv}_{i}^{p}))}^{\frac{1}{p}}$ (4) where n represents the total nuclei patches for each image, fv_i represents the vectors of the feature maps associated to the i^th patch of the sample observed, with p value as 3.

Several pooling strategies such as are tested on the obtained features. In general, the pooling strategy assists to reduce the training intricacies while employing the classifier. Further, the obtained feature vectors are further classified using a LightGBM [31] classifier, an open-source framework based on gradient boosting decision tree algorithm (GBDT).

3 Experiments and results

3.1 Datasets

The ICIAR 2018 Grand challenge on Breast Cancer Histology images is an extended dataset [32] and the distribution of the images are tabulated in Table 1 and the sample images are depicted in Fig. 3.

Fig. 3

Illustration of samples from ICIAR 2018 dataset, (a) & (b) belong to non-carcinoma and (c) & (d) to carcinoma classes.

Table 1

ICIAR 2018 dataset

Class	Sub-class	Number of Samples
Non-carcinoma	Normal	100
	Benign	100
Carcinoma	Insitu	100
	Invasive	100

The proposed model is executed on a workstation with six core Intel processor of 3.9GHZ, 32GB DDR4 RAM and NVIDIA GEFORCE RTX 11GB GPU. The proposed work is evaluated on ICIAR2018, a benchmark dataset for breast histopathological images. From the ICIAR2018 dataset, there are a total of 22,105 non-overlapping nuclei patches are extracted, out of which carcinoma nuclei patches are 13,185 and non-carcinoma patches are 8,920. The data augmentation is an indispensable task where the patches are rotated in X-Y plane with $\frac{c π}{4}$ , where π is 180° and c ranges from 0 < c < 7, yielding 7 augmented samples of rotation 45° variations for each patch. Thus, the augmentation of patches increases the training data and eradicates overfitting problems.

3.2 Performance metrics

The performance of the proposed model to classify the breast histopathology images into four classes as normal, benign, insitu and invasive are empirically analyzed using metrics such as accuracy, precision, sensitivity, and specificity.

The formulae for the performance metrics are given in Equation (5–8).

$Accuracy = \frac{T_{positive} + T_{negative}}{T_{positive} + T_{negative} + F_{positive} + F_{negative}}$ (5) $Precision = \frac{T_{positive}}{T_{positive} + F_{positive}}$ (6) $Sensitivity = \frac{T_{positive}}{T_{positive} + F_{negative}}$ (7) $Specificity = \frac{T_{negative}}{T_{positive} + F_{positive}}$ (8) where T_positive denotes the number of positive samples classified as positive. Similarly, the T_negative denotes the negative samples which are accurately classified as negative, F_positive denotes the number of samples that are incorrectly classified as positive and F_negative represents the number of samples that are incorrectly classified as negative.

Finally, the confusion matrix is a contingency table representation between the actual and the predicted outcome which provide a better visualization of the model performance.

3.3 Results of proposed model

The hematoxylin stain normalized image, I_h is obtained from the image for a better nuclei extraction and depicted in Fig. 4.

The edges of the nuclei are highlighted in the obtained images using gamma correction method. The nuclei are captured prominently if the images are preprocessed using gamma correction which is illustrated in Fig. 5(a). Using our nuclei blob detection algorithm, the non-overlapping nuclei patches are extracted for feature extraction using non maximum suppression technique. The detected blobs are illustrated in Fig. 5(b). A sample of the overlapped and non-overlapped nuclei blobs are shown in Fig. 5(c) & 5(d) .

The features of the nuclei patches are extracted using pretrained SOTA models InceptionV3, ResNet50 and VGG16 using transfer learning approach. The fully connected layers are removed in all pre-trained SOTAs, and the global average pooling technique is introduced for accepting images of arbitrary sizes and average of each feature map is generated. It also provides a determined size of feature vector of length equivalent to number of channels of the feature map. Thus, the InceptionV3, ResNet50 exhibits one dimensional feature vector of 2048 and VGG16 of 1408. The obtained feature descriptors are merged to form descriptors for each image using p-norm pooling technique, where p is 3. The data are again processed with five different seeds, thereby training around 150 gradient boosting models (10 number of folds×5seeds×3 CNNs).

Further, a FusedModel is obtained by combining the predictions of three different pre-trained models, seeds and are cross validated across all the 10 folds for multi-class classification. The comparative study of binary and multi-class classification of the images using the proposed model are studied.

Figures 6 & 7 showcase the trends in the performance of the binary and multi-class classification across all 10 folds.

The confusion matrix for the image-wise classification of binary and multiclass classification is depicted in Fig. 8. The binary classification comprises of carcinoma and non-carcinoma classes, whereas the multi-class involves normal, benign, Insitu and Invasive classes.

Fig. 6

Represents the performance metrics of binary class classification for all 10 folds using cross validation.

Fig. 7

Represents the performance metrics of multi class classification for all 10 folds using cross validation.

Fig. 8

Confusion matrix of binary class and multi-class classification.

The classification accuracy for binary classification is 98.3%. For multi-class classification, the average classification accuracy across all folds is 95.1%.

The performance metrics of the image-based classification for four classes using the fused model are tabulated in Table 2. From the table, it is inferred, the fused model achieved an accuracy of 95.1% and outperformed other state of the art methods. Similarly, the binary classification is performed and are tabulated in Table 3. From Tables 2 and 3, the average performance metrics of binary classification is high when compared to multi-class classification. This is due to the indistinguishable features between the normal and the benign classes.

Table 2

Performance metrics (in %) for four class image level classification using proposed method

Classes	Metrics (in %)
	Accuracy	Precision	Sensitivity	Specificity
Normal	92.5	94.6	91.3	97.5
Benign	95.9	95.1	94.6	97.3
Insitu	95.7	95	95.2	97.1
Invasive	95.9	95.3	95.3	97.3
Mean	95.1	95	94	97.3

Table 3

Performance metrics (in %) for two class image level classification using proposed method

Classes	Metrics (in %)
	Accuracy	Precision	Sensitivity	Specificity
Non-carcinoma	98.6	98.6	99	98.2
Carcinoma	98.1	98.3	97.6	98.6
Mean	98.3	98.4	97.6	98.4

Hence, it is evident from the multi-class confusion matrix, that the normal image was labelled as benign. Among the multi-class classification, the insitu and invasive showed a better performance when compared to the benign and normal. However, in binary class classification there is an increased performance as both the normal and benign are of same noncarcinoma class.

3.4 Performance comparison of individual pretrained networks and the Fused model

The experiments of the proposed nuclei-based patch extraction model using three different pretrained CNN networks in ombination with LightGBM are performed. The mean percentage of accuracy, precision, sensitivity, and specificity are tabulated in Table 4 and Table 5 for both the binary and multiclass classification of the individual models. While employing the pretrained CNNs on the extracted nuclei patches, InceptionV3 have outperformed the other two models with 93.9% and 97.1% accuracy for multiclass and binary class respectively. VGG16 is made of simple structure has achieved significant performance in extracting the low-level representations. However, with complex medical images, learning the underlying patterns and extracting high level features needs at most care. Hence, InceptionV3 involves pointwise convolutions along with different filter sizes in the convolutional layers assists to learn the complex patterns of the nuclei patches. ResNet50 achieved a better performance compared to VGG16 and extracted the features which contributed for a better classification of the nuclei patches. Hence, the features extracted by all three pretrained models are fused together to achieve an improved classification accuracy.

Table 4
Performance comparison of the different pretrained models on image level multi class classification

Models Metrics (in %)

Accuracy Precision Sensitivity Specificity

InceptionV3 93.9 94.6 90.9 96.5

ResNet50 91.9 93.5 93.6 95.3

VGG16 90.7 92 92.2 94.1

FusedModel 95.1 95 94 97.3

Models	Metrics (in %)
InceptionV3	93.9	94.6	90.9	96.5
ResNet50	91.9	93.5	93.6	95.3
VGG16	90.7	92	92.2	94.1
FusedModel	95.1	95	94	97.3

Table 5

Performance comparison of the different pretrained models on image level binary class classification

Models	Metrics (in %)
	Accuracy	Precision	Sensitivity	Specificity
InceptionV3	97.5	97.9	96.6	97.5
ResNet50	96.9	97.5	95.6	96.3
VGG16	96.5	96.9	95.5	96.1
FusedModel	98.3	98.45	97.6	98.4

Table 6

Comparison of the proposed method with SOTA’s on ICIAR 2018 dataset

Method	Pre-	Data	Patch	Accuracy in %
	processing	Augmentation	Extraction	Binary	Multi-class
				Classification	classification
Alexander Rakhlin et al. [12]	✓	✓	Grid based	93.8	87.5
Kaushiki Roy et al. [14]	✓	✓	Grid based	92.5	90
Awan et al. [15]	✓	✓	Grid based	83	87
Golatkar et al.[16]	✓	✓	Nuclei density based	93	85
Gao et al. [17]	✓	✓	Grid based	x	87.5
Iesmantas et al. [18]	✓	✓	Grid based	x	87
Salvi et al.[7]	✓	✓	Guided nuclei segmentation	84	x
Cao et al. [19]	x	x	Grid based	x	87.1
Proposed	✓	✓	Nuclei based	98.3	95.1

3.5 Impact of various feature pooling approaches on classification accuracy

As InceptionV3 achieved better performance compared to the other models, this model is used to experiment different feature pooling strategies on the extracted nuclei patches. Figure 9 compares the accuracy of average pooling, square pooling, and the 3-norm pooling approaches. The relative accuracies are plotted as a bar graph and from the graph it is evident that the 3-norm pooling approach achieved the best accuracy with 93.9%. Thus, the p-norm pooling approach with p as 3 is considered for feature fusion across all the experiments. In disease diagnosis, even a slight increase in accuracy is essential. While designing a classification model for breast tumor diagnosis, it is essential to avoid misdiagnosing a benign as malignant or vice versa. Thus, to achieve a better result, it is indeed worth to employ feature fusion and use a fused model though there is an acceptable increased computation time.

Fig. 9

Accuracy comparison of different pooling approaches for binary and multiclass classification.

3.6 Comparison of nuclei-based patch selection and Grid based patch selection

For grid-based selection method, the images are stain normalized using [11] and preprocessed using gamma correction. The grid-based sampling is applied and patches of 512×512 are extracted from the images. Approximately 35 patches are extracted from each image with overlap. Data augmentation explained in section 3.1 is applied on these images. The augmented images are further fed to pretrained models and features are extracted. The extracted features are fused and classified using the techniques discussed in 2.3. The accuracy obtained as a result of grid-based sampling for binary and multi class classification are 94.5% and 91.5%. whereas the nuclei-based patch selection approach using Algorithm 1 and Algorithm 2 was improved by 3.8% and 3.6% compared to grid-based approach. The precision obtained by the nuclei-based patch extraction for binary and multi class is 98.5% and 95% which is 4.5% and 3.5% enhanced than the grid-based models. This evidently showcase that the features extracted from the nuclei-based patches contribute more towards the discrimination of the tumor. The processing time for nuclei-based technique is observed to be lesser than the grid-based approach.

4 Comparison of proposed method with SOTAs on ICIAR 2018 dataset

The proposed approach is compared with the existing SOTAs and is illustrated in Table 5. From the table, it is evident that majority of the works have considered grid-based patch extraction. Amongst the existing works, the highest accuracy for binary classification at image level is 93.8% [12] and 90% for multi class classification [14]. In paper [7], the authors employed a smart patch technique by choosing patches with high density nuclei and achieved an accuracy of 84% for binary classification. The proposed model outperforms the existing works with an accuracy of 98.3% for binary and 95.1% for multiclass classification proving that the feature extraction from the nuclei-based patches is successful.

5 Conclusion

CADs play a remarkable role in an accurate and precise diagnosis of medical images. With respect to histopathology images, accurate classification assists to identify the type of the cancer. Deep learning automatically infers and learn feature representations for classification of breast tissues. However, there are certain constraints which have a great impact on the accuracy and the time taken for processing the tissues. To address these issues, this research work proposes a novel nucleus-based patch extraction for an accurate automated classification of the histopathology images. The cancer diagnosis and their types are purely based on the characteristics of the nucleus. The highlighted nuclei patches which are obtained because of the pre-processing the histopathological images are fed to the pre-trained models for the extraction of discriminative features. The LightGBM is employed on the obtained features from three different pre-trained models and has achieved an accuracy of 98.3% for binary class and 95.1% for multi-class classification. In future, the model can be experimented on different tumor histopathological datasets. With the advancement of computation resources and deep learning techniques, data augmentation of the images can be achieved using Generative Adversarial Networks (GANs) and can further be extended to study the impact of attention models on these histopathological images.

References

World Health Organization. Available:, https://www.who.int/news-room/fact-sheets/detail/breast-cancer.

Mahmood

, Li

, Pei

, Akhtar

, Imran

and Rehman

K.U.

, A brief survey on breast cancer diagnostic with deep learning schemes using multi-image modalities, IEEE Access 8 (2020), 165779–165809. doi: 10.1109/ACCESS.2020.3021343.

Zhang

Y.J.

, Wei

, Li

, Zheng

Y.Q.

and Li

X.R.

, Status quo and development trend of breast biopsy technology, Gland Surgery 2(1) (2013), 15doi: 10.3978/j.issn.2227-684X.2013.02.01.

Deniz

, Sengür

, Kadiroğlu

, Guo

, Bajaj

and Budak

, Transfer learning based histopathologic image classification for breast cancer detection, Health Information Science and Systems 6(1) (2018), 1–7. doi: 10.1007/s13755-018-0057-x.

Saxena

and Gyanchandani

, Machine learning methods for computer-aided breast cancer diagnosis using histopathology: a narrative review, Journal of Medical Imaging and Radiation Sciences 51(1) (2020), 182–193. doi: 10.1016/j.jmir.2019.11.001.

Salvi

, Michielli

and Molinari

, Stain Color Adaptive Normalization (SCAN) algorithm: separation and standardization of histological stains in digital pathology, Computer Methods and Programs in Biomedicine 193 (2020), 105506. doi: 10.1016/j.cmpb.2020.105506.

Salvi

, Molinari

, Acharya

U.R.

, Molinaro

, Meiburger

K.M.

Impact of stain normalization and patch selection on the performance of convolutional neural networks in histological breast and prostate cancer classification, Computer Methods and Programs in Biomedicine Update (2021), 100004. doi: 10.1016/j.cmpbup.2021.100004.

, Wu

and Wu

, Classification of breast cancer histology images using multi-size and discriminative patches based on deep learning, IEEE Access 7 (2019), 21400–21408. doi: 10.1109/ACCESS.2019.2898044.

Acharya

, Alsadoon

, Prasad

P.W.C.

, Abdullah

and Deva

, Deep convolutional network for breast cancer classification: enhanced loss function (ELF), The Journal of Supercomputing 76(11) (2020), 8548–8565 10.1007/s11227-020-03157-6.

10.

Alom

M.Z.

, Yakopcic

, Nasrin

, Taha

T.M.

and Asari

V.K.

, Breast cancer classification from histopathological images with inception recurrent residual convolutional neural network, Journal of Digital Imaging 32(4) (2019), 605–617. doi: 10.1007/s10278-019-00182-7.

11.

Macenko

, Niethammer

, Marron

J.S.

, Borland

, Woosley

J.T.

, Guan

, Thomas

N.E.

A method for normalizing histology slides for quantitative analysis. In 2009 IEEE international symposium on biomedical imaging: from nano to macro, IEEE. (2009, June). 1107–1110. doi: 10.1109/ISBI.2009.5193250.

12.

Rakhlin

, Shvets

, Iglovikov

and Kalinin

A.A.

, Deep convolutional neural networks for breast cancer histology image analysis. InSpringer, Cham. June) pp, International conference image analysis and recognition (2018), 737–744 10.1101/259911.

13.

Araújo

, Aresta

, Castro

, Rouco

, Aguiar

, Eloy

and Campilho

, Classification of breast cancer histology images using convolutional neural networks, PloS One 12(6) (2017), e0177544. doi: 10.1371/journal.pone.0177544.

14.

Roy

, Banik

, Bhattacharjee

and Nasipuri

, Patch-based system for classification of breast histology images using deep learning, Computerized Medical Imaging and Graphics 71 (2019), 90–103. doi: 10.1016/j.compmedimag.2018.11.003.

15.

Awan

, Koohbanani

N.A.

, Shaban

, Lisowska

, Rajpoot

Context-aware learning using transferable features for classification of breast cancer histology images. In International conference image analysis and recognition springer, Cham. (2018, June), pp. 788–795. DOI: 10.1007/978-3-319-93000-8_89.

16.

Golatkar

, Anand

, Sethi

Classification of breast cancer histology using deep learning. In International conference image analysis and recognition Springer, Cham., (2018, June), pp. 837–844 doi: 10.1007/978-3-319-93000-8_95.

17.

Guo

, Dong

, Song

, Zhu

and Liu

, Breast cancer histology image classification based on deep neural networks. InSpringer, Cham. June) pp, International conference image analysis and recognition, (2018), 827–836. doi: 10.1007/978-3-319-93000-8_94.

18.

Iesmantas

, Alzbutas

Convolutional capsule network for classification of breast cancer histology images. In International Conference Image Analysis and Recognition, Springer, Cham. (2018, June), pp. 853–860. doi: 10.1007/978-3-319-93000-8_97.

19.

Cao

, Bernard

, Heutte

, Sabourin

Improve the performance of transfer learning without fine-tuning using dissimilarity-based multi-view learning for breast cancer histology images, International Conference Image Analysis and Recognition Springer, Cham. (2018, June), pp. 779–787 doi: 10.1007/978-3-319-93000-8_88.

20.

Elmuogy

, Noha Hikal

and Hassan

, An efficient technique for CT scan images classification of COVID-19.”, Journal of Intelligent & Fuzzy Systems 40(3) (2021), 5225–5238. doi: 10.3233/JIFS-201985.

21.

Dhivya

, Mohanavalli

, Karthika

, Shivani

, Mageswari

GAN based data augmentation for enhanced tumor classification. In 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP) (pp. 1-5). IEEE (2020, September) doi: 10.1109/ICCCSP49186.2020.9315189.

22.

Sierra-Enriquez

E.E.

, José Valdez-Rodríguez

, Edgardo Felipe-Riveró

and Calvo

, Classification and enhancement ofinvasive ductal carcinoma samples using convolutional neuralnetworks, Journal of Intelligent & Fuzzy Systems Preprint (2022), 1–9. doi: 10.3233/JIFS-219250.

23.

Simonyan

, Zisserman

Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014).

24.

, Zhang

, Ren

, Sun

Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778.

25.

Szegedy

, Liu

, Jia

, Sermanet

, Reed

, Anguelov

, Rabinovich

Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp. 1–9 doi: 10.1109/CVPR.2015.7298594.

26.

Kumar

and Batra

, Classification of Invasive Ductal Carcinoma from histopathology breast cancer images using Stacked Generalized Ensemble, Journal of Intelligent & Fuzzy Systems 40(3) (2021), 4919–4934. doi: 10.3233/JIFS-201702.

27.

Abbasniya

M.R.

, Ali Sheikholeslamzadeh

, Nasiri

and Emami

, Classification of Breast Tumors Based on Histopathology Images Using Deep Features and Ensemble of Gradient Boosting Methods, Computers and Electrical Engineering 103 (2022), 108382. doi: 10.1016/j.compeleceng.2022.108382.

28.

Sethy

P.K.

, Pandey

, Khan

M.R.

, Behera

S.K.

, Vijaykumar

and Panigrahi

, A cost-effective computer-vision based breast cancer diagnosis, Journal of Intelligent & Fuzzy Systems 41(5) (2021), 5253–5263. doi: 10.3233/JIFS-189848.

29.

Dhivya

, Anjali

R.J.

, Mohanavalli

, Sripriya

, Srinivasan

Investigations of Shallow and Deep Learning Algorithms for Tumor Detection. In 2020 IEEE-HYDCON (pp. 1-5). (2020, September). IEEE. doi: 10.1109/HYDCON48903.2020.9242888.

30.

Lowe

D.G.

, Distinctive image features from scale-invariantkeypoints, International Journal of Computer Vision 60(2) (2004), 91–110. doi: 10.1023/B:VISI.0000029664.99615.94.

31.

, Meng

, Finley

, Wang

, Chen

, Ma

and Liu

T.Y.

, Lightgbm: A highly efficient gradient boosting decision tree, Advances in Neural Information Processing Systems 30 (2017).

32.

Aresta

, Araújo

, Kwok

, Chennamsetty

S.S.

, Safwan

, Alex

and Aguiar

, Bach:. Medical Image Analysis 56 (2019), 122–139. doi: 10.1016/j.media.2019.05.010.