Deep learning the features maps for automated tumor grading of lung nodule structures using convolutional neural networks

Abstract

Accurately identifying the exact boundary region of the pulmonary nodules in lung cancer images are the most challenging tasks in the Computer Aided Diagnosing schemes (CADx). Detecting the boundaries from different nodule structures is crucial due to the presence of similar visualization characteristics between the nodules and its surroundings. The study proposed an approach for pulmonary nodule region of interest (NROI) detection and segmentation using Computed Tomography (CT) lung images. Lung nodule CT images are acquired from the Lung Image Database Consortium (LIDC) public repository having 1018 cases. In this paper, a methodology for automated tumor grading of pulmonary lung nodules is proposed using Convolutional Neural Network (CNN). The salient features of benign and malignant nodules from different nodule structures are automatically self-learned and classified based on the classification strategy. The stages involved in the methodology are: 1) Pre-processing the image datasets using discrete wavelet transforms (DWT). 2) NROI segmentation. 3) NROI Feature extraction using CNN. 4) Nodule classification. CNN are trained with self-learned extracted features from NROI and are further classified as benign or malignant. Analyzing and segregating these extracted features plays a vital role in the correct classification of malignancy levels. The methodology is compared with conventional state-of-art methods and traditional hand-crafted methods. A total of 710 pulmonary nodules are used in the study, with 258 benign samples and 452 malignant samples. A consistent behavior was observed using CNN with reduced low false positives and a classification accuracy of 96.5%, sensitivity of 96%, specificity of 96.55% and standard Receiver operating characteristic (ROC) curve with the highest value of 0.969 was recorded.

Keywords

Lung cancer computed tomography (CT)pulmonary nodules segmentation feature extraction Convolutional Neural Network (CNN)discrete wavelet transform (DWT)

1. Introduction

Lung cancer is a major challenging public issue worldwide in biomedical science leading to a high mortality rate among other types of cancer. According to statistics 2018 in the United States, nearly 234,030 new lung cancer individuals and 154,050 deaths were estimated by the American Cancer Society [1]. From the survey, lung cancer death rates were high when compared to other cancer combined. Early detection of lung cancer can improve the survival rate but it is hard to detect lung cancer as the symptoms normally arise when already in the final stages. Risk factors causing cancer are due to the consumption of tobacco, smoking and also due to biological, chemical (exposure to radon gas) and environmental conditions (exposure to secondhand smoke).

Advancing the fight against lung cancer at early stages helps prevent cancer growth to a certain extent. Certain measures include analyzing the presence of pulmonary nodules in the soft lung tissues according to their malignancy levels plays a challenging task in diagnosing the lung cancer at early stages using CT images [2] as the nodule densities may be similar to that of the other lung structures based to their anatomical properties [3]. In particular, pulmonary nodules are small oval or irregularly shaped growth in the lungs with the diameter up to 30 mm in size in the chest region [4]. In addition, Nodules are categorized according to the size, shape, location and texture information. Examining these nodule structures are rigorous and require expertise to reduce false positives. Therefore, detecting and diagnosing the nodules both play vital roles in lung cancer diagnosis at early stages to increase survival rates.

Various imaging modalities are recommended by radiologists in detecting pulmonary lung nodules [5]. Computed Tomography (CT) images are the commonly preferred imaging modality by the Radiologist’s to categorize the pulmonary nodule as normal (non-cancerous), benign (low malignancy levels) and malignant (high malignancy levels) due to their high sensitivity. Chest CT images are composed of a sequence of frames/images. These images may exhibit information related to different chest diseases such as tuberculosis, atelectasis, lung cancer, etc thus requires the radiologist’s to analyze the criticality. This could be a tremendous task for the radiologists to detect, diagnose and categorize the images. Hence computer-aided detection systems were developed to help and assist doctors/radiologists efficiently detect and diagnose the reliable nodules automatically [6].

The scope and application of image processing upon computer-aided diagnosis have a tremendous growth towards performance and accuracy over time. Lung CT Images are analyzed to detect pulmonary nodules using CAD systems. A typical CAD system examines the chest CT by 1) segmenting the region of interest (ROI) while eliminating the rest, 2) extracting the features of the segmented lung nodule, and finally, 3) classifying the nodules as normal, benign or malignant reducing false positives meanwhile generating good sensitivity scores. Some researchers have added a step of preprocessing prior to the lung segmentation process to enhance and make the image features visible in the scan.

Accurately segmenting the pulmonary lung nodules can directly affect the result analysis. Thus, recent advances in CADx systems have optimized the lung segmentation approaches to accurately enhance the performance of Lung cancer diagnosis based on the stages of malignancy levels categorized [7, 8, 9]. The lung nodule segmentation is a crucial step in identifying boundaries of different structures due to the presence of similar visualization characteristics between the nodules and its surroundings. For example, the intensity values of juxta-pleural (nodules attached to the walls of the chest) have similar intensity values to walls of the chest. Also, juxta-vascular are nodules attached to blood vessels. Differentiating the intensity values of true nodules to false nodules is difficult.

After detecting the potential candidate nodules, the salient features from different nodule structures were extracted. Extracted features using traditional hand-crafted methods are categorized as texture (structural features), morphological features, geometric (density) features and statistical (histogram) features. The existing CADx systems are in need to design these features as an essential model. But the process in manually extracting these features is time consuming and complicated [10]. Moreover, the extracted features must be in correlation to obtain expected performance measurements.

Recently, many researchers made progress in training and classifying huge datasets using deep learning CNN algorithms [11, 12]. The study reveals deep learning algorithms trained with huge datasets can outperform compared to traditional machine learning strategies [13]. This led to significant advancement in automated self-learning to extract feature maps using deep learning methods and are the trend changing concepts in medical imaging [14] in the recent past. However, hyper-parameters are to be explicitly defined for the neural network to capture the salient features from assorted volumes of CT imagesets having same size of input patches.

After feature extraction, the extracted features were classified based on homogeneity/classes of interest. In the study, the appropriate extracted features from the NROI are trained with same size of input patches and tested accordingly for nodule classification to eliminate false positives. In addition, some of the other factors to be added to system’s performance while detecting the abnormalities in the lungs include – Image acquisition, nodule size, shape and location identification, Image reconstruction, optimizing the system using some cross-validation process to ensure no false positive results.

The methodology presented in the paper focuses on the nodule region of interest segmentation, feature extraction, and classification stages. In the study, an end-to-end CNN are trained and tested with NROI samples and are classified as benign or malignant based on tumor patterns. The approach is compared with the state-of-art methods and traditional hand-crafted methods for performance evaluation.

In the paper, Section 2 describes the related work. Section 3 describes the methodology for segmenting and classifying the candidate regions using the features extracted from CNN and traditional hand-crafted methods. Section 4 illustrates the experiments and evaluation of the proposed method. In Section 5, the results obtained are discussed.

2. Related work

The proposed methodology aims to segment and analyze the extracted features of irregularly structured candidate nodules for better classification according to the levels of malignancy. The extracted features for nodule classification from both the traditional hand-crafted based and CNN based methods have challenging tasks to overcome as discussed above. In this section, the previous works related to the proposed method are discussed and organized more precisely into two categories based on nodule detection and classification methods. The works related to traditional segmentation and classification methods are summarized in Table 1. Table 2 represents the works related to deep learning based methods.

2.1 Category 1: Works related to traditional segmentation and classification methods

Ref [15] proposed a novel approach for early detection of lung cancer diagnosis Genetic K-Nearest Neighbour methods. A combination of genetic algorithm with K-NN classifies the pulmonary nodules quickly resulting in a classification accuracy of 90%.

Ref [16] implemented growing neural gas grouping technique to extract the pulmonary structures from the lungs. SVM systems were used to classify the extracted pulmonary structures as nodules or non-nodules. The images used in the study were acquired from LIDC dataset comprising of 29 exams having 48 nodules. The experiment reported a mean accuracy of 91%, sensitivity of 85.93% and specificity of 91%.

Ref [17] proposed a methodology to CAD systems for segmenting the lung region from each CT slice using a greedy snake algorithm. The nodule regions of interest were extracted using region growing methods. The shape and textures features were extracted from the region of interests and classified as cancerous or non-cancerous using radial basis function neural network. Total of 150 test images out of 1564 slices was acquired from LIDC dataset resulting in an accuracy of 94.44% and Sensitivity of 92.3%.

Ref [18] proposed a novel approach to CAD systems for segmenting and detecting lung nodules in CT images. Intensity thresholding was used to segment the lungs from the computed tomography images. The morphological closing operation was applied to segment the nodules attached to the lung wall (pleura connected). K-means clustering and morphological opening operation were applied to primarily detect and segment potential pulmonary nodules. The segmented nodules were categorized into six groups based on their thickness and connectivity with the lung walls. The LIDC and Mayo hospital dataset used in the study contains 133 different lung nodules identified from 1302 slices by 2 expert radiologists. The system’s sensitivity for small and large nodules is 83.33% and 93.8% respectively. Overall the system sensitivity was 91.65% and accuracy was 96.22%.

Ref [19] proposed an Efficient classification model for lung cancer stage diagnosis using Fuzzy C-means and SVM classification. Total of 70 image samples from LIDC image dataset was used in the experiment. As a pre-processing step, the Gaussian filter was used for smoothing and Gabor filter was used for enhancing the images. The lung images were segmented using FCM and classified using SVM classifiers with a classification accuracy of 93%.

Ref [20] developed a methodology for segmenting the candidate nodules on the lobes using self-organizing maps (SOM) based neural network model. Artificial neural networks were used to classify the nodules as benign and malignant samples. Images were acquired from CT scanners at Abant Izzet Baysal University Medical Faculty in DICOM format. Total of 128 CT scan images from 47 patients having 76 benign and 52 malignant samples was used in the experiment resulting in classification accuracy of 90.63%, sensitivity of 92.30%, specificity of 89.47%.

Ref [21] proposed a novel computer-aided pipeline for early lung cancer diagnosis composing of 4 stages. 1) pre-processing – Lung volume extraction method using the circular Hough transform. 2) Segmentation using self-organizing maps (SOM). 3) Extracted feature reduction using principal component analysis (PCA). 4) Classification using probabilistic neural networks. Images used in the study were taken from LIDC dataset consisting of 38 images having 26 malignant and 12 benign samples. The experiment achieved a classification accuracy of 95.91%, specificity of 94.24% and the sensitivity of 97.42%.

Table 1
Works related to traditional segmentation and classification methods

Research rapers	Objective	Image database	No. of images	Implementation Key difference	Results
[15]	Nodule segmentation and classification	LIDC	50–100 images	Feature extraction – Gabor filterGenetic algorithm with KNN classification	Accuracy – 90%
[16]	Nodule classification	LIDC	29 exams4949 images48 nodules	Growing neural gas (GNG) algorithmSVM classifier (RBF)	Sensitivity – 85.93%Specificity – 91%Accuracy 91%
[17]	Nodule segmentation and classification	LIDC	1564 slices150 test images	Segmentation – Greedy Snake algorithmFeature Extraction – Region growingRadial basis function neural network classifier	Sensitivity – 92.3%Accuracy 94.44%
[18]	Nodule segmentation and classification	LIDC	101 cases consisting 1302 slices133 nodules	Segmentation – Intensity ThresholdingK-means clusteringContour refinement process	For small nodules – Sensitivity – 83.3%For Large nodules Sensitivity – 93.8%Accuracy – 96.22%
[19]	Nodule segmentation and classification	LIDC	70 image samples	Gaussian filter – smoothing Gabor filter – enhancing the images Segmentation – FCM SVM classification	Accuracy – 93%
[20]	Nodule segmentation and classification	CT scanners	128 nodules 76 benign52 malignant	Segmentation – Self-Organizing maps (SOM)Classification – Artificial Neural Networks	Accuracy 90.63%Sensitivity – 92.30%Specificity – 89.47%
[21]	Nodule segmentation and classification	LIDC	34 lung nodules 12 benign26 malignant	Pre-processing – LUVEM (circular Hough transform)Segmentation – Self-Organizing maps (SOM)Principal Component Analysis (PCA)Classification – Probabilistic Neural Networks	Accuracy 95.91%Sensitivity – 97.42%Specificity – 94.24%

Table 2

Works related to deep learning methods

Research papers	Objective	Image database	No. of images	Implementation Key difference	Results
[22]	Nodule classification	LIDC	2618 CT slices880 – LMNs495 – HMNs	Multi-crop convolutional neural network (MC-CNN)	Accuracy – 87.14%Sensitivity – 77% Specificity – 93% ROC curve of 0.93
[23]	Nodule detection	LIDC	13668 images	Three Multichannel ROI based Deep Learning 1) CNN 2) DBN 3) SDAE	AUC – CNN – 0.899 DBN – 0.884 SDAE – 0.852 Taditional CADx – 0.848
[24]	Nodule classification	LIDC	70 images Normal – 27 Benign – 21 Malignant – 22	Optimal Deep Neural Network (ODNN)Feature extraction – Linear Discriminate Analysis Classification – Modified gravitational search algorithm	Accuracy – 94.56%Sensitivity – 96.2% Specificity – 94.2%
[25]	Nodule detection	LIDC	–	Transfer learning – deep CNN Feature extraction – VGG-16 Classification – SVM	Sensitivity –87.2% with 0.39 FPs85.4% with 4 FPs ps
[26]	Nodule detection	LUNA16	888 CT scans 223 nodules	Nodule detection – Faster region CNN CNN for false positive reduction and candidate merging Segmentation – customised fully CNN	Accuracy – 91.4% with FP of 1Accuracy – 94.6% with FP of 4
[27]	Nodule classification	LIDC	3286 $-$ nodules 4294 – non-nodules	CNN for false positive reduction	Sensitivity – 92.8% with 8 FP

2.2 Category 2: Works related to deep learning methods

Ref [22] proposed a nodule classification methodology using deep structured Multi-crop convolutional neural network (MC-CNN). Images were acquired from LIDC-IDRI database having 880 low malignancy nodules and 495 high malignancy nodules (total of 2618 images) were used in the study. Experiments resulted in nodule classification accuracy of 87.14%, the specificity of 93%, the sensitivity of 77% and an AUC score of 0.93.

Ref [23] proposed an end-to-end machine learning algorithm to automatically extract the salient features from CT images for lung cancer detection. A total of 13688 samples from LIDC datasets were used in the study. The nodules were segmented using the expert radiologist’s annotations and markups. Three multichannel ROI based deep learning algorithms were designed 1) CNN 2) Deep belief networks (DBN) 3) Stacked denoising autoencoder (SDEA). The methodology recorded an AUC of 0.899 for CNN, 0.884 for DBN, 0.852 for SDEA, and 0.848 for traditional CADx.

Ref [24] designed a nodule classification methodology using Optimal Deep Neural Network (ODNN) and Linear Discriminate Analysis (LDA). The extracted salient features from candidate nodules were optimized using LDA and were further classified using Modified gravitational search algorithm (MGSA). The experiments were conducted on 70 images. For training the network, all 70 images (Normal – 27, Benign – 21, Malignant – 22) were used while inturn 30 images were used for testing (Normal – 8, Benign – 11, Malignant – 11). The methodology achieved an accuracy of 94.56%, the specificity of 94.2% and sensitivity of 96.2%.

Ref [25] implemented a CADx system based on transfer learning for lung nodule detection. Images were acquired from LIDC dataset comprising of 700 nodules and 700 non-nodule samples. The database samples are pre-processed and cropped into 224 $\times$ 244 pixel rectangle for further processing. Feature extractors like VGG-16 are used to extract the salient features from the input samples and are classified using SVM classifiers. The system recorded the sensitivity of 87.2% with 0.39 FPs per scan and 85.4% with 4 FPs per scan.

Ref [26] designed a methodology to segment the nodule contours using an end-to-end fully automated framework. The algorithm involves 3 major phases. 1) candidate nodule detection using Faster regional CNN 2) false-positive reduction and candidate merging using CNN 3) candidate nodule segmentation using customized fully CNN. Images from LUNA 16 image datasets comprising of 888 CT scans having 223 nodules were used for experiments and resulted in a classification accuracy of 91.4% with false positives 1 per scan and 94.6% with false positives of 4 per scan.

Ref [27] proposed a lung nodule detection methodology using convolutional neural network. Raw image patches from LIDC datasets were used to train the CNN in the study to reduce the complexity of the system. Each CT image was split into several input patches with 3 nodule types and 3 non-nodule types for experiments. The methodology resulted in a sensitivity of 92.8% with 8 false positives per scan (FPs/scan).

Figure 1.

Architecture of the proposed method.

Figure 2.

Image dataset consisting of a) Lung CT image with Malignancy level 5 nodule b) Ground truth values c) Segmented NROI image.

3. Methodology

The main goal of the study is to design and implement a segmentation process, extracting only the nodule region of interest for automated tumor grading from different lung nodule structures and classify them according to their levels of malignancy. The Fig. 1 shows the implementation of the proposed method with 5 phases: Data Acquisition, pre-processing, NROI segmentation, Feature Extraction, and Classification. From the segmented NROI, the statistical and texture features are extracted for comparison. The extracted features are analyzed and segregated for correct classification of pulmonary nodules as benign or malignant candidate regions.

3.1 Data acquisition

Lung image database consortium (LIDC) and Infectious Disease Research Institute (LIDC/IDRI) is one of the most popular publically available databases for detecting and diagnosing lung cancers. LIDC image datasets consist of lung cancer screening clinical thoracic computed tomography (CT) scans with marked-out annotated lesions in an xml file [28] by four radiologists. Seven academic centers and eight medical imaging companies are collaborated to create the LIDC dataset consisting of 1018 cases having images with the scan slice thickness varying from 1.25 mm to 30 mm and malignancy levels varying from Level 1 to Level 5. The CT images are pre-processed and the corresponding four Radiologists marked-up annotations and ground truth (GT) values help to segment the NROI. Each xml file consisting of annotations categorize the pulmonary nodules into 3 classes: nodule less than 3 mm, nodule greater than or equal to 3 mm and non-nodule greater than 3 mm. The images retrieved are of size 512 $\times$ 512 dimensions. Each extracted annotations in correspondence to their malignancy levels are read and their subsequent loactions are retrieved slicewise, segmented and converted into TIF images [29] of size 52 $\times$ 52 pixels extracting the nodule section for further processing. The extracted NROI are rotated to three different angles (90, 180, 270) forming a training set of images with 710 samples, each sample containing 2704 pixels for use in the study. In order to distinguish the malignancy levels better and improve the efficiency, intermediate samples having level 3 malignancies, non-nodules larger than 3 mm, nodules less than 3 mm and nodules having ambiguous ids are eliminated in the study. Figure 2 shows the original CT scan image, its corresponding GT values and extracted nodule ROI rectangle of 52 $\times$ 52 pixels.

Figure 3.

Image samples of 52 $\times$ 52 pixels segmented based on the Nodule region of interest.

3.2 Pre-processing

Prior to the nodule detection, pre-processing the CT image enhances the quality and reduces the noisy artifacts occurred while capturing the images. Some of the image processing techniques used for noise removal and contrast adjustments are the wavelet transforms [30], median filter, Gaussian filter [31, 32], Laplacian and Sobel filter, contrast stretching (normalization), histogram equalization, CLAHE [33] and Gabor filter [2] and so on. In the study, discrete wavelet transform is applied on CT images to enhance the salient features or hidden details at different scales by decomposing the images into 4 frequency sub-bands as represented in Eq. (1) using low pass and high pass filters by applying Daubechies filters. Daubechies filters achieve the perfect reconstruction of the original signal when compared to other filters. These filters help in identifying the sudden changes in intensities in detail with respect to the original image. Finally all 4 frequency sub-bands are reconstructed using inverse DWT resulting in enhanced original image.

$\displaystyle\textit{Coef }[a_{n}]=\delta_{a_{n}}$ (1)

3.3 Nodule region of interest (NROI) segmentation

The candidate regions of pulmonary nodules are segmented using marked-up annotations and its corresponding ground truth values and masks. Each annotation in the xml file in correspondence to the image slice is retrieved and their corresponding locations on CT images are traced for candidate nodule region of interest segmentation [29]. Based on their malignancy levels, the candidate region’s pixels values are retained using the masks meanwhile the rest are padded with zero, forming tiff image samples of the 52 $\times$ 52 pixel region of interest (ROI) rectangular frame as shown in Fig. 3. All the information concerning to the nodule structures along with its shapes and sizes are extracted efficiently fitting the pulmonary nodule in the rectangle framed. In case the size of the nodule exceeds the 52 $\times$ 52 rectangle size, down-sampling is applied to the large nodules to fit into the rectangle. The Fig. 4 shows few NROI images and their corresponding cropped CT scans. Overall, a total of 710 pulmonary nodules are used in the study. Malignancy level 1 and 2 samples are combined forming 258 benign samples (low malignancy nodules), level 4 and level 5 samples are combined forming 452 malignant samples (high malignancy nodules).

Figure 4.

NROI images and their corresponding cropped CT scans.

3.4 Feature extraction

Feature extraction is a significant method used for computing dimensionality reduction of input data into a set of minimal features. In the study, salient features from the NROI segmented images are extracted from both traditional hand-crafted and CNN methods that play a prominent role in classifying the nodule images according to the categories of classes specified. A total of 23 features are extracted from traditional hand-crafted statistical and texture behavior that define discriminative features.

3.4.1 Traditional based feature extraction

1.
Statistical features are histogram based features indicating the frequencies of pixels varying from 0–255 gray level values present in image datasets. The statistical features are extracted for both benign (low malignancy) samples and malignant (high malignancy) samples. Some of the commonly extracted histogram features are mean, standard deviation, variance, skewness and kurtosis. The corresponding equations are derived as follows.

–
Mean: is an average gray level pixel value of a particular region/segmented area. It is roughly towards intensity values but not actually related to texture.

$\displaystyle EF_{1}=\mu=\frac{1}{mn}\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{m% }P(i,j)$ (2)
–
Variance: are gray level fluctuations from the actual mean gray level pixel values. The statistical distribution of variance helps distinguish profiles with low contrast using its texture.

$\displaystyle EF_{2}={\sigma}^{2}=\frac{1}{mn}\sum\limits_{i=1}^{n}\sum\limits% _{j=1}^{m}{[P(i,j)-\mu]}^{2}$ (3)
–
Standard deviation: is the square root of variance indicating the image contrast. Images with low contrast have low variance values whereas images with high contrast have high variance values.

$\displaystyle EF_{3}=\sigma=\sqrt{\frac{1}{mn}\sum\limits_{i=1}^{n}\sum\limits% _{j=1}^{m}{[P(i,j)-\mu]}^{2}}$ (4)

Figure 5.
Architecture of the CNN.

–
Skewness: Skewness is the measure of asymmetry with respect to gray level values around the sample mean. Skewness of the histogram is categorized into positives, negatives and zero curves.

$\displaystyle EF_{4}=\frac{1}{mn}\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{m}% \left[\frac{P(i,j)-\mu}{\sigma}\right]^{3}$ (5)
–
Kurtosis: is a measure of how prone a distribution is with respect to an outlier. Kurtosis depicts the shape of distribution of the histogram tail.

$\displaystyle EF_{5}=\frac{1}{mn}\sum\limits_{i=1}^{n}\sum\limits_{j=1}^{m}% \left[\frac{P(i,j)-\mu}{\sigma}\right]^{4}$ (6)

2.
Texture features – The texture of the image is calculated using their probability scores. The gray level coherence matrix (GLCM) and wavelet features are extracted for better classification performance and accuracy in the methodology. GLCM texture features indicate the estimation of recurrence of occurrences pair-wise having the same pixel values considering spatial relationships. GLCM features extracted in the experiment include energy, entropy, correlation, contrast, homogeneity, autocorrelation, cluster Prominence, cluster shade, Difference in entropy, Difference in variance, dissimilarity, information measure on correlation1, information measure on correlation2, inverse difference, maximum probability, sum average, sum entropy, sum of square variance and sum variance. The features extracted like Energy, Entropy, Homogeneity, Contrast, Correlation, Sum Average and Sum Entropy plays a prominent role in classifying images as benign or malignant. The corresponding equations are derived as follows.

–
Energy: also known as second angular moment or uniformity. It is the sum of squared elements in GLCM that describes the consistency in gray level distribution contributing to shape the maximum strength of surface.

$\displaystyle EF_{6}=\sum\limits_{j=1}[P(i,j)]^{2}$ (7)
–
Entropy: Entropy refers to image data that requires a compression process. Images having of low entropy values exhibits low contrast meanwhile images having high entropy exhibits large contrast of pixel values.

$\displaystyle EF_{7}=-\sum\limits_{i}\sum\limits_{j}P(i,j)\log[P(i,j)]$ (8)
–
Homogeneity: measuring how close the distribution of intensity values in GLCM with respect to GLCM diagonal.
–
Contrast: Measuring the local variance that calculates the spatial varying moments in GLCM.

$\displaystyle EF_{8}=\sum\limits_{i,j}\mid i-j\mid^{2}P(i,j)$ (9)
–
Correlation: Measuring the joint probability occurrences of gray levels of pixels with linear dependencies.

$\displaystyle EF_{9}=\sum\limits_{i,j}\frac{(i,\mu_{i})(j,\mu_{j})P(i,j)}{% \sigma_{i}\sigma_{j}}$ (10)

Figure 6.
Learning curve of the proposed methodology using ‘Adam’ optimizer.

3.4.2 CNN based feature extraction

Figure 5 depicts the architecture of the proposed CNN algorithm. Each CT scan of dimension 512 $\times$ 512 pixels from LIDC datasets were complex for the neural network to be trained. Considering the diameter of candidate nodules ranging from 3 to 30 mm, merely the candidate nodule regions are good enough to extract the salient features maps rather than considering the whole image. Hence a template was created for segmenting only the solitary candidate nodules with varying sizes and shapes. Images from LIDC datasets are re-sampled creating a 52 $\times$ 52 pixel nodule region of interest (NROI) rectangular frame. This improves the processing time, storage capability, and extracts relevant features maps to understand the characteristics captured by CNN better. The segmented NROI rectangles are passed as input to CNN.

The first convolution layer is of size 5 $\times$ 5 with 12 filters. The second convolution layer is of size 3 $\times$ 3 with 10 filters. The third convolution layer is of size 5 $\times$ 5 with 8 filters and the fourth convolution layer is of size 3 $\times$ 3 with 6 filters. Each convolution layer is intercepted by ReLu, max pooling and batch normalization. A stride of 2 $\times$ 2 is used in the experiments to decrease the size of the feature maps and weights of the network. The last layer before the output layer is a fully connected layer that has the input shrunk to 2 output neurons using softmax non-linear functions. Finally, the output layer provides the strength of the network prediction for each possible category of classes. Each neuron in the output represents one of the categories of classes such as benign or malignant nodules. Dropouts are used in architecture to prevent overfitting. The batch size was set to 100. A learning rate of 0.1 and the epochs was set to 100. Sub-sampling rate was normally set to 2. In the study, to increase the performance of training, Adaptive moment estimation optimizer (Adam) was used and the corresponding learning curve is depicted in Fig. 6. In addition, stochastic gradient descent with momentum (sgdm) optimizer was also used in the experiments to enhance the performance but resulted in better efficiency with Adam when compared to the sgdm optimizer.

Table 3
Segregation of input samples based on the malignancy levels

Testing	Phase 1	Phase 2	Phase 3	Phase 4
No. of images nodule $\geqslant$ 3 mm	5 images	18 images	5 images	12 images	8 images	22 images
Images Moved from	Malign 2	Malign 4	Malign 5	Malign 5	Malign 1	Malign 2
Images Moved to	Malign 4	Malign 2	Malign 2	Malign 2	Malign 4	Malign 4
Result	93.46	94.66	95.01	96.45

Figure 7.

Nodule diagnosis by 4 expert radiologists.

3.5 Classification

For traditional feature extraction methods, support vector machine classifiers are used to distinguish the nodules as benign or malignant based on the suspicious malignancy levels. SVM are straightforward supervised learning algorithms used for classifying categories of classes. The re-sampled 52 $\times$ 52 rectangle images from the LIDC datasets were analyzed by the existing SVM classifiers of Matlab tool. SVM classifier uses Radial basis function as the kernel to standardize the input data for classifying the lung cancers as benign or malignant. Based on the classifier, the images were classified into levels. Malignancy level 1 and 2 samples are combined forming benign samples labeled as ‘Nodules greater than 3 mm with Malignancy level 1 and 2’. Malignancy level 4 and 5 samples are combined forming malignant samples labeled as ‘Nodules greater than 3 mm with malignancy level 4 and 5’. A range of values was calculated from the benign and malignant features for classifying the images according to levels of malignancy based on their corresponding probability scores. Using the posterior probability scores, the standard area under the receiver operating characteristic curve (AUC) is plotted. The results exhibit high accuracy with low false positives categorizing cancer levels for better treatment procedures.

$\displaystyle K(p,p^{\prime})=\exp\left(-\frac{\parallel p-p^{\prime}\parallel% ^{2}}{2\sigma^{2}}\right)$ (11)

Where $||p-p^{\prime}||^{2^{\prime}}$ : is the squared Euclidean distance between two pixel values.

Table 4

DCNN architecture performance measure using Adam optimizer

# of layers	Architecture	Alpha	Kernel size	Accuracy
				Epoch 20	Epoch 30	Epoch 40	Epoch 50	Epoch 100
8	8, 6, 4	0.1	5, 5, 5	0.913	0.919	0.919	0.925	0.934
10	12, 8, 6, 4	0.1	5, 5, 5, 5	0.940	0.941	0.945	0.949	0.948
10	12, 8, 6, 4	0.1	5, 3, 5, 3	0.950	0.952	0.958	0.965	0.9648
12	16, 12, 8, 6, 4	0.1	5, 5, 5, 5, 5	0.934	0.938	0.938	0.934	0.939
12	16, 12, 8, 6, 4	0.1	5, 5, 5, 3, 3	0.939	0.940	0.939	0.938	0.942

Table 5

DCNN architecture performance measure using Sgdm optimizer

# of layers	Architecture	Alpha	Kernel size	Accuracy
				Epoch 20	Epoch 30	Epoch 40	Epoch 50	Epoch 100
8	8, 6, 4	0.1	5, 5, 5	0.881	0.890	0.896	0.885	0.889
10	12, 8, 6, 4	0.1	5, 5, 5, 5	0.930	0.921	0.926	0.933	0.933
10	12, 8, 6, 4	0.1	5, 3, 5, 3	0.919	0.931	0.924	0.935	0.939
12	16, 12, 8, 6, 4	0.1	5, 5, 5, 5, 5	0.915	0.905	0.912	0.904	0.925
12	16, 12, 8, 6, 4	0.1	5, 5, 5, 3, 3	0.921	0.914	0.923	0.918	0.928

Figure 8.

Images false positively accepted as high malignancy nodules.

Figure 9.

Images false positively accepted as low malignancy nodules.

4. Experiments and evaluation

The proposed methodology was evaluated on the segmented NROI samples resulting in a classification accuracy of 93.46%. To improve the performance of the existing methodology, we interpreted the inconsistency raised in the nodule classification. The hand-crafted features from the statistical and texture behavior were observed with overlapping values. A study was carried out to identify the inconsistency with candidate regions having different malignancy suspiciousness. As the candidate nodules are segmented using marked-up annotations and its corresponding ground truth values and masks, each xml files associated with CT scans were interpreted for annotations and analyzed to trace the inconsistencies. With certain observations in some cases, a particular candidate region has different malignancy levels (3, 4, 5) evaluated by four different radiologists for the same CT as shown in Fig. 7. Example: One radiologist evaluates the candidate region as malignancy level 3 while the other radiologists have evaluated the same candidate region as malignancy level 4 and 5 correspondingly. Due to incorrect malignancy suspiciousness, false categorizing the segmented images as benign or malignant samples resulted in low-performance overall exhibiting false positive classification of candidate nodule regions.

From a detailed study, the images with overlapping feature values were identified and traced to re-check the corresponding radiologist’s evaluation. Some candidate regions were false positively accepted as nodules by radiologists (example: images with ribs, vessels or diaphragm). A number of candidate nodule regions were true positively accepted as high malignant samples which were supposed to be true positively considered as low malignant samples as depicted in Fig. 8. The extracted features of these images have values that fall under the range of values distinct to benign samples. Similarly, the Fig. 9 demonstrated some candidate regions true positively accepted as low malignant samples but were supposed to be considered as high malignant samples with high probability. Likewise, the extracted features of these images have values that fall under the range of values distinct to malignant samples. Thus by segregating the samples manually according to their malignancy levels, a total of 70 samples were potentially identified and tested in phases as shown in Table 3. The study depicts a total of 70 samples were false positively categorized based on their ambiguous malignancy suspiciousness. These candidate nodules greater than 3 mm with incorrect malignancy levels were segregated and grouped manually by correctly positioning the candidate nodules into its corresponding folders forming a nodule grouped dataset.

Figure 10.

Proposed methodology tested using ‘Adam’ and ‘Sgdm’ optimizer.

Table 6

Comparison of proposed CNN with traditional CADx

Methodology	Actual ROI images	Nodule grouped images
Proposed CNN	93.46%	96.48%
Traditional hand-crafted method	91.10%	93.9%

Table 7

Comparison of results with works related to traditional methods

Works	No. of images	No. of nodules	Accuracy	Specificity	Sensitivity
[16]	29 exams	48 nodules	91%	91%	85.93%
	4949 images
[17]	1564 slices	–		–	92.3%
	150 test images
[15]	50–100 images	–	90%	–	–
[18]	101 cases	133 nodules	96.22	–	For small nodules $\geqslant$ 83.3%
	1302 slices				For large nodules $\geqslant$ 93.8%
[19]	70 images	–	93%	–	–
[20]	–	128 nodules	90.63%	89.47%	92.30%
		76 benign
		52 malignant
[21]	–	34 lung nodules	95.91%	94.24%	97.42
		12 benign
		26 malignant
Proposed methodology	710 images	258 benign	96.48%	96.55%	96%
		452 malignant

Table 8

Comparison of results with works related to state-of-art methods

Works	Dataset	No. of nodules	Accuracy	Specificity	Sensitivity	ROC
[22]	LIDC	2618 CT slices	87.14%	93%	77%	0.93
		880 – LMNs
		495 – HMNs
[23]	LIDC	13668 images	89.9%	–	–	–
[24]	LIDC	70 images	94.56%	94.2%	96.2%	–
		Normal – 27
		Benign – 21
		Malignant – 22
[25]	LIDC	–	–	–	87.2% with 0.39 FPs	–
					85.4% with 4 FPs ps
[26]	LUNA 16	888 CT scans	91.4 with 1 FP’s	–	–	–
		223 nodules	94.6 with 4 FP’s
[27]	LIDC	3286 – nodules	–	–	92.8% with 8 FP	–
		4294 – non-nodules
Proposed methodology	LIDC	258 benign	96.48%	96.55%	96%	0.969
		452 malignant

The CNN architecture was experimented with different configurations of convolution layers, filter sizes, and epochs for performance evaluation. CNN with nodule grouped dataset was trained with Adam optimizer with varying input parameters as shown in Table 4 with a highest classification accuracy of 96.48%. In addition, the experiments were repeated to train CNN with Sgdm optimizer and the results are depicted in Table 5 with maximum classification accuracy of 93.9%. Adam optimizer exhibited better performance measures compared to Sgdm optimizer as shown in Fig. 10. Finally, based on the experiments, we fixed the hidden layer’s neurons as [12, 8, 6, 4] with the kernel size of [5, 3, 5, 3] due to their relative stability while continuing our experiments.

5. Results and discussion

A consistent improvement was observed with the proposed methodology using a nodule grouped dataset exhibiting the highest classification accuracy of 96.48% compared to the initial dataset. Table 6 depicts the performance of the CNN and traditional hand-crafted methods for both datasets. The methodology exhibits good results compared to traditional hand-crafted methods. In addition, the algorithm outperformed compared to the traditional segmentation and classification methods and conventional state-of-art methods as shown in Tables 7 and 8 respectively.

Figure 11.

Area under the Receiver Operating Characteristic curve (AUC).

Figure 12.

Confusion matrix of the proposed methodology.

The proposed approach used 710 candidate nodule images proportionally separated into 2 groups namely training and testing sets folds. The statistical and texture features were extracted from the segmented solitary nodules and were classified using SVM classifier. In addition, an end-to-end automated feature learning was employed using CNN. In general, few pulmonary nodules may be false positively classified as true nodules if the morphological appearances of irregular shaped structures are similar to the actual nodule structure features. Hence significantly identifying and designing a set of features relevant to particular image datasets for classification are really challenging. The procedures in designing features are time-consuming and may not guarantee good results if correlations between features are not properly considered. With a minimal set of extracted features maps, the approach was efficient in analyzing lung cancers by distinguishing different structures of nodules classified as benign or malignant according to the malignancy levels of classes categorized. The methodology exhibits good results for lung nodule classification with the area under the receiver operating characteristic curve (AUC) score of 0.969, classification accuracy of 96.5%, sensitivity of 96%, specificity of 96.55% compared to the previous works. Figures 11 and 12 depicts the corresponding ROC curve and confusion matrix respectively.

Also generalizing a template for segmenting only the solitary nodules with varying sizes of irregularly shaped nodules were difficult. Therefore all the images in the database were re-sampled creating a 52 $\times$ 52 pixel region of interest (ROI) rectangular frame such that the solitary nodule fits into it with its candidate region’s pixels values retained using the masks meanwhile the rest were padded with zero. If the candidate nodule region exceeds the size of the frame, down-sampling was applied to ensure the nodule fits the frame. Not all the candidate nodule regions are down-sampled. The small nodules were retained without down-sampling as it may incur information loss, making sure each nodule fits into a rectangular frame. Hence the NROI rectangle of 52 $\times$ 52 pixels consisting of candidate region helps predict the size and shape of tumors which are essential for lung cancer diagnosis. The approach also increases the ability in identifying tumors in the internal structure (soft tissues, fluids, juxta-plural, juxta-vascular) locations with varying sizes and shapes predicting pulmonary nodules based on their malignancy levels ranging from 1 to 5. Overall, the nodule size, shape and malignancy level details must be taken into consideration as an important feature for classification. The training and testing ratio samples were varied to see if the algorithm results in a consistent behavior as represented in Table 9. The proposed novel approach shows good performance evaluation and is applicable to huge datasets. The approach can be applied to other datasets in medical imaging for better computational efficiency.

Table 9

Performance evaluation of the proposed method with varying testing datasets

Testing set	No of images	Accuracy
65%	462	95.32%
70%	497	95.61%
75%	532	95.40%
80%	568	96.48%
85%	604	95.80%
90%	639	95.07%

All the above-described methods were run on Matlab 2018b version on a desktop machine with the memory of 8 GB, 12 (4 C and 8 G) core AMD A10 processor and an Nvidia GeForce GTX 960 GPU enabled to analyze the results of the proposed method.

6. Conclusion

The proposed methodology demonstrates consistent results for classifying lung nodules as benign (Low malignancy level) or malignant (high malignancy level) samples based on malignancy levels of classes categorized. The methodology proposes the efficiency of detecting and segmenting the solitary nodule regions of interest from lung CT images. The NROI rectangle of 52 $\times$ 52 pixels consisting of candidate regions helps predict varying shapes and sizes of pulmonary nodule structures efficiently. With an end-to-end automated feature learning, the approach was successful in analyzing lung cancers by distinguishing different nodule structures for classification. The initial inconsistencies raised in the nodule classification as discussed in Section 4 were interpreted by segregating the samples according to their malignancy levels. The training and testing sets of segregated image samples are classified generating their corresponding posterior probability scores. Using the posterior probability scores, the area under the standard Receiver operating characteristic (AUC) curve is plotted. The method exhibits good results in handling the challenging problem of malignancy classification at early stages with the highest AUC score of 0.969 and classification accuracy of 96.48% amongst the other methods with low false positives.

Future work

Although the current results are encouraging, increasing the number of deep learning layers can improve the performance of detailed volumetric analysis of cancer nodules in medical image analysis.

A detailed investigation on optimal size of input patches and filter sizes are to be carried out.

References

Siegel

Miller

Jemal

. Cancer statistics, 2018. CA: A Cancer Journal for Clinicians. 2018; 68(1): 7–30. Available from: doi: 10.3322/caac.21442.

Makaju

Prasad

Alsadoon

Singh

Elchouemi

. Lung cancer detection using CT scan images. Procedia Computer Science. 2018; 125: 107–114.

Xiuhua

Tao

Zhigang

, et al. Prediction Models for Malignant Pulmonary Nodules Based-on Texture Features of CT Image. In: Theory and Applications of CT Imaging and Analysis. InTech; 2011.

Hansell

Bankier

MacMahon

McLoud

Muller

Remy

. Fleischner society: Glossary of terms for thoracic imaging. Radiology. 2008; 246(3): 697–722.

Zhao

. Early detection of lung cancer: Low-dose computed tomography screening in China. Thoracic Cancer. 2015; 6(4): 385–389.

Manikandan

Bharathi

. A survey on computer-aided diagnosis systems for lung cancer detection. Int Res J Eng Technol. 2016; 3(5): 1562–70.

Lassen

Jacobs

Kuhnigk

van Ginneken

van Rikxoort

. Robust semi-automatic segmentation of pulmonary subsolid nodules in chest computed tomography scans. Physics in Medicine & Biology. 2015; 60(3): 1307.

Kalpathy-Cramer

Zhao

Goldgof

Wang

Yang

, et al. A comparison of lung nodule segmentation algorithms: Methods and results from a multi-institutional study. Journal of Digital Imaging. 2016; 29(4): 476–487.

ur Rehman

Javaid

Shah

SIA

Gilani

Jamil

Butt

. An appraisal of nodules detection techniques for lung cancer in CT images. Biomedical Signal Processing and Control. 2018; 41: 140–151.

10.

Roth

Liu

Yao

Seff

Cherry

, et al. Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Transactions on Medical Imaging. 2016; 35(5): 1170–1181.

11.

Tan

Huo

Liang

. Apply Convolutional Neural Network to Lung Nodule Detection: Recent Progress and Challenges. In: International Conference on Smart Health. Springer; 2017. pp. 214–222.

12.

Zhang

Dong

Chen

Jia

Muhammad

, et al. Image based fruit category classification by 13-layer deep convolutional neural network and data augmentation. Multimedia Tools and Applications. 2019; 78(3): 3613–3632.

13.

Yapeng

Hongyuan

. Benign and Malignant Solitary Pulmonary Nodules Classification Based on CNN and SVM. In: Proceedings of the International Conference on Machine Vision and Applications. ACM; 2018. pp. 46–50.

14.

Tajbakhsh

Suzuki

. Comparing two classes of end-to-end machine-learning models in lung nodule detection and classification: MTANNs vs. CNNs. Pattern Recognition. 2017; 63: 476–486.

15.

Bhuvaneswari

Therese

. Detection of cancer in lung with k-nn classification using genetic algorithm. Procedia Materials Science. 2015; 10: 433–440.

16.

Netto

SMB

Silva

Nunes

Gattass

. Automatic segmentation of lung nodules with growing neural gas and support vector machine. Computers in Biology and Medicine. 2012; 42(11): 1110–1121.

17.

Elizabeth

Nehemiah

Raj

Kannan

. Computer-aided diagnosis of lung cancer based on analysis of the significant slice of chest computed tomography image. IET Image Processing. 2012; 6(6): 697–705.

18.

Javaid

Javid

Rehman

MZU

Shah

SIA

. A novel approach to CAD system for the detection of lung nodules in CT images. Computer Methods and Programs in Biomedicine. 2016; 135: 125–139.

19.

Kavitha

Shanthini

Sabitha

. ECM-CSD: An efficient classification model for cancer stage diagnosis in CT lung images using FCM and SVM techniques. Journal of Medical Systems. 2019; 43(3): 73.

20.

Dandıl

Çakiroglu

Ekşi

Özkan

Kurt

ÖK

Canan

. Artificial neural network-based classification system for lung nodules on computed tomography scans. In: 2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR). IEEE; 2014. pp. 382–386.

21.

Dandıl

. A Computer-Aided Pipeline for Automatic Lung Cancer Classification on Computed Tomography Scans. Journal of Healthcare Engineering. 2018; 2018.

22.

Shen

Zhou

Yang

Dong

Yang

, et al. Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification. Pattern Recognition. 2017; 61: 663–673.

23.

Sun

Zheng

Qian

. Automatic feature learning using multichannel ROI based on deep structured algorithms for computerized lung cancer diagnosis. Computers in Biology and Medicine. 2017; 89: 530–539.

24.

Lakshmanaprabu

Mohanty

Shankar

Arunkumar

Ramirez

. Optimal deep learning model for classification of lung cancer on CT images. Future Generation Computer Systems. 2019; 92: 374–382.

25.

Shi

Hao

Zhao

Feng

Wang

, et al. A deep CNN based transfer learning method for false positive reduction. Multimedia Tools and Applications. 2019; 78(1): 1017–1033.

26.

Huang

Sun

Tseng

TLB

Qian

. Fast and Fully-Automated Detection and Segmentation of Pulmonary Nodules in Thoracic CT Scans Using Deep Convolutional Neural Networks. Computerized Medical Imaging and Graphics. 2019.

27.

Wang

Shen

Huang

Sheng

. Lung Nodule Detection in CT Images Using a Raw Patch-Based Convolutional Neural Network. Journal of Digital Imaging. 2019; 1–9.

28.

Clark

Vendt

Smith

Freymann

Kirby

Koppel

, et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information repository. Journal of Digital Imaging. 2013; 26(6): 1045–1057.

29.

Lampert

Stumpf

Gançarski

. An empirical study into annotator agreement, ground truth estimation, and algorithm evaluation. IEEE Transactions on Image Processing. 2016; 25(6): 2557–2572.

30.

Abbas

. Segmentation of differential structures on computed tomography images for diagnosis lungrelated diseases. Biomedical Signal Processing and Control. 2017; 33: 325–334.

31.

Dai

Dong

Zhang

Chen

. A novel approach of lung segmentation on chest CT images using graph cuts. Neurocomputing. 2015; 168: 799–807.

32.

Chen

Yao

Chen

. A parameterized logarithmic image processing method with Laplacian of Gaussian filtering for lung nodule enhancement in chest radiographs. Medical & Biological Engineering & Computing. 2016; 54(11): 1793–1806.

33.

Dhaware

Pise

. Lung cancer detection using bayasein classifier and FCM segmentation. In: 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT). IEEE; 2016. pp. 170–174.

Deep learning the features maps for automated tumor grading of lung nodule structures using convolutional neural networks

Abstract

Keywords

1. Introduction

2. Related work

2.1 Category 1: Works related to traditional segmentation and classification methods

Table 1 Works related to traditional segmentation and classification methods

3.1 Data acquisition

3.4.1 Traditional based feature extraction

Table 3 Segregation of input samples based on the malignancy levels

References

Table 1
Works related to traditional segmentation and classification methods

Table 3
Segregation of input samples based on the malignancy levels