Lung cancer detection based on computed tomography image using convolutional neural networks

Abstract

BACKGROUND:

Lung cancer is the most common type of cancer, accounting for 12.8% of cancer cases worldwide. As initially non-specific symptoms occur, it is difficult to diagnose in the early stages.

OBJECTIVE:

Image processing techniques developed using machine learning methods have played a crucial role in the development of decision support systems. This study aimed to classify benign and malignant lung lesions with a deep learning approach and convolutional neural networks (CNNs).

METHODS:

The image dataset includes 4459 Computed tomography (CT) scans (benign, 2242; malignant, 2217). The research type was retrospective; the case-control analysis. A method based on GoogLeNet architecture, which is one of the deep learning approaches, was used to make maximum inference on images and minimize manual control.

RESULTS:

The dataset used to develop the CNNs model is included in the training (3567) and testing (892) datasets. The model’s highest accuracy rate in the training phase was estimated as 0.98. According to accuracy, sensitivity, specificity, positive predictive value, and negative predictive values of testing data, the highest classification performance ratio was positive predictive value with 0.984.

CONCLUSION:

The deep learning methods are beneficial in the diagnosis and classification of lung cancer through computed tomography images.

Keywords

Lung cancer deep learning convolutional neural network GoogLeNet

1. Introduction

1.1 Lung cancer

Lung cancer is the most common type of cancer, accounting for 12.8% of cancer cases and 17.8% of cancer deaths worldwide [1]. Since lung cancers initially show non-specific symptoms such as fatigue and cough, it is difficult to diagnose in the early stages and only 10% of the patients can be diagnosed at this stage. Since most patients can be diagnosed at stages III and IV, the 5-year survival is $<$ 5% [2]. Evaluation of all diagnosed patients showed an average survival of 1 year [3]. Lung cancers are histologically classified into two: small cell lung cancer and non-small cell lung cancer (NSCLC). NSCLCs constitute 85%, and the most common subtypes are adenocarcinoma (AC) and squamous cell carcinoma (SCC) [4]. Individual and tumor-dependent variables determine survival and treatment plans [5]. TNM staging, which considers tumor size (T), lymph node metastasis (N), and distant metastasis (M), is the most critical prognostic factor in NSCLC cases [5]. However, advanced age, comorbid disease, and high lactate dehydrogenase (LDH) are other factors known to negatively affect prognosis [6]. Lung cancers can metastasize by lymphatic and hematogenous routes. Primarily, the sites where it metastasizes are the brain, bone tissue, and adrenal glands. Other organ metastases usually occur at later stages. Metastases in lung cancer is one of the most crucial poor prognostic factors affecting survival [5].

Diagnostic methods in lung cancer are divided into invasive and noninvasive. In noninvasive diagnostic methods, sputum cytology is the most easily available examination for diagnosis in a patient with suspected lung cancer; however, its sensitivity in the diagnosis of lung cancer was 58% and its specificity was 98% [7]. Among the imaging methods, chest radiography is the first method of choice. Computed thoracic tomography (CT) for further examination is a crucial method in the staging and diagnosis of lung cancer. It shows the metastases of the tumor to the mediastinum and other organs. Its use in combination with positron emission tomography increases the rate of tumor detection and the success rate in distinguishing tumors from atelectasis or consolidation [8]. Positron emission tomography-computed tomography (PET-CT) is a method based on the uptake of radioactive 18F-labeled fluoro-deoxy-glucose by tumor cells. Its relevance in detecting lung cancer and its sensitivity to distant metastasis foci are high [9]. In invasive diagnostic methods, in patients with lung cancer, the extent, localization, and staging of the tumor can be determined by bronchoscopy. The pathology studied from the samples taken determines the tumor histology and subtypes. In patients who cannot be diagnosed through bronchoscopy, the diagnosis is made by endobronchial ultrasonography, endoscopic ultrasonography, transthoracic needle biopsy, mediastinoscopy, or surgical biopsy, depending on the lesion site.

In lung cancer, the histopathological type of the tumor; the patient’s age, gender, medical history, and comorbid diseases, pulmonary functions, imaging, and laboratory evaluation are the conditions that guide the treatment. Apart from these conditions, the most crucial step to consider is staging. As the stage increases, the prognosis worsens. CT is a noninvasive method used in the detection and diagnosis of lung cancer [10]. Thus, rapid and accurate interpretation of thorax CT is serious as regards early diagnosis of the disease, directing the detected patients to biopsy, and determining the biopsy site.

Image processing techniques developed using machine learning methods have played a critical role in the development of medical decision support systems. Computer-aided diagnostic methods (CADx) help clinicians make decisions in the healthcare field [11, 12, 13, 14, 15, 16]. Deep learning, which is a machine learning method, is an effective and fast method that involves feature selection, pattern recognition, classification, and regression in big data as a whole [12, 17, 18].

This study aimed to classify benign and malignant lung lesion from CT images with high accuracy with a deep learning approach using the Keras library and convolutional neural networks (CNNs).

2. Materials and methods

This research is for observational research within the scope of quantitative research. The research was retrospective; the case-control analysis research method was used. This study is within the scope of big data analysis. Modeling will be based on artificial intelligence and statistical significance will not be checked; hence, power analysis is not required. R Studio (version 1.1.453) was used for the classification of thorax CT lesion in this study [19]. The Keras library allows users to easily develop CNN models and frees users from the complexity of these low-level libraries. NVIDIA DGX™ Systems, developed to meet CUDA-supported artificial intelligence and analytics demands, were used as GPU hardware. The steps of the proposed method are indicated in Fig. 1.

Figure 1.

The steps of the proposed method.

2.1 Dataset

The image dataset includes 4459 CT scans (benign, 2242; malignant, 2217) taken from 40 patients. In the dataset, lung cancer classification was performed based on different patient groups, including treatment history, smoking habits, age, etc., and thoracic CT screening protocol was applied. To provide high-quality images for screening, a high-resolution and powerful CT scanner was used. CT scans of the patients included in the study were performed with a 16-slice multidetector scanner (Toshiba Alexion™/Advance, Toshiba Medical Systems Corporation Nashu, Japan). The thoracic scan included a broad area covering the lungs and mediastinum, from the lower cervical spine to the upper diaphragm. Typically, thin sections ranging from 1 to 3 mm were used to obtain higher-resolution images. Low-dose CT protocols were employed to minimize patient exposure. The sample dataset is shown in Fig. 2. Relevant scans were obtained from the Chest Diseases Department of Recep Tayyip Erdogan University Training and Research Hospital.

Figure 2.

Benign and malignant sampling images.

2.1.1 Data labeling

The process of assigning labels to CT images to determine whether they are cancerous or non-cancerous was carried out under the supervision of specialist doctors in the chest diseases department. The quality, clarity and other features of the images were evaluated and the labelling process was carried out. The selected CT images were divided into two groups, cancerous and non-cancerous. After the control process, cancerous images were labeled “1” and non-cancerous images were labeled “0”.

2.1.2 Data split

Before the data split process, the class imbalance issue was checked to ensure a balanced distribution. The data was then divided into two subsets for training and testing. The dataset was split into 80% for training and 20% for testing. During the training process, model learning and weight updates are performed, while in the testing process, performance measurements on previously unseen data, hyperparameter tuning, and overfitting control of the model are examined.

2.2 Data preprocessing

The original resolution of each image is 1920 $\times$ 1080. The pixel size used in thoracic CT images can vary between 0.5 mm and 1.0 mm. The original resolution of each image is 512x512 matrix size. Linear Smoothing is the most common category of filtering performed by direct pollutants. Image augmentation process will improve the model’s generalization ability by increasing the diversity of samples in the dataset. Since working with as many images as possible in image processing methods based on deep learning will increase the accuracy of the model, it has been tried to model with the maximum number of images. Linear smoothing is the most common category of filtering accomplished by direct pollutants. The direct purifier replaces each pixel with a direct alliance of its neighbors, and a complexity kernel is used in convention for the direct alliance. Linear filtering of a signal can be bodied as the complexity.

$y(t)=\smallint_{-\infty}^{\infty}({h(r).x({t-r})dr})$ of the input signal $x(n)$ with the impulse response $h(n)$ of the given purifier, i.e., the purifier affair ascending from the input of an ideal Dirac impulse. The fastest blur algorithm is the square-core linear filter, where all kernel coefficients are equal, also known as the moving average. $S[{I,J}]=\sum_{k=-r}^{n}C({\imath,j+k})$ formulation shows that the sum of the elements in the S rectangular window can decompose into the C total window column [20]. The pseudo-code of the moving average filtering method is given in Fig. 3 [21].

Figure 3.

The flowchart of Pseudo-code of moving average filtering [21].

In the dataset, segmentation has not been performed. Instead of directly performing segmentation on the given images, lung cancer detection and classification were carried out by processing the labeled full-sized input images.

2.3 Ethics committee approval

This study was approved by the Non-Interventional Clinical Research Ethics Committee of Recep Tayyip Erdogan University Faculty of Medicine, Rize, Turkey (approval date: August 18, 2022, number: 438) and carried out in accord with the Declaration of Helsinki.

2.4 Object finding/feature extraction

Transfer learning means using a deep learning model that has been pre-trained in another domain in the target domain (lung cancer detection). The pre-trained model is usually trained on a large dataset and has learnt general features. This model can be adjusted or fine-tuned appropriately for lung cancer detection. In this way, it may be possible to successfully extract features of lung lesions with less data.

In this study, GoogLeNet from deep learning methods were used. It is a multi-layered collection of sensors based on the biological process of the human brain [22]. GoogLeNet is a classic deep learning framework proposed by Szegedy et al. [23, 24]. Unlike the deeper of the network, it will bring negative effects, such as overfitting, gradient vanishing, and gradient burst, to achieve better training performance. GoogLeNet improves training results by using computing resources more efficiently that is, by extracting more features for the same amount of computation [23, 24]. GoogLeNet model consists of one or more layers of convolution, subsampling, and feedforward [25]. In the study, a pre-trained model of GoogleNet (TensorFlow and Keras) was loaded into the system and appropriate hyperparameters were used. The learning rate of the model was set as 0.001, the batch size value for the pre-trained GoogleNet model was set as 32 and the number of epochs was set as 50. Relu and softmax activation functions are implemented in convolution and dense layers, respectively. Adamax optimizer was used as the optimization function [26]. The model hyperparameters chosen for the Adamax algorithm are given as learning rate (0.001), beta_1 (0.9), and beta_2 (0.999). The pooling layers used in GoogleNet were used for average pooling. The GoogleNet system architecture used in the study is indicated in Fig. 4.

Figure 4.

Detailed summary representation of the GoogLeNet architecture.

2.4.1 Convolutional neural network architecture

Convolutional Neural Network (CNN) is a class of artificial neural networks that uses convolutional layers to filter inputs for extracting useful information. The convolution operation involves combining input data (feature maps) with a convolution to create a transformed feature map. The filters (kernels) in the convolutional layers are adjusted based on learned parameters to extract the most useful information for a specific task, resulting in a transformed feature map. Convolutional networks are automatically adjusted to find the best features for the task at hand. Applications of Convolutional Neural Networks include various image processing tasks such as image recognition, image classification, video tagging, and text analysis, as well as speech processing tasks like speech recognition, natural language processing, and text classification. They are also utilized in state-of-the-art artificial intelligence systems. Convolutional Neural Networks (CNNs) consist of an input layer, an output layer, and one or more hidden layers. CNNs are a subclass of neural networks that leverage the spatial structure of inputs. They have a standard structure composed of alternating convolutional layers and pooling layers (usually each pooling layer is placed after a convolutional layer) [27]. The architecture of the CNN is shown in Fig. 5 [28].

Figure 5.

The architecture of the CNN.

Correct, the convolutional layer consists of a series of learnable kernels or filters that aim to extract local features from the input. Each kernel is used to compute a feature map. The units of the feature maps are only connected to a small region of the input, referred to as the receptive field. A new feature map is typically created by sliding a filter over the input and computing the dot product (similar to the convolution operation), followed by the addition of a non-linear activation function to introduce non-linearity to the model. All units share the same weights (filters) among each feature map. The advantage of weight sharing is the reduced number of parameters and the ability to detect the same feature independently of its location in the inputs [29]. There are several nonlinear activation functions available, such as Sigmoid, tanh, and Relu. The size of the output feature map depends on the filter size and the stride, given an input image of size (H x H), when convolving it with a filter of size (F x F) and a stride (S), the output size (W x W) is given by [30];

$\displaystyle W=\lfloor(H-F)/S\rfloor+1$ (1)

Pooling or downsampling layer reduces the resolution of the previous feature maps. Pooling provides invariance for small transformations and distortions. It divides the inputs into disjoint regions of size (R x R) to produce one output from each region. Pooling can be max-based or average-based. If a specific input is fed into a pooling layer of size (W x W), the output size is obtained as follows [31];

$\displaystyle P=\lfloor W/R\rfloor$ (2)

Correct, the top layers of CNNs are one or more fully connected layers similar to a feedforward neural network that aim to extract high-level features from the inputs. The units in these layers are connected to all the hidden units in the previous layer. The last layer is a softmax classifier that predicts the posterior probabilities of each class label over the K classes, as shown in Equation [30, 32];

$\displaystyle y_{i}=\frac{\text{exp}({-z_{t}})}{\mathop{\sum}\nolimits_{j=1}^{% k}\text{exp}({z_{t}})}$ (3)

Figure 6.

The detailed workflow diagram of the study architecture.

2.4.2 Performance metrics

Classification performance of patient images during the training and testing phases and 95% confidence interval values will be given with accuracy, sensitivity, reliability, and positive and negative prediction metrics. The following formulation is used to calculate performance metrics. Accuracy $=$ (TP $+$ TN)/(TP $+$ TN $+$ FP $+$ FN), Sensitivity $=$ TP/(FN $+$ TP), Specificity $=$ TN/(FP $+$ TN), Positive Predictive Value $=$ TP/(TP $+$ FP) and Negative Predictive Value $=$ TN/(TN $+$ FN). TP, TN, FP, and FN expressions specified in the formula are true positive, true negative, false positive, and false negative values. The detailed workflow diagram of the study architecture is presented in Fig. 6.

3. Results

The dataset used to develop the GoogLeNet model included the training (3567) and testing (892) datasets. The classification accuracy and loss performance graphs for each epoch created during the training phase of the model with the GoogLeNet algorithm are shown in Fig. 7. The highest accuracy rate of the model in the training phase was estimated as 0.98.

Figure 7.

The classification accuracy and loss performance graphs for each epoch.

Figure 8.

Confusion matrix and performance metrics of model.

The confusion matrix and performance metrics of the classification obtained by the test data and the GoogLeNet model are given in Fig. 8. According to accuracy, sensitivity, specificity, positive predictive value, and negative predictive values, the highest classification performance ratio was positive predictive value with 0.984.

4. Discussion

This study aimed to classify lung lesions as benign and malignant with high accuracy using a structured GoogLeNet model based on thorax CT images. The use of machine learning methods and image processing techniques in the healthcare field has been gradually increasing. These methods should be continuously developed and updated because large amounts of data significantly affect classification performance. Deep learning approaches such as multi-layer neural networks provide higher classification performance from large data sizes and give better results than classical machine learning methods.

Considering the recent studies obtained from thorax CT images with deep learning methods, in Tao et al.’s prediction of future images of lung lesions as growth modeling by follow-up computed tomography scans using deep learning: a retrospective cohort, performance in distinguishing lesions by applying a CNNs to 246 images of 313 lung lesions with at least one follow-up CT scan. In the area under the ROC curve, they obtained values of 0.857 and 0.843 [33]. In another study from Zhu et al. a different method used by authors. As a result of that study the super-pixels and the level set segmentation methods show that the proposed algorithm has a high accuracy for lung cancer detection in CT images [15].

Rustam et al. developed a classification with a CNNs model from 400 scanned images of 150 healthy and 250 patients. The model produced accurate results with a classification performance of 98.5% [34].

Anthimopoulos et al. used the CNNs model for the classification of interstitial lung diseases in their study; 85.5% classification performance was achieved using a dataset of 14696 image patches from 120 CT scans from different scanners and hospitals [35].

Dansana et al. used VGG on a dataset of 360 images, 295 of which are images, X-ray, and CT scan images. They used the CNNs method for binary classification pneumonia based on conversion of VGG-19, Inception_V2, and Decision tree model and achieved 91% classification performance [36].

In this study, the classification of thorax CT images is a crucial difference, thanks to the GoogLeNet architecture. The results of the GoogLeNet algorithm used in the study showed that the classification performance of benign and malignant lung lesions based on thorax CT images is high. Therefore, the algorithm is recommended to be used in the classification of benign and malignant lung lesions. Additionally, a system was created in which the model can be trained with more patient images and evaluation can be made with various performance metrics.

5. Conclusion

In the study of lung cancer diagnosis and classification based on computed tomography images, we performed analyses using CNNs, one of the deep learning approaches. We passed the dataset of 2242 benign and 2217 malignant tumors of 40 patients from the Recep Tayyip Erdogan University Chest Disease clinic through the layers in the CNNs algorithm. Relu and softmax activation functions are implemented in convolution and dense layers, respectively. Adamax optimizer was used as the optimization function. We used the moving average method, which is one of the liner smoothing filtering methods. Experimental results showed 0.974 accuracy rate, 0.975 sensitivity, 0.973 Specificity, 0.984 positive predictive value, and 0.959 negative predictive values with our proposed method. The results indicate that deep learning methods are beneficial in the diagnosis and classification of lung cancer through computed tomography images and that similar studies can be conducted in the future.

As a result, it is recommended to use these systems in clinical decision support processes by obtaining successful results with the CNNs algorithm in the detection of lung lesions (benign/malignant) based on thorax CT images.

Footnotes

Acknowledgments

The authors have no acknowledgments.

Conflict of interest

The authors have no competing interests to declare.

Funding

None to report.

References

Parkin

Pisani

Ferlay

. Global cancer statistics. CA: A Cancer Journal for Clinicians. 1999; 49(1): 33-64.

Çelik

ÇAKIR

GÜLBAĞCI

Demirci

VARIM

Bilir

. General characteristic features of our patients with lung cancer diagnosed: Sakarya university medical oncology clinic, 2017–2018 lung cancer statistics. Journal of Human Rhythm. 6(1): 8-14.

Mahesh

Archana

Jayaraj

Patil

Chaya

Shashidhar

, et al. Factors affecting 30-month survival in lung cancer patients. Indian J Med Res. 2012; 136(4): 614-21.

Salehi-Rad

Paul

Dubinett

Liu

. The biology of lung cancer: Development of more effective methods for prevention, diagnosis, and treatment. Clinics in Chest Medicine. 2020; 41(1): 25-38.

Brundage

Davies

Mackillop

. Prognostic factors in non-small cell lung cancer: A decade of progress. Chest. 2002; 122(3): 1037-57.

Kaçan

Babacan

Yücel

Kılıçkap

Akkaş

Şeker

, et al. The factors effecting survival of stage IV non-small cell lung cancer patients. Cumhuriyet Medical Journal. 2013; 35(3): 332-8.

Felten

Knoll

Schikowsky

Das

Feldhaus

Hering

, et al. Is it useful to combine sputum cytology and low-dose spiral computed tomography for early detection of lung cancer in formerly asbestos-exposed power industry workers? Journal of Occupational Medicine and Toxicology. 2014; 9(1): 1-9.

Ceylan

Dogan

Kocaçelebi

Savas

Çakan

Çagrici

. Contrast enhanced CT versus integrated PET-CT in pre-operative nodal staging of non-small cell lung cancer. Diagnostic and Interventional Radiology. 2012; 18(5): 435.

Karakaş

Kalemci

Demir

Karakaş

Yilmaz

. Comparison of PET/CT and CT in preoperative grading of non-small cell lung cancer. Kocatepe Tıp Dergisi. 2011; 12(1): 23-31.

10.

Chaudhry

Gul

Chaudhry

. Utility of computed tomography lung cancer screening and the management of computed tomography screen-detected findings. J Thorac Dis. 2018; 10(3): 1352.

11.

Halalli

Makandar

. Computer aided diagnosis-medical image analysis techniques. Breast imaging. 2018; 85.

12.

Hafizović

Čaušević

Deumić

Bećirović

Pokvić

Badnjević

, editors. The Use of Artificial Intelligence in Diagnostic Medical Imaging: Systematic Literature Review. 2021 IEEE 21st International Conference on Bioinformatics and Bioengineering (BIBE); IEEE. 2021.

13.

Choudhury

. Predicting cancer using supervised machine learning: Mesothelioma. Technology and Health Care. 2021; 29: 45-58.

14.

Huang

Yue

Wang

. Optimal three-dimensional reconstruction for lung cancer tissues. Technol Health Care. 2017; 25(S1): 423-34.

15.

Zhu

Pak

Song

Dou

Zhao

Cao

, et al. A novel lung cancer detection algorithm for CADs based on SSP and Level Set. Technol Health Care. 2017; 25(S1): 345-55.

16.

Badnjevic

Cifrek

, editors. Classification of asthma utilizing integrated software suite. 6th European Conference of the International Federation for Medical and Biological Engineering: MBEC 2014, 7–11 September 2014, Dubrovnik, Croatia; Springer. 2015.

17.

LeCun

Bengio

Hinton

. Deep learning. Nature. 2015; 521(7553): 436-44.

18.

Mujkić

Baralić

Ombašić

Bećirović

Pokvić

Badnjević

, editors. Machine Intelligence in Biomedical Data Modeling, Processing, and Analysis. 2022 11th Mediterranean Conference on Embedded Computing (MECO); IEEE. 2022.

19.

Chang

Cheng

Allaire

Xie

McPherson

. shiny: Web Application Framework for R. R package version 1.4. 0.2. 2020. URL https://CRAN R-project org/package=shiny. 2020; 4(1): 4.1.

20.

Chandel

Gupta

. Image filtering algorithms and techniques: A review. International Journal of Advanced Research in Computer Science and Software Engineering. 2013; 3(10).

21.

Maleika

. Moving average optimization in digital terrain model generation based on test multibeam echosounder data. Geo-Marine Letters. 2015; 35(1): 61-8.

22.

Yang

Feng

Chi

Duan

Liu

, et al. Deep learning aided decision support for pulmonary nodules diagnosing: A review. J Thorac Dis. 2018; 10(Suppl 7): S867.

23.

Szegedy

Ioffe

Vanhoucke

Alemi

, editors. Inception-v4, inception-resnet and the impact of residual connections on learning. Thirty-first AAAI conference on artificial intelligence. 2017.

24.

Szegedy

Vanhoucke

Ioffe

Shlens

Wojna

, editors. Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.

25.

Huang

Wei

Zhang

. Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors. 2015; 2015.

26.

Kingma

. Adam: A method for stochastic optimization. arXiv preprint arXiv:14126980. 2014.

27.

LeCun

Bottou

Bengio

Haffner

. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 1998; 86(11): 2278-324.

28.

Albelwi

Mahmood

. A framework for designing the architectures of deep convolutional neural networks. Entropy. 2017; 19(6): 242.

29.

Chen

Wang

. A gloss composition and context clustering based distributed word sense representation model. Entropy. 2015; 17(9): 6007-24.

30.

Nair

Hinton

, editors. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th International Conference on Machine Learning (ICML-10). 2010.

31.

Aldhaheri

Lee

, editors. Event detection on large social media using temporal analysis. 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC); IEEE. 2017.

32.

Kohavi

, editor. A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai; 1995: Montreal, Canada.

33.

Tao

Zhu

Chen

Yin

Yang

, et al. Prediction of future imagery of lung nodule as growth modeling with follow-up computed tomography scans using deep learning: A retrospective cohort study. Translational Lung Cancer Research. 2022; 11(2): 250.

34.

Rustam

Hartini

Pratama

Yunus

Hidayat

. Analysis of architecture combining convolutional neural network (CNN) and kernel K-means clustering for lung cancer diagnosis. Int J Adv Sci Eng Inf Technol. 2020; 10(3): 1200-6.

35.

Anthimopoulos

Christodoulidis

Ebner

Christe

Mougiakakou

. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Transactions on Medical Imaging. 2016; 35(5): 1207-16.

36.

Dansana

Kumar

Bhattacharjee

Hemanth

Gupta

Khanna

, et al. Early diagnosis of COVID-19-affected patients based on X-ray and computed tomography images using deep learning algorithm. Soft Computing. 2020: 1-9.