Abstract
Pneumonia is a disease caused by the virus (flu, respiratory Syncytial Virus) or bacteria. It can be fatal if not diagnosed and treated at an early stage. Chest X-rays have been widely utilized to diagnose such abnormalities with high exactitude and are primarily responsible for the augment real-world diagnosis process. Poor availability of authentic data and yardstick-based approaches and studies complicates the comparison process and identifying the safest recognition method. In this paper, a Dual Deterministic Model (DD-M) is proposed based on a Deep Neural network that would identify Pneumonia from chest X-ray and distinguish the cause in case of either viral or bacterial infection at an efficiency equivalent of an active radiologist. To accomplish the automated task of the proposed algorithm, an automatic computer-aided system is necessary. The proposed algorithm incorporates deep learning techniques to understand radiographic imaging better. The results were evaluated after implementing the proposed algorithm where; it reveals various aspects of the chest infected with Pneumonia compared to the healthy individual with approximately 97.45% accuracy and distinguishes between the viral and bacterial infection with the efficiency of 88.41%. The proposed algorithm with an improved image dataset will help the doctors diagnose.
Keywords
Introduction
Pneumonia is one of the soul frightening, extensive infectious diseases that can be triggered through the entering of bacteria, viruses, or fungi in the body of a human being that requires a precise diagnosis at an early stage for a suitable treatment [1]. In the case of severity, it can cause death. To diagnose Pneumonia, doctors and health care specialists use a chest X-ray that is proved to be an excellent imaging sense modality. However, it is possible to spot Pneumonia using other non-contact procedures such as Computed Tomogram (CT) images and High-Resolution Computed Tomography (HRCT) [2]. Meanwhile, precise assessment of the X-ray images is a significant aspect for the radiologist having proficiency and capability [3]. As it is an approximation by the World Health Organization (WHO), approximately two-thirds population of the world is impotent to approach the radiologist for precise diagnosis [4]. An automated computerized system is required to calculate the X-ray images that help the doctor and healthcare facilitator diagnose can be proved very useful. In such cases, researchers have adopted image processing algorithms to detect Lung abnormalities [5]. For the classification of Pneumonia, a chest X-ray is employed as an imagining sense modality for the discovery of the disease through a trained radiologist. A recent study illustrates the diagnosis of Pneumonia using a deep learning framework for automated recognition of Pneumonia and other diseases on X-ray images to support the health consultants [6]. Furthermore, an immediate diagnosis of Pneumonia type along with proper medication may be helpful in saving the patient’s life. Machine learning-based multiple-piece explanations have also been endorsed for extracting valuable characteristics to mechanize the process of categorization. Over time medical sciences progressively expand, and Chest X-ray images are the best method to diagnose Pneumonia in the patient [7].
In most cases, X-ray images developed by various means are not very clear and provide noticeably less information due to which they are sometimes misdiagnosed, and the same can be said for the Pneumonia categories such as viral or Bacterial Pneumonia. Due to misdiagnosis, wrong medication may be offered to the patient, deteriorating their condition [8]. Development in methods allows researchers to evaluate the results more precisely. Health check personnel get a dissected picture by adopting a vast algorithm that certainly influences the correctness of diagnosis outcomes. In third world countries, many irregularities are reported to diagnose Pneumonia due to the deficiency of skilled radiologists [9]. A structure that can classify Pneumonia and its type is essential, which may help the concerned radiologists promptly diagnose the type of Pneumonia after getting hold of an X-ray image.
Deep learning models, such as CNN, fascinated many researchers to examine medical records because they are altered and made for extensive data analysis. Deep learning models can pick up the features pulled out from the image database utilizing multiple layers of filters [10]. Since the beginning of the deep learning algorithm in the medical field, it is becoming more robust and widely held among researchers due to its automatic diagnostic feature and precision [11]. It has been brought into being from the previous studies done by researchers Cernazanu and Holban. The segmentation of the Chest X-ray has been defined through a convolutional neural network [12], the segmentation of images into the bone and non-bone tissue was elaborated briefly. The primary purpose of their study was to build a spontaneous methodology that would be able to detach the bone tissue from the whole image of the chest X-ray through a convolutional neural network, which the objective of their study would achieve. The CNN works on the principle of analyzing the pixels having a graphic picture that have been later categorized into the bone and non-bone tissue [10, 12]. Biomedical applications such as cancer finding in breast, artery-clogging, and heart diseases consume Artificial Intelligence for their proper diagnoses [13, 14]. Srivastav, Devansh et al. presented a technique based on Generative Adversarial Network (GAN) to generate synthetic image dataset via augmentation and applied transfer learning algorithm to improve the accuracy with modified dataset [15]. Researchers also included the concept of Region of Interest (ROI) to categorize Pneumonia and its type using a chest X-ray image dataset [16]. By applying a similar technique to further research, the classification of lung cancer has been done that introduced an image processing technique [17] along with its application to explore whether cancer is malicious or else at its initial stage. Deep learning algorithms require an enormous amount of data to learn, which are easily accessible for various applications to train different deep learning models [18, 19]. Many researchers involved in developing deep learning algorithms for the classification of Pneumonia type have experimented in varying the deep learning parameter; in some cases, pre-trained classifiers are also augmented and applied to classify Pneumonia and its types [20, 21]. Through such advancement, diameter, perimeter, irregularity in categorization was automatically recognized. The techniques created on machine learning using CNN for the progression and development of the diagnosis of images [22]. To some degree, the machine learning algorithms are hard-hitting because assailants at any time can produce some models that cannot be trained by machines [23]. Since COVID-19 became a global phenomenon, few studies have been introduced on Artificial Intelligence to spot and classify it. Pre-trained models such as ResNet50, ResNet101, ResNet152, and Inception ResNet-V2 are extensively utilized for COVID-19 detection [24]; the study was accomplished using four different classes, in which the amount of ‘COVID’ images are 341, 2800 ‘Normal’, 1493 ‘Viral’ and 2800 ‘Bacterial’ were used.
Vikash et al. has used the transfer learning technique by incorporating ImageNet models to classify Pneumonia and achieved an accuracy of 96.39% [42]. Similarly, Saraiva, Arata Andrade used Artificial Neural Network to classify Pneumonia and attain an accuracy of 94.5% [46]. Khatri et al. also presented a similar technique for identifying Pneumonia type (viral or bacterial) by incorporating the EMD model and achieved an accuracy of 83.3% [47]. Gu et al. used a custom VGG16 model, and their work was distributed in two parts; they used Fully Convolutional Network (FCN) to classify the Lung region affected and used Deep Convolutional Neural Network (DCNN) to classify the category of Pneumonia [48].
In this paper, a (DD-M) is projected to classify Pneumonia type in two levels. Level 1 used a (DCNN) model to classify an inputted X-ray image as either Normal or Pneumonia and the proposed model achieved an accuracy of 97.45%. Level 2 classification is only performed when Level 1 classifies an X-ray image to be Pneumonia. It identified Pneumonia type (viral or bacterial) using a custom DenseNet201 model and achieved an accuracy of 88.41%. Model is distributed in two different levels to achieve better accuracy as compared to the methods used by Khatri et al. [47] and Gu et al. [48], where the final layer of the model has three output/classes, for example, Normal, Viral, and Bacterial, while the planned model has two courses for each of its level and in both cases, it performed better.
Deep learning models
Deep learning is an interaction through which a machine can learn and retain a particular sort of information the way a human being does. In light of Deep learning, such algorithms are utilized for predictive analysis, which can automate them.
Convolutional Neural Network (CNN)
CNN architecture.
A Convolutional Neural Network (CNN) is a well-known deep learning algorithm that recognizes and classifies elements in illustrations for computer visualization [25]. Working with images, a convolutional neural network is the most acceptable option. This system acquired inspiration from the visual cortex, just similar to the brain of the human processing the image of something after visualizing it [26]. The brain processes an enormous volume of information as soon as the visual cortex views anything. Each neuron adds to its capacity belonging to its related field, and they communicate with other neurons to deliver their specific information and cover the whole area of view [27, 28]. Neurons in CNN are like the neurons existing in the brain, and both operate with their respective field of view. Information processing covers the entire image through each node, as given in Fig. 1, which expresses CNN’s architecture. Each layer is organized to be capable of identifying the same patterns such as edges, curves, peaks, etc., using their detailed kernel matrix [17, 29]. Three layers in CNN are usually used in its architecture: convolutional layer, pooling layer, and a fully connected layer. Several convolutional layers or multiple pooling layers in the CNN model, for each layer output, is forwarded to its following adjacent layer. Since it is a multilayered model, the convolutional and pooling layers are mainly hidden layers, and Fully Connected Layer (FCL) is for the classification [16].
Feature extraction using kernel matrix in convolutional layer.
The convolutional layer fills in as the fundamental building block of the CNN architecture, which bears the vast majority of the computational load inside the CNN network. CNNs are feedforward Neural Networks limited by two necessities, neurons inside the selected field latches with similar features to diminish the complexity and hold the spatial structure and its weights. The principal function of the Convolutional layer is to calculate the Dot Product (DP) of the kernel matrix [21] and a specific percentage on the image field as disclosed in Fig. 2, and this percentage is denoted as a window of
It is established on the multi-layer neural network developed to evaluate graphical inputs and accomplish chores such as image categorization, dissection, and target revealing, benefiting autonomous vehicles [30]. It can be summed up that the development, improvement, and augmentation of images utilizing image histogram equalization, deep neural network, and convolutional neural network upholds in achieving a sophisticated response rate and error; nevertheless, it requires farseeing training stretch that would be due to the development that required too much time [31].
where,
To authenticate the suggested DDM algorithm, three State of the Art (SOTA) networks are used. AlexNet [32] and ResNet [37] are the two pretrained algorithms used. A short overview of the two pre-trained networks talked over in Sections 2.2.1 and 2.2.2.
AlexNet
AlexNet architecture.
AlexNet is amongst the most well-known pre-trained classifiers available for classification [33]. It has a class vector of 1000, implying it can classify 1000 different classes. AlexNet calls for an input of an image with three channels or a Red Green Blue (RGB) image with the resolution of 227
Residual Network or ResNet was introduced to cater to two subjects: degradation problem and vanishing gradient [36, 37, 38]. Suppose the convolutional layer is added with a more precise activation function. In that case, the gradient of the loss function falls to zero, in turn rendering the network particularly difficult to train [38] without error. ResNet18 is not the only alternative of its kind, ResNet50 and ResNet101 being the other two. Names of all three are created on the number of layers they hold [39]. Alternatives of ResNet are widely used in medical imaging [40, 41] and its related classifiers. ResNet18 is inducted to identify Pneumonia in this paper for fair analysis of the proposed model. ResNet focuses on the residual reply between the sample dataset to learn rather than learning their features levels. Figure 4 shows the architectures of ResNet18.
ResNet18 architecture.
The well-trained model always gives precise results depending on the program that feeds to the learning system of the model. In most cases, the picked-up data set is not up to the mark and fails to achieve the new information of the sample corresponding to the annotation. A few pre-processing steps included training the model and applied transformation, such as moving, zooming, and suppressing the noise. This way, new sample information could be improved in the applied model.
Augmented dataset for Pneumonia (row 1) and Normal (row 2) categories.
The revolving action done for picture augmentation typically takes place by revolving the provided picture clockwise from a point ranging 0 - 360 degrees. It turns the picture pixel and fills the picture where pixels are gone from the picture. The scaling activity is used for amplification or reduction of the size of the picture, which is another technique of using augmentation. Altogether, 10% is done for the picture amplification. Picture modification is likely by either deciphering the picture in an even, vertical course or in both directions. Few samples of the augmented data set of the similar image are displayed in Fig. 5. In Table 1, a detail of the X-ray image dataset is given in which some images were casually augmented to enhance their quality, as displayed in Fig. 5. The dataset is distributed in four diverse classifications for two classifications: Normal, Pneumonia, Viral Pneumonia, and Bacterial Pneumonia.
Total number of X-ray images used in the dataset
To check the existence of Pneumonia through chest X-ray gives a workable explanation of conceivable outcomes through a deep learning algorithm. The model learns all the fundamental features corresponding by itself. Gradient descent has been used as a programming optimization that enhances the neurons or nodes until it finds the function of local minimum. For executing the projected DDM algorithm, an intel core i7 machine bearing 32 GB of Random-Access Memory (RAM) and Nvidia GeForce 1660 Super Graphics Processing Unit (GPU) is used. The algorithm is established on the python platform using the Keras library. The DDM model has two different levels, where it is trained for two separate identifications. The first level of classification is to identify the positive or negative case of Pneumonia, and if the case is identified as positive, only then it will hop to the second level for the classification between Viral and Bacterial Pneumonia. The process of the DDM model is demonstrated as a flow control process in Fig. 6. The classification is separate for their particular classification process to gain additional accuracy.
Flow control DDM for testing.
After the system initialization, an X-ray image input is acquired and transferred for pre-processing. Since the proposed DDM requires an input array of 224
Table 2 represents the Level 1 classification in which it classifies the image as Pneumonia or normal. It has three convolutional layers activated using the ReLU activation function and three pooling layers. The total trainable parameters of the whole network are 498882. Since the variation within the dataset for the classification of Pneumonia or normal is quite large, a small model such as Level 1 classifier is enough to achieve approximately 97.5% Validation Accuracy.
Figure 7 represents the Level 2 classification in which it tends to classify the already predicted Pneumonia into Viral or Bacterial. For Level 2 classification, Customized DenseNet201 Model is employed with total trainable parameters of 558434 as displayed in Table 3 while attaining the approximate validation accuracy of 88.41%.
Level 1 classification (Normal or Pneumonia)
FCL: Fully Connected Layer; Conv2D: 2D-Convolutional layer; ReLU: Rectified Linear Unit.
Level 2 classification trainable and non-trainable parameters
Level 2 classification using customized DenseNet201 (viral or Bacterial Pneumonia).
To authenticate the results of the trained model, four different performance parameters were utilized. Performance is assessed based on Accuracy, Sensitivity/Recall, Precision, and F1 score. Two pre-trained models (AlexNet and ResNet18) were also used to train the acquired dataset and assist as a reference to compare with the proposed model. For Level 2 classification, a comparison is performed between a regular DenseNet201 model with the customized DenseNet201 model. The performance is compared based on the specific dataset used to train the models; the performance of the suggested model may differ for other applications or any other dataset that may vary according to the application. The equations of the above-discussed Performance parameters are expressed in Eqs (2)–(5)
The algorithm’s performance used for testing and training with the help of a confusion matrix is given in Table 4. The performance of the DDM model using four performance parameters is given in Table 5, where the proposed DDM model for Level 1 (Normal or Pneumonia) classification performed better as compared to the pre-trained classifiers as it produces the highest accuracy as compared to the rest of the two pre-trained classifiers which was found to be 97.45%. In the case of Level 2 (Viral or Bacterial) classification, a pre-trained DenseNet201 classifier is used and customized for the specific task to obtain the best performance compared to the regular DenseNet201 model. Since DenseNet201 produced the best performance compared to the rest of the pre-trained classifiers for Level 2 classification, which was found to be 85.977%, it was selected for the customization. The customized version of the DenseNet201 model attains 88.41%, as displayed in Table 5.
Confusion matrix of AlexNet, ResNet18 and DDM along with their respective accuracies
Confusion matrix of AlexNet, ResNet18 and DDM along with their respective accuracies
N: Normal; P: Pneumonia; V: Viral Pneumonia; B: Bacterial Pneumonia; DDM: Dual deterministic model.
Performance parameters of DDM model (validation/testing accuracy) vs pre-trained classifiers
Figure 8 represents the graphical representation of the accuracy and loss for Level 1 classification. In the given scenario, a custom callback is utilized to achieve the best model that attains an accuracy of 97.45%. Figure 9 displays the testing result of the proposed DDM Level 1 classification model, identifying all six images without any error. Figures 10 and 11 represents the graphical representation of the testing performance of the DDM Level 2 classification model. In Fig. 11, DDM Level 2 classifier predicted Bacterial while the actual response was viral in row 1 and column 1.
Comparative analysis with similar research
Training and validation accuracy of level 1 classification (Normal or Pneumonia).
Sample prediction of level 1 (Normal or Pneumonia) classifier.
Training and validation accuracy of level 2 classification (Viral or Bacterial Pneumonia).
Sample prediction of level 2 (Viral or Bacterial Pneumonia) classifier using customized DenseNet201.
Table 6 represents the performance of the proposed DDM Model with some of the similar and current work performed by the researchers. Rajaraman et al. [43] customized the VGG16 model and reported achieving an accuracy of 96.2% for the classification of Viral or Pneumonia. Ayan et al. [45] employed the VGG16 model to attain an accuracy of 84.5%, and it was outperformed by the customized VGG16 model introduced by Rajaraman et al. [43]. However, for the classification of Pneumonia Type (Viral or Bacterial), Gu et al. [48] proposed a Deep CNN based model to attain an accuracy of 80.4%, and it has three classes to choose from. In the same scenario, Khatri [46] employs the EMD model to outperform the CNN model introduced by Gu et al. [47] by achieving 83.3% accuracy. The suggested model was distributed in two levels to attain the maximum accuracy possible. Both groups did better than the other models by attaining 97.45% accuracy for Level 1 classification and 88.41% for Level 2 classification.
Many humans worldwide lose their lives every year due to Pneumonia, which is a potentially deadly disease if not treated as it should be and in the initial stage. Timely and accurate diagnoses in cognition with appropriate treatment may help save several lives. In third-world countries, where the health services are not up to the mark, there might be a lot of patients waiting in the outdoor emergencies for the appropriate diagnosis, and even then, it becomes a lot problematic to manage a huge amount of patients by doctors. In such cases, Computer-Aided Diagnosis (CAD) is a worthwhile option to speed up the diagnosed procedure. A Deep Neural Network model (DDM) is suggested in this effort to identify the Pneumonia type (Viral or bacterial). To attain this task, two levels were introduced in which Level 1 will classify the Normal or Pneumonia case using Chest X-ray image dataset, and if it classifies Pneumonia, the image information is transported to Level 2 for the classification of Viral or Bacterial Pneumonia. The quality parameters used to check its viabilities are Recall/Sensitivity, Precision, Accuracy and F1-Score. The performance of the suggested model for Level 1 classification was found to be improved than the current algorithm and attained the Accuracy, Recall, Precision and F1-Score of 97.41, 97.9, 97.55 97.731, respectively. For Level 2 classification, Accuracy, Recall, Precision and F1-Score are 88.41, 87.91, 87.836 and 88.242, respectively. Since regular DenseNet201 produced an excellent reply for the Chest X-ray image dataset, its custom model is utilized in this work for level 2 classification. The projected CAD-based DDM model will demonstrate a tool for radiologists to attain more image datasets and immediately diagnose Pneumonia with its type.
