Abstract
In late 2019, coronavirus disease (COVID-19) began to spread globally and is highly contagious. Due to its exceptionally rapid spread and high mortality rate, it is not yet possible to be eradicated. In order to halt the spread of COVID-19, there is a pressing need for effective screening of infected patients and immediate medical intervention. The absence of rapid and accurate methods to identify infected patients has led to a need for a model for early diagnosis of patients with and suspected of having COVID-19 to reduce the probability of missed diagnosis and misdiagnosis. Modern automatic image recognition techniques are an important diagnostic method for COVID-19. The aim of this thesis is to propose a novel deep learning technique for the automatic diagnosis and recognition of coronavirus disease (COVID-19) on X-ray images using a transfer learning approach. A new dataset containing COVID-19 information was created by merging two publicly available datasets. This dataset includes 912 COVID-19 images, 4273 pneumonia images, and 1583 normal chest X-ray images. We used this dataset to train and test the deep learning algorithm. With this new dataset, two pre-trained models (Xception and ResNetRS50) were trained and validated using transfer learning techniques. 3-class images were identified (Pneumonia vs. COVID-19 vs. Normal), and the two models generated validation accuracies of 90% and 97.21%, respectively, in the experiments. This demonstrates that our proposed algorithm can be well applied in diagnosing patients with lung diseases. In this study, we found the ResNetRS50 model to be superior.
Introduction
Coronavirus disease (COVID-19), a highly contagious disease, was declared a public health emergency of international concern by the World Health Organization (WHO) on January 30, 2020, and a pandemic since March 11. As the global pandemic continues to spread, governments and healthcare workers are paying increasing attention to epidemic prevention and control. The disease is due to a single-stranded RNA virus known as Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-COV2), subsequently named COVID-19 by the WHO in February 2020. This viral infection poses a significant threat to human health due to its high variability and high morbidity and mortality. COVID-19 is a viral disease that spreads rapidly and is not only a shock to one country and has come as a shock to the entire world. It poses a huge threat to human health worldwide, especially the health of the elderly and children, as infection with the virus can lead to signs and symptoms of multiple organ failure, death, and various complications in the human body. Whereas most COVID-19-infected individuals experience mild to moderate respiratory disease, others develop fatal pneumonia and die, with more people losing their lives each day.
With the COVID-19 pandemic, humanity is facing a great challenge, with every national healthcare system deeply affected and the economic life of people in all countries being hit from all sides [1]. In order to halt the spread of COVID-19, there is a pressing need for effective screening of infected patients as well as an immediate medical aid response. RT-PCR is used as the gold standard tool in COVID-19 detection; however, this technique is not very accurate, and it is time-consuming and labor-intensive to perform [2]. Thus, any technical tool that can help healthcare professionals quickly screen for COVID-19 infection will have a significant role to play in medical assistance. An essential diagnostic method for COVID-19 is the use of modern automatic image recognition techniques. Chest CT scan and chest X-ray (i. e. Radiography) is one of the most valuable and popular imaging techniques used in the hospital today. In contrast, chest X-ray is one of the more accessible and less expensive imaging techniques compared to chest CT and is, therefore, a good screening option for COVID-19. Most X-ray images of COVID-19 have similar features but are identical to those of other infections, such as bacterial and viral infections. As a result, it is sometimes difficult for physicians to determine from images whether an infection such as COVID-19 is pneumonia or not. Mainly when workloads are busy, and the potential of misdiagnosis is high, as well as the need to delay treatment [3].
In recent years, deep learning has been used extensively in various industrial, medical, and other fields. During these three years of research in COVID-19 automatic diagnosis, deep learning has been one of the essential technical algorithms along with machine learning. After learning numerous chest X-ray images, the images are extracted to obtain feature information and classified based on the characteristics of the image. Nowadays, the image features extracted using deep learning techniques have classification capabilities that in some respects go far beyond what people can distinguish with their eyes. In particular, deep learning leverages CNNs to train and learn on a dataset to achieve the model’s feature extraction capabilities. Sometimes, because training the model consumes a lot of computing power, transfer learning is performed on an already trained similar model that benefits from a specific dataset. Transfer learning for training models on X-ray image datasets is becoming an important technological innovation application.
The aim of this paper is to investigate previous COVID-19 recognition techniques and to propose a deep learning model which can be applied to specific medical treatments. A new dataset containing COVID-19 was constructed from two publicly available datasets on the web. Driven by this unique dataset, the X-ray images were trained to learn and evaluate using transfer learning. In combination with previous research, Xception and ResNetRS50 were used as pre-trained models in this study. In the experimental results for the 3-class classification dataset, the classification validation accuracies of the two models were 90% and 97.21%, respectively.
Related works
With the development of automatic medical image recognition technology, the accuracy of diagnosis has surpassed that of general practitioners using image data of various diseases. These modern techniques can help doctors diagnose diseases better. From 2020 till now (January 2023), researchers have constructed different public datasets of COVID-19 X-ray images and trained them to detect COVID-19 in X-ray images using machine learning and deep learning to generate models suitable for diagnosing COVID-19. Analyzing the papers of these researchers, it is found that the convolutional neural network method using transfer learning technology is a better learning model.
Based on traditional machine learning
Based on traditional machine learning, the researchers implement COVID-19 diagnosis and data privacy protection. The use of Federated Learning (FL) and Differential Privacy Generative Adversarial Networks (DP-GAN) allows for the automatic detection of COVID-19 by training models in three classes of X-ray image datasets without sharing the original hospital data, and its FedDPGAN-based ResNet is 94.45% accurate under independent and identical distributions [4]. Rasheed, J. et al. in [5], constructed a set of 500 X-ray images using logistic regression (LR), combined with principal component analysis (PCA) for dimensionality reduction and achieved an accuracy of 97.6%. A study was also conducted using convolutional neural networks and principal component analysis. Singh, R.K. et al. in [6], first performed image augmentation and image segmentation on a three-class dataset, then extracted image features by convolutional neural networks, and finally used a Pruned Naïve approach for classification. An accuracy of 98.67% was obtained. T. Ozcan, in [7], proposed a feature fusion-based approach. The images were pre-processed, and the features were extracted using AleNet and then fused. Finally, the SVM was used for classification, and an accuracy of 99.52% and 87.64% was obtained for binary and triple classification, respectively. S. Lu et al. in [8], investigated using k-nearest neighbors to compute relationships between image-level representation (ILR) vectors of convolutional neural networks. A neighboring-aware graph neural network (NAGNN) architecture that can distinguish COVID-19 was constructed. M. R. Islam et al., in [9], at first, performed a convoluted neural network operation on 2,482 CT images to extract their features. Features are then mapped using traditional machine learning algorithms such as Support Vector Machines (SVM), Gaussian Naive Bayes (GNB), Random Forest (RF), Logical Regression (LR), and Decision Trees (DT). Finally, an ensemble learning model for classified COVID-19 CT images was formed. The accuracy of this model is 99.73%.
Based on deep learning
In 2020, several researchers used deep learning to diagnose COVID-19, which is briefly described below.
Chowdhury, M.E.H., et al. in [3], constructed a dataset containing 423 COVID-19, 1485 viral pneumonia, and 1579 normal chest X-ray images using publicly available datasets and used the DenseNet201 model to train the detection on the two-class and three-class datasets, and finally resulted in 99.7% and 97.9% accuracy, respectively. Ozturk, T. In [10], the DarkNet models were also used to train and test detection on binary and ternary classification datasets and ultimately achieved accuracies of 98.08% and 87.02%, respectively. S. Minaee et al. in [11], used four different convolutional neural networks, such as DenseNet-121, to train models on two-fifths of a publicly available dataset of a total of 5000 x-ray images and the remaining three-fifths for model evaluation. The final models all achieved almost 90% specificity. M. J. Horry et al. in [12], used various applicable convolutional neural network models for common medical image datasets (ultrasound, X-ray, CT) and trained models for diagnostic COVID-19 in each of the three classification datasets. The trained VGG19 model obtained relatively good detection results, obtaining 86% precision in this x-ray dataset. L. D. Wang et al., in [13], proposed a COVID-Net model that obtained an accuracy of 93.3% in a publicly available test of the COVIDx dataset. A. A. Ardakani et al. in [14], obtained CT photographs from 184 patients and used 10 useful convolutional neural networks to train a binary classification model to diagnose COVID-19, with the ResNet-101 model and Xception model having higher accuracy of 99.51% and 99.02%, respectively. A. I. Khan et al. in [15], constructed a CoroNet model using Xception. The model was trained and evaluated based on two different publicly available datasets and obtained four-class, three-class, and binary classification accuracies of 89.6%, 95%, and 99%, respectively. An overview of some 2020 studies using deep learning to diagnose COVID-19 automatically is shown in Table 1.
Some 2020 deep learning studies for classification in X-ray images
Some 2020 deep learning studies for classification in X-ray images
Some 2021 deep learning studies for classification in X-ray images
In 2021, many researchers used the latest deep learning models to diagnose COVID-19, which are briefly described below.
V. N. M. Aradhya et al. in [16], proposed a one-shot learning algorithm combining cluster-based probabilistic neural network (PNN) and generalized regression neural network (GRNN) to gain an advantage in fewer samples, and the algorithm model was trained on a four-classification dataset and proved to be an effective model in a publicly available dataset of 306 images. A. Umar Ibrahim et al. in [17], using the AlexNet pre-model, trained and evaluated on a public dataset of 5,853 images for binary classification and found that the model worked better when the dataset was divided into a 7 : 3 training and test set, with an accuracy of 98.73%. Ç. Polat et al. in [18], trained, tested, and validated three different models for transfer learning for two-class image recognition on a total dataset of 1821 images, of which there were 299 COVID-19 images. The DenseNet-161 model was finally found to be the best performance, with an accuracy of 97.1%. A. K. Rangarajan et al. in [19], applied image augmentation and generative adversarial networks to increase the number of datasets and then trained them using different training models. This achieved relatively good performance for the Xception model in both datasets generated in both ways, with accuracy rates of 98% and 98.1%, respectively. A. Narin et al. in [20], using a variety of applicable models in a dataset of four classifications, were trained to perform binary recognition and found that ResNet50 had the best recognition performance. E. F. Ohata et al. in [21], constructed a dataset with 388 images, 194 of which were COVID-19 positive, and then used a combination of pre-trained models and machine learning to train a two-class diagnostic model and found that DenseNet201 trained with MLP performed better, achieving 95.6% accuracy. E. M. El-Kenawy et al. in [22],, using publicly available datasets on the web, trained feature learning and extraction with ResNet-50, followed by classification with ASSOA and MLP, and achieved a relatively good accuracy (99.26%). P. S. A. Babu et al. in [23], in a publicly available dataset of 2905 images, of which 219 were COVID-19 images, obtained an accuracy of 92.25% using ResNet-50 training learning. Y. El Idrissi El-Bouzaidi et al. in [24], performed data augmentation on the image dataset and then applied the DensNet121 model to train for two and three classification recognition, obtaining 99% and 96.52% accuracy, respectively. S. H. Kassania et al. in [25], applied various pre-trained models to the publicly available X-ray image dataset of COVID-19 and the CT image dataset for model training. The model combining DenseNet121 and Bagging tree had the best classification performance in the experiment with an accuracy of 99%. A. Castiglione et al. in [26], using the ADECO-CNN model, obtained an accuracy of 99.99 in a CT image dataset. R. Jain et al. in [27], using various pre-trained models, trained models on the Kaggle public dataset and found that the accuracy of the Xception model was the highest (97.97%) for recognition. A. Gupta et al. in [28], using the InstaCovNet-19 model, achieved an accuracy of 99.08% and 99.53% in the 3-class and 2-class datasets, respectively. A. K. Das et al. in [29], used Resnet50V2, DenseNet201, and Inceptionv3 respectively for training, and then combined with the prediction method, and the recognition accuracy reached 91.62%. An overview of some 2021 studies using deep learning to diagnose COVID-19 automatically is shown in Table 2.
In 2022, some researchers also used the latest deep learning models to diagnose COVID-19. N. Jalu et al. in [30], used data augmentation techniques to augment a two-class dataset and then trained four learning models. The CNN model obtained the best accuracy of 95%. G. Bargshady et al. in [31], used various data augmentation techniques, specifically CycleGAN, to augment the dataset. Various training models were used for training, among which Inception-CycleGAN resulted in an accuracy of 94.2%. N. N. Das et al. in [32], used the Xception model for training development.
This chapter then describes a method for building a new dataset containing COVID-19 using two online public datasets and, on this basis, explores the core features of the two pre-trained models in feature extraction. Finally, models are evaluated, and experimental results are compared according to the experimental procedure.
The methodology of this study consists of various stages, and the flowchart of the overall system framework for COVID-19 classification is illustrated in Fig. 1. The stages in the study included datasets, image preprocessing, trained model, and evaluation of classification results.

Block diagram of the overall system with the 3-class classification for COVID-19.
This study starts with the collection of the dataset and then proceeds to the preprocessing stage, which is divided into two phases. At the beginning, each image in the data is scaled to 224*224*3. This is followed by dataset splitting, which transforms the COVID-19 dataset into a training dataset, a validation dataset, and a test dataset. The next step after preprocessing is to train the pre-trained model for classification using transfer learning. At each stage of data training, data validation, and data testing, the parameters of the model are trained and adjusted to test for underfitting or overfitting. At the same time, evaluation metrics such as accuracy, precision, recall, and F1-score are generated to compare and evaluate the models.
The first dataset is the Kaggle dataset [33], a commonly used public dataset for research that is constantly updated. In the paper, it has 6856 X-ray images. There are 4273 images of pneumonia (2780 images of bacterial pneumonia and 1493 images of viral pneumonia) and 1583 images of normal. The second dataset is the augmented COVID-19 X-ray image dataset [34], which is also publicly available and used to detect COVID-19 X-ray images. The dataset contains 912 different COVID-19 images and two folders that store the augmented and non-augmented images, respectively. In this study, the augmented images were selected for analysis.
For this study, we selected the augmented COVID-19 images on the second dataset and selected images on the Kaggle dataset to create a new dataset. This new dataset included a total of 7768 chest X-ray images (4273 images of pneumonia, 912 images of COVID-19, and 1583 images of normal). Table 3 shows the composition of the dataset.
Composition of the dataset
Composition of the dataset
In the dataset’s training, the images were preprocessed to scale each image to 224×224 for better and faster training. Figure 2 shows a sample of three different types of images in the dataset.

Samples of X-ray images in the Dataset: COVID-19 (A), normal (B), and pneumonia (C).
In this study, we selected two pre-training models with good performance. Using transfer learning, information about the features of the dataset is learned. The two different models are then compared in terms of various evaluation metrics compared in the experiments.
Xception
Xception, as a mapping from the cross-channel correlation to the spatial correlation on the feature map of CNNs, is able to achieve complete decoupling and is built entirely on top of a depth wise separable convolutional layer structure. The algorithm is computationally more efficient by using multiple convolutional kernels on different scales. It is a model that replaces Inception with depth wise separable convolution layers. Its parameter structure is similar to that of Inception, except that it performs well. Briefly, the architecture of Xception is a linear stack of depth wise separable convolutional layers with connected residuals [35].
ResNet-RS
ResNet-RS has adopted a new and improved training approach supported by different scaling strategies, and its new extension architecture was born. Training methods and regularization methods may be more important than architectural improvements for boosting model performance. The scaling strategy can be summarized as follows: scaling the model depth at training settings where overfitting can occur (otherwise, scaling the width is preferable); and scaling the image resolution at a slower rate. This new framework has been validated and found to be effective, with the ResNet-RS model being a much faster training and greater performance improvement than previous deep residual networks, as shown in the following experimental results [36].
Transfer learning
It is not easy to get large amounts of scarce data and expensive computing resources, especially in healthcare. A wide range of high-quality labeled data is critical for a deep learning system to aid the diagnosis of medical images. Nevertheless, we can effectively deal with this problem using transfer learning. Transfer learning is a learning process that takes advantage of similarities between data, tasks, or models and applies models that have been learned or trained in an old domain to a new environment.
In the transfer learning approach used in this study, a pre-trained model network is used as a feature extractor for a new task and then connected to an MLP classifier for training, validation, and testing. Algorithm 1 represents the pseudo-code of the 3-class Classification Model for COVID-19 using transfer learning.
Experiments
Python was used for data reading, transfer learning training, development of evaluation models, and visualization of experimental results. All Python code was run on a Cluster server with Linux as the operating system. Table 4 displays Cluster Information.
Cluster information
Cluster information
The CNN model weight parameters (Xception and ResNetRS50) were randomly initialized and then selected for training and optimization of the cross entropy function using the ADAM optimizer (decay = learning rate/ epoch). During experiments, the dataset was randomly split into 2 independent datasets for training and testing 9 : 1. The hyperparameters such as batch size, learning rate, and number of epochs were fixed at 8, 1e-3, and 40 epochs, respectively.
Evaluation of model performance is a key focus of research, and a deep learning model is often evaluated using these five metrics for good or bad, such as accuracy, precision, recall, specificity, and F1 score. As shown in Equations (1) to (5) [27]. They are:
The values of TP, FP, TN, and FN given in Equations (1) to (5) represent the values of true positives, false positives, true negatives, and false negatives, respectively. TP is the proportion of COVID-19 positives that are correctly tagged in the model for a given set of test data and model; FP is the ratio of mislabeled positives to negatives; TN is the normal value with the proportion of correctly labeled negatives; and FN denotes the proportion of mislabeled negative positives in the model [20].
In this part, we conduct experiments with 2 different models for 3-class classification and compare them based on the results obtained from our experiments.
Comparison of the model training process
During the experiment, two different pre-trained models, namely Xception and ResNetRS50, can be observed (see Figs. 3 and 4), where the running time for each epoch was not quite the same during training, even for different epochs of the same model. In this experiment, 90% of the data was used to train the model, and the remaining 10% was used to test the model. Comparing the training times of the two models in this dataset, the total training times per epoch for the Xception and ResNetRS50 pre-trained models were 145–152 seconds and 112–121 seconds, respectively.

Training time for each epoch in the Xception model.

Training time for each epoch in the ResNetRS50 model.
By observing the training process in the dataset, we find that each evaluation index of the Xception model changes as a function of the number of epochs, as can be seen in Fig. 5, and the accuracy curve and the loss curve also show different patterns of change with increasing number of epochs. The recognition accuracy rises slowly with the number of epochs, and the loss function keeps decreasing. Finally, based on the figure, the accuracy is 0.9699, the loss value is 0.0604, the accuracy of verification is 0.9000, and the value of the verification loss is 0.3427.

Training Loss and Accuracy of Xception with successive epochs.
Through observation of the training process in the dataset, we observed that the evaluation metrics of the ResNetRS50 model changed as a function of the number of epochs. As shown in Fig. 6, the accuracy and loss curves also showed different change patterns as the number of epochs increased. The recognition accuracy slowly increases and finally approaches 1 as the number of epochs increases, while the loss function continuously decreases and ultimately tends to 0. Finally, according to this figure, the accuracy is 1, the loss value is 0.0002, the validation accuracy is 0.9721, and the validation loss value is 0.0688.

Training Loss and Accuracy of ResNetRS50 with successive epochs.
After understanding the plot graphs, the accuracy and loss values of both models during training can be seen in Figs. 5 and 6. The ResNetRS50 model has higher accuracy and lower loss values and, to some extent, outperforms the other models.
After testing experiments on these different models, the best-performing model was selected from a comparison of the results of the experiments. The generalization performance of a learning model is usually evaluated using various standard evaluation metrics. In this study, three different metrics were primarily used for additional performance measures in addition to accuracy. The metrics automatically calculated in the model are accuracy, recall, precision, and F1_score [37]. In short, accuracy is the rate at which the model predicts correctly. Precision is the ratio of correct predicted positive samples to all positive identifications. Recall refers to the proportion of the number of correctly identified positive samples to the total number of positive samples. F1-score refers to the harmonic mean of precision and recall.
As shown in Table 5, the Xception model classification reported 92% accuracy,96% precision,97% recall, and 96% F1-score for the COVID-19 class, 92% average precision for the normal class, and 92% average precision for the pneumonia class. Table 6 shows that the ResNetRS50 model reported 96% accuracy,99% precision, 100% recall, and 99% F1-score for COVID-19, 90% for normal, and 97% for pneumonia. Based on Tables 5 and 6, comparing the two models, the values of precision, recall, F1_score, and accuracy are lower in the Xception model. A low precision will produce high false positives. It is easy to misclassify patients for the wrong treatment. A low recall rate will lead to some patients being undiagnosed and untreated. In this study, we found the ResNetRS50 model to be superior.
Xception classification report
Xception classification report
ResNetRS50 classification report
The COVID-19 pandemic is still ongoing every day. As the number of cases continues to increase, hospitals need to detect infected cases quickly. X-ray images are used for automated diagnosis, which is very helpful for patient care and to prevent further spread of the disease. Because of the limited number of X-ray COVID-19 image datasets, transfer learning is the preferred method for training. In this project, we have experimented with two CNN models for diagnostic classification using automatic scanning of patient X-ray images. In addition, we conclude that the ResNetRS50 model performs best for COVID-19 diagnosis on X-ray images in both models mentioned above, relative to the deep networks in other studies.
To further develop this study, it is proposed to increase the dataset size by adding additional COVID-19 patient X-ray images, and once these images are available, validation of the model proposed in this paper using a large chest X-ray dataset could be considered. In addition, we also attempt to improve the existing algorithm without changing the current algorithmic framework to improve detection efficiency and accuracy. We also intend to integrate our approach into a free application for image classification.
Copyright
Authors submitting a manuscript do so in the understanding that they have read and agreed to the terms of the IOS Press Author Copyright Agreement posted in the ‘Authors Corner’ on www.iospress.nl.
