Abstract
Pharyngitis is an inflammation of the oropharynx’s mucous membranes. It is typically brought on by a bacterial illness. The outburst of latest technologies has created the need for remote care of detecting diseases like pharyngitis through images of throat taken with help of smart camera. In recent years, research has forwarded with help of deep learning in classifying pharyngitis. But deep learning models require at least one hour training and requires considerably large data set to get a good accuracy. In this paper, we focused on this time constraint and are proposing a novel approach PFDP to classify pharyngitis through detection of potential features based on doctor’s perspective. We have extracted the tiny portions of image which the doctor observes them as infected and calculated frequencies of the occurrences of these portions and are given to custom made decision rules. The classification results showed significant improvement in performance in terms of time taken to reach average accuracy of 70%. It has taken only 5 minutes to extract counts of infected patterns and 1 more minute to get classification results by decision rules of if-then-else rules. We have conducted the experiment on set of 800 images. Though accuracy is lesser than that of what other works achieved but time taken to extract features is significantly lower than that of previous works. Also our approach does not require training and can be applied where scarcity of dataset exists. We assure that our approach is a new direction of research and can compete with more state of the art works in future.
Introduction
The latest technological developments highlighted the essential role of remote healthcare. These solutions offer patients safe access to medical services from their homes, mitigating the risk of viral spread and ensuring ongoing care, especially for those with throat related diseases. With the widespread use of smartphones, leveraging their capabilities for capturing throat or mouth images is feasible. These snapshots can be input into specialized processing systems or software, enabling the classification of conditions like pharyngitis, marking a promising approach to accessible healthcare diagnostics.
The contagious nature of the virus, coupled with the vulnerability of ill patients, necessitates innovative approaches to minimize risks. Remote healthcare, facilitated by advancements in technology, offers a crucial solution. In the case of diseases like pharyngitis, where symptoms may overlap with COVID-19, remote diagnosis and treatment play a pivotal role. Pharyngitis, often associated with respiratory infections, shares symptoms with COVID-19, such as sore throat and difficulty swallowing. Utilizing remote health solutions, individuals can capture images of their throats using smartphones, allowing healthcare providers to assess and diagnose pharyngitis symptoms without direct physical contact.
This not only minimizes the risk of viral spread but also addresses the challenges posed by symptom overlap between respiratory illnesses. Remote healthcare proves crucial in promptly identifying and treating pharyngitis, preventing its progression, and reducing the burden on healthcare facilities. Advanced technologies, including machine learning algorithms, enhance diagnostic accuracy, making remote pharyngitis diagnosis more efficient and reliable. Overall, in the COVID-19 context, remote healthcare acts as a vital tool for both preventing and managing pharyngitis, ensuring timely intervention and safeguarding public health.
Pharyngitis manifests with symptoms such as sore throat, difficulty swallowing (dysphagia), and inflammation of the tonsils and the back of the throat. It is commonly caused by viral infections like the flu or the common cold, though bacterial infections, particularly by Streptococcus pyogenes, can also be responsible. The condition may lead to discomfort, pain, and irritation in the throat, impacting daily activities. In more severe cases, untreated bacterial pharyngitis can result in complications such as the spread of infection to neighboring areas, like the middle ear or sinuses.
According to the Health Report of U.S National Statistics, streptococcal pharyngitis, commonly known as strep throat, constitutes a significant driver for patient admissions to hospital emergency departments in the United States [1]. This bacterial infection, primarily attributed to Group A beta-hemolytic streptococcus, poses a substantial health risk for both children and adults [2, 3, 4]. In 2010, 1.814 million people with Pharyngitis, with 692,000 children under 15, visited the emergency room. The majority of pharyngitis cases occur in children under the age of five.
Delayed diagnosis of strep throat elevates the potential for rheumatic fever, a precursor to chronic rheumatic heart disease, resulting in approximately 320,000 global fatalities annually [5, 6, 7]. Consequently, early identification of strep throat assumes paramount importance, particularly in medically underserved remote areas, as it serves as a pivotal preventive measure against fatalities stemming from rheumatic heart disease. Furthermore, an erroneous diagnosis of strep throat may precipitate unwarranted antibiotic treatments, contributing to the peril of bacterial resistance [8, 9].
Overall architecture of pharyngitis detection.
Chronic pharyngitis, characterized by persistent inflammation, may contribute to the development of more serious conditions, including tonsillitis or contribute to the risk of developing respiratory issues. Regular medical assessment and appropriate treatment are essential to manage symptoms effectively and prevent potential complications associated with pharyngitis. The overall architecture is depicted in Fig. 1. The various sections of this paper are arranged as follows. In Section 2, we proposed the literature survey. In Section 3, we discussed proposed methodology. Section 4 outlines the results of proposed approach. Section 5 concludes the paper.
The growing need for remote healthcare, fuelled by the vulnerability of ill patients and the recent challenges posed by the COVID-19 pandemic, has spurred research into efficient methods for diagnosing and treating diseases remotely. In the context of pharyngitis, a common upper respiratory tract infection affecting millions, leveraging smartphones for remote diagnostics has gained attention.
The prevailing diagnostic approach involves clinical decision-making employing the Centor score, derived from a comprehensive set of criteria encompassing parameters such as coughing and fever [2, 3, 5, 7, 8, 10]. An alternative clinical diagnostic method for detecting streptococcal pharyngitis is throat culture, wherein a sample of throat cells is introduced into a medium conducive to bacterial growth, facilitating disease identification [9, 11, 12, 13, 14, 15, 16]. A positive bacterial growth signifies a bacterial infection with a diagnostic accuracy of 98%, while the absence of growth indicates the absence of a bacterial infection [15]. Touch spray ionization mass spectrometry has also been employed for strep throat diagnosis [14]. Nevertheless, these diagnostic modalities necessitate the expertise of trained physicians or specialists, presenting an enduring challenge in achieving timely and universally accessible diagnoses for all patients.
Research endeavours have harnessed color intensity values to discern a spectrum of maladies, ranging from diabetes [17, 18] and internal-organ disorders [19, 20, 21] to afflictions of the heart and kidneys [17, 18, 22, 23, 24, 25, 26, 27, 28, 29]. These methodologies, predicated on color intensity values, have been intricately amalgamated with machine learning paradigms, including naive Bayes, Bayes net, and sequential minimal optimization (SMO) [30, 31, 32]. In these investigations, 21 distinct properties were derived from tongue color intensity values to effectively diagnose a diverse array of 23 diseases. Despite the discerning capacity exhibited in diagnosing various afflictions through tongue color features, certain limitations persist in the identification of syndromes, discrimination of color features, and accurate disease classification [17, 22, 23, 24]. Evidently, factors such as disparate lighting conditions, color spaces, and devices, as emphasized by Zhang and Kim et al., introduce variability that compromises the reliability of these aforementioned methods in the accurate diagnosis of corresponding diseases [17, 33, 34].
The previous study aimed to capture throat images by combining smartphones with supplementary equipment and utilized the k-nearest neighbor algorithm in the color component distribution space to categorize streptococcal tonsil images [35]. Streptococcus pyogenes is the leading bacterial cause of pharyngitis, accounting for 20–30% of sore throat cases [36, 38]. In another study [3], CycleGAN-generated synthetic images exhibited practical pharyngitis characteristics. A deep learning model notably improved pharyngitis diagnosis accuracy using these synthetic throat images. In the evaluation of pharyngitis identification on the test dataset, the ResNet50 model with GAN-based augmentation achieved the highest ROC-AUC of 0.988. Additionally, using the ResNet50 model in a four-fold cross-validation approach resulted in a maximum classification accuracy of 0.953 and a ROC-AUC of 0.992.
In contemporary pursuits within the realm of image classification or categorization, there has been a notable integration of features rooted in the Bag of Visual Words (BoVW) paradigm, as evidenced by various scholarly endeavours [40, 41, 42, 43]. A subset of these initiatives has strategically employed the principles of transfer learning. Transfer learning, in the context of machine learning, denotes a paradigmatic approach wherein a pre-existing model, having undergone prior training, is harnessed to enhance performance on a cognate yet distinct task [44, 45, 46].
Proposed Methodology of PFDP.
The framework of proposed methodology PFDP is given in Fig. 2. As part of proposed approach, we need to identify 5 potential candidate images of pharyngitis. Then, we need to extract small infected portions from these images and call these portions as potential feature templates and these are given in Step-1 of Fig. 2. Then, we need to find minimum pixel value and maximum pixel values for each of feature template as given in step-2. As part of step-3, we extract specific portion of image and let it be ImageP. Then, we check whether the pixel values of ImageP fall in the range of minimum pixel value to maximum pixel value of particular feature template. If so, the match count for that feature template is incremented. This process is given in step-4. Then, as part of step-5, we finalize the match counts for each of feature templates. Then, these match counts are given to simple if-then-else rules for classification. As shown in Fig. 3(e), the potential feature templates are extracted by snipping a squared portion from all 5 candidate images for Pharyngitis. In this same fashion, all 5 feature templates of pharyngitis class were extracted from 5 potential candidate images and are shown in Fig. 4. All these feature templates were of size 13
Potential candidate images for Pharyngitis.
Potential feature template images for Pharyngitis.
Consider the sample image given in Fig. 5. As teeth color is matching with the color of white infected portions, we snip the image in the form of 20 columns as left margin and 96 columns as right margin and also of 20 rows as top margin and 96 rows as top margin as shown in Fig. 5. This is the portion ImageP which we discussed earlier.
The proposed work comprises of three algorithms. The process of step-1 of Fig. 2 is given in Algorithm-1 whose name is find_min_max_values_PFDP. In Algorithm-1, the minimum and maximum pixel values of blue, green and red color channels of each template feature are calculated. In 1st and 2nd lines, the arrays minvaluesbgr and maxvaluesbgr are initialized to zeros. In 3rd line, the list of file names were extracted from the path where feature templates are stored. In lines numbered 4 to 16, we read each image of feature template and extract blue, green and red components into templateb, templateg and templater respectively. Next, we extract max and min values from each of these channel specific template matrices and store it in maxvaluesbgr array and minvaluesbgr array.
Selecting only the particular portion of source image.
The Algorithm-2 is given the name find_num_matches_phary_nonphary_PFDP that comprises of steps to extract final counts of cases where feature templates are matched inside the image. The first step is to read all set of file names of images from a train or test pharyngitis or non-pharyngitis images. Then pick the first image and read into array format. If image read is empty, then report it as empty. The “classnum” parameter is for class labels being 1 for pharyngitis and 0 for non-pharyngitis. The labels variable for ith image is set accordingly. Then lines 7, 8 and 9 extract blue, green and red components of ith image with portion of [20:160][20:160] for rows and columns respectively. In line numbers 10 to 17, for each feature template, we are extracting the portion of width and height from which the pixel values are retrieved and searching if these pixel values fall between the range of minimum value and maximum value of feature template. This is done for all three color channels of blue, green and red. We count as 1 occurrence of feature matching when all these pixel values in selected portion of the image are in range of minimum and maximum values of all three color channels of feature template.
Before we proceed to main functio, we assume that we called the function find_min_max_values_PFDP with necessary arguments supplied and calculated the min and max pixel values of feature templates. Main function with name Mainfunction_PFDP for the proposed work is given in Algorithm 3 above. The first two lines in Algorithm 3 are used to initialize vectors holding feature counts. The 3rd and 4th lines are used to initialize values for labels. The fifth and sixth lines are used to hold the paths for pharyngitis and non-pharyngitis images. The 7th and 8th lines call the function find_num_matches_phary_nonphary_PFDP() function with specified arguments supplied. The 9th line concatenates the both features of pharyngitis and non-pharyngitis images and 10th line concatenates the labels of respective images. The 11th line initializes the vector for holding the labels after applying the proposed technique. The line numbers 12th to 21st set the label of test image to be classified to 1(pharyngitis) if the count in any of potential feature is greater than zero or else to 0(non-pharyngitis). The line number 22nd initializes the correctly classified images count to 0. The line numbers 23rd to 27th lines compare the true label with classified label and if it is equal they are equal the count for correctly classified label is incremented and line number 28th calculates the percentage of correctly classified images out of 1600 images taken in the experiment. The line number 29th prints the accuracy.
The selection of particular width and height of image is not taken into consideration inside the experiment. We simply assumed it to be 256
Experimental results
The experiment was conducted in Google Colab environment with CPU resource and implemented in python open source framework and details are given in Table 2. For our proposed work there is no need of GPU resources and CPU resources are enough. As the potential feature templates involve the combination of Red, Green and Blue components, we have taken the choice of considering RGB information compared to grey scale.
Pharyngitis dataset accumulation
Pharyngitis dataset accumulation
Simulation parameters of experiment
Model applicability on scarcity of dataset
Table 4 shows the values of accuracy over size of template matching area. The accuracy was calculated on performing test on 1600 images of 800 per each of two classes. The accuracy was projected with regard to varying sizes of width and height in Algorithm 2. The highest accuracy of 72.56% was achieved when width and height are set to 2. The standard deviation of size of template matching area is taken as 2. If this size is lesser, then we have deeper scanning of the image for detection of potential features. On the other hand, if this size is larger, then the happening of relatively weak features may affect the accuracy. Due to this reason, we limited the maximum size of template matching area to 10
Accuracies of classification over width and height of template matching area (PFDP)
Accuracies of classification over starting row number of test Image (PFDP)
Accuracy over width or height of template matching area (PFDP).
Accuracy over starting row of portion of test image (PFDP).
Accuracies of classification over feature combination (PFDP)
The introduction of weights to feature templates may lead to the sensitive nature of model’s performance. The model’s accuracy may vary as the weights are changed meticulously. The exhaustive study of features of image is required to employ weights and currently in our work weights are not given to feature templates.
The width and height is varied starting from 2
Table 6 shows the accuracies of classification over feature combination. We also plotted the each feature contribution towards accuracy in Fig. 8. The data set for pharyngitis is so typical or the very nature of pharyngitis detection is tough so that the contribution of combination of features plays vital role in showing different accuracies. The intuition behind this makes the author to test the combinations for computing accuracy. As we have limited feature templates and so is the limited number of feature combinations. We have taken 9 combinations of features. The first 5 combinations are five feature templates taken alone and 6th combination is contribution of feature 1 and feature 2 combined and 7th combination is contribution of feature 3 and 4 combined and 8th combination is contribution of features 3, 4 and 5. The 9th combination is contribution of all 5 features taken as combined.
Time taken for training in various works
Accuracy over feature combination (PFDP).
Comparitive table for accuracy, precision and recall of various works
Training time versus model graph.
Accuracy versus model.
The time taken for training the model on the data sets in various works is given in Table 7. It has taken around 1 hour for training of the model in work [35]. It has taken around 12 h for training for 1000 epochs in work [38]. Our proposed approach took 2 to 3 minutes which is very minute in proportion when compared to other works. The very nature of image classification based on potential features does not require training of the model. It is all about detecting potential features on test images. So our proposed model PFDP does not require any training of the data. In our proposed work, there exists no training of model on images. Instead, we directly apply classification model on test data set. Also, the proposed method PFDP can be applied to other domains where scarcity of data sets exists. The graph showing the time taken for training versus model is given in Fig. 9.
The accuracy, precision and recall of various works with regard to pharyngitis detection are projected in Table 8. In the work of Askarian [35], they achieved 93.75% accuracy and 88% precision and 87.5% recall. In another work of Yoo [38], they achieved 92.5% accuracy, 94.2% precision and 89.8% recall. In the proposed work, we achieved 72.56% accuracy, 72.51% precision and 71.49% recall.
The graph plotted for accuracy versus model is given in Fig. 10. Though accuracy, precision, recall are lesser than that of what other works achieved, we can confidently say out proposed approach can save lots of training time and achieved reasonably good accuracy. The future works in this direction can undoubtedly improve accuracy and other parameters with less training time and can be applied even in the cases where scarcity of data sets exist. The extracted potential features can also be given to CNN instead of custom made if-then-else rules. But in this scenario, as CNN takes comparatively more time than custom-made if-then-else rules, custom made rules will suffice the achievement of good performance of model.
In this Paper, we have devised a new approach of classifying pharyngitis based on matching of potential feature templates in the test images. We have taken varying sizes of portions of image and varying portions of feature template matching areas in our experimentation. The matching is performed so that occurrence of a match is counted when pixel values of test image fall below the limits of min and max pixel values of feature templates of all three color channels named blue, green and red. We projected the graphs with regard to varying sizes of template matching area and portions of test image that is being considered. The highest accuracy of 72.56 is achieved with size of 2
