Abstract
Problem statement
While CT (Computed Tomography) is commonly used, its diagnostic accuracy for chronic sinusitis remains uncertain. Moreover, the high cost of CT examinations limits its use as a routine diagnostic method. There is an urgent need to develop an AI-assisted diagnostic model for sinusitis.
Objective
The primary aim of this study is to develop an AI-assisted diagnostic model for sinusitis that can improve diagnostic accuracy and accessibility compared to traditional CT methods.
Methodology
This study utilized a retrospective approach, focusing on patients diagnosed with chronic sinusitis via CT and normal patients admitted to the People's Hospital between January 2018 and January 2019. A total of 5000 sinus CT images were collected. All cases underwent T (targeted) coronal plain scans in the hospital's CT room, ensuring complete CT images. In constructing the chronic sinusitis classification model based on deep learning, 5000 CT images of soft tissue windows and sinuses were gathered. This included 1000 CT images for each of the four groups diagnosed with sphenoid sinusitis, frontal sinusitis, ethmoid sinusitis, and maxillary sinusitis, along with 1000 images from normal cases (250 images per group). The sigmoid function replaced the softmax function, and the binary cross-entropy function was used to assess the model's predictive accuracy.
Results
The model achieved an accuracy of 85.8%, outperforming doctors with low (71.7%), medium (78.4%), and senior (73.4%) qualifications. The model demonstrated high accuracy, superior feature extraction, and resolution capabilities.
Keywords
Introduction
The clinical diagnosis and treatment of chronic sinusitis are based on the patient's symptoms, such as headache, runny nose, nasal congestion, and facial discomfort. Due to its inherent uncertainty, various objective examination methods, such as CT, combined with nasal endoscopy and anterior rhinoscopy, etc., can carry out objective detection of nasal mucosa and polyps. Some doctors use CT scans to determine the condition to guide treatment and surgery; CT scan findings can also be used as part of the grading system for some chronic sinusitis conditions. CT is the gold standard for evaluating whether sinusitis requires surgery. It can be combined with nasal endoscopy to provide patients with more objective data. Before doing CT, how to improve the accuracy of diagnosis without reducing costs is the goal pursued by doctors. There was no correlation between symptoms on CT scans and their CT findings in patients with chronic sinusitis. Therefore, the analysis of the correlation between subjective signs and objective imaging can help this article better understand the two's role in clinical practice. This article aims to clarify the significance of CT examination for chronic sinusitis, the correlation between CT and nasal endoscopy results and the severity of clinical symptoms, and explore how to improve the diagnostic accuracy of chronic sinusitis while reducing costs.
Chronic sinusitis is a common and frequently occurring disease, and there are still many problems in daily diagnosis and treatment. Chen Li's purpose is to explore the effect of all-around care in outpatient clinics for patients with sinusitis. She selected 120 patients with chronic sinusitis who were admitted to the outpatient clinic from January to December 2021 and used the random number table method to divide them into two groups: one group was general outpatient care, the other group was comprehensive care, and compared the anxiety of the two groups self-rating scale and health survey short form. She believed that comprehensive outpatient care for patients with chronic sinusitis could effectively relieve the patient's mental stress, improve the patient's quality of life, and make the patient's nursing satisfaction higher. 1 Zhang Rong believed chronic sinusitis was a chronic inflammatory disease in the nasal sinuses. Its causes and pathogenesis are very complex, and it manifests as nasal obstruction, purulent discharge, loss of smell, head swelling and pain, etc. 2 Bai Jie analyzed interleukin's expression and detection value in the tissues of elderly patients with chronic rhinosinusitis and nasal polyps. 3 Yang Huarong discussed the recurrence rate of chronic sinusitis patients treated with transnasal endoscopy in Yan'an University Affiliated Hospital from 2019 to 2021 and analyzed its risk factors. He conducted a retrospective analysis of the clinical data of 315 hospital patients who underwent endoscopic sinus surgery. He divided the patients into two groups according to whether they had recurrence within 1 year, compared the two groups, and screened various clinical indicators of the two groups. He believed that clinical practice should pay enough attention to these factors and carry out symptomatic intervention for patients with these factors to reduce recurrence after surgery. 4 They used textual information in medical records to make a diagnosis, which improved the ability to diagnose chronic sinusitis, but the ability to predict chronic sinusitis was not very good.
Machine learning and deep learning belong to artificial intelligence, and deep learning includes the latter. Deep learning provides new ideas and methods for deep learning with its excellent performance, especially in image processing. Deep learning algorithms can divide the training process into two categories according to the presence of labelled data: supervised learning with labelled data and unlabeled learning with unlabeled data. Supervised learning with labelled data is a commonly used medical image learning method, which means guided training of images with labelled data. It is mainly used in fields such as classification, detection and recognition of medical images.
To enhance the accuracy of image diagnosis for chronic maxillary sinusitis (CMS) by developing a deep learning convolutional neural network (CNN) and a support vector machine (SVM) model using CT image data. 5 This model was evaluated on 1000 samples from 500 patients collected between January 2018 and December 2021. The results demonstrated the model's sensitivity, specificity, and accuracy for 93 CMS cases and 161 CMS cases with bone remodelling, respectively. A deep learning algorithm has been developed to accurately diagnose sinusitis in the frontal, ethmoid, and maxillary sinuses using both Waters’ and Caldwell's views. It can detect and classify each paranasal sinus without manual cropping. The algorithm achieved diagnostic performance comparable to that of radiologists for ethmoid and maxillary sinusitis and demonstrated a higher AUC than models using only the Waters’ view for maxillary sinusitis. 6 This advancement reinforces the value of radiography as an effective first-line imaging tool for sinusitis screening. Deep learning (DL) is assessed for detecting, classifying, and segmenting maxillary sinus diseases. From 1167 eligible studies, 14 DL models were trained using radiographic images. 7 Accuracy ranged from 75.7% to 99.7%, with AUC values between 0.7 and 0.997. These findings indicate that DL could support students, residents, and dentists in diagnosing conditions and making informed decisions regarding implant treatments.
This article intends to use the original image data collected by sinus CT as the data source and use the VGG network (Visual Geometry Group Network) as the framework to adopt the artificial intelligence-assisted diagnosis model of chronic sinusitis based on the deep learning algorithm and apply it to chronic sinusitis, evaluate its clinical application value, and lay a theoretical and technical foundation for the further development of artificial intelligence diagnosis of chronic sinusitis. At the same time, with its powerful image processing capabilities, it can effectively improve the accuracy of doctors’ reading, shorten the reading time, improve doctors’ work efficiency, reduce doctors’ work intensity, and enhance doctors’ diagnostic consistency. Early diagnosis and treatment can improve the patient's quality of life and reduce the economic burden on the patient's family.
Research methods for artificial intelligence-assisted diagnosis model of sinusitis
Medical image segmentation during the diagnosis and treatment of sinusitis
Given the shortcomings of doctors’ manual target area drawing in the current diagnosis and treatment of sinusitis, as well as the continuous increase in the demand for radiotherapy, this article intends to use this as an entry point to research the key technologies of intelligent auxiliary diagnosis of sinusitis based on deep learning and realize the transformation from a manual drawing by doctors to automatic drawing by artificial intelligence.8,9 Combining the clinical limitations of CT and MRI images with information complementarity, this article will first conduct in-depth research on the registration of multimodal images to provide more useful medical information for segmentation work and help improve segmentation accuracy. Secondly, for medical images currently only applicable to a single mode, this article adopts a new algorithm for automatically segmenting medical images using convolutional networks and improves it. Finally, to better utilize the information on multimodal images and further improve the delineation accuracy of the images, this article intends to segment the multimodal images. Based on the existing research basis, the corresponding solutions are given for the key technical problems involved in the three research contents.
Multimodal sinusitis medical image registration based on dual-channel convolutional network Sinusitis medical image segmentation
Due to the high specificity of multimodal images, large differences in multimodal image content, and offsets between images, traditional image registration based on grayscale or features is no longer suitable for multimodal image registration. Given the characteristics of multimodal sinusitis images, this article plans to use convolutional networks to register multimodal sinusitis images.
10
Based on this, this article adopts and improves a new similarity loss function. To address this problem, this article plans to use a dual-modal convolutional network. Based on the dual-modal network, image similarity measurement and feature extraction are performed on the dual-modal convolution network. The maximum mean square error of the feature is used as the similarity measure of the matching image, and then combined with the matching image's smoothing, the matching network's loss function is constructed.
At present, automatic segmentation of sinusitis images has problems such as complex tissue structure, few annotated data, few specifications, blurred boundaries between tumours and images, and serious imbalance between foreground and background types.
11
This article adopts a lesion segmentation algorithm for sinusitis lesions based on cascaded 3D fully convolutional networks. First, based on the layered network, block sampling, rotation, flipping, and other methods are used for the training samples. According to the spatial complexity of 3D (three dimensional) data, random sampling is performed from three directions: axial, coronal, and sagittal, thereby making up for the lack of training samples, and using the roughly segmented tumour probability map as the input of the secondary network.12,13 This method uses the hole space convolution pooling pyramid, based on DenseNet, by increasing the number of receiving fields and improving the feature vector of the image, thereby achieving the purpose of fine segmentation; in addition, to overcome the imbalance between classes, a loss function based on the weighted tumour voxel distance of focal loss is used to reduce the weight of the sample. While obtaining smoother boundaries, it can effectively solve the problem of vanishing gradients.
This method uses a top-down approach to train features after deep learning and adjust their weights to achieve better-fitting results. Each layer of neurons will calculate the input data and pass the results of the calculations to the next layer. The calculation results of each layer are the features learned by the network, and the features can be divided into shallow features and deep features from top to bottom. Among them, deep features contain more abstract pattern information. For example, for identifying animals, deep features may include ears, noses, eyes, and other important information helpful for classification. Finally, the neural network compresses and fuses the obtained features and outputs the prediction results according to the threshold value. This process is very similar to the processing mode of the human brain.
Reliability is using different segmentation algorithms for multiple images and measuring each algorithm's similarity. First, an experienced imaging physician manually determines the target area, then inputs the same sequence of MRI images, uses the segmentation algorithm described in this article to segment the target area, and compares the segmentation results of the two. Then, use the following formula to calculate reliability:
Multimodal medical image segmentation based on multi-task learning
CT images have blurry borders and poor resolution of soft tissues, while MRI images have artifacts, are insensitive to calcification and cortical bone lesions, and have a poor spatial resolution; this article intends to take sinusitis cells as the research object, study the useful information and its guiding role between each modality in multimodal images, and analyze new algorithms for multi-task learning in multimodal images.
14
Three sub-networks of CT segmentation network, MRI segmentation network and image similarity measure joint learning of CT and MRI can be further constructed. On the other hand, for special tasks of a single channel, the attention module of the special task is learned to avoid overfitting, thereby making full use of common features and improving the recognition accuracy of the target.
The preprocessing of CT-scanned images is essential for achieving optimal performance during model training. First, each CT image is resized to a uniform dimension of 224 × 224 pixels while preserving the aspect ratio to maintain consistency across the dataset. After resizing, pixel values are normalized to a range of [0, 1] by dividing each value by 255. This helps accelerate the model's convergence during training by ensuring that input features are on a similar scale. Several data augmentation techniques are employed to diversify the training dataset and mitigate overfitting. These techniques include random rotations within a specified range (e.g., ± 20 degrees), horizontal flipping, varying zoom levels, and adjustments to brightness to mimic different lighting conditions.
Moreover, contrast enhancement techniques, such as histogram equalization, are applied to improve the visibility of anatomical structures and lesions within the CT scans. Noise reduction methods, like Gaussian filtering, are then utilized to decrease artifacts and refine the details of the images, facilitating better feature extraction. To focus on relevant anatomical regions, segmentation is performed to isolate the sinus areas from the remainder of the CT images, employing techniques such as thresholding or CNNs based on the complexity of the images. Finally, the preprocessed images are divided into training and validation datasets, typically following an 80–20 split. This approach ensures that a substantial portion of the data is used for training while reserving a separate set for validating the model's performance.
Convolutional neural network in deep learning
Deep learning technology can be seen as a type of research that applies artificial intelligence to the computer field. This concept was proposed from the study of the way the brain works. Researchers found that the human visual system processes information in layers. Therefore, it is necessary to simulate how neurons in the brain process and transmit information to establish a depth Neural Network. The depth of the network is determined by the number of hidden layers between the input layer and the output layer so that each layer of the network can comprehensively learn the underlying features of the image and directly express them into abstract high-level features. In many aspects, such as search, driverless driving, natural language processing, image processing, etc., the successful implementation of this article will promote the intersection of multiple disciplines and has important scientific significance.
(1) Convolutional layer
Convolutional layers are the core of neural networks. Different features can be obtained according to the convolutional layer's location. Among them, the shallow convolution can obtain the underlying information, such as shape and texture, while the deep convolution can obtain more semantic information. Using the optimization of the loss function, backpropagation is continuously carried out, the parameters in the convolution kernel are updated, and the special activation function is finally used to obtain the characteristic output map. The calculation formula of the convolution layer in the artificial intelligence-assisted diagnosis of sinusitis is as follows:15,16
The height and width result in
In real life, colour images usually have three channels of R, G, and B (Red, Green, and Blue), while each image in the input network has three channels, and its convolution operation is the same as a single-channel operation. The original image has three channels, and each corresponding convolution kernel has three. Dot-multiply the pixels corresponding to the convolution kernel and the original image, then add them to obtain a pixel in the output feature map. An output feature map with a size of 4 × 4 is obtained through the entire input image. Only a single output feature map can be obtained if there is one convolutional core. The calculation results show that the number of channels of each convolution kernel is consistent with the number of channels of the input image, and the number of each convolution kernel is consistent with the characteristics of the output image.
(2) Fully connected layer
The last few layers of the network model are usually fully connected layers, whose function is to weigh the features extracted in the convolutional layer in front of the model and the dimensionality-reduced features of the pooling layer to complete the classification. First, construct a priori neural network including two levels of convolution and pooling, and then combine the first constructed neural network with the one to be identified, and then keep the number of all neurons on the previous level consistent with the number of categories of the task to be recognized, and convert the output of all neurons on the upper level into the probability expressions of each category, and obtain the predicted value of each category, to complete the recognition task.
Assuming that the space step is h and the time step is t, the numerical solution of the chronic sinusitis level set function can be obtained:18,19
The VGG model, developed by the Visual Geometry Group at the University of Oxford, is renowned for its straightforward and consistent architecture. It features a sequence of convolutional layers followed by fully connected layers, which enables efficient multi-level feature extraction, making it highly effective for image analysis tasks. With configurations like VGG16 and VGG19, which contain 16 and 19 layers, the model's depth allows it to learn complex patterns and features from images, improving classification and detection performance. The deeper networks, such as VGG, can attain higher accuracy in various image recognition tasks. A key benefit of the VGG model is its compatibility with transfer learning. Pre-trained VGG models, typically trained on large datasets like ImageNet, can be adapted for specific tasks with relatively small datasets. This is particularly advantageous in medical imaging, where annotated data is often limited. By utilizing the VGG model's pre-learned features, researchers can achieve strong performance with less training data. Additionally, the VGG model has proven robust to variations in image quality, orientation, and scale qualities critical in medical imaging, where images may differ due to varying imaging protocols or patient conditions. This adaptability enhances its reliability for diagnostic applications.
The VGG model has set a high standard in various image classification benchmarks. It has succeeded in competitions like the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), solidifying its reputation as a benchmark model. Its performance in these settings also boosts confidence in its suitability for medical image analysis. Moreover, the architecture allows for effective visualization of learned features, providing insights into the model's decision-making process. This transparency is especially valuable in medical settings, where understanding the rationale behind diagnoses can increase trust and acceptance among clinicians. It also integrates well with advanced techniques such as data augmentation, ensemble methods, and attention mechanisms, enhancing its effectiveness in image analysis. This flexibility allows researchers to adapt the model to specific needs, improving diagnostic accuracy. The VGG model's structural advantages, depth, transfer learning compatibility, robustness, and demonstrated success in image analysis make it a valuable asset in medical imaging applications. The VGG model adopts the method of layered superposition, which improves the nonlinear mapping ability of the model under the condition of keeping the receiving field constant. The receptive field obtained by superimposing two 3 × 3 convolutions is equivalent to a 5 × 5 convolution, more receptive fields can be superimposed, and the computational complexity and parameters can be effectively reduced. The tuning parameters used for the deep learning VGG model are represented in Table 1.
Tuning hyperparameters for deep learning VGG model.
Compared with traditional medical image processing, deep learning technology has significant advantages in the auxiliary diagnosis of diseases. Given the characteristics of nasal endoscopy, the research on the auxiliary diagnosis method of nasal endoscopy based on deep learning is of great significance. On this basis, this article adopts a new technique that combines target recognition and target tracking. The improved VGG is used to identify the nasal lesion area. The recognition result is used to initialize the multi-scale feature fusion target tracking network to realize comprehensive intelligent identification and positioning of the nasal lesion area, helping doctors to complete the patient's location search, identification and positioning in real-time, and has good clinical application prospects.
With the advancement of science and technology and the development of medicine, medical imaging technology has also been further developed. It has been deeply studied in common diseases and image-based diagnosis, especially in pulmonary nodule detection, breast cancer diagnosis, etc., and it plays a great role. Currently, MRI, computed tomography, positron emission tomography and other technologies are mainly used clinically for imaging diagnosis of nasopharyngeal cancer. Various imaging methods can reflect the subject's medical information from different perspectives, and doctors often choose appropriate imaging methods based on the subject's characteristics. Among them, MRI technology is widely used to visualize tumours and other human tissues and functions and can conduct detailed imaging of various body parts at any level. At the same time, MRI has a high-resolution ability for soft tissues and does not cause serious harm to the human body, like X-rays and CT. Therefore, MRI technology has significant advantages in the research of nasopharyngeal cancer.
In diagnosing nasopharyngeal cancer, magnetic resonance imaging (MRI) must collect images from the patient's transverse, sagittal, coronal and other directions. Each level can be divided into two categories: The longitudinal relaxation sequence and the transverse relaxation sequence, each with more than 40 images. The diagnosis of nasopharyngeal cancer is mainly based on observing abnormalities in multi-directional and multi-sequence MRI. The clinical manifestations of nasopharyngeal cancer are relatively complex and require high clinical experience of doctors. The huge amount of brain MRI sequence image data brings doctors a heavy workload and a long diagnostic cycle. However, doctors will experience fatigue and different clinical experiences during the long-term diagnosis process, which can easily lead to misdiagnosis and missed diagnosis. In recent years, with the development of computer technology, computer-aided diagnosis has become an important research direction of medical imaging. Its main purpose is to perform image processing on the patient's medical imaging information through computers and perform image analysis on it to discover lesions, determine the nature of the lesions, and match and compare them with the patient's comprehensive pathology data, thereby obtaining a preliminary result that doctors can use for diagnosis.
With the continuous development of computer-intelligent diagnosis technology, computer-aided design technology has been widely used in medical imaging. Computer-aided diagnosis technology based on medical images roughly follows the following four steps:
Medical image preprocessing: Preprocessing mainly includes image enhancement, denoising, etc.
Medical image segmentation: The goal of segmentation is to extract and display the observed organs, tissues, etc.
Classification recognition: In this stage, a classifier is designed, a training sample is used to train the classifier, and finally, the obtained features are classified. The contrast of brain imaging is low, and the tissues and organs of the brain are complex and diverse. Image segmentation has brought great difficulties due to the differences between people and the blurred boundaries of different soft tissues and mucous membranes. Based on the previous research, this project plans to take nasopharyngeal cancer as the research object, use image feature extraction technology to extract the nasopharyngeal mucosal area from the image features and conduct quantitative and qualitative analysis on it to realize intelligent diagnosis of nasopharyngeal cancer patients, reduce the workload of doctors, improve their work efficiency, reduce the physical exertion of doctors and the resulting misdiagnosis and missed diagnosis, thereby improving the accuracy of their diagnosis.
Nasal lesion is a disease that seriously endangers people's health. Its diseased site is relatively special and is closely related to people's lives. The sinuses, nasal cavity, and surrounding tissues are deep, highly complex, and susceptible to pathogenic bacterial infection, which makes clinicians’ diagnoses very difficult. 20 Even otolaryngologists with decades of experience have difficulty making a comprehensive, accurate, and proactive diagnosis of suppuration or infection.21,22
Construction of artificial intelligence-assisted diagnosis model for sinusitis
Research object
This article adopts a retrospective method. In this paper, from January 2018 to January 2019, patients with normal and chronic sinusitis confirmed by CT examination in the People's Hospital were selected as the research objects, and a total of 5000 sinus CT images were collected.
(1) Inclusion criteria
One of the symptoms of nasal congestion and runny nose must appear at the same time, or symptoms such as swelling and pain of the head and face and hyposmia must occur at the same time;
CT examination can detect damage to the sinus or meatus complex and/or nasal mucosa; nasal endoscopy can be used to observe whether there are nasal polyps and mucopus in the nasal cavity and whether there is edema; it was pathologically diagnosed as chronic sinusitis syndrome; 23 the subjects were able to actively accept relevant examinations and treatments.
(2) Exclusion criteria
Patients with asthma and other upper respiratory tract infections; history of nasal and sinus surgery;24,25
A history of topical medication in the nasal cavity in the past three months; 26 other benign and malignant tumours in the nasal cavity;27,28
Those with incomplete case data.
(3) Collection of basic clinical data
Basic demographic characteristics: gender, age, previous surgical history;
Clinical manifestations: having a history of allergic rhinitis, asthma and other comorbidities;
Laboratory items: serum immunoglobulin E, allergens, blood routine, serum antibodies, antibodies;
Nasal polypoma was observed using the paraffin method.
Allergen Testing: Patients must undergo a skin prick test for the following allergies: tree pollen, grass pollen, dolphin pollen, dust mites, cat, dog, mold, or cockroach. In one or more of the above items, the prick test result of one or more items is greater than (++) and positive; ImmunoCAP can also be used to detect specific IgE (Immunoglobulin E) for common inhaled allergens. If it exceeds 0.35 IU/mL, it is a mild allergy or suspected, and IgE above 0.7 IU/mL is positive.
Symptom statistics: after a detailed questioning of the patient by a professional otolaryngologist, recording the patient's current medical history, including whether the patient has nasal congestion, runny nose, hyposmia, head and face symptoms, sneezing, coughing, ear symptoms, sleep symptom. The patient's consultation data were incomplete during the acute phase, and current medical history data were excluded.
Ct scanning method
All cases underwent a T-coronal plain scan in the CT room of the hospital, and the CT images of all cases were complete. During the CT scanning process, the patient's sinuses were scanned in 3D using the PHILIPS MxLite View software, which displayed the anatomical structure of the patient's sinuses and the scope and degree of lesions. Before each patient's operation, when the window width was 360, and the window level was 60, the Lund-Mackay scoring method was used to analyze it. The Lund-Mackay score of chronic sinusitis is shown in Table 2. When the window width is 2000, and the window level is 300, the GOSS bone inflammation scoring method is used to score it (Scoring rules: Sinus: 0 = no abnormality, 1 = partial opacity, 2 = total opacity; sinus ostial complex: 0 = no obstruction, 2 = obstruction; the total score for unilateral 0–12, the total score for bilateral is 0–24 points).
Chronic sinusitis Lund-Mackay scale.
Chronic sinusitis Lund-Mackay scale.
The frontal sinus-like structure must be consistent with the vertical part of the frontal bone to achieve a complete sinus-like structure. The maxillary sinus should be directed downwards below the nasal floor area and cut transversely to the zygomatic process. The end of the ethmoid bone should be the cribrosa plate, while the end of the sphenoid bone should be vertical, extending to the pillars of the optic nerve. Rudimentary sinuses have their corresponding opacification score multiplied by 0.5; thus, a grade of underdevelopment halves the opacification score for that particular sinus cavity. Sinus tracts that did not appear were excluded from the overall calculation. We multiplied the total score without sinus CT by a correction factor to compensate for this error. The correction factor was calculated by dividing the maximum possible score of this CT scan by 24.
(1) Sinus CT image data set
A comparison of sinusitis and normal sinus CT images in the four groups of sphenoid sinusitis, frontal sinusitis, ethmoid sinusitis, and maxillary sinusitis is shown in Figure 1. A total of 5000 horizontal CT medical imaging images of soft tissue windows and sinuses were collected, of which 1000 were diagnosed as CT images of the 4 groups of sphenoid sinusitis, frontal sinusitis, ethmoid sinusitis, and maxillary sinusitis. There are 4 groups of CT images, all normal, with 250 frames in each group. The study utilized traditional Chinese medicine imaging diagnostic standards as a reference for classifying sinus CT images, consulting two rhinologists and integrating comprehensive medical history, nasal endoscopy, and sinus CT data. The dataset comprised 5000 CT images, including 1000 images each from four groups diagnosed with sphenoid, frontal, ethmoid, maxillary sinusitis, and normal cases. To prevent overfitting due to the small sample size, the dataset was divided into training and test sets in an 80:20 ratio, with images labelled binary (normal as 0 and abnormal as 1). This framework aimed to enhance the diagnostic accuracy of chronic sinusitis by combining traditional diagnostic methods with advanced imaging techniques. This article intends to use traditional Chinese medicine imaging diagnostic standards, with two rhinologists as a reference, as well as comprehensive medical history, nasal endoscopy, sinus CT, and other data to diagnose CT imaging. Since the number of samples in the data set is very small, to prevent overfitting, this article divides the data set into two parts according to the ratio of 8:2; one is the training set. The other is the test set, which constructs a corresponding binary label y, where the normal label is 0 and abnormal labels are 1.

Comparison of sinusitis and sinus normal CT images in 4 groups of sphenoid sinusitis, frontal sinusitis, ethmoid sinusitis and maxillary sinusitis.
(2) Image preprocessing
The image preprocessing process is very important in the training process. The preprocessing process begins with rescaling and normalization. Since CT images vary in resolution and pixel intensity, each image is resized to a uniform dimension of 224 × 224 × 3 pixels, aligning with model input requirements. Additionally, pixel values are normalized to a standardized range, typically between 0 and 1 or −1 and 1, ensuring consistent brightness and contrast across images, which improves feature stability. Noise reduction is applied next, as CT scans often contain inherent noise and artifacts. Techniques like Gaussian or median filtering suppress these disturbances, helping the model concentrate on relevant anatomical features without interference from extraneous variations. Contrast enhancement follows, addressing the naturally low contrast of sinus CT images. Methods such as histogram equalization enhance the visibility of structural details within the sinuses, allowing the model to distinguish tissue boundaries and sinus cavities more effectively. To mitigate overfitting and simulate diverse clinical conditions, data augmentation is employed. This involves rotations, flips, scaling, and brightness adjustments, effectively expanding the dataset's variability. Such augmentation improves the model's generalizability, particularly when tested with new data. Another critical step is segmentation and region of interest (ROI) extraction, where automatic segmentation techniques isolate the sinus regions from adjacent tissues, concentrating on the areas crucial for diagnosing sinusitis. Approaches like thresholding or using a 3D convolutional network allow for accurate ROI identification. In cases where multimodal images (e.g., CT and MRI) are used, alignment and registration techniques are applied to bring the images into a common coordinate system. This step facilitates cross-referential accuracy in feature extraction, which is particularly valuable when fusing data from different imaging modalities. This preprocessing pipeline ensures that the CT images used in model training are optimized for accurate feature extraction and classification, ultimately improving the AI model's performance in diagnosing sinusitis.
(3) Model construction
This article is based on VGG and uses a 3 × 3 size convolution kernel; downsampling uses 2 × 2 maximum pooling; after the downsampling operation of the four convolutions is cycled, the fully connected layer is reached, and all feature maps containing local information (including the height, width, number of channels, etc. of the feature map) are mapped to 4096 dimensions. The network structure design rules are as follows:
Input: 224 × 224 × 3 channel image enters the convolutional layer.
Image preprocessing: To perform mean preprocessing on the image, subtract the RGB average calculated in the training set from the RGB average in each pixel and force the data back to the standard normal with a variance of 1 and an average of 0 distributed.
Convolutional layer: this network uses continuous small convolution kernels (3 × 3) for continuous convolution. After this convolution, the resolution of the image is constant.
The pooling layer is continuous and can reduce the image's resolution. Spatial pooling is implemented through 5 max-pooling layers, mainly on a 2 × 2 pixel window with a stride of 2.
Cycling 4 convolution pool operations, and when all convolution pool operations are completed, the feature information of sinus CT images can be obtained.
On the fully connected layer, all feature maps containing local information (including the height, width, and number of channels of the feature map) are mapped to 4096 dimensions.
Output: After the convolutional layer, there are 3 fully connected layers. The first and second contain 4096 channels, respectively.
After the convolutional layer, it is followed by 3 fully connected layers.
This article intends to divide sinus CT into sphenoid sinus area, ethmoid sinus area, frontal sinus area, maxillary area, and normal people, using the sigmoid function to replace the soft-max function and the binary cross-entropy function to evaluate the model prediction results.
The deep learning model for diagnosing chronic sinusitis is primarily built on CNNs, which are highly effective for image analysis tasks. It starts with an input layer that accepts sinus CT images, which are preprocessed to maintain consistent size and format, essential for reliable training. The core of the architecture consists of multiple convolutional layers, where each layer applies a series of filters to the input images. This enables the model to learn hierarchical features; shallow layers detect basic patterns like edges and textures, while deeper layers capture more complex structures and patterns pertinent to sinusitis diagnosis.
After each convolution, a Rectified Linear Unit (ReLU) activation function introduces non-linearity, allowing the model to learn more intricate relationships in the data. Pooling layers, often using max pooling, follow the convolutional layers to reduce the spatial dimensions of the feature maps. This down-sampling reduces the computational load and helps prevent overfitting by introducing a degree of translation invariance. The model transitions to fully connected layers as it progresses through several convolutional and pooling layers. These layers take the high-level features extracted by the convolutional layers and combine them to form the final predictions about the presence or absence of sinusitis. The output layer, typically a softmax layer, provides probability scores for each class (normal vs. abnormal sinus CT), with the class having the highest probability chosen as the model's final prediction.
The model learns to extract these features by training on a labelled dataset of sinus CT images and adjusting the filters’ weights through backpropagation to minimize the loss function. The architecture leveraging CNNs captures the intricate features within sinus CT images, enabling accurate chronic sinusitis diagnosis. This deep learning approach offers a more nuanced understanding of the data than traditional image analysis techniques, leading to better diagnostic accuracy.
Sinusitis artificial intelligence-assisted diagnosis model simulation experiment
Experimental environment
Application environment: MATLABR2021a, Windows 10 operating system; Hardware configuration: Intel(R)Core(TM)i5-7200UCPU@2.50GHz2.70 GHz, memory 10G, hard disk 500G.
Simulation experiment
Sinus CT images were divided into normal sinus CT and abnormal sinus CT (maxillary sinusitis, frontal sinusitis, ethmoid sinusitis, sphenoid sinusitis). After the training phase, the test data set is used for testing, and the confusion matrices of the above five groups are obtained. Based on the confusion matrix, the validity of the computer-aided diagnosis model of chronic sinusitis based on deep learning given in this article can be evaluated by the six evaluation indicators of accuracy, precision, sensitivity, specificity, interpretation time, and the ROC area under curve (AUC).
The sinus soft tissue showed obvious density inhomogeneity, and the difference between the high-density area CT and the surrounding tissue was more than 70 Hu. The reference standard for hyperostosis and sclerosis of the adjacent sinus wall is: the contralateral and adjacent uninvolved sinus bone fragments can be seen to be thickened, and the thickness of the sinus bone fragments at the corresponding parts of the same level is greater than 1 mm; the criteria for judging bone destruction are: the contralateral sinus and adjacent sinus wall of the control subjects are not affected, and some bony defects or bony discontinuities can be seen.
Sinus CT images are divided into two categories: normal and abnormal. Based on this, the method was evaluated from six aspects: sensitivity, specificity, precision, accuracy, interpretation time, and the ROC area under the curve (AUC).
Comparative experiment
The comparison experiment process is shown in Figure 2. To scientifically evaluate the effectiveness of the model in this article, this article plans to set up low-qualified doctors (2 residents), medium-qualified doctors (2 attending doctors each) and senior-qualified doctors (1 deputy senior + 1 director) as controls and select 200 frames from sinus CT images (40 frames for sphenoid sinusitis, 40 frames for frontal sinusitis, 40 frames for ethmoid sinusitis, 40 frames for maxillary sinusitis and 40 frames for normal sinuses) to obtain classification fuzzy matrices for doctors of different ages.

Comparative experimental process.
All senior doctors will review the films simultaneously, and if there are differences in opinions between the two doctors, they will reach a consensus through discussion. All readers record the reader's interpretation time on each image. This article uses five evaluation methods: sensitivity, specificity, precision, accuracy and interpretation time.
SPSS25 statistical software was used to analyze the data. Quantitative data were described using general statistics. The chi-square test was used to compare the normal distribution between groups. Fisher's exact statistics were used when the theoretical number of units was less than 5d.
Research results of artificial intelligence-assisted diagnosis model for sinusitis
Model simulation experiment results and analysis
Sinus CT images are planned to be divided into two categories: normal and abnormal. The confusion matrix results are shown in Table 3. The values of horizontal factors at the same level are all 1000.
Exploration results of the confusion matrix.
Exploration results of the confusion matrix.
This article evaluates this method from six aspects: sensitivity, specificity, precision, accuracy, interpretation time and the ROC area under curve (AUC). Among them, the evaluation results of sensitivity, specificity, precision, accuracy and interpretation time are shown in Table 4. The overall accuracy was 85.22%, the accuracy for sphenoid sinusitis was 96.32%, the accuracy for frontal sinusitis was 91.11%, the accuracy for ethmoid sinusitis was 86.65%, the accuracy for maxillary sinusitis was 90.45%, and the accuracy for normal was 71.55%. In terms of sensitivity, sphenoid sinusitis was 61.21%.
Comparison of recognition accuracy.
The sensitivity and specificity were analyzed from a clinical point of view. In terms of the accuracy of this model for the four groups of sinusitis, sphenoid sinusitis, frontal sinusitis, ethmoid sinusitis, and maxillary sinusitis all have higher accuracy, which shows that the model's classification effect is better, while the normal accuracy is lower. Judging from the specificity of this model for the four types of sinusitis, the specificity is very high, indicating that the detection rate of chronic sinusitis by this model is high, and the probability of misdiagnosis is also very low. From the perspective of the sensitivity of chronic sinusitis, maxillary sinusitis and frontal sinusitis are more sensitive than sphenoid sinusitis and ethmoid sinusitis. The possible reason is that the maxillary sinus and frontal sinus are independent positions in the horizontal CT image. In contrast, the sphenoid and ethmoid sinus overlap more in the horizontal CT image, so the sensitivity and missed diagnosis are lower. Overall, the model in this article has a higher detection rate for chronic sinusitis but is prone to confusion when determining the sphenoid and ethmoid sinus.

Comparison of recognition accuracy.
Evaluation results of accuracy.

Recognition accuracy comparison.
A comparison of recognition accuracy is shown in Figure 3 and Table 5. In the first test, the model's accuracy in this paper is 85.8%, the accuracy of low-qualification doctors is 71.7%, the accuracy of medium-qualification doctors is 78.4%, and the accuracy of senior-qualification doctors is 73.4%. In the second test, the accuracy of the model in this paper is 84.4%, the accuracy of low-qualification doctors is 76.6%, the accuracy of medium-qualification doctors is 76.6%, and the accuracy of senior-qualification doctors is 80.4%. In this study, the effectiveness of an AI model developed for diagnosing sinusitis was compared with the diagnostic accuracy of medical professionals across different expertise levels. Results from two tests illustrate the model's performance relative to doctors grouped by qualification: low (residents), medium (attending doctors), and senior (senior attending doctors and directors). In the first test, the AI model achieved an accuracy of 85.8%, outperforming all doctor groups, with low-qualification doctors reaching an accuracy of 71.7%, medium-qualification doctors achieving 78.4%, and senior-qualification doctors attaining 73.4%. These results suggest that the AI model demonstrated a marked improvement in diagnostic accuracy over the doctors, regardless of experience level. In the second test, the AI model's accuracy slightly decreased to 84.4%. Still, it remained higher than the doctors, with low-qualification and medium-qualification doctors reaching 76.6% accuracy and senior-qualification doctors improving to 80.4%. Across both tests, the AI model consistently led in accuracy, indicating its potential as a reliable diagnostic tool to support doctors in clinical settings and enhance the precision and consistency of sinusitis diagnoses on CT scans.
The diagnostic accuracy of an AI-assisted model for sinusitis is comparable to that of doctors at various qualification levels across two tests, as shown in Figure 4 and Table 6. In the first test, the AI model achieved an accuracy of 80.5%, while low-qualified doctors scored 78.1%, mid-qualified doctors reached 81.8%, and senior-qualified doctors recorded 81.3%. Mid- and senior-qualified doctors slightly outperformed the AI model, with the low-qualified group trailing behind. In the second test, the AI model's accuracy improved to 83.7%, while the accuracy of low-qualified doctors declined to 70.4%. Mid-qualified doctors scored 81%, and senior-qualified doctors saw a small decrease to 78.8%. In this test, the AI model demonstrated superior accuracy to all doctor groups, particularly outperforming the low-qualified doctors. These results highlight the model's potential as a consistent and reliable diagnostic tool. Notably, the AI model excelled in the second test, surpassing even the senior-qualified doctors, indicating its potential to support a higher standard of diagnostic accuracy and reduce variability across doctors with different experience levels.
Recognition accuracy comparison.
Recognition accuracy comparison.
The recognition sensitivity comparison is shown in Figure 5. The AI-assisted diagnostic model for sinusitis achieved a sensitivity of 84% in the first test, outperforming both low- and mid-qualification doctors. This result suggests that the AI model surpasses the sensitivity levels of less experienced doctors and offers a more reliable diagnostic capability than those with higher qualifications. In the second test, the model's sensitivity slightly decreased to 81.6% but maintained a notable advantage over the doctors’ sensitivity levels. This consistency across both tests highlights the model's robustness in accurately identifying cases of sinusitis. These findings indicate that the AI model can significantly reduce the risk of misdiagnosis, leading to more precise and timely treatment for patients with chronic sinusitis. Even compared to senior doctors, its ability to sustain high sensitivity suggests that it could be a valuable addition to clinical practice, enhancing healthcare professionals’ diagnostic capabilities and contributing to better patient outcomes. By integrating deep learning into the diagnostic process for sinusitis, this AI model improves sensitivity and offers a reliable alternative to traditional diagnostic methods. Such advancements may reshape the approach to diagnosing and managing chronic sinusitis, positioning the AI model as a vital component of modern medical practice.

Recognition sensitivity comparison.
Table 7 shows the interpretation time comparison. In the first test, the interpretation time of this article's model was 0.01 s, the interpretation time of low-qualification doctors was 3.59 s, the interpretation time of mid-level doctors was 2.47 s, and the interpretation time of high-level doctors was 0.93 s. In the second test, the interpretation time of the model in this paper is 0.14 s, the interpretation time of low-qualification doctors is 2.43 s, the interpretation time of medium-qualification doctors is 1.64 s, and the interpretation time of senior-qualification doctors is 2.11 s.
Interpretation time comparison.
The results attained by the interpretation time comparison are shown in Figure 6. To describe the performance of the algorithm in this paper, the cross-sectional CT images of patients with chronic sinusitis and the cross-sectional CT images of normal people are respectively used to segment the nasal mucosa area by the algorithm of this paper and by doctors, and then the reliability, true positive rate, false positive rate and false negative rate of the segmentation algorithm of this paper are counted. The statistical results of the reliability, true positive rate, false positive rate, and false negative rate of the segmentation algorithm in this paper are shown in Figure 7 and Table 8. The reliability of patients in group 1 was 0.914, the true positive rate was 0.731, the false positive rate was 0.244, and the false negative rate was 0.059. The reliability of patients in group 2 was 0.748, the true positive rate was 0.708, the false positive rate was 0.253, and the false negative rate was 0.061. The reliability of patients in group 3 was 0.831, the true positive rate was 0.732, the false positive rate was 0.189, and the false negative rate was 0.063. The reliability of patients in group 4 was 0.829, the true positive rate was 0.826, the false positive rate was 0.159, and the false negative rate was 0.058.

Interpretation time comparison.

Statistical results of the segmentation algorithm.
Statistical results of the segmentation algorithm.
The generalizability of the AI-assisted diagnostic model for sinusitis is crucial, especially considering its applicability to other datasets and performance under different clinical conditions. Trained on a dataset of 5000 CT images representing normal and abnormal sinus cases, the model needs to be validated on external datasets covering a wider range of demographics, imaging protocols, and clinical presentations. This broader testing will help assess whether the model retains its accuracy and sensitivity across various populations. Additionally, since imaging techniques can vary, such as differing CT machine types and settings, it is essential to evaluate the model's robustness to variations in image quality, resolution, and contrast, as these factors may impact diagnostic outcomes.
The model's performance was also assessed across a variety of clinical situations, such as sinusitis phases (acute vs. chronic), concurrent illnesses (e.g., allergies, nasal polyps), and patient demographics (age, gender, and comorbidities). This will provide information on the model's adaptability to real-world circumstances. Cross-validation approaches, which include repeatedly training and testing the model on multiple data subsets, are useful for analyzing its dependability and consistency, identifying potential biases, and ensuring it is not too fitted to the training data. Furthermore, longitudinal studies evaluating the model's performance over time and across different patient cohorts might shed light on its flexibility and long-term efficacy in clinical contexts. Finally, the model's generalizability is linked to its integration into clinical workflows. It is essential to understand how it complements healthcare professionals and its effect on decision-making processes. While initial results are promising, further validation across diverse datasets and clinical conditions must confirm the model's generalizability and effectiveness in practical applications. Ongoing evaluation and adjustment are key to maximizing its utility in diagnosing sinusitis and potentially other related conditions.
Practical implications
Integrating an artificial intelligence (AI) tool for diagnosing sinusitis into clinical workflows presents several challenges. Data quality and standardization are crucial, as AI models require high-quality, standardized data for training and validation. Variability in imaging protocols, equipment, and patient demographics can lead to inconsistencies in the data, making it essential to train the AI tool on a diverse and representative dataset. Interoperability is also essential, as the AI tool must be compatible with existing electronic health record (EHR) systems and imaging software used in clinical settings.
User acceptance and training are critical, as clinicians may be hesitant to adopt AI tools due to concerns about reliability, accuracy, and the potential for technology to replace human judgment. Comprehensive training programmes are necessary to familiarize healthcare professionals with the AI tool, its capabilities, and how to interpret its outputs. Regulatory and compliance issues must be navigated, as AI tools in healthcare must comply with various regulations and standards, such as those set by the Food and Drug Administration (FDA). Clinical workflow integration is another challenge, as the AI tool must fit seamlessly into existing workflows without causing disruptions. Cost and resource allocation plays a role, as implementing AI technology can require a significant financial investment.
Ethical considerations are paramount, as the use of AI in healthcare raises questions regarding patient privacy, data security, and the potential for bias in AI algorithms. Ensuring that patient data is handled securely and the AI tool is free from biases that could affect diagnostic accuracy is critical for building trust among patients and healthcare providers. Once the AI tool is integrated, continuous monitoring and improvement are necessary, requiring ongoing assessment of its performance and effectiveness in real-world clinical settings.
Limitations
The deep learning-based AI model for sinusitis diagnosis faces limitations due to data restrictions and generalizability issues. The model uses a dataset of 5000 CT images from a single institution, which may limit its applicability across diverse populations due to variations in imaging protocols, equipment, and patient demographics. It also faces diagnostic challenges with sinusitis subtypes that present overlapping symptoms and CT features, particularly between sphenoid and ethmoid sinusitis. The model's dependence on high-performance computational resources limits its deployment in smaller or resource-limited clinics. Additionally, the model's intrinsic resolution constraints and potential artifacts of CT imaging may restrict its ability to detect subtle anatomical details relevant for sinusitis diagnosis. Future work aims to expand and diversify the dataset, gather longitudinal data on sinusitis, incorporate multimodal imaging data, integrate patient symptom profiles, nasal endoscopy findings, and other clinical data, and conduct real-time applications that reduce computational requirements. Transfer learning techniques could minimize the need for large, labelled sinusitis datasets and conduct clinical trials to validate the model's effectiveness in real-world settings.
Conclusion
This article presents a method for classifying and diagnosing chronic sinusitis using deep learning technology and applying it to clinical practice. It adopts a new artificial intelligence diagnosis method based on deep learning, which has a good effect on clinical applications and has achieved good results. This article establishes a deep learning-based artificial intelligence diagnostic method for chronic sinusitis that can clinically reach the level of senior doctors and is clinically superior to mid-to senior-level doctors and low-level doctors. Compared with other experienced doctors, the diagnosis method of chronic sinusitis based on deep learning has a clear advantage in the reading time. Although CT is becoming increasingly common, not all outpatients can use CT as their routine examination. Therefore, the artificial intelligence-assisted diagnosis model of sinusitis based on deep learning is simple, convenient, and can be performed in outpatient clinics. The cost is lower than that of CT. It is easier for people to accept it. In future studies, if the clinical manifestations and nasal endoscopy are confirmed as chronic sinusitis, further CT examination should confirm the diagnosis and provide a basis for further treatment.
Footnotes
Code availability
Not applicable.
Authors’ contributions
The First Affiliated Hospital of Heilongjiang University of Chinese Medicine, Dianyi Wang, is responsible for designing the framework, analyzing the performance, validating the results, and writing the article. Wentao Li, Jingyi Wang, is responsible for collecting the information required for the framework, provision of software, critical review, and administering the process.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Clinical observation of Shuanghuanglian (freeze-dried) nasal irrigation for injection in treatment of chronic sinusitis (20220707011078); and Study on the mechanism of immune factors in treating chronic rhinosinusitis with YiqitongQiaoyin (2022–QNRC1–19) and This work was supported by the seventh batch of national traditional Chinese medicine experts academic experience inheritance work.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
No datasets were generated or analyzed during the current study.
