Abstract
BACKGROUND:
Dental panoramic imaging plays a pivotal role in dentistry for diagnosis and treatment planning. However, correctly positioning patients can be challenging for technicians due to the complexity of the imaging equipment and variations in patient anatomy, leading to positioning errors. These errors can compromise image quality and potentially result in misdiagnoses.
OBJECTIVE:
This research aims to develop and validate a deep learning model capable of accurately and efficiently identifying multiple positioning errors in dental panoramic imaging.
METHODS AND MATERIALS:
This retrospective study used 552 panoramic images selected from a hospital Picture Archiving and Communication System (PACS). We defined six types of errors (E1-E6) namely, (1) slumped position, (2) chin tipped low, (3) open lip, (4) head turned to one side, (5) head tilted to one side, and (6) tongue against the palate. First, six Convolutional Neural Network (CNN) models were employed to extract image features, which were then fused using transfer learning. Next, a Support Vector Machine (SVM) was applied to create a classifier for multiple positioning errors, using the fused image features. Finally, the classifier performance was evaluated using 3 indices of precision, recall rate, and accuracy.
RESULTS:
Experimental results show that the fusion of image features with six binary SVM classifiers yielded high accuracy, recall rates, and precision. Specifically, the classifier achieved an accuracy of 0.832 for identifying multiple positioning errors.
CONCLUSIONS:
This study demonstrates that six SVM classifiers effectively identify multiple positioning errors in dental panoramic imaging. The fusion of extracted image features and the employment of SVM classifiers improve diagnostic precision, suggesting potential enhancements in dental imaging efficiency and diagnostic accuracy. Future research should consider larger datasets and explore real-time clinical application.
Introduction
Dental panoramic imaging is an essential tool in oral health diagnosis. It provides a broad view of the entire mouth, including the teeth, upper and lower jaws, and surrounding structures and tissues. However, the accuracy of these images highly depends on the correct positioning of the patient during the imaging process. Even slight deviations can distort the image and lead to diagnostic inaccuracies. Traditional methods to correct positioning errors require significant time and expertise and may not be effective in all cases. With the advent of artificial intelligence (AI) and machine learning technologies, there is a potential to streamline AI process, but comprehensive solutions have been lacking.
The aforementioned study makes a noteworthy contribution to the domain of dental panoramic radiography. It tackles the issue of patient positioning, a crucial factor that influences image quality in dental panoramic radiography, a point aptly underscored by the authors. Using a convolutional neural network (CNN) for estimating and correcting positioning errors of the dental arch in the panoramic image, which would effectively reduce the blur in images [1–4]. Meanwhile, the effectively highlights the benefits of panoramic radiography, such as improved image quality, lower radiation exposure, and cost-effectiveness, while also stressing the challenges inherent to the technique. It acknowledges the complexity of accurate patient positioning due to the diversity of patient anatomy and the intricacies of the maxillofacial bone structure [5]. Moreover, Kitai N et.al. had successfully presented their findings in a clear and concise manner. Their results indicate that 3D panoramic radiography provides significantly smaller measurement errors compared to conventional panoramic radiography and can be effectively used for the quantitative evaluation of anterior tooth length via cone beam computed tomography [6]. A retrospective study analyzed a substantial number of panoramic radiographs, 1904 in total, which substantiates their findings and reinforces the reliability of the study. The inclusive nature of the patient sample—capturing everyone who underwent panoramic examinations in 2011—is a strong point in this study [7–10]. This article could be more explicit about the severity and the precise effect of severe errors on diagnostics, only 3% of which were deemed impossible for correct diagnosis. The impact of digital techniques in mitigating these errors could also have been discussed more thoroughly. The result that 79% of the radiographs exhibited errors highlights the prevalent issue of incorrect patient positioning during panoramic imaging. The revelation that the most common error—tongue not being in contact with the hard palate—did not significantly impact diagnostic utility underscores the necessity for further research on which errors are most detrimental to diagnostic accuracy.
Additionally, Dhillon M, Raju SM, et al. presents a comprehensive evaluation of positioning errors in panoramic radiography. The study, comprised of 1,782 panoramic radiographs, provides a robust sample size, lending credibility to the results. One of the most striking findings of the research is that 89% of the analyzed radiographs contained positioning errors, indicating a significant issue in current practice. The analysis of the different types of errors, categorized into nine groups, provides valuable insights into which mistakes are most common. The most frequent error was the failure to position the tongue against the palate, which accounted for more than half of the errors. However, the specific impact of each error on image quality was not discussed in the study [11, 12]. Lingam AS, Koppolu P, et al., aims to evaluate the frequency of positioning errors in panoramic radiographs and assess the overall quality of these images. The sample size of 2629 patients, selected out of 3900 new patients, offers an ample range for a robust analysis. The study reveals that 77.2% of the examined radiographs had at least one positioning error, a finding which is alarming and highlights the need for improvement in patient positioning during the imaging procedure. As per the study, the most common error was failure to place the tongue close to the palate. However, it would be beneficial if the study elaborated on the specifics of all the errors observed and their frequency [13]. Choi BR, Choi DH, et al., focuses on assessing the clinical image quality of panoramic radiographs from Korean dental clinics and identifies the parameters affecting the image quality. Regarding the causes of imaging errors, the study indicates that patient positioning and issues in image processing were the primary factors, along with a fewer number of errors from the radiographic unit and due to anatomical abnormalities. The prominence of patient positioning and image processing errors aligns with findings from previous studies and highlights the need for greater care in these areas [14–16]. Khator AM, Motwani MB, et al. explores the positioning errors in digital panoramic radiography, their relative frequency, and the ones directly affecting the diagnostic image quality. The study emphasizes the importance of careful patient positioning in producing good-quality radiographs, suggesting that spending more time on patient positioning could reduce the repetition of radiographs and limit unnecessary patient exposure. This paper provides valuable insights into the common errors in digital panoramic radiography and highlights the need for improved practices in patient positioning [17]. In the evaluated radiographs, only 5% were without errors, while 95% exhibited one or more positioning errors. This suggests that positioning errors are a prevalent issue in digital panoramic radiography, leading to lower image quality. The most frequent error was the head turned to one side (33.8%), whereas patient movement during exposure was the least common error (1.8%).
Therefore, positioning errors in radiographic imaging, specifically in panoramic radiography, can occur due to several reasons. Here are some of the primary causes as below [18–22].
Incorrect patient positioning: This is the most common cause of positioning errors. If the patient’s head is not properly aligned with the machine, it can lead to distortions or omissions in the image. Examples of incorrect positioning include the head being turned to one side, chin tipped too high or too low, or the patient positioned too far forward or backward.
Movement during exposure: If a patient moves during the exposure, it can blur the image or cause other distortions. This is particularly an issue with younger patients or those who have difficulty remaining still.
Failure to position the tongue against the palate: This can cause a radiolucent shadow that obscures the upper teeth and surrounding structures.
Patient posture: A slumped position can lead to a distorted image, particularly a “flattened” image of the dental arches.
Inadequate patient instruction or understanding: If the patient doesn’t understand how they need to be positioned, or the operator does not adequately explain it, positioning errors can occur.
Operator error: Finally, if the operator does not properly align the machine with the patient or select the correct settings, this can result in positioning errors.
These errors not only affect the quality of the images but can also lead to misdiagnosis or the need for repeated exposures, which increases the patient’s exposure to radiation. Therefore, proper patient positioning and operator training are critical in obtaining high-quality panoramic radiographs. The objective of this research is to develop and test a deep learning model that can accurately and efficiently identify multiple positioning errors in dental panoramic imaging.
Materials and methods
Image data
The data used in this research was obtained from the Hualien Armed Forces General Hospital under IRB approval (IRB Approval No: A202205095). A total of 552 patients with 2D dental panoramic images were collected between 2017-2021 with the image resolution is 64×1536 pixels, and the field of view is 5×149 mm2. Six positioning errors were defined, including slumped position (E1), chin tipped low (E2), open lip (E3), head turned (E4), tilted to one side (E5), and tongue against the palate (E6) (Fig. 1).

Six positioning errors were defined with sample size.
The categorization of positioning errors was performed by three independent experienced dental radiologists (physicians). These experts reviewed each dental panoramic image in our dataset and assigned appropriate error types based on the criteria outlined (Table 1). Their assessments were then cross-verified to ensure consistency and reliability. The criteria for determining each type of positioning error are based on established clinical guidelines as well as consultations with experienced dental radiologists (physicians).
Manifestations of common positioning errors in dental panoramic radiographs
The statistical counts of positioning errors in one image are listed in Table 2. Multiple positioning errors in one image refer to situations where several different aspects of the patient’s positioning are incorrect during the radiographic imaging. These can include errors such as the patient chin being tipped too high or too low, the patient leaning forward or backward, failing to position the tongue against the palate, the head being tilted or turned to one side, and the patient moving during the exposure. According to Table 2, it is common to see multiple positioning errors in a single radiographic image. For instance, 32.4% of images (179 out of 552) had 3 positioning errors, and 27.9% of images (154 out of 552) had 4 positioning errors. Interestingly, only 5.4% of images (30 images) had a single positioning error, indicating that positioning errors tend to co-occur in these radiographic images. This indicates that these types of mistakes often occur together, which can significantly impact the diagnostic quality of the radiographs. These errors can lead to distortion or obscuration of anatomical structures, potentially hindering accurate diagnosis. Consequently, it is important to identify these errors and implement strategies to mitigate them. This might involve further training for radiologic technologists, better patient instructions, or improvements in the design and features of radiographic equipment to make correct positioning easier to achieve. Notably, the presence of multiple errors in one image underlines the complexity of proper patient positioning in radiography and emphasizes the need for careful attention to detail during this process.
The frequency of positioning errors in one image are listed
In this study, each step contributes to building a model that can effectively learn from the data, generalize to unseen instances, and provide reliable performance metrics. These steps collectively represent a rigorous and systematic approach to machine learning model development, leading to more robust and reliable outcomes. The flowchart of this study is shown in Fig. 2.
This research workflow involves several steps that integrate image analysis with machine learning to recognize positioning errors in dental images. Here is the summary and detailed explanation of each step. The flowchart of this research. Step 1: Loading Images The first step is gathering and loading the dataset containing images with six different types of positioning errors in dental radiography. It is important that these images are appropriately labeled to identify the type of error they represent. Step 2: Image Pre-processing This involves preparing the images for subsequent analysis. Pre-processing steps can include colorization (converting the image to a specific color space that may enhance certain features) and resizing (altering the dimensions of the image to ensure consistency across the dataset). Step 3: Training Selected CNNs with Transfer Learning Here, we plan to use CNNs to classify images based on the positioning errors. We are using several pre-trained models (vgg19, xception, mobilenetv2, inceptionv3, resnet50, resnet101) and applying transfer learning, which allows these models to be fine-tuned on our specific dataset. We have set certain hyperparameters like batch size, epochs, and learning rates and we plan to use Stochastic Gradient Descent with Momentum (SGDM) and ADAM as the optimizers. We are training these models on 90% of the whole data and testing them on the remaining 10%. Step 4: Fusion of Extracted Features At this stage, the feature maps generated by the trained CNNs are combined, or “fused”. The purpose of this fusion process is to consolidate the extracted features and to enhance the robustness of the model provided by SVM. Step 5: Training SVM models for every positioning error image The fused features from the previous step are used to train SVM models, a type of supervised machine learning model, for each type of positioning error. Step 6: Evaluation of Performance The performance of the models is evaluated using various metrics such as accuracy, recall rate, precision, and Area Under the Curve (AUC). These metrics will give an understanding of how well the models are performing and identifying positioning errors. Step 7: Results This is where we analyze the output of the presented models, understand the implications of the findings, and consider potential improvements or future directions for this work. This might involve comparing the performance of different models, discussing why certain models performed better than others, and suggesting possible real-world applications of these findings.
The selected CNNs –vgg19, resnet50, resnet101, inceptionv3, xception, and mobilenetv2 –are used because they are widely recognized and established architectures in the field of deep learning (Table 3). They have been proven to deliver high performance across a range of image classification tasks.
The brief properties of investigated CCN models
The brief properties of investigated CCN models
VGG19: Known for its simplicity and uniform architecture, VGG19 is often used as a baseline for comparison with other models. Despite its depth, it has a very homogeneous and straightforward architecture that’s easy to understand [23].
ResNet50 and ResNet101: ResNet models are popular due to their “residual learning” approach to deal with the vanishing gradient problem, allowing for the efficient training of much deeper networks. The 50- and 101-layer variants provide good balance between complexity and performance [24].
InceptionV3: The Inception models, including InceptionV3, offer an alternative method of deepening networks while maintaining computational efficiency, making them useful for large-scale and resource-constrained applications [25].
Xception: This architecture uses depth wise separable convolutions, a modification to the standard convolution operation, that can lead to significant computational savings and improved model performance [26].
MobileNetV2: As an efficient architecture designed for mobile devices, MobileNetV2 provides an interesting point of comparison to assess how much efficiency can be gained without significant loss of accuracy [27].
By comparing different models, the research aims to identify the best model or combination of models for the task of detecting positioning errors in dental images. This comparison may also provide insight into the kind of architectural features and techniques that are most effective for this specific task.
Stochastic Gradient Descent with Momentum (SGDM) and Adaptive Moment Estimation (ADAM) are two popular optimization algorithms used in training CNNs due to their effectiveness and efficiency. SGDM helps accelerate SGD in the relevant direction and dampens oscillations. It does this by adding a proportion of the update vector of the past time step to the current update vector. In other words, it considers the past gradients to smooth out the update. This can help to overcome local minima or saddle points in the error surface and can lead to faster convergence. ADAM is an extension of the stochastic gradient descent, that is specifically well-suited for training deep neural networks. ADAM not only stores an exponentially decaying average of past squared gradients like AdaGrad or RMSprop but also keeps an exponentially decaying average of past gradients, like momentum. These two characteristics make Adam an effective and well-balanced optimizer for CNNs. It adapts the learning rate for each weight in the network individually and computes adaptive learning rates for different parameters. They are especially useful in large-scale, high-dimensional settings, and when the data has noisy or sparse gradients. In this study, the investigated parameters are listed in Table 4. Therefore, a total of 470 images were used for training the CNN models, and 52 images were used for testing these models.
The investigated parameters of the presented methods
The investigated parameters of the presented methods
In our study, feature extraction plays a critical role in identifying multiple positioning errors in dental panoramic images. The features are extracted using six CNN models from a dataset of 552 cases (images). For feature extraction, we focus on the Fully Connected Layer (FCL) of the network. The FCL takes the output of the last convolutional layer and flattens it into a single vector. This vector is then multiplied by a weight matrix and added to a bias vector. The output undergoes an activation function to produce the final features. The features obtained from each of the six CNN models, also referred to as FCL-features, were aggregated into a matrix of dimensions 552×6. This matrix was then used to train the SVM model for the task at hand. The choice of using the FCL for feature extraction is motivated by the layer’s capability to capture high-level abstractions of the input data. The FCL aggregates spatial hierarchies built up by previous layers, thereby encapsulating both low-level and high-level features that are crucial for identifying multiple positioning errors. To expedite the training process and enhance performance, our CNN model is initialized with pre-trained weights. We then fine-tune the model on our specific dental panoramic imaging dataset. This approach ensures that our model benefits from the generic feature extraction capabilities of the pre-trained CNN while adapting to the unique characteristics of dental panoramic images.
Fusion of extracted features and training SVM models
In this stage, the features extracted from the different CNNs are combined or “fused” to create a comprehensive feature set that captures a wide range of information about the 552 images. Each CNN may identify and highlight different aspects of the images due to differences in their architecture and training. By fusing these diverse features, the model can benefit from the unique strengths of each individual CNN. The fusion of features can be done in several ways, such as concatenation or more sophisticated techniques that take into account the relationships between features. This comprehensive feature set provides a richer representation of the images, increasing the potential for accurate classification.
Following the fusion of the extracted features, each type of positioning error is trained with an SVM model that optimizes hyperparameters automatically. This optimization process is an essential part of the training, as it ensures that the SVM models can effectively handle the high-dimensional space generated by the fused features. The hyperparameters that are optimized include the Box Constraint and the Kernel Scale. The Box Constraint controls the penalty for misclassification, determining the trade-off between classifier complexity and misclassification rate. The Kernel Scale affects the flexibility of the decision boundary, influencing the model’s bias-variance trade-off. The optimization process utilizes a Bayesian optimization algorithm, a sequential model-based method which is efficient for optimizing expensive black-box functions. The optimal values of Box Constraint and the Kernel Scale are employed a grid search approach over a logarithmically spaced range of values between 0.001 and 1000. The grid search was performed in conjunction with k-fold cross-validation to assess the model’s performance on unseen data effectively. The combination of hyperparameters that yielded the highest cross-validation accuracy was chosen as the best set for our model. The specific acquisition function used in this process is ‘expected-improvement-plus’, which is designed to handle noisy objective functions and make the optimization process more efficient. The SVM models employ a radial basis function (RBF) as their kernel. This type of kernel can handle nonlinear data effectively, making it suitable for the complex, high-dimensional data in this study. The RBF kernel transforms the data into a higher dimensional space where a linear decision boundary can be found. Finally, to speed up the training process and make full use of available resources, parallel computation is used. This allows multiple SVM models to be trained simultaneously, greatly reducing the time required to complete this step. The features extracted from each of the six CNN models, commonly referred to as FCL-features, were aggregated into a matrix with dimensions of 552×6 (i.e., 552 images and features extracted from 6 CNN models). This matrix was then used for training the SVM model, employing a ten-fold cross-validation method.
This combination of automatic hyperparameter optimization, a Bayesian Optimization algorithm, a suitable kernel function, and parallel computation ensures that the SVM models are both effective in classifying the positioning errors and efficient to train.
Performance index for classification
The performance of the classification models is evaluated using four key metrics: accuracy, recall rate, precision, and Area Under the Curve (AUC).
Accuracy measures the proportion of the total predictions that are correct. It is computed as the number of correct predictions divided by the total number of predictions. Accuracy is a straightforward indicator of a model’s overall performance. However, in cases where the classes are imbalanced, accuracy may not give the whole picture, as a model could achieve a high accuracy by simply predicting the majority class.
Recall Rate measures the proportion of actual positive class instances that are correctly identified by the model. It is calculated as the number of true positives (TP) divided by the sum of true positives and false negatives (FN), i.e., TP / (TP + FN). This metric is particularly important when the cost of failing to detect a positive instance is high.
Precision calculates the proportion of positive class predictions that are actually correct. It is computed as the number of true positives divided by the sum of true positives and false positives (FP), i.e., TP / (TP + FP). High precision indicates that the model accurately predicts positive instances and has a low rate of false positive errors.
Area Under the Curve (AUC) refers to the area under the Receiver Operating Characteristic (ROC) curve. The ROC curve is a graphical plot that illustrates the performance of a binary classifier as the discrimination threshold is varied. The AUC gives a single numeric value summarizing the overall performance of the classifier, with a value of 1.0 indicating perfect classification and a value of 0.5 indicating random chance. A higher AUC implies a better classifier.
These metrics provide a comprehensive assessment of the classification model’s performance. It’s important to consider all these metrics together, as they each provide different insights into the model’s performance. For example, a model may have high accuracy but low recall, which might be problematic in certain applications. By considering all four metrics, we can get a more complete picture of the model’s performance.
Results
Figure 3 and Table 5 show the performance indices for the six Support Vector Machines (SVMs) models, each trained to identify a different type of positioning error. The recall for class 0 (no positioning error) ranges from 0.873 to 0.990, suggesting that the models are relatively good at identifying true negatives. However, there is a significant difference in recall rates between the models, with SVM for Error 6 achieving the lowest recall for class 0 at 0.873. The recall for class 1 (presence of a positioning error) is consistently high, reaching a maximum of 1.000 in SVMs for Error 1 and Error 6. This indicates that these models are particularly good at identifying images with the specific positioning error they were trained to recognize.

The ROC and performance index among six SVM for each positioning error.
The performance metric among six SVM models for each positioning error
Notice: Recall (1) is TP/(TP + FN), Recall (0) is TN/(TN + FP), Precision (1) is TP/(TP + FP), Precision (0) is TN/(TN + FN).
Precision, which is the measure of how many selected items are relevant, ranges from 0.929 to 1.000 for class 0 and 0.952 to 0.988 for class 1. This indicates that the models have low false-positive rates, meaning they do not often incorrectly classify an image as having a positioning error when it does not.
The accuracy of the models, which represents the proportion of true results in the data, ranges from 0.955 to 0.989, indicating good overall performance of the models. Meanwhile, the AUC, which measures the area under the Receiver Operating Characteristic curve and can be interpreted as the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one, ranges from 0.965 to 0.998. This indicates that all models have a strong performance and can discriminate between classes effectively. The model for Error 2 stands out with an exceptionally high AUC of 0.998.
Hence, the SVM models show strong performance across all metrics in identifying the six types of positioning errors. The model for Error 2 is the strongest performer, with the highest AUC and among the highest scores in recall, precision, and accuracy.
In this study, the accuracy for multiple classes (positioning errors) is calculated based on six binary encoding (0 or 1). Each possible combination of positioning errors is assigned a unique identifier (uID), both for the true labels (uID) and for the predicted labels (pID). The uID (pID) is calculated as Equations 1, 2), where E
i
is an indicator variable for the present ith poisoning error or not.
Each positioning error, if present, contributes a power of 2 to the sum, effectively creating a unique binary representation for each possible combination of positioning errors. Similarly, the pID for the SVM predictions is calculated in the same manner. The accuracy of the multiple class classification is then evaluated by comparing the uID and the pID for each sample. If the pID equals the uID, the prediction is considered correct; otherwise, it is a miss identification.
Finally, the accuracy for multiple classes is defined as the total number of correct predictions divided by the overall number of samples. This provides a measure of how often the SVM correctly identifies the exact combination of positioning errors in the dental images. By defining accuracy in this way, it is possible to measure the performance of the SVM not just on individual error types, but on its ability to correctly identify all present errors in each image. It is a more demanding measure than simply looking at each error type in isolation, as it requires the SVM to correctly identify all errors in an image to be considered a correct prediction. The accuracy of multiple class classification is 0.832 in this study. The accuracy of 0.832 in this study indicates that the SVM model correctly identified the combination of positioning errors in the dental images approximately 83.2% of the time (Table 6). This is a substantial achievement, considering the complexity of multiple class classification tasks, especially in medical imaging.
Demonstration how to calculate the accuracy for multiple classes
Note: The overall pID equal to uID is 459. Hence, the accuracy is 0.832 (i.e., 459/552).
This study leverages machine learning, specifically CNNs and SVMs, to improve the detection of positioning errors in dental imaging. Our findings highlight the potential of these methodologies in this context, but also illuminate the complexity and challenges inherent in multi-class classification tasks, especially when working with medical images. The selection of six different CNN architectures (Vgg19, Xception, Mobilenetv2, Inceptionv3, Resnet50, Resnet101) for feature extraction, and the use of SVMs for classification, was found to be a powerful combination. We observed notable performances across a range of metrics, including recall, precision, accuracy, and the Area Under the Curve (AUC) in ROC plots. The best performance was observed in Error 1, which had an accuracy of 0.989 and an AUC of 0.995. This robust performance underscores the potential of these models in the identification of this particular error.
However, there were some variations in the performances for different error types. For instance, the recall rate for Error 6 was slightly lower compared to the others, which suggests that the model had some difficulty correctly identifying true positives for this error type. This indicates that the CNN and SVM combination might struggle with some types of errors, and that further research is needed to enhance their performance across all error types. Our method for measuring accuracy in multiple class classification, based on encoding the combination of errors and comparing the predicted and true labels, showed an overall accuracy of 0.832. This level of accuracy is encouraging, given the complexity of the task, and suggests that our methodology can effectively handle multi-class classification problems.
The data augmentation in CNN model
Data augmentation was used to overcome the challenge of having a small dataset, a common problem in medical imaging research. By rotating, shifting, zooming, and mirroring the images, we were able to increase the size of the dataset and hence improve the robustness and generalizability of our models. This study has a number of implications for future research. Firstly, the variation in performance across error types suggests that further work is needed to optimize models for all types of errors. This could include fine-tuning hyperparameters, experimenting with different combinations of CNN architectures and classifiers, or potentially incorporating additional data to better train the models. Secondly, the use of transfer learning demonstrated how pre-trained models could be effectively repurposed for the task of dental image analysis. This opens up the possibility for similar approaches to be applied in other domains of medical imaging, where datasets are often limited.
While our study provides promising results, it also highlights the need for further investigation. As advancements continue in machine learning techniques, we believe the accuracy and precision of automated analysis in dental imaging will only improve. With these improvements, the goal of fully automated and highly accurate identification of positioning errors in dental imaging becomes an increasingly tangible reality.
Achieving high accuracy in multi-class tasks using SVMs often requires a strategic and careful approach to various aspects of the machine learning process. Here are some strategies that can help improve the accuracy.

Automatic tuning hyperparameters like the box constraint, kernel scale, and kernel function.
By employing these strategies, it may be possible to enhance the performance of SVMs on multi-class tasks. Nonetheless, it’s essential to remember that machine learning is an iterative process, and each task may require a unique combination of techniques to achieve the highest level of accuracy.
Table 7 summarizes research studies published between 2022 and 2023, which employ various deep learning algorithms for dental imaging tasks [28–31]. Sample sizes range from as low as 84 in Başaran et al.’s study to as high as 156,965 in Park et al.’s work. The methods used for analysis differ across studies, from R-CNN and VGG-Net to more general deep learning approaches and, uniquely in this work, fusion-extracted features from CNNs. In terms of accuracy, Park et al. lead with 87.9%, likely benefiting from a large dataset. In this work, despite a more moderate sample size of 552, achieves an impressive accuracy of 83.2% for detecting multiple positioning errors, making it the second most accurate method listed. Meanwhile, Başaran’s and Kohinata’s works show the lowest accuracies, around 69.0%, possibly limited by either their methodology or smaller datasets. Chaurasia et al., with an accuracy of 70.8%, manage to perform moderately despite a significant sample size. Therefore, this work demonstrates a promising balance of methodological innovation and high accuracy, making it a competitive contribution to the field of deep learning in dental imaging.
Comparative summary of deep learning approaches in dental imaging studies
Comparative summary of deep learning approaches in dental imaging studies
To the best of our knowledge, no prior studies have focused specifically on the fusion of extracted features from CNNs for detecting multiple positioning errors in dental panoramic radiography. Therefore, our study introduces a novel method and contributes new insights to the field. We believe that this innovation strengthens the value of our research from a thesis standpoint, as it addresses a complex issue that hasn’t been previously tackled. The aforementioned points elaborate the comparative efficiency of our method and its uniqueness in addressing dental imaging errors, thereby underscoring its significance in the broader research context.
The main contributions are listed as, a) Development of a comprehensive deep learning model that can identify multiple positioning errors in dental panoramic images. This is a novel application of AI in the field of dentistry, particularly in the area of imaging. b) Validation of the model on a large dataset of dental panoramic images. The results demonstrate a high degree of accuracy in error identification, surpassing traditional methods. c) Creation of a systematic process to integrate the deep learning model into dental imaging procedures. This helps in automating the error detection process and improving the efficiency of dental imaging practices. d) Enhancing the understanding of how AI can be used in improving dental care. The research contributes to the body of knowledge in AI’s practical applications in healthcare and particularly in dentistry.
In this study, we presented a comprehensive approach to identify and classify dental image positioning errors using pre-trained CNNs and SVMs. Our investigation confirmed the feasibility and effectiveness of such an approach for diagnosing positioning errors in dental images, thus addressing a critical need in the realm of dental imaging. The key to our success was the fusion of deep learning methodologies with classical machine learning techniques. The adoption of transfer learning from CNNs allowed us to leverage powerful, pre-trained models (Vgg19, Xception, Mobilenetv2, Inceptionv3, Resnet50, Resnet101) for effective feature extraction. These features, in turn, informed the SVMs, leading to robust error classifications. Furthermore, hyperparameter optimization, notably with automatic tuning and parallel computation, bolstered the performance of the SVMs. We also innovatively addressed multi-class classification by defining unique identifiers for each positioning error combination. This approach, combined with metrics such as recall, precision, accuracy, and AUC, provided a nuanced view of our model’s performance. Our results showed promising classification performance, with the accuracy of individual positioning error detection being high. However, multi-class classification accuracy, while reasonable, signifies an area for further improvement. This study contributes to the burgeoning field of AI-enhanced medical diagnostics, where there is much room for research and development. Moving forward, we recommend exploring strategies such as ensemble methods, class balancing, and advanced feature selection techniques to further enhance multi-class classification performance. This, along with continued iterations and improvements on our existing methods, may potentially lead to an even more accurate and reliable system for dental image error identification.
The methods and findings of this study highlight the potential of machine learning technologies in advancing medical diagnostics and patient care. Through continual advancements and applications of these technologies, we look forward to contributing to a future where machine learning models are integral components of healthcare systems, driving efficiency and accuracy in diagnostics. We aim to continue this line of research in several directions as followings. (1) Extend the model to identify additional types of positioning errors, thereby making it more comprehensive. (2) Investigate the potential integration of the model into existing radiological software used in clinical settings. (3) Evaluate the model’s performance with larger and more diverse datasets to ensure its robustness and reliability. We believe that the model holds great promise for implementation in real-world clinical settings to assist radiologists and improve the accuracy of dental panoramic imaging.
Author contributions
Conceptualization, T.-B.C. and H.-Y.S.; methodology, T.-B.C. and H.-Y.S.; software, T.-B.C. and S.-Y.H.; validation, S.-T.H., K.-Y.L., T.-B.C and N.-H.L.; formal analysis, K.-Z.T., S.-Y.H. and Y.-H.H.; investigation, S.-T.H., C.-Y.W., Y.-W.W. and Y.-L.W.; data curation, S.-T.H. and Y.-L.W.; writing—original draft preparation, H.-Y.S. and T.-B.C.; writing—review and editing, S.-T.H., H.-Y.S., T.-B.C. and N.-H.L.; project administration, T.-B.C.; funding acquisition, T.-B.C., H.-Y.S., Y.-H.H. and H.-Y.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional review board statement
In accordance with ethical standards and guidelines, all participants in this study provided informed consent prior to participation. This research project received ethical approval from the Institutional Review Board (IRB) (Approval No: A202205095).
Informed consent statement
In accordance with ethical standards and guidelines, all participants in this study provided informed consent prior to participation.
Data availability statement
The data are not publicly available due to privacy or ethical issues. The data presented in this study are available on request from the corresponding authors.
Footnotes
Acknowledgments
The authors would like to thank the Ministry of Science and Technology in Taiwan, for partial financially supporting this research under Contract MOST111-2118-M-214-001, MOST111-2221-E-214-005, HAFGH-D-112007 and NSTC112-2221-E-214-008.
Conflicts of interest
The authors declare no conflict of interest.
