Abstract
Objectives
Burning mouth syndrome (BMS) is defined as a chronic intraoral burning sensation occurring in the absence of identifiable local or systemic causes. Several classification systems have proposed diagnostic criteria for BMS, including the International Classification of Headache Disorders (ICHD-3), the International Classification of Diseases (ICD-11), the International Classification of Orofacial Pain (ICOP-1), and the World Congress of Oral Medicine (WCOM). However, none of these have been validated. This study aims to evaluate the diagnostic performance of ICHD-3, ICD-11, ICOP-1, and WCOM criteria for BMS and to suggest optimized diagnostic criteria.
Methods
We assessed 76 consecutive patients referred for burning oral pain. Of these, 34 (28 women) were diagnosed with BMS according to the reference standard, and 42 (37 women) had other oral mucosal pain conditions. Two control groups were also recruited: 31 TMD patients (28 women) and 30 pain-free participants (26 women). Assessment involved self-report questionnaires and comprehensive clinical examinations. All patients with burning oral pain underwent additional laboratory tests. Consensus-based diagnosis constituted the reference standard, and a blinded examiner applied each set of different diagnostic criteria as the index tests. We calculated sensitivity, specificity, positive (PPV) and negative predictive values (NPV), and positive and negative likelihood ratios (LR + and LR-), and the area under the ROC curve for each index tests. Because some ICHD-3 criteria were ambiguously defined, we created operational definitions and tested three versions of the criteria.
Results
No significant group differences were found in age, sex, or smoking status. None of the criteria exhibited both high sensitivity and specificity. WCOM showed the highest sensitivity (91.2%), NPV (99.2%), LR- (0.14), and AUC (0.717). In contrast, ICHD-3 definition 3 showed the highest specificity (93.2%), PPV (8.2%), and LR + (5.2), but the lowest sensitivity (35.3%). Based on these findings, we developed an optimized version of the ICOP criteria. The new criteria showed the highest sensitivity (94.1%), NPV (99.9), LR- (0.07), and AUC (0.874), while maintaining acceptable specificity (79.6%), PPV (7.4%), and LR + (4.4).
Conclusion
Substantial variation exists in the diagnostic performance of current BMS criteria, with each set showing either high sensitivity or specificity. This study provides the first data-driven proposal for modified diagnostic criteria for BMS, offering a foundation for improving future versions of ICHD and ICOP.
This is a visual representation of the abstract.
Introduction
Burning Mouth Syndrome (BMS) manifests as a burning sensation of the oral mucosa, and its pathophysiology is complex and multifactorial. 1 This involves psychological factors, peripheral fiber degeneration, increased peripheral density of the capsaicin-activated and heat-sensitive Transient Receptor Potential Vanilloid-1 ion channel in the oral mucosa, and dysfunction of central pain-modulating pathways.2,3 BMS is associated with substantial suffering and markedly reduced quality of life. 4
Individuals with burning oral sensations are diagnosed with Oral Mucosal Pain (OMP) when a local or systemic cause is identified, and with BMS when no underlying pathology is observed. These categories correspond to the former terms secondary and primary BMS. 5 Recently, it has also been proposed to rename BMS to BMD (burning mouth disorder). 6 BMS remains a diagnosis of exclusion. Confirming or ruling out BMS requires identifying affected individuals through a thorough medical history, clinical examination, and appropriate laboratory testing to eliminate alternative causes of oral mucosal pain. Several diagnostic systems include criteria for BMS, for example the International Classification of Headache Disorders (ICHD-3), 7 the World Congress of Oral Medicine (WCOM), 6 the International Classification of Diseases (ICD-11), 8 and the International Classification of Orofacial Pain (ICOP-1). 9
Recently, a structured diagnostic protocol, the Research Diagnostic Criteria for BMS (RDC/BMS), was developed using a Delphi process. 10 Anchored in the biopsychosocial model of chronic pain, the RDC/BMS incorporates symptom assessment, clinical examination findings, self-reported psychosocial health, and biomarkers. However, it remains a beta version, and none of these proposed diagnostic criteria have yet been validated. Field testing of diagnostic criteria, such as the ICHD-3, has been strongly advocated to improve diagnostic classification. 7 There is a clear need for more clinically useful and validated diagnostic criteria for BMS.
We hypothesized that current diagnostic criteria exhibit high sensitivity but low specificity, and that targeted optimization can improve their overall diagnostic performance.
The aims of this study were to:
Field-test and compare the diagnostic criteria of ICOP-1, ICHD-3, ICD-11, and WCOM for burning mouth syndrome (BMS) using a consensus-based reference standard and control groups (OMP, temporomandibular disorders and non-pain individuals). Develop optimized diagnostic criteria for BMS within the ICOP-1 framework by integrating factors that enhance diagnostic performance.
Materials and methods
Study design
This prospective case-control study collected all interview, survey, and clinical examination data before administrating the index tests. Consensus diagnoses were established thereafter (see below). The study followed the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines 11 and a completed 30-item STARD checklist.
Participants
Study sample, setting, and locations
Participants were recruited from three clinical sites in southern Sweden:
the Department of Oral and Maxillofacial Surgery and Oral Medicine, and the Department of Orofacial Pain and Jaw Function at Malmö University; the Department of Oral and Maxillofacial Surgery, Skåne University Hospital, Lund; and the Department of Orofacial Pain, Maxillofacial Surgery, Orofacial Medicine, and Oral Radiology at Blekinge County Hospital, Karlskrona.
The study period was from January 2023 to May 2025. Ethical approval was granted by the Swedish Ethical Review Authority (Application No. 2022-05895-01). All participants provided written informed consent and received 300 SEK compensation. Clinical examinations were performed at Malmö University or Blekinge County Hospital by the same three examiners.
Participant recruitment
A total of 137 individuals participated in the study: 76 patients with undiagnosed intraoral mucosal pain, 31 with TMD, and 30 pain-free controls. Consecutive patients referred for chronic intraoral burning pain (OMP/BMS) were recruited from the participating clinics. An age- and sex-matched convenience sample of TMD patients was recruited from specialist orofacial pain clinics at Malmö University and Blekinge County Hospital. Pain-free participants were recruited via public advertisements and were matched to the BMS group on age-and sex.
Inclusion criteria
Participants were men and women aged 18–80 years (all groups). Additional criteria were oral mucosal pain lasting more than three months (BMS/OMP groups); a DC/TMD-based TMD pain diagnosis of more than three months’ duration (TMD group); and no orofacial pain within the last three months (non-pain group).
Exclusion criteria
Participants were excluded if they had communication barriers that impeded questionnaire comprehension (as identified in the referral or during the initial visit), a diagnosis of fibromyalgia, pregnancy or lactation, or ongoing dental treatment.
Procedure and data collection
All participants completed standardized questionnaires assessing general health, pain-related variables, and psychosocial factors, followed by a clinical examination. The examination involved the inspection and assessment of the masticatory muscles and the oral mucosa. For participants with undiagnosed intraoral mucosal pain, additional procedures were performed: candida testing, somatosensory testing, salivary flow measurement, and gustatory testing. Hematological analyses for patients with undiagnosed intraoral mucosal pain were conducted through public healthcare clinics.
To establish the reference standard for patients with undiagnosed intraoral mucosal pain, evaluations were conducted on the same day by an oral surgeon/oral medicine specialist (AT) and an orofacial pain specialist (TL). A consensus diagnosis was reached based on case history, clinical examination, and findings from candida testing, hematologic assessments, and salivary function assessments. In cases of diagnostic uncertainty, a third clinician (JPG) provided adjudication. TMD patients and healthy pain-free participants were examined solely by the orofacial pain specialist. 12
An independent examiner (ME), blinded to the consensus diagnoses and group allocation, evaluated all participants using the ICOP-1, ICHD-3, ICD-11, and WCOM diagnostic criteria before the reference standard examination. Index and reference examinations were conducted on the same day for most participants. However, because the reference examination required additional data (hematological, serological, and candida testing), the final consensus diagnosis was confirmed several weeks later.
All examiners underwent a joint training session to harmonize examination procedures and diagnostic criteria. TL and AT conducted additional discussions to align and document the diagnostic consensus procedures. All examiners were certified in DC/TMD procedures through the DC/TMD Training and Reliability Centre at Malmö University, Malmö, Sweden.
Protocol for clinical examination and questionnaire
Evaluation of patients with oral mucosal burning pain (BMS/OMP) followed RDC/BMS recommendations, including symptom self-reporting, clinical examination, psychosocial assessment, and optional biomarker testing.
The intraoral examination assessed oral mucosa color and morphology, the presence of lesions, erosions, or ulcerations, moisture level, and tongue characteristics – deep fissures, coatings, and papillary features – as well as salivary glands and general dental status. Procedures followed the RDC/BMS guidelines, 10 with modifications that included assessment of tongue parafunctional signs, occlusal relationships, periodontal status, and a modified Qualitative Somatosensory Test (QualST). The extraoral examination evaluated the masticatory system using the DC/TMD protocol. 13
Hematologic, serologic and candida testing
Patients with undiagnosed intraoral mucosal pain were referred to their primary healthcare provider to undergo laboratory evaluation according to the RDC/BMS-protocol. 10 The evaluation included complete blood count (CBC/FBC), hemoglobin (Hb), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), white cell count (WCC), serum iron, folate, vitamin B12, zinc, HbA1c, thyroid function tests (T3, T4, TSH), and C-reactive protein (CRP). When clinically indicated, additional testing included liver enzymes (ALT, AST, GGT, ALP), albumin, autoantibodies (ANA, Anti-Ro, Anti-La, ENA), and homocysteine.
The clinical examination of patients with undiagnosed intraoral mucosal pain included testing for presence of candida. Smear samples were collected from the bilateral buccal vestibule and dorsal tongue using a sterile instrument. Samples were applied to glass slides, fixed in alcohol, and analyzed at the Department of Oral Pathology, Malmö University, using periodic acid–Schiff staining to detect candida hyphae. If the results were positive, antifungal treatment was initiated before further investigation. Nystatin oral suspension (100,000 IU/mL; 1 mL four times daily) was prescribed, to be used preferably after meals and retained in the mouth as long as possible before swallowing. Treatment duration was four weeks.
Salivary function
Unstimulated and stimulated whole saliva volumes were measured using a standardized method. The collection time was 10 min for unstimulated saliva and 5 min for stimulated saliva. 14 Low salivary flow was defined as ≤0.1 mL/min for unstimulated whole saliva and ≤0.7 mL/min for chewing-stimulated saliva. 15
Qualitative somatosensory testing (QualST)
Testing was conducted on the side with the highest pain intensity for patients with BMS, OMP, or TMD, or on the right side when pain was equally bilaterally. For pain-free participants, all tests were performed on the right side. The tested regions were dorsum of the hand (extra-trigeminal), cheek (extraoral), tongue, hard palate, and lower lip. Participants rated the perceived intensity of touch (using cotton swabs), cold (using metal spatulas cooled in an ice water bath), and heat (using metal spatulas warmed in 50° C warm water bath) using 0–50–100 numerical rating scale, where 0 indicated no sensation, 50 barely painful, and 100 the worst imaginable pain. 16 Wind-up (WU) testing was performed at the tongue tip using a single mechanical stimulus applied with a pressure-calibrated manual periodontal probe (0.2 N), followed by ten consecutive stimuli delivered at 1 Hz. Participants rated pain intensity of the first and last repeated stimuli on a 0–10 numerical rating scale (0 = no pain; 10 = worst imaginable pain).
Questionnaires
Data collection included a semi-structured interview and validated self-report questionnaires assessing:
Sensory screening was performed using 0–10 numeric rating scales assessing olfactory perception, gustatory perception, and subjective mouth dryness. Specifications and results of clinical examinations, self-reported pain characteristics, and psychosocial measures are presented in a separate manuscript.
Reference standard: BMS
The diagnostic criteria for the BMS reference standard are presented in Table 1. These criteria were applied using responses from the self-report questionnaires, clinical evaluations, biomarker analyses, and, when applicable, treatment responses. Classification followed the RDC/BMS protocol 10 and relevant recommendations in the literature.3,5,25,26 The clinicians responsible for establishing the reference diagnosis (AT and TL) were blinded to all index-test results.
The diagnostic criteria systems for burning mouth syndrome as they were applied for the reference standard and index tests.
Abbreviations: ICOP: Internation classification of Orofacial Pain, WCOM: World Congress of Oral Medicine, ICD-11: International Classification of Disease, 11th edition, ICHD-3 International Classification of Headache Disorders, 3rd edition.
For OMP diagnosis, AT and TL reached consensus using the ICOP criteria, 9 by the requirement of a ≥ 30% reduction in pain intensity following treatment of the identified etiological factor. Confirmation was through specific test e.g., Candida culture, salivary flow measurement, laboratory blood analysis and following targeted interventions such as antifungal treatment, vitamin supplementation, corticosteroid therapy, salivary substitutes, oral appliances, and when possible, change of medication. We expanded the diagnostic protocol to include two additional OMP subgroups: hyposalivation and parafunctional habits. This approach is consistent with previous classifications in which BMS associated with these factors was categorized as secondary BMS, based on the assumption that such cases may respond to etiology-directed therapy. 5 The hyposalivation subgroup was defined as patients with an unstimulated saliva flow ≤ 0.1 mL/min who experienced ≥ 30% pain reduction after using a saliva substitute. Parafunctional habits were identified based on clinical signs (tongue indentations and linea alba) and a > 30% pain reduction following use of an oral appliance in the form of an Essix splint. At delivery, patients were instructed to relax the jaw/tongue whenever they noticed themselves pressing against the appliance, serving as a form of biofeedback.
Index test: diagnostic criteria and rationale
The diagnostic criteria for ICOP-1, WCOM, ICD-11, and ICHD-3 are presented in Table 1. For the index tests, criteria requiring exclusion of local or systemic factors were not applied, as the aim of the study was to evaluate each system based solely on symptom descriptions and clinical signs.
Several criteria required operational definitions because they were not explicitly specified within the original classification systems. For ICD-11, the diagnostic criteria do not define what constitutes significant emotional distress or functional disability. In this study, emotional distress was defined as the presence of at least one of the following: mildly elevated depressive symptoms (PHQ-9 > 5), anxiety (GAD-7 > 5), stress (PSS-10 > 13), or pain catastrophizing using PCS-score > 30.18–21 Functional disability was assessed by asking participants whether their intraoral pain caused any functional limitation, with responses recorded as yes or no.
In ICHD-3 criterion D, normal somatosensory testing and function are required, but no definitions or operational guidelines are provided. Using data from the QualST protocol, we developed three operational definitions of sensory changes that would disqualify a diagnosis of BMS. These definitions incorporated temporal summation and the presence of allodynia to touch, cold, or heat stimulation of the tongue during the QualST. Abnormal temporal summation was defined as a WU difference > 3.
Level 1: WU difference > 3 AND Allodynia to All stimuli Level 2: WU difference > 3 AND Allodynia to ANY stimulus Level 3: WU difference > 3 OR Allodynia to ANY stimulus
The ICHD-3 diagnostic criteria incorporating these operational definitions were designated as ICHD-3–Definition 1, ICHD-3–Definition 2, and ICHD-3–Definition 3. A blinded examiner (ME) applied all diagnostic criteria6–9 to each participant, and the resulting diagnoses were subsequently compared with those established using the reference standard.
Statistical methods and data processing
Power calculation: Given the similarities between the index tests, we estimated a 10% difference in specificity or sensitivity. To achieve 80% power with a 95% confidence, we calculated that 122 participants were required.
Sensitivity, specificity, and likelihood ratios were calculated for each diagnostic criterion using our consensus-based reference standard. In this field test, patients diagnosed with BMS according to our reference standard constituted true positives, and the remaining patient groups constituted true negatives. We calculated the area under the receiver operating characteristic curve (ROC-Curve) for each index test. Positive predictive and negative predictive values (PPV and NPV) were estimated using a BMS prevalence of 1.7%, based on the most recently published data. 27 Data were processed and analyzed using IBM SPSS Statistics version 30.0.0.0. (IBM, Armonk, NY, USA)
Results
A total of 145 participants were enrolled in the study; of these, 137 were included and 8 were excluded (Figure 1 flow-diagram). Among the 137 individuals, 34 (28 women) were diagnosed with BMS, 31 (28 women) with TMD, and 30 (26 women) were non-pain participants. Thirty-seven (33 women) patients were diagnosed with various OMP conditions. Five patients (4 women) were diagnosed with neuropathic pain (NP) using ICOP diagnoses. 9 The distribution of OMP and NP conditions is presented in Table 2. In Tables 3 and 4, and in the following section, OMP and NP patients are presented as one group under the term other oral mucosal pain (OOMP). No significant age differences were observed among patients with BMS (median 59, IQR 51–67), OOMP (58, 47–71), TMD (56, 42–65), or non-pain participants (48, 50–66). Sex distribution and smoking status also did not differ between the groups.

Flow diagram explaining the enrollment process, order of tests taken, the number of participants undergoing the tests, and the outcome of the reference standard
Distribution of study participants in the diagnostic groups.
Pain duration, characteristic pain intensity and pain-related disability in patients with burning mouth syndrome, patients with other oral mucosal pains symptoms and patients with painful temporomandibular disorders
Characteristic pain intensity and pain-related disability were obtained from the corresponding subscales in Graded Chronic Pain Scale.
Number of positive and negative results in the various index tests in the different participants groups according to the reference standard.
Abbreviations: BMS: Burning mouth syndrome, OOMP: Other Oral Mucosal Pain, TMD: Temporomandibular disorder, ICOP: Internation classification of Orofacial Pain, WCOM: World Congress of Oral Medicine, ICD-11: International Classification of Disease, 11th edition, ICHD-3 International Classification of Headache Disorders, 3rd edition.
Pain characteristics and pain-related disability are summarized in Table 3. Pain intensity did not differ significantly among the BMS, OOMP, and TMD groups. No significant between-group differences were observed in characteristic pain intensity, average pain intensity, or pain-related disability.
Table 4 shows the distribution of positive and negative index test results (BMS vs. non-BMS) across the diagnostic criteria and reference standard groups. WCOM yielded the highest number of positive BMS diagnoses (n = 67), whereas ICHD-3 Definition 3 produced the highest number of non-BMS diagnoses (n = 118). The numbers of true positives, false negatives, true negatives, false positives, along with sensitivity, specificity, PPV, NPV likelihood ratios, and the area under the ROC-Curve (AUC) for all index tests are shown in Table 5.
The number of true positives, false positives, true negatives, false negatives, sensitivity, specificity, positive predictive value, and negative predictive value in the various diagnostic criteria for burning mouth syndrome.
Abbreviations: ICOP: Internation classification of Orofacial Pain, WCOM: World Congress of Oral Medicine, ICD-11: International Classification of Disease, 11th edition, ICHD-3 International Classification of Headache Disorders, 3rd edition, PPV: positive predictive value, NPV: negative predictive value. LR+: likelihood ratio of a positive test, LR-: likelihood ratio of a negative test, AUC: Area under the receiver operating characteristic curve. *: based on a prevalence of 1.7%.
WCOM demonstrated the highest sensitivity and NPV, while ICHD-3 Definition 3 showed the highest specificity and PPV. Definition 3 also yielded the highest probability of BMS when the test was positive (5.2), while WCOM showed the lowest probability of BMS when the test was negative (0.14). WCOM exhibited the highest AUC (0.771), whereas ICD-11 showed the lowest (0.623).
Based on the collected data, we proposed optimized diagnostic criteria for BMS within ICOP, as shown in Table 6. Using the optimized ICOP criteria, sensitivity increased to 94.1% (95% CI: 80.3–99.3), while specificity was 78.6% (69.5–86.1). The PPV was 7.1% (5.0–10.0), and the NPV was 99.9% (99.50–99.97). The optimized ICOP criteria showed the second-highest probability of BMS when the test was positive (4.4, 3.0–6.4), and the lowest probability of BMS when the test was negative (0.07, 0.02–0.29). In the ROC analysis, these optimized ICOP criteria demonstrated the highest area under the ROC curve in this study (0.874).
The diagnostic criteria for burning mouth syndrome according to the suggested optimized ICOP criteria.
Notes: 1: Alternative wording is acceptable provided it describes an equivalent to a burning sensation.
Abbreviations: ICOP: International Classification of Orofacial Pain.
Only conditions A—C were included in the diagnostic accuracy calculations to ensure comparability with analyses of the other diagnostic criteria.
Discussion
This study for the first time demonstrates that a data-driven diagnostic criteria for BMS may improve diagnostic performance compared with existing ICHD and ICOP criteria. Current commonly used diagnostic criteria show limited overall accuracy, typically achieving either acceptable sensitivity or acceptable specificity, but not both. This indicates that none of the current diagnostic criteria can accurately identify BMS patients while excluding patients with OMP. Given the diagnostic complexity of BMS,25,28 this limitation is expected and highlights the need for data-driven criteria, such as those provided in this study. Inclusive criteria such as WCOM showed high sensitivity (91%) but low specificity (65%) due to a substantial overlap with OMP, whereas restrictive criteria such as ICD-11 demonstrated high specificity but low sensitivity. In contrast, optimized ICOP criteria improved overall diagnostic accuracy, with 94% sensitivity and 79% specificity.
Field-testing has been applied in several studies to assess the diagnostic validity of clinical criteria as it reflects real-world settings outside the laboratory.13,29 In our study, we enrolled consecutive participants from a large region in southern Sweden, encompassing both urban and rural areas, who had been referred to three different clinical settings for evaluation of intraoral burning pain. The age and sex distribution, as well as pain characteristics, were consistent with those reported in previous studies of BMS.30–32 Therefore, we consider our results to be representative of the broader BMS patient population.
All index tests distinguished BMS and TMD from non-pain participants; however, their diagnostic accuracy varied in relation to OMP. The main differences between the diagnostic criteria are as follows: ICHD-3 requires normal somatosensory testing but does not define what constitutes somatosensory changes or specify how such changes should be determined. 7 We presented three versions of the criteria based on three different thresholds for sensory change. The first definition excluded patients with a WU difference >3 and allodynia to touch, cold, or heat. The second version excluded patients only if both a pathological WU difference and any allodynia were present. The third version excluded patients who showed either a pathological WU difference or any allodynia. The use of a WU difference >3 as pathological temporal summation can be discussed. Using WU difference rather than ratio has been reported to yield higher reliability. 33 In this study, a substantial subset of BMS patients demonstrated somatosensory changes, which is expected based on the literature.2,3,34–37 This resulted in several BMS patients being excluded at ICHD-3 Definition 2, and most being excluded at Definition 3. Although a previous study showed that BMS patients diagnosed using ICHD-3 and ICOP were similar in clinical profile, 30 the requirement of normal somatosensory function is poorly defined and substantially reduces the sensitivity of the criteria.
In the ICD-11, BMS requires the presence of significant emotional distress or functional limitations. These specific criteria are, however, undefined. For emotional distress, we used participants’ questionnaire data on depression, anxiety, stress, and catastrophizing.18–21 For functional limitations, we asked participants whether their intraoral pain caused any functional impairment. Patients who scored above the normal range on any of the questionnaires or who reported pain-related functional limitations were considered to meet the criterion. Nevertheless, these requirements led to the exclusion of many BMS patients, resulting in relatively low sensitivity. At the same time, the criteria excluded many OMP patients, giving a relatively high specificity.
ICHD-3 and ICOP require normal oral mucosa for a BMS diagnosis. This led to the exclusion of several BMS patients who presented with mucosal lesions. However, no causal relationship could be established between these lesions and the intraoral pain. This is supported by previous findings showing that oral lesions and BMS are not mutually exclusive. 31 The ICOP classification also allows for subclassification of BMS into cases with or without somatosensory changes. In our study, we could apply the same operational approach as we applied for ICHD-3. This would allow BMS patients to be subclassified rather than being excluded. However, doing so would not alter the diagnostic performance of the main criteria.
Based on the data collected in this study, we aimed to propose new, improved diagnostic criteria for BMS for future versions of ICOP and ICHD. The resulting Optimized ICOP criteria were constructed to maximize the number of correctly identified BMS cases (true positives) while excluding the largest possible number of non-BMS participants (true negatives). As noted above, the substantial similarities between BMS and OMP patients in symptom characteristics and other clinical aspects made this task challenging. However, we attempted to highlight the few differences between the groups when constructing the criteria.
In this study, all BMS patients reported continuous, bilateral pain, consistent with earlier studies. 32 Although cases of unilateral BMS have been reported, such cases often more closely correspond to post-traumatic trigeminal neuropathy and typically exhibit different pain characteristics and associated symptoms that differ from those of bilateral BMS.38,39
Most BMS patients described their primary symptom as burning pain. However, a subset described discomfort, dysesthesia, or tingling with a burning quality, consistent with previous findings. 30 Because the ICHD-3, ICOP, and ICD-11 criteria specify only burning pain as the defining descriptor, patients using alternative but related descriptors risk being excluded. To address this, we allowed discomfort or dysesthesia with a burning quality as acceptable descriptors.
All but two BMS patients reported daily pain, and most experienced symptoms throughout waking hours. Our criterion requiring daily pain for ≥6 h per day produced the highest specificity while maintaining very high sensitivity.
Criterion C in the Optimized ICOP does not require normal oral mucosa for a BMS diagnosis, unlike ICHD-3 and ICOP. However, the criterion still requires exclusion of local and systemic factors that could account for the pain.
Acceptable diagnostic performance for orofacial pain conditions is typically defined as sensitivity >70% and specificity >95%. 13 None of the diagnostic criteria in this study demonstrated sufficiently high specificity. ICHD-3 Definition 3 showed the highest specificity, but at the same time exhibited a very low sensitivity. This likely reflects the substantial overlap in most clinical characteristics between BMS and OMP patients, 40 as all criteria successfully excluded all TMD patients and non-pain participants. However, in real-life clinical practice, systemic and local factors should be investigated and excluded, which would enhance the specificity of the diagnostic process. Thus, it is likely that the various sets of criteria have higher specificity in real-life setting. Because our aim was to examine and improve the remaining conditions of the criteria, this exclusion step was not incorporated into our calculations, including those for the optimized ICOP.
The overall diagnostic performance of a test can be measured through using the area under the ROC curve (AUC). The Optimized ICOP exhibited the highest AUC. However, the ROC-curve is ideally suited for continuous variables and may not be the most appropriate measure of diagnostic performance in our study, given the dichotomous nature of the index tests. 41
Another useful measure of diagnostic performance for dichotomous tests is the likelihood ratios. Together with the pre-test probability of a diagnosis, in this case the prevalence, likelihood ratios can be used to estimate the post-test probability, where a greater shift from post- and pre-test probability indicates a higher diagnostic performance. 42 For a positive test result, ICHD-3 showed the greatest increase from pre- to post-test probability. However, its very low sensitivity would result in many missed BMS cases. The Optimized ICOP produced strong diagnostic evidence, yielding the second-highest LR + and offering a more balanced and clinically appropriate compromise.
Methodological considerations
Although our sample size provided 80% power, a larger sample might have revealed additional differences between the BMS and OMP groups, potentially allowing further optimization of our proposed diagnostic criteria. It is also worth noting that the size of the subgroups remains limited, which may have an effect on the precision of our calculations. Another limitation is that the TMD group and controls only received a brief intraoral screening as a part of their assessment by the orofacial pain specialist. Thus, some more subtle intraoral mucosal lesions may have been undetected.
The external validity of this study is somewhat limited by several factors. The study population may not be representable of real-life settings as we excluded patients with widespread pain conditions such as fibromyalgia to avoid possible confounding effects on some our examinations, especially the QualST.
The population in the present study is limited to patients in the southern Sweden, which may limit the generazibility of our findings across widespread populations. But, at the same time, there is very little that indicates that this population would differ from a generalized widespread population.
Another possibly limiting factor of the external validity is how we applied the reference standard. Although necessary because of the absence of a gold standard, it includes features that overlap with those in the index tests. We tried to amend this limitation by using a blinded operator to apply the index tests. However, the reference standard is still inherently subjective and thus affects the validity of the study. Moreover, the use of a novel, unvalidated method of QualST and thresholds for somatosensory changes is another limiting factor of the external validity of our findings and would need further testing.
OMP diagnosis was supplemented by the requirement of a ≥ 30% reduction in pain intensity following treatment of the identified etiological factor. When initiated treatment did not achieve sufficient improvement (>30%), patients were classified as BMS in accordance with recommendations of the RDC/BMS protocol. 10
The strength of this study lies in its prospective design, in which patients with intraoral burning pain were recruited, data were collected, and index tests were performed before establishing a diagnosis according to the reference standard. Furthermore, ME, who performed all index tests, was blinded to all participants’ diagnoses. Consensus decisions were reached by two independent, experienced clinicians/researchers in oral medicine and orofacial pain. Overall, this study represents a novel field-testing approach for evaluating and redefining diagnostic criteria for BMS.
Conclusions
This innovative clinical study indicates that the proposed data-driven diagnostic criteria for BMS will improve diagnostic performance compared with existing ICHD and ICOP frameworks. Current and commonly used diagnostic criteria show limited accuracy, generally achieving either acceptable sensitivity or acceptable specificity – but not both.
Article highlights
A novel field-testing approach was applied to evaluate diagnostic criteria for BMS.
Commonly used diagnostic criteria for BMS show substantial variation in diagnostic performance, typically achieving either acceptable sensitivity or specificity.
This study provides the first data-driven diagnostic criteria for BMS.
The proposed criteria show superior diagnostic performance compared with established criteria systems.
Footnotes
Acknowledgements
All data are available upon reasonable request to the corresponding author, contingent upon the approval of a Data Transfer Agreement by the Swedish Ethical Review Authority.
ORCID iDs
Ethical considerations
Ethical approval was granted by the Swedish Ethical Review Authority (Application No. 2022-05895-01).
Consent to participate
All participants provided written informed consent and received 300 SEK compensation.
Consent for publication
All the authors consent for this work to be published with Cephalalgia.
Author contributions
This study was designed by TL, PA, AT, PS, and ME. Data were collected by ME, TL, and AT. Diagnoses were obtained through consensus between TL, AT, and JPG. The data were analyzed by ME and PA, and the results were critically examined by all authors. ME and TL had a primary role in drafting the manuscript, which was subsequently edited by PA and ME, with input from AT, JPG, and PS. All authors approved the final version and agreed to be accountable for all aspects of the work.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Region Skåne, (grant number OFRS993154).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
All data are available upon reasonable request to the corresponding author, contingent upon the approval of a Data Transfer Agreement by the Swedish Ethical Review Authority.
Open practices
Not applicable
