Abstract
Background:
Multiple ultrasound-based risk stratification systems (RSSs) for thyroid nodules are used worldwide. Variations in structure, performance, and recommendations are confusing for physicians and patients and complicate management decisions. The goal of this study was to determine the factors that are associated with choice of RSS and barriers to RSS use. These results are intended to inform development of a universal international thyroid ultrasound RSS.
Methods:
An online survey with questions about usage of RSSs, ultrasound practice and volumes, training, specialty, practice type, and geographic region was made available to members of five professional societies via email. Subgroup analysis was performed to identify the factors that governed use of one or more of five leading RSSs: American Association of Clinical Endocrinology (AACE), American College of Endocrinology (ACE), and Associazione Medici Endocrinologi (AME) Medical Guidelines, American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS), American Thyroid Association (ATA) guidelines, European Thyroid Association TIRADS (EU-TIRADS), and Korean Society of Thyroid Radiology/Korean Thyroid Association TIRADS (K-TIRADS).
Results:
There were 875 respondents from 52 countries (response rate not estimated due to overlapping society membership). More than 7 specialties were represented, with most (538; 61.5%) in endocrinology. The choice of RSS was strongly associated with medical specialty and geographic region. Of 692 respondents who indicated that their practice used an RSS, 213 (30.8%) used more than one. The specialties that were more likely to use multiple RSSs were surgery and others (40%), followed by endocrinology (33.0%), and radiology or nuclear medicine (17%) (p < 0.001). Of 271 (31.0%) respondents who indicated that they do not personally use an RSS, the majority (168; 62%) preferred to describe the specific sonographic characteristics/features that they believe are most relevant in a nodule.
Conclusions:
Almost one third of respondents indicated use of more than one RSS in their practice, potentially leading to confusion, and a similar proportion reported not using an RSS for various reasons. A unified international system that addresses their concerns and simplifies risk classification of thyroid nodules may benefit practitioners and patients. This is particularly important as newer thyroid nodule management options gain acceptance.
Introduction
Over the past two decades, many organizations have developed ultrasound-based risk stratification systems and management guidelines for thyroid nodules, hereinafter termed RSSs. These include the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS), the American Thyroid Association (ATA) guidelines, the Chinese Medical Association (C-TIRADS), the European Thyroid Association (ETA, EU-TIRADS), the Korean Society of Thyroid Radiology/Korean Thyroid Association (KSThR/KTA, K-TIRADS), the Society of Radiologists in Ultrasound (SRU Consensus Conference Statement), and the American Association of Clinical Endocrinology (AACE), the American College of Endocrinology (ACE), and the Associazione Medici Endocrinologi (AME), the latter three societies having collaborated to produce a unified document (AACE/ACE/AME Medical Guidelines for Clinical Practice for the Diagnosis and Management of Thyroid Nodules) (1 –7). As well, several groups of investigators have developed RSSs that are not associated with a specific professional organization (8,9).
Multiple studies have compared RSSs to determine if some perform better than others, but the results have been inconsistent, probably reflecting differing patient populations, inclusion criteria, analytic methods, and other factors. For example, a meta-analysis of 8 studies that evaluated the performance of 5 RSSs found that they showed that the ACR TI-RADS resulted in fewer unnecessary biopsies than the ATA guidelines, K-TIRADS, EU-TIRADS, or the Kwak TIRADS (TIRADS proposed by Kwak) (10). However, other research has produced different results. For example, in one recent study, the ATA guidelines outperformed the ACR TI-RADS (11). Reported variations in performance make it difficult for individual practitioners and groups to know which RSS to employ for their patients.
In addition to performance, the choice of which RSS to use is influenced by other considerations (Table 1). Notably, the RSSs take somewhat different approaches to assigning nodules to risk categories based on ultrasound features. For example, the ATA guidelines include pattern matching, while the ACR TI-RADS awards points in five categories, which are then summed to produce a cancer risk level. Other systems, such as the EU-TIRADS and K-TIRADS, use an intermediate approach in which classification is based on the presence of key suspicious features in combination with others. The malignancy likelihood associated with each risk category also differ considerably between RSSs. Disparities between RSSs are further characterized by differing size cutoffs for fine-needle aspiration biopsy (FNAB) and recommendations for follow-up. Additionally, practitioners may be more likely to adopt an RSS that is endorsed by a professional society to which they belong, which may in turn relate to their geographic location.
Characteristics of Five Thyroid Ultrasound Risk Stratification Systems
AACE/ACE/AME, American Association of Clinical Endocrinology, American College of Endocrinology, Associazione Medici Endocrinologi; ACR TI-RADS, American College of Radiology Thyroid Imaging Reporting and Data System; ATA, American Thyroid Association; EU-TIRADS, European Thyroid Association Thyroid Imaging Reporting and Data System; K-TIRADS, Korean Society of Thyroid Radiology/Korean Thyroid Association Thyroid Imaging Reporting and Data System; RSS, risk stratification system.
Not surprisingly, these differences have led to confusion for medical professionals and patients. To help resolve this conundrum, a grassroots initiative managed by the steering committee of the International Thyroid Nodule Ultrasound Working Group (ITNUWG) is striving to develop an international RSS, tentatively termed I-TIRADS, that harmonizes the leading RSSs (12). To help guide this effort, which at this writing is nearing the end of its first phase, the ITNUWG conducted an international survey on RSS use to determine the factors that govern RSS choice and investigate why some physicians do not use an RSS at all. In this article, we summarize the results of this survey with the hope of fostering support from practitioners who deal with thyroid nodules.
Materials and Methods
Survey design and participants
We performed an online survey, intended as a needs assessment to inform development of a future international thyroid ultrasound RSS. The institutional review board of the University of Alabama at Birmingham determined that approval was not needed for secondary review of survey data. The survey was developed by four members of the ITNUWG steering committee (F.N.T., L.H., C.D., and E.P.) and was built using an online platform from
As noted, our goal was to gain a better understanding of how and why RSSs are used to guide the development of I-TIRADS. Five professional organizations (ATA, AME, ETA, KSThR, SRU) that we felt were most likely to garner responses from their members solicited participation via email. We decided against including the ACR because most members do not interpret thyroid ultrasound in their practice and because we felt that the minority who do might receive and respond to the SRU solicitation. As well, the ACR does not typically conduct surveys of this type. No monetary or other reward was provided to participants.
The 22-question survey (Supplementary Data), which was designed and tested to be completed in less than 15 minutes, asked about respondents' opinions, choice, and usage of RSSs, demographics, geographic region, specialty, level of training, relevant experience, practice type, and volume of thyroid ultrasound and FNAB. Participation was voluntary. The survey website was open from August through December 2020.
Analysis of survey results
Summary statistics were prepared for responses to each question. Because not every participant answered all the questions, the percentage of respondents providing a given answer was calculated individually for each question, using the number of respondents to that question as the denominator.
Subgroup analysis was performed for the factors that determined the use of five leading RSSs: AACE/ACE/AME Medical Guidelines, ACR TI-RADS, ATA guidelines, EU-TIRADS, and K-TIRADS. The factors used for analysis were specialty, geographic region, training level, practice type, experience based on practice volumes and years in practice, and FNAB volume. Three categories were employed for analysis of medical specialty: (i) endocrinology, (ii) radiology or nuclear medicine, and (iii) surgery or other specialty. Three categories were also used for analysis of geographic region: (i) Americas (North America, Central America, South America, and the Caribbean); (ii) Europe, Africa, and the Middle East; and (iii) Asia and Oceania.
Survey results were exported into a spreadsheet (Excel; Microsoft, Redmond, WA), and statistical analyses were performed using SPSS Statistics Version 27 (SPSS, Inc., Chicago, IL). Chi-square tests were used to compare factors associated with the use of RSSs and choice of RSSs. A p value of 0.05 was used as the threshold for statistical significance.
Results
Study group
Table 2 shows the characteristics of the 875 respondents, who spanned 52 countries. The response rate could not be estimated due to overlapping society membership. Three hundred ninety-one (54.0%) respondents were from Europe and 205 (28.3%) from North America. The majority were attending level physicians (632; 72.2%) and in academic practice (500; 57.1%). There were more than 7 specialties represented, with most (538; 61.5%) in endocrinology. The second and third most common specialties were radiology (180; 20.6%) and surgery (100; 11.4%).
Characteristics of Survey Respondents
Use and choice of RSS
A total of 724 respondents answered questions about RSS awareness, value, and use. Almost all (685; 94.6%) were at least somewhat familiar with an RSS (Table 3). The majority either strongly agreed or agreed (659; 91.0%) that there was value in RSS usage. There was wide variation in the use of RSS by practice. Six hundred ninety-two respondents indicated that at least one RSS was used in their practice (Table 4). Notably, almost one third of them (213; 30.8%) specified that more than one RSS was used. For the practices that used only one RSS, the EU-TIRADS (126; 18.2%) and ACR TI-RADS (115; 16.6%) were the most common. Overall, RSS usage alone or in combination was as follows, in descending order: ATA (235; 34%), ACR TI-RADS (233; 33.7%), EU-TIRADS (205; 29.6%), AACE (142; 20.5%), K-TIRADS (101; 14.6%), other (32; 4.6%), none (36; 5.0%).
Opinions on Value and Structure of Thyroid Ultrasound Risk Stratification Systems
Use of Single Versus Multiple Thyroid Ultrasound Risk Stratification Systems in Respondent's Practice
Determinants of RSS use
The use of an RSS was high in all regions but was greatest in Asia/Oceania (97.2%), followed by the Americas (95.8%), and Europe, Africa, and the Middle East (89.8%) (p < 0.002). The specialty, training level, practice type, clinical practice volumes, and FNAB experience did not influence the use of an RSS. The use of multiple RSSs was highest in the Americas (40.2%), followed by Europe, Africa, and the Middle East (26.2%), and Asia/Oceania (19.3%) (p < 0.001).
The specialties that were more likely to use multiple RSSs were surgery and others (40%), followed by endocrinology (33.0%), and radiology or nuclear medicine (17%) (p < 0.001).
Few specialties indicated no RSS use: 4% for endocrinology, 2% for radiology or nuclear medicine, and 7% for surgery and others.
Attendings and trainees (fellows and residents) used RSSs differently. RSSs were used by 504/533 (94.6%) attendings compared with 167/191 (87.4%) trainees (p < 0.001). Trainees were less likely to use multiple RSSs (47/19; 24.6%) compared with attendings (166/533; 31.1%) (p < 0.001). There were no significant differences in the use of RSS based on age (p = 0.081) or gender (p = 0.127).
Determinants of RSS choice
The use of an RSS was dependent on specialty and region (Table 5) (all p < 0.001). Endocrinologists were more likely to use the AACE/ACE/AME Medical Guidelines (31.7%). Radiologists and nuclear medicine physicians tended to use the ACR TI-RADS (52.6%). Surgeons and others were more likely to use the ATA guidelines (51.9%). Respondents from Europe, Africa, and the Middle East mostly used the EU-TIRADS (35.1%) and the AACE/ACE/AME Medical Guidelines (35.1%). Respondents from the Americas more often used the ACR TI-RADS (78.7%) and the ATA guidelines (55.1%). Respondents from Asia/Oceania mostly employed the K-TIRADS (88.1%).
Use of Thyroid Ultrasound Risk Stratification Systems by Specialty and Region
All p-values are <0.001.
Total using RSSs includes people who used one or more RSSs, including those who reported use of an RSS other than the five listed RSSs.
Percentages refer to the proportion of the specialty and geographic region that use each of the five RSSs.
NM, Nuclear Medicine.
Reasons for RSS nonuse and suggestions for improvement
Although almost all survey respondents (94.6%) claimed at least some familiarity with RSSs (Table 3), 271 indicated that they, personally, did not use an RSS. Most of them (168; 62%) responded that they preferred to describe the specific sonographic characteristics/features that they think are most relevant in a nodule. Other reasons included the lack of a requirement for RSS use by their institution (53; 19.6%), the multiplicity of available RSSs (49; 18.1%), a perception that RSSs add little to observation of suspicious features (40; 14.8%), a contention that expertise is at least as effective as any RSS (35; 12.9%), insufficient evidence for RSS utility (22; 8.1%), RSS complexity (20; 7.4%), and concern that RSS use would increase the likelihood of being sued for malpractice (6; 2.2%).
Of 724 respondents, 449 (62%) indicated that a universal lexicon paired with illustrative images of ultrasound features would improve interobserver variability. Respondents also expressed support for a comprehensive online atlas of ultrasound images and/or video clips (391; 54%) and for training on a universal lexicon endorsed by a professional organization (325; 44.9%). Regarding RSS structure (Table 3), most (285; 39.4%) preferred a points-based system, followed by a pattern approach (206; 28.5%) and the presence of one or more suspicious features (161; 22.2%). Finally, the majority (689; 95.2%) indicated a preference for no more than 5 risk categories.
Use of ablation procedures and elastography
Of 777 respondents who answered questions about these techniques, 598 (77%) indicated that they do not perform ultrasound-guided thyroid ablation, and 549 (70.7%) responded that they do not use ultrasound-based elastography to classify thyroid nodules. However, 168 of them (21.6%) indicated that they would like to start using the latter technique.
Discussion
In this international survey of self-selected practitioners with an interest in thyroid nodule risk stratification, 91% of respondents agreed or strongly agreed that there was value in RSS use. Not unexpectedly, we found that the choice of RSS was closely associated with medical specialty and geographic region. For example, the K-TIRADS is likely to be favored in the Republic of Korea, while the EU-TIRADS is bound to be more familiar to and adopted by European physicians. However, a surprising finding was that almost 31% of respondents reported use of more than one RSS in their practice. This suggests that in addition to variability in the choice of RSS related to location and specialty, there is also inconsistency within practices.
We speculate that this may occur when the physicians who comprise the practice prefer a particular RSS but referring physicians request that a different system be used for their patients. It is also possible an individual's RSS use relates to their training experience, and their choice may be different from that of their colleagues. Of the respondents using multiple RSSs, the most common combination was the ACR TI-RADS and ATA, particularly in American institutions. For example, radiologists in the United States may categorize nodules with the ACR TI-RADS as recommended by the ACR while simultaneously reporting them using the ATA guidelines at the request of endocrinologists and surgeons, who understandably want to avoid re-classifying them based on reported ultrasound findings.
More than half of the respondents expressed a desire for a universal lexicon to reduce interobserver variability, an issue that has also been highlighted by several investigators. A multicenter study found that interobserver variability for six features ranged from poor to substantial, with most characterized as moderate (13). Similarly, Hoang et al. found poor interobserver agreement for margin and punctate echogenic foci, leading them to recommend targeted education (14). Respondents also indicated support for an online atlas of ultrasound images and/or video clips to illustrate features.
The reasons given by the 271 respondents who indicated that they did not personally use an RSS were particularly instructive. Addressing their concerns will be critical to encourage adoption of a universal RSS. Notably, more than 60% preferred to use their own description of the ultrasound features they consider most relevant, rather than apply an RSS. However, this further contributes to confusion for patients and physicians caused by the use of multiple systems and guidelines, as well as leading to reduced efficiency and errors caused by re-classification.
It also complicates research that seeks to refine risk assessment based on grayscale ultrasound and other techniques, such as elastography. This is supported by a study by Solymosi et al. that compared FNAB recommendations based on personal expertise with a composite score derived from the five RSSs in our study (15). These investigators found that personal experience provided a more accurate diagnosis for thyroid malignancies overall, missing a lower number of small thyroid cancers, while TIRADS exhibited similar accuracy for clinically relevant or potentially aggressive lesions.
Even more concerning are variable recommendations for biopsy and follow-up imaging between RSSs. This issue has become particularly acute in recent years as patients increasingly access their own reports and wonder why a nodule might receive divergent management advice. This will be increasingly important as active surveillance and minimally invasive therapy for thyroid cancer management gain acceptance (16 –18).
Although we do not believe that a universal RSS will eliminate variations in management, which will always be guided by local practice and other considerations, we believe that it could serve as a foundation upon which such decisions can be made. Even more importantly, a widely endorsed system produced by an international panel of experts could substantially reduce the collective effort currently expended by multiple professional organizations to develop or revise their own guidelines. Additionally, a universal RSS could be revised more frequently to better keep pace with techniques such as ultrasound elastography that are not widely used now but could be adopted in the future.
Our study had several limitations. First, although we attempted to canvas a diverse group of physicians, these results may not be generalizable because most respondents were academic practitioners and/or endocrinologists. Second, we were unable to determine the response rate for the relevant population of practitioners since most of the participating societies offered the survey to all their members, many of whom may not see patients with thyroid nodules. As well, many organizations whose members are interested in thyroid nodules, such as the AACE and the American Association of Endocrine Surgeons, were not surveyed.
It was not possible to determine the response rate, as some respondents indicated membership in more than one society, and we did not ask to which organization's solicitation they were replying. Physicians with a greater interest in RSSs were more likely to reply to a survey titled “Risk Stratification Systems: Usage and Needs,” potentially biasing the results. However, our purpose was to explore the practice patterns and opinions of a group of stakeholders, rather than conduct formal hypothesis-driven research.
Conclusions
Despite these shortcomings, we believe that the results of this survey, particularly the concerns expressed by practitioners who do not use an RSS, will be valuable to inform the ongoing work of the ITNUWG. Respondents indicated strong support for a universal lexicon and image atlas, the development of which is part of the I-TIRADS initiative. As well, we hope the finding that more than one RSS is used in nearly one third of practices will foster interest in adopting a unified system that has the potential to reduce diversity in thyroid nodule classification, benefiting practitioners and patients alike.
Footnotes
Acknowledgments
The authors thank Drs. Susan Mandel and William Middleton for their assistance in soliciting survey participation by members of the American Thyroid Association and the Society of Radiologists in Ultrasound, respectively.
Authors' Contributions
All authors confirm that they have contributed significantly to all aspects of this work and approved the final version of the article.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Data
