Abstract
Background:
The use of thyroid ultrasound increases yearly, adding to costs and overdetection of clinically irrelevant nodules. We investigated which indications most commonly prompt referral for thyroid ultrasound and the diagnostic utility by indication.
Methods:
We performed a retrospective observational cohort study of adults (≥18 years) undergoing an initial dedicated thyroid ultrasound between 2017 and 2019 at a tertiary academic center. Indicated reasons for referral were categorized into suspected palpable nodule (SPN), compressive symptoms (CS), metabolic symptoms (MS), screening due to high-risk factors, follow-up of incidental finding on other imaging, and combination of factors. Percentage of ultrasounds with an identifiable nodule and with a nodule recommended for biopsy was compared by indication. Separate logistic regression models were used to identify factors associated with finding any nodule and a biopsy-recommended nodule.
Results:
Among the 1739 patients included, the most common indication for thyroid ultrasound was SPN (40%), followed by incidental imaging (28%), CS (13%), combination (11%), MS (6%), and high-risk factors (2%). Overall, 62% of ultrasounds identified a nodule. Ultrasounds performed for incidental findings had the highest rate of nodule identification (94%), compared with 55%, 39%, and 43%, for SPN, CS, and MS, respectively (p < 0.05). Only 27% of ultrasounds identified a biopsy-recommended nodule. Nodules found incidentally had the highest rate of biopsy-recommended nodules at 55%. Rates of biopsy-recommended nodules for SPN, CS, and MS were 21%, 6%, and 10%, respectively. Logistic regression demonstrated that compared with patients referred for an SPN, those with incidental nodules were 10 times more likely to have a nodule found on ultrasound (odds ratio [OR] = 10.6 [CI 7.0–16.0]), while those referred for CS were half as likely to have a nodule (OR = 0.5 [CI 0.4–0.7]). Similar factors were associated with identification of biopsy-recommended nodules.
Conclusions:
Of all new dedicated thyroid ultrasounds, only a quarter find biopsy-recommended nodules, and nearly 40% do not identify a nodule at all. Notably, only 55% of ultrasounds done for SPN found a nodule. Ultrasound for CS and MS had the lowest rates of detecting nodules. Providing clear guidance on when to order thyroid ultrasounds can help reduce unnecessary health care utilization and potential overtreatment.
Introduction
The number of thyroid ultrasounds performed in the United States has increased fivefold since 2002. 1 This substantial increase produces a significant strain on health care resources and leads to overdetection and overtreatment of benign thyroid nodules and small indolent cancers with questionable clinical relevance. 1 –4 In addition, the discovery of clinically irrelevant nodules can result in undue patient anxiety, requires long-term surveillance, and risks overtreatment. 5
Several factors have contributed to the increase in thyroid ultrasound use, including the ease of accessibility to ultrasound and providers' increased reliance on diagnostic imaging, in general. 6 –8 Moreover, there is little guidance on referral practice for thyroid ultrasounds, as current guidelines focus on the management of nodules once found. 9,10 While certain societies offer guidance on ordering a thyroid ultrasound in specific situations, such as nodules found incidentally on other imaging (American College of Radiology [ACR]) or for workup of abnormal thyroid function tests (Choosing Wisely), there are no comprehensive guidelines covering appropriate indications for thyroid ultrasound. 11,12
Furthermore, controversy exists regarding whether to order ultrasounds in certain situations, such as high-risk individuals with familial diseases or strong family history, or compressive symptoms (CS). 9,12 –15 Consistent with this, a recent meta-analysis evaluating the appropriateness of thyroid ultrasounds demonstrated that each of the seven studies had its own definition of what constitutes an appropriate ultrasound. 16
Determining the upstream factors that lead to utilization of thyroid ultrasounds is crucial to enhancing patient care, reducing unnecessary testing, and improving guidance on when thyroid ultrasounds are useful. 17 In this study, data from a large tertiary health care system are used to better understand which indications prompt providers to refer for thyroid ultrasound and the diagnostic utility by indication.
Methods
The retrospective observational cohort study was approved by the University of Wisconsin Health Sciences Institutional Review Board (ID: 2020-0128). We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guidelines.
Study population
All adults (≥18 years) who were referred for a new dedicated thyroid ultrasound at a tertiary academic medical center between the years 2017 and 2019 were identified in the electronic medical record (Epic Clarity; Epic Systems Corporation) by a data scientist (A.M.M.). Eligible patients were required to have at least two outpatient encounters in the year preceding the ultrasound.
The 1-year lookback and the two encounter requirement were used to identify and review the clinic notes that describe the ultrasound referral indication as well as capture patient comorbidities. Two encounters were used as the accuracy of the Charlson Comorbidity Index (CCI) is substantially enhanced when two outpatient encounters are used to capture comorbidities. 18 Patients with a documented history of thyroid cancer, any previously recorded thyroid ultrasounds, or who were incarcerated were excluded.
Data extraction
Trained research staff reviewed referring provider notes, ultrasound report, fine-needle aspiration biopsy (FNAB) report, and relevant encounters leading up to the ultrasound for thyroid ultrasound referral indications. The extraction form was developed by senior authors (D.O.F., L.M.G., and S.F.T.). The extraction team (E.K., C.C., Y.Q., Y.Z., A.A., K.B., and D.S.) reviewed the first 10 charts together using the extraction form to establish consensus on how best to extract fields from the notes to ensure reliability and consistency.
Thereafter, the reliability check process was performed in several additional stages. A quality check was performed once 100 charts and 700 charts (∼20%) were completed by an independent study personnel (Y.Z.) who rereviewed each completed chart and discrepancies were addressed and discussed among the extraction team. Furthermore, the team had bimonthly meetings to discuss issues with extractions and questions were discussed. This process resulted in a high degree of extraction consistency within the team.
Extractors categorized referral reasons into (1) suspected palpable nodule (SPN) on physical examination; (2) CS (e.g., globus sensation and dysphagia); (3) metabolic symptoms (MS; e.g., fatigue, weight change, and heat intolerance); (4) screening due to high risk for thyroid cancer (e.g., multiple endocrine neoplasia, Pendred syndrome, and family history of thyroid cancer); (5) follow-up of incidental thyroid nodule detected on another imaging study; or (6) combination of aforementioned factors. Suspected cases of goiter were included in the physical examination findings of a palpable thyroid nodule or mass.
Additional variables extracted included the date of ultrasound, maximum diameter of largest nodule identified, if a biopsy was recommended by the interpreting radiologist, and if FNAB was performed. Demographic variables extracted included patient age, sex, race/ethnicity (White, Black, Asian, and Hispanic), insurance status (Medicaid, Medicare, private insurance, and uninsured), and CCI (0, 1, and 2+).
Outcomes
The primary outcome of interest was the percentage of thyroid ultrasounds performed that had an identifiable nodule for each referral indication. The secondary outcome evaluated “biopsy-recommended” nodules specifically, which were defined as nodules that the radiologist recommended for biopsy. Radiologists at our institution routinely utilize the ACR Thyroid Imaging Reporting and Data System (TI-RADS), and recommendations for biopsy are based on nodule size and TI-RADS category; however, specific TI-RADS scores were not abstracted. The percentage of biopsy-recommended nodules among all ultrasounds performed was compared among each referral indication.
Statistical analysis
Differences between groups for categorical variables were calculated utilizing chi-squared tests and continuous variables (e.g., nodule size) were compared using analysis of variance (ANOVA). Two separate multivariable logistic regression analyses were performed to analyze the association of referral indications with (1) the identification of a nodule on ultrasound and (2) identification of a biopsy-recommended nodule on ultrasound. Both logistic regression models controlled for factors known to affect ultrasound utilization: race/ethnicity, insurance type, CCI, sex, and age. 19 All statistical analysis and data visualizations were performed using SAS (version 9.4, Cary, NC).
Results
Demographic data
In all, 3459 discrete neck ultrasounds were identified between 2017 and 2019. Of these, 1739 (50%) met inclusion criteria (Fig. 1). The majority of patients were female (76%), White (86%), privately insured (62%), and healthy (76% with CCI 0); the mean age was 53 years (standard deviation [SD] 17; Table 1).

Flow diagram of patient inclusion into study.
Demographics By Ultrasound Indication
CCI, Charlson Comorbidity Index.
Indications
The most common indication for thyroid ultrasound referral was an SPN on physical examination (40% of all ultrasounds performed; Table 1). The next leading referral indication was follow-up of a nodule identified on incidental imaging (28%). Referral for CS and MS made up 13% and 6% of ultrasounds, respectively. Table 2 shows the distribution of specific CS and MS. Only 2% of patients were referred for thyroid ultrasound due to high-risk factors. Patients referred for a combined indications accounted for 11% of thyroid ultrasounds. The combined indication was most often (45%) due to a patient presenting with CS and an SPN on examination (Supplementary Table S1).
Reported Symptoms Among Patients Referred for Thyroid Ultrasound for Symptom Workup
Total of 223 patients with local symptoms only. Two patients had unspecified symptoms. Percentage adds up to more than 100% as patients could report multiple symptoms.
Thyroid nodule identification by indication
Among all ultrasounds performed, 62% identified a thyroid nodule. Patients referred for incidental findings had the highest percentage of ultrasounds with thyroid nodules present at 94% (Fig. 2). Several possible reasons exist for the 6% of incidental findings that lacked a thyroid ultrasound correlate. For example, some were imaging “suggestive” of a nodule, but not definitively a nodule. These findings typically derived from imaging that was not optimal for the thyroid (e.g., chest or cervical spine computed tomography scan and nuclear imaging); and the suggestive nodule was ultimately not detectable on dedicated thyroid ultrasound.

Percentage of ultrasounds with an identified nodule (dark gray) and percentage of ultrasounds with an identified biopsy-recommended nodule (light gray), by ultrasound indication.
In comparison, patients referred because of an SPN on examination and for CS identified nodules on 55% and 39% of ultrasounds, respectively. Patients with MS had a nodule on ultrasound 43% of the time. Among those referred for high-risk factors, 57% had a nodule present. Patients referred for combined indications had nodules on 42% of ultrasounds. ANOVA indicated the difference in nodule detection rate between indications was statistically significant (p < 0.05).
Only 27% of ultrasounds identified a thyroid nodule that was recommended for biopsy. Those referred from incidental imaging had the highest percentage of biopsy-recommended nodules (55%; Fig. 2). This was followed, in order, by patients referred for physical examination of a palpable thyroid nodule (21%), high-risk factors (20%), combined indications (16%), MS (10%), and CS (6%).
Nodule size
The overall mean maximum diameter of identified thyroid nodules was 1.9 cm (SD 1.3 cm) (Table 3). Of note, 32 radiology reports identified a thyroid nodule but did not record size and were, therefore, excluded from this specific analysis. Patients referred for incidentally found thyroid nodules had the largest mean nodule size at 2.4 cm (SD 1.2). All other groups had a mean nodule size between 1.2 and 1.8 cm. The difference in average size of nodule was statistically significant (p < 0.05).
Mean, Median, and Maximum Thyroid Nodule Size By Ultrasound Referral Indication
SD, standard deviation; US, ultrasound.
Factors associated with detecting a nodule on ultrasound
A logistic regression model evaluated the association of patient age, sex, ethnicity, CCI, insurance status, and referral indication with identification of a nodule on ultrasound (Table 4). Compared with patients referred for an SPN on examination, those with incidental nodules were over 10 times more likely to have a nodule found on ultrasound (odds ratio [OR] = 10.6 [confidence interval {CI} 7.0–16.0]). Conversely, patients referred for CS were half as likely to have an identifiable nodule compared with those referred for physical examination findings (OR = 0.5 [CI 0.4–0.7]). In addition, the odds of finding a nodule increased with age, especially for those aged 65 years and above (vs. age <45 OR = 3.6 [CI 2.2–5.9]). Finally, females were twice as likely to have a nodule found on thyroid ultrasound (OR = 2.0 [CI 1.5–2.6]).
Multivariable Logistic Regression of Variables Associated with the Identification of a Nodule on Ultrasound and with the Identification of a Biopsy-Recommended Nodule on Ultrasound
OR, odds ratio.
Results were similar when evaluating only biopsy-recommended nodules (Table 4). Compared with patients referred for an SPN on examination, those referred for incidental findings were significantly more likely to have a biopsy-recommended nodule (OR = 4.7 [CI 3.5–6.3]), while those referred for CS were significantly less likely to have a biopsy-recommended nodule on ultrasound (OR = 0.3 [CI 0.1–0.5]. There was no difference in identifying biopsy-recommended nodules between sexes (female vs. male OR = 1.2 [CI 0.9–1.6]).
Discussion
Of all thyroid ultrasounds performed over a 3-year period, only a quarter found a nodule recommended for biopsy, and nearly 40% did not identify a nodule at all. This high rate of negative findings points to an area of potential medical overuse. The reliance on thyroid ultrasound is understandable, as it is widely regarded as safe, efficient, and an important part of identifying treatable thyroid cancers. 20 However, ultrasonography also contributes significantly to the increasing costs of thyroid pathology management, which are projected to exceed $3.5 billion by 2030. 11 Moreover, many clinically insignificant nodules are surveilled after the initial test, adding surplus costs and resource use. 3,4 In light of high cumulative costs and low rates of clinically relevant nodules, it is important to re-evaluate the rationale for ultrasound use to reduce unnecessary testing.
Ultrasounds referrals for CS had the lowest clinical yield, with fewer than half discovering a thyroid nodule, and only 6% finding a biopsy-recommended nodule. Although thyroid nodules can be a source of symptoms such as dysphagia, dysphonia, globus sensation, and dyspnea, these symptoms are far more likely to be from other etiologies. 13 Our data suggest that using these symptoms as an indication for thyroid ultrasound is of low value. Furthermore, nodules found during workup of symptoms were often small—the mean size of a nodule found in workup of symptoms was roughly a centimeter—further calling into question if these nodules can be accurately attributed to the patient's symptoms (Table 3).
Guidelines from the American Thyroid Association, American Head and Neck Society, and American Association of Endocrine Surgeons do not make specific recommendations on using ultrasound to work up local CS; however, the American Association of Clinical Endocrinologists states that an ultrasound is indicated without the presence of a palpable nodules if there is “persistent dysphonia, dysphagia, or dyspnea.” 9,10,13
It is unclear if this suggests an ultrasound should be performed after the evaluation of other causes of symptoms or what duration is implied by “persistent.” Regardless, our data suggest that guidelines should consider limiting the use of thyroid ultrasound in the workup of symptomatology, as this could help reduce unnecessary tests while also expediting the workup of other, more likely etiologies of a patient's symptoms. In particular, symptoms such as globus pharyngeus, hoarseness, and dysphagia are more appropriately evaluated with tests such as flexible laryngoscopy, manometry, and swallow studies. 21,22
Ultrasound referral due to SPN also had a very low rate of nodule detection. In this group, a nodule was present only 55% of the time, which is particularly notable in light of autopsy studies showing the incidence of thyroid nodules to be 40–55%. 23 This reinforces the idea that the physical examination added little overall to the pretest probability of identifying a nodule. The low diagnostic yield of the thyroid examination is consistent with published research demonstrating the questionable quality of physical examinations related to the thyroid.
One study compared thyroid ultrasound findings with two independent physicians performing thyroid physical examinations demonstrated that 63.7% of cases were either missed or misinterpreted. 14 Although physical examination remains an essential part of the physician's toolbox, thyroid palpation has repeatedly demonstrated a high rate of missed nodules and a significant false-positive rate, as seen in our study and others. 15 Clinical examination skills, in general, have been noted to be increasingly less accurate, which have been attributed to an increased reliance on technology along with the ever-increasing time pressures providers have that leave less time for thorough examinations. 24
Considering that physical examination findings were the largest driver of thyroid ultrasounds, the utility of the physical examination in the diagnostic pathway of thyroid pathology needs to be critically re-evaluated. To be clear, the thyroid examination is still an important part of a physician's evaluation, but it is essential to recognize our own limitations and not rely solely on the physical examination to guide decision-making. This also presents a need for further evaluation into how provider factors, such as specialty, time in practice, and training, impact diagnostic utility of the physical examination.
The low rates of nodule detection for certain indications points to a potential area of medical waste. In addition, when nodules are detected, many are ultimately clinically irrelevant yet still incur significant costs and impart an emotional toll on patients. 3,5 In light of this, consensus guidelines on when it is appropriate to order a thyroid ultrasound are needed. This would not only help guide providers, but can also be useful in educating patients, as patient requests are a frequently cited driver of unnecessary thyroid ultrasounds. 14
This study had a number of limitations. First, these findings are from a single health care system and race/ethnicity, socioeconomic differences, and practice patterns may not be representative of national or global diversity. In addition, data on thyroid function tests, more granular criteria on rationale for recommending biopsy (ACR TI-RADS score), and biopsy results were not abstracted, which limited the detail of our analysis.
Data also limited our ability to determine and analyze whether ultrasounds ordered were appropriate or not and their respective diagnostic yields on biopsy, as fine-needle aspiration results were not collected. Furthermore, only the presence and size of nodules were analyzed, and the presence of thyromegaly without nodules could influence an indication, particularly physical examination findings and symptoms. Calculation of full thyroid volume, and its relation to symptoms and palpability, would help present a more complete picture, and is an area that needed further study.
In addition, we cannot discern the relevance of MS because we do not have laboratory data to indicate if these ultrasounds were appropriately ordered. Finally, this study captured only thyroid ultrasounds that were formally referred to radiology. It is unclear how often point of care or in-office thyroid ultrasound was utilized, which may affect the frequency of ultrasounds performed for each indication and potentially diagnostic yield. Future research should expand to multiple sites and compare additional characteristics of each indication, including practice setting, and provider specialty training.
Conclusions
Only a quarter of dedicated thyroid ultrasounds yielded a nodule recommended for biopsy, and 39% did not have a nodule present at all. Moreover, only half of ultrasounds performed to evaluate a suspected nodule on physical examination actually identified one. Ultrasound referrals for symptoms had the lowest diagnostic yield, with fewer than 10% identifying a biopsy-recommended nodule. These findings suggest that re-evaluation of guidelines on thyroid ultrasound referral may be necessary to decrease the overutilization of thyroid ultrasound and the downstream consequences of detection of small potentially irrelevant nodules.
Authors' Contribution
Data curation (equal), investigation (equal), and writing—original draft (equal) by E.K. Data curation (equal), investigation (equal), supervision (equal), project administration (equal), and writing—review and editing (equal) by Y.Z. Data curation (equal) by Y.Q., C.C., A.A., K.B., and D.S. Conceptualization (equal) by L.M.G. Conceptualization (equal), supervision (equal), project administration (lead), methodology (equal), and writing—review and editing (equal) by N.A. Data curation (equal) and investigation (equal) by A.M.M. Conceptualization (equal), supervision (equal), project administration (equal), methodology (equal), and writing—review and editing (equal) by S.F.T. Conceptualization (lead), resources (equal), and writing—review and editing (equal) by D.O.F. Conceptualization (equal), methodology (equal), formal analysis (equal), and writing—original draft (equal) by A.S.C.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Funding Information
E.K., Y.Z., Y.Q., C.C., A.A., K.B., L.M.G., A.M.M., and A.S.C. have nothing to disclose. N.A., S.F.T., and D.O.F. were supported by National Institutes of Health (NIH), National Cancer Institute (Grant Nos. R01CA251566 and R01CA251566-02S1).
Supplementary Material
Supplementary Table S1
