Abstract
Background:
The American Thyroid Association Sonographic Pattern System (ATASPS) depicts five levels of suspicion for malignancy based on the sonographic appearance of a thyroid nodule. However, 3–37% of nodules are non-classifiable when the combination of grayscale findings is not depicted by the ATASPS. The only calcifications included in the ATASPS are in solid hypoechoic high suspicion (HS) nodules and include both microcalcifications and peripheral interrupted calcifications with soft tissue extrusion. Non-hypoechoic nodules with these and other calcification patterns, which we defined as non-high suspicion calcifications (NHSC), are not classifiable by ATASPS. We assessed the effect of assigning an ATASPS risk level to nodules with NHSC based on analysis of their other grayscale features.
Methods:
A retrospective review of 728 consecutively biopsied nodules was performed. Nodules were classified by ATASPS as HS, intermediate suspicion (IS), low suspicion (LS), or very low suspicion (VLS); other nodules with patterns not described by ATASPS were non-classifiable (NC). If NC was due to NHSC, the nodule was assigned an ATASPS by analysis of grayscale features alone. Cytology and pathology results were correlated with assigned ATASPS level.
Results:
A NC pattern was observed in 144 of the 728 nodules (20%). Of these, 101/144 (70%) had NHSC and the assigned ATASPS was IS (n = 18), LS (n = 62) and VLS (n = 21). The distribution of cytology diagnoses within this group was similar to classifiable nodules (IS p = 0.13, LS p = 0.55, VLS p = 0.44). The majority of NHSC (n = 92, 91%) were macrocalcifications (large central or linear dystrophic calcifications); however, 9 LS pattern nodules had punctate echogenic foci, possibly representing microcalcifcations, with an estimated cancer prevalence of 19% (vs. 10% for total LS group, p = 0.24). The remaining NC nodules (43/144, 30%) included solid nodules with heterogeneous echogenicity (n = 30) or presence of a complete circumferential rim calcification, limiting further sonographic assessment (n = 13). Malignancy was identified in 11 out of 43 (26%) of these [9/30 (30%) heterogeneous solid and 2/13 (15%) with complete rim calcifications].
Conclusions:
Macrocalcifications accounted for the majority of NHSC and these did not alter the expected ATASPS malignancy risk based on grayscale features.
Introduction
Over 600,000
To this end, the American Thyroid Association (ATA) 2015 thyroid nodule guidelines (12) described and illustrated five levels of suspicion for malignancy based on the sonographic appearance of a thyroid nodule: benign, very low suspicion (VLS), low suspicion (LS), intermediate suspicion (IS), and high suspicion (HS), each with a corresponding estimated range of malignancy risk (3). Purely cystic nodules are considered benign and do not require FNA. The VLS nodules are mixed cystic solid or spongiform nodules and are reported to have a risk of cancer below 3%. The LS nodules are non-calcified, smoothly marginated solid, and iso- or hyperechoic nodules or mixed nodules with an eccentric solid component and have a malignancy risk of 5–10%. Solid and hypoechoic nodules with no other high-risk features have a 10–20% chance of cancer and are considered IS. Finally, HS pattern nodules, with a reported malignancy rate of 70–90%, are described as hypoechoic solid nodules with any of the following: microcalcifications (small nonshadowing bright spots), taller than wide shape, irregular margins, interrupted peripheral calcifications with an extrusive soft tissue component, extrathyroidal extension, or suspicious lymph nodes. Size thresholds for FNA vary based on the American Thyroid Association Sonographic Pattern System (ATASPS), with observation alone being appropriate for any size of VLS pattern nodules.
An estimated 3-37% of nodules are NC by ATASPS (3 –6). The presence of calcifications in non-HS nodules is not described by the ATASPS. This includes both macrocalcifications with distal acoustic shadowing and nonshadowing punctate echogenic foci (PEF), termed microcalcifications by the ATASPS. However, the positive predictive value of PEF for pathologically proven psammomatous or microcalcifications is only 45–48%; in other instances, PEF reflect thick colloid or nonshadowing macrocalcifications (7). Therefore, we decided to adopt the term PEF when describing their presence.
The aim of our study was to assess whether non-high suspicion calcifications (NHSC), defined as macrocalcifications and/or PEF in non-hypoechoic solid nodules, modified cancer risk as compared with the ATASPS risk that would have been assigned based on the grayscale ultrasound features alone. We also examined malignancy risk in noncalcified nodules with other features that precluded categorization into ATASPS, such as heterogeneous echogenicity.
Materials and Methods
Data collection
After obtaining Institutional Review Board approval, a retrospective review of thyroid nodule ultrasound patterns and thyroid FNA cytology results from 686 consecutive subjects with 728 nodules biopsied by the Division of Endocrinology, Diabetes and Metabolism at Hospital of the University of Pennsylvania between January 2014 and December 2015 was performed. This time period was selected as VLS pattern nodules with and without calcifications were still being routinely biopsied in accordance with prior guidelines (8). All biopsies were performed under ultrasound guidance by an endocrinologist using a 27- or 25-gauge needle with two or more passes. On-site cytopathology was performed to ensure adequacy of the specimen. Cytology results were classified by cytopathologists at the University of Pennsylvania per the Bethesda classification (9) as follows: Bethesda VI-malignant (B6), Bethesda V-suspicious for malignancy (B5), Bethesda IV-follicular neoplasm/suspicious for follicular neoplasm (B4), Bethesda III-atypia/follicular lesion of undetermined significance (B3), Bethesda II-benign (B2), or Bethesda I-non-diagnostic (B1). In cases where nodules with a B1/B3 cytology underwent repeat FNA, the second FNA result was used.
In subjects with indeterminate cytopathology (B3 or B4), a first-generation Afirma Gene Expression Classifier (GEC) result (Veracyte, Inc., San Francisco, CA) was recorded if available. At our institution at the time of the study period, Afirma was only performed on select B3/B4 nodules, as all B5 nodules were referred for surgery. When surgery was performed, surgical pathology results from surgeries completed at the University of Pennsylvania were reviewed to determine the final outcome of each nodule.
Image interpretation
Thyroid ultrasound was performed on a high-resolution ultrasound machine by using a 12 MHz linear-array transducer (iU 22 or Epiq Philips Healthcare, USA) in the radiology department, Ultrasound Division. Each biopsied nodule was retrospectively classified into one of five ATASP or was categorized as NC. Images were reviewed by one of three endocrinologists (K.K., C.S.K., S.J.M.) who had minimum of 5 years of thyroid ultrasound experience in an academic practice. Reviewers were blinded to the cytology/surgical outcomes. If there was uncertainty about classification, at least two reviewers would come to a consensus. The presence of NHSC was recorded. As previously discussed, NHSC were defined as shadowing macrocalcifications and PEF in nodules that were not solid and hypoechoic, excluding VLS pattern nodules (mixed cystic solid/spongiform) where the ATASPS allows for the presence of nonshadowing echogenic foci. All nodules with NHSC were analyzed and assigned an ATASPS risk category based on analysis of their other grayscale features such as echogenicity and consistency, or considered NC when they were still unable to fit within the ATASPS.
Statistical analysis
Descriptive statistics were calculated in Microsoft Excel. Interobserver variability was assessed in a subset of subjects by using a free marginal kappa statistic. The Fisher's exact test was used to compare the frequency of cytology results within each ATASPS for nodules with and without calcifications. A p-value <0.05 indicated statistical significance.
Determining malignancy rate with each ATASPS
Surgical pathology results were the gold standard for determining malignancy rate for each category of the ATASPS when available (n = 198, 27%). Many subjects did not warrant surgical resection of their nodule(s) based on benign cytopathology or molecular testing, while others declined surgery or were lost to follow-up. When surgical resection was not performed, malignancy rates were estimated by using institutional cancer rates based on Bethesda cytology and Afirma GEC molecular testing estimates (9,10). No estimate of malignancy could be made for non-operated nodules with non-diagnostic (B1) cytology (n = 7). These comprised no more than 2% of any ATA category, and none of these nodules were considered malignant in the final estimate. Subjects who underwent surgery and were found to have the noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) (11) were considered in the malignant group for our analysis (n = 12).
Results
We evaluated 728 nodules from 686 subjects, of whom 559 (81%) were female. The mean age of subjects was 54 years old, and the mean thyrotropin level was 1.8 U/mL (n = 680). The mean nodule size was 25 mm (range 7–94 mm). There was a family history of thyroid cancer in a first-degree relative in 34 subjects (5%), and 11 subjects (2%) had a history of childhood radiation exposure.
The majority of nodules (n = 584, 80%) were classifiable by ATASPS. The estimated malignancy rates were 75%, 15%, 10%, and 1% for each of the ATASPS categories HS, IS, LS, and VLS, respectively, within the predicted ranges by the ATASPS (12). The NC patterns were identified in 144 nodules (20%) (Fig. 1). Of these, 101 nodules (14%) were NC due to NHSC (Figs. 2 and 3). Complete rim calcification completely obscured the underlying grayscale pattern in 13 nodules (2%); these were excluded from the NHSC analyses (Fig. 4). Another 30 nodules (4%) were not able to be classified, as the solid component demonstrated heterogeneous echogenicity (Fig. 5). The K value for interobserver variability was 0.72, which was consistent with good interobserver agreement.

Classification of 728 thyroid nodules into ATASPS or non-classifiable. ATASPS, American Thyroid Association Sonographic Pattern System; NHSC, non-high suspicion calcifications.

Solid hypoechoic nodule with a central linear macrocalcification in sagittal and transverse views; cytology showed a follicular neoplasm (Bethesda IV), and surgical pathology showed a follicular adenoma.

Sagittal view of a solid isoechoic nodule with large echogenic foci with shadowing as well as non-shadowing punctate echogenic foci. Cytology was benign (Bethesda II).

Peripherally calcified thyroid nodule where the underlying grayscale pattern was completely obscured.

Longitudinal and transverse views of a solid thyroid nodule with mixed echogenicity; cytology showed papillary thyroid carcinoma (Bethesda VI).
Analyzing grayscale features, we classified 101 nodules with NHSC into ATASPS and these were assigned designations of IS, LS, or VLS [IS (n = 18, 17%), LS (n = 62, 61%), and VLS (n = 21, 21%)]. The observed cytology outcomes between these nodules and those noncalcified nodules that could be classified by using traditional ATASPS were not significantly different (IS p = 0.13, LS p = 0.55, VLS p = 0.44) (Table 1). The estimated cancer prevalence of these NHSC nodules was 7%, versus 9% in the total group of IS/LS/VLS nodules (not significant). Macrocalcifications composed the majority of observed NHSC (n = 92, 91%), including coarse calcifications, central and peripheral linear and curvilinear calcifications, and non-interrupted peripheral rim calcifications. The remaining 9 NHSC nodules demonstrated PEF, rather than macrocalcifications. All were predominantly solid and isoechoic and thereby assigned to the LS category, with an estimated cancer prevalence of 19% (vs. 10% for the total LS group, p = 0.24).
Cytology Distribution of Non-Calcified Nodules Versus Nodules with Non-High Suspicion Calcifications in Intermediate Suspicion, Low Suspicion, and Very Low Suspicion Patterns
IS p = 0.13, LS p = 0.55, VLS p = 0.44, all exclusive of Bethesda I nodules.
ATASPS, American Thyroid Association Sonographic Pattern System; IS, intermediate suspicion; LS, low suspicion; NHSC, non-high suspicion calcifications; VLS, very low suspicion.
Of the remaining 43 (6%) nodules with NC ATA sonographic patterns, 13 had complete rim calcification obscuring grayscale features and 30 were solid with a heterogeneous echogenicity. Solid nodules with a heterogenous echogenicity had benign cytology in only 13 (43%), while the majority (11/13 85%) of completely rim calcified nodules had benign cytology.
Discussion
We were able to validate the application of ATASPS in our patient population as a risk stratification tool. Our observed cytology results and corresponding predicted malignancy rates aligned well with the percentages proposed by the ATASPS. Other prospective and retrospective studies have similarly validated ATASPS (4,13 –15); a notable exception is the Yoon study, which obtained a 58% malignancy rate in HS nodules (16). Our study differs from these in that it includes results of cytopathology, histopathology, and molecular testing and it has a more detailed assessment of calcifications. Further, to our knowledge, our study contains the largest detailed analysis of nodules that are not classifiable by ATASPS (3 –6).
Several sonographic risk stratification systems are utilized in clinical practice to select appropriate thyroid nodules for FNA, and calcifications are handled differently in these classifications. The American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) assigns one risk point for macrocalcifications, two points for peripheral (rim) calcifications, and three points for PEF (17). Korean TIRADS and the European Union TIRADS assign risk to microcalcifications (sonographic appearance as PEF) alone (18,19). These guidelines reflect the heterogeneity in the literature surrounding the risk of certain calcification patterns, which is why we sought to further examine calcifications not included in the ATASPS, which we defined as NHSC.
Of the NHSC nodules, the large majority (>90%) were large central or linear dystrophic calcifications. To our knowledge, no other investigators have assessed whether macrocalcifications alter risk based on grayscale imaging assignment within the ATASPS. The ACR TI-RADS authors acknowledge that data for increased malignancy risk with macrocalcifications is variable, particularly in the absence of other moderate to HS grayscale features (17). Data surrounding macrocalcifications are heterogenous; and although increased malignancy risks have been reported, some studies do not report the underlying grayscale pattern of the nodule (20,21). When the underlying grayscale pattern is assessed, macrocalcifications are more likely to be associated with malignancy if the other grayscale features confer higher risk (22,23). Gwon et al. reported a rate of PTC of 23% in nodules with isolated macrocalcifications; however, of the six nodules that underwent surgical resection, 4 (67%) were ATA HS pattern, 1 (17%) was ATA IS suspicion pattern, and 1 (17%) was ATA LS pattern, suggesting that these calcifications may not have increased risk beyond that expected in the underlying ATASPS (23). A recent paper assessing the use of artificial intelligence to analyze thyroid nodules using the ACR TI-RADS risk stratification system found that eliminating macrocalcifications from the point scale did not alter diagnostic performance (24). Our analysis likewise suggests that the majority of NHSC are macrocalcifications that do not alter risk of malignancy within ATASPS; therefore, it is likely reasonable to select nodules for FNA based on the underlying ATASPS proposed in the ATA guidelines despite the presence or absence of macrocalcifications.
In analyzing NHSC, we also assessed PEF in non-hypoechoic solid nodules. Although other classification systems have addressed cancer risk when PEF are found in mixed or solid iso- or hyperechoic nodules (3 –6), the ATASPS cannot classify this pattern. In our small sample size of nine nodules with PEF, all were ATA LS pattern nodules based on remaining grayscale features. Perhaps due to the small sample, there was a nonsignificant but increased malignancy risk of 19%, compared with that predicted by the underlying ATASPS of LS (estimated 5–10% risk). This is potentially clinically significant and aligns with other studies that have reported that PEF are intermediate risk in iso- and hyperechoic solid nodules or partially cystic nodules, but high risk in hypoechoic solid nodules (22). The association of PEF with malignancy has been well established (25,26) and the ACR TI-RADS, EU-TIRADS, and Korean-TIRADS guidelines all have a lower threshold for biopsy when PEF are present even in non-hypoechoic solid nodules (17 –19).
In addition, we also assessed completely rim calcified nodules with obscuration of the underlying grayscale pattern. In our small sample size, 85% of these nodules yielded benign cytology results in contrast with recent findings from Malhi et al., where nearly one-third of peripherally calcified nodules were malignant regardless of whether the underlying sonographic pattern was obscured or the calcifications were continuous or interrupted (27). Therefore, our findings should not be generalized to include nodules where calcifications completely obscure the sonographic features of the underlying nodule.
A final finding in our study was an apparent increased malignancy rate in solid nodules with heterogeneous echogenicity, also NC by the ATASPS. Valderrabano et al. (5) assessed ATA sonographic patterns of 463 cytologically indeterminate (Bethesda III/IV) thyroid nodules with subsequent resection and histology. Thirty-seven percent were NC by ATASPS due to either heterogeneous echogenicity or iso/hyperechoic nodules, with one or more suspicious features (5). A 36% malignancy rate (inclusive of NIFTP) and an odds ratio for malignancy of 2.35 relative to LS/IS pattern nodules was seen. In our study, the majority of NC nodules due to mixed echogenicity were follicular-derived lesions, which was also the case in Valderrabano et al. (5). More investigation into this heterogeneous solid pattern is warranted so that the malignancy risk and tumor types can be appropriately categorized; however, this pattern is likely under-recognized as a feature of concern.
Limitations of this study, which was performed at an academic medical center with a high-volume thyroid center, include the generalizability of these findings in routine endocrine practice, where physicians may have less experience interpreting thyroid ultrasound. However, one benefit of the ATA's method of defining nodules in terms of recognizable patterns is that such a system is more reproducible (28 –30) and less dependent on physician experience (28). It is important to again reiterate that the ATA guidelines state that spongiform/mixed cystic and solid nodules may exhibit bright reflections on ultrasonography caused by colloid crystals or posterior acoustic enhancement of the back wall of a microcystic area, which may be confused by less proficient sonographers (12,31). We were very cautious to examine these types of nodules carefully to not “overcall” PEF, which would dilute the malignancy risk in this small sample of nodules. Another limitation is that this study was retrospective, which may have introduced selection bias. Although reviewers were blinded to cytology outcomes, the knowledge that these were biopsied nodules in conjunction with being able to see nodule size on the imaging could have biased interpretation of the imaging in select cases.
In conclusion, our data demonstrate that most NHSC are macrocalcifications that do not alter cancer risk from that predicted by the underlying grayscale ATA sonographic pattern. Within the classification systems used in the United States, these findings may be most clinically applicable in mixed cystic and solid nodules with macrocalcifications, which ACR TI-RADS would assign a “mildly suspicious—TR3” categorization where biopsy might otherwise be avoided if NHSC did not increase assessed risk. Our data also support that the presence of PEF in non-hypoechoic solid nodules should be considered an IS pattern, a recommendation that would align with ACR TI-RADS, EU TIRADS, and Korean TIRADS. Finally, we have shown that some solid nodules not classifiable by ATASPS because of a heterogenous echotexture should warrant a heightened degree of suspicion for FNA, perhaps similar to the ATA IS pattern. Further study is warranted to continue to modify the classification schemes, increase alignment among thyroid nodule FNA guidelines systems, better understand less common ultrasound patterns, and most appropriately select patients for thyroid FNA.
Footnotes
Authors' Contributions
All authors contributed to the article by substantial contributions to the design, analysis or interpretation of data, drafting/revising the work, approval of the version to be published, and agreement to be accountable for all aspects of the accuracy and integrity of the research.
Author Disclosure Statement
None of the authors have any disclosures.
Funding Information
This work was supported by the Thyroid Cancer Patient Research and Education Fund of the University of Pennsylvania.
