Abstract
Background:
Recommendations for subcentimeter thyroid nodules that need fine-needle aspiration biopsy are renewed in the revised American Thyroid Association (ATA) guidelines published in 2009. We applied these recommendations to analyze the diagnostic performance of the ATA guidelines and compared it to that of other modified guidelines.
Methods:
We evaluated 1054 nodules with sizes of 6–10 mm in 991 patients. A total of 713 nodules were included in the study population by excluding nodules with insufficient results for deciding whether they had a benign or malignant cytology. Frequencies of ultrasonographic features in benign and malignant nodules were compared, and odds ratios of suspicious ultrasonographic features were obtained with univariate and multivariate analysis. Seven modified guidelines were made based on the revised ATA guidelines and from multivariate analysis results. Diagnostic performances of the guidelines were compared by sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the area under the receiver operating characteristic curve (Az) value.
Results:
In addition to hypoechogenicity, infiltrative margin, microcalcification, and taller-than-wide shape that were suggested by the ATA guidelines, solid composition and macrocalcification were significantly associated with malignancy on multivariate analysis (p=0.001 and 0.003, respectively). Increased vascularity, however, was not significantly associated with malignant nodules (odds ratio 0.729, p=0.212). Among the eight guidelines, the ATA guidelines showed the lowest diagnostic performance (Az=0.616). Excluding increased vascularity and including solid composition with or without macrocalcification to the suspicious ultrasonographic features of the ATA guidelines improved sensitivity (96.6% vs. 97.0%), specificity (26.6% vs. 42.9%), PPV (48.3% vs. 54.7%), and NPV (91.7% vs. 95.2%), thereby resulting in the highest Az value (Az=0.699, p<0.001).
Conclusions:
This study suggests that excluding increased vascularity and adding solid composition to the suspicious ultrasonographic features of the ATA guidelines would significantly improve the diagnostic performance in subcentimeter nodules for the identification of malignant lesions.
Introduction
T
The suspicious US features suggested in the revised ATA guidelines include microcalcification, hypoechogenicity, increased nodular vascularity, infiltrative margin, and taller-than-wide shape on transverse view; thyroid nodules with any of the above US features are recommended for FNAB (9). To our knowledge, this combination of US features for subcentimeter nodules has not yet been applied to clinical settings. Furthermore, whether the suggested combination of suspicious features has a higher diagnostic performance compared to other combinations has not been studied. Therefore, the purpose of this study was to analyze the diagnostic performance of ultrasound for 5 to 10 mm thyroid nodules as suggested by the revised ATA guidelines and compare it to the diagnostic performances of our seven modified versions of the ATA guidelines.
Method and Materials
Patients
From September 2007 through March 2008, 1054 nodules with a size of 6 to 10 mm in 991 patients were biopsied under ultrasound guidance in our institution. The insufficiency rate of FNAB was 141 of 1054 (13.4%). Among 435 nodules with nondiagnostic (n=141), indeterminate (n=1), suspicious (n=58), and malignant (n=234) cytological results, only those that were operated on (n=304) were included. Among nodules with benign cytology (n=620), nodules that were operated on or that showed no interval change for at least 1 year of follow-up were included. Therefore, a total of 713 nodules were included in the study population (Fig. 1).

Study population. FNAB, fine-needle aspiration biopsy.
US evaluation and US-guided FNAB
US images were obtained using 5–12 MHz linear transducers (HDI 5000 and IU-22, respectively; Philips, Bothell, WA). Real-time ultrasound was performed by seven radiologists (four faculty members with experience ranging between 5–13 years and three fellows). US features of all thyroid nodules that underwent ultrasound-guided FNAB (US-FNAB) were prospectively recorded according to internal component, echogenicity, margin, calcification, shape, and vascularity at the time of US-FNAB (10). The internal composition was classified as solid, predominantly solid (solid portion ≥50%), predominantly cystic (solid portion <50%), spongiform appearance, and entirely cystic. The echogenicity of the nodule was classified as hyperechogenic (defined as hyperechogenic compared to the normal thyroid gland), isoechogenic (defined as isoechogenic compared to the normal thyroid gland), hypoechogenic (defined as hypoechogenic compared to the normal thyroid gland but hyperechogenic to the surrounding strap muscle), and marked hypoechogenic (defined as hypoechogenic compared to the surrounding strap muscle). The prospectively recorded echogenic features were reclassified into three categories (hyperechogenic, isoechogenic, and hypoechogenic) by combining hypoechogenic and marked hypoechogenic into one category according to the ATA guidelines. The margin of nodules was classified as well-circumscribed, microlobulated, and irregular. Following the ATA guidelines, margins were reclassified into two categories, well-circumscribed and infiltrative, by combining microlobulated and irregular margins and defining them as infiltrative margins. Calcification was classified as presence of microcalcification (defined as calcification <1 mm), macrocalcification (defined as calcification ≥1 mm), eggshell calcification, and no calcification. If there was both microcalcification and macrocalcification, it was determined as microcalcification. Microcalcification is considered to be a suspicious feature according to the ATA guidelines. The shape was classified as wider-than-tall (defined as a ratio of the transverse diameter to the anteroposterior diameter that was <1) and taller-than-wide (defined as a ratio of the transverse diameter to the anteroposterior diameter that was ≥1). The shape was evaluated on both transverse and longitudinal view and if there was a taller-than-wide shape on either plane, it was determined as having the taller-than-wide shape (11). The vascularity was classified as peripheral, central, both peripheral and central, and no vascularity. Increased nodular vascularity according to the ATA guidelines was assumed when there was central or both peripheral and central vascularity on the Doppler ultrasound.
US-FNAB was performed with a 23 gauge needle attached to either a 2 mL or 20 mL disposable plastic syringe. Aspiration was done at least twice in each nodule and aspirated material was expelled onto glass slides and smeared. For Papanicolaou staining, all smeared glass slides were placed in 95% alcohol. For a cell block, the remaining material was rinsed in saline solution. Per the cytopathologists' requirement, additional special staining was done on a case-by-case basis.
Comparison of guidelines
The ATA guidelines criteria and seven other modified ATA guidelines, including modified guidelines 1–7, were compared. The seven modified guidelines had various combinations of suspicious US features of malignancy (Table 1). For the combination of suspicious features, we took into account multivariate analysis results, thereby including solid composition and macrocalcification and excluding increased vascularity.
Solid composition is a requisite feature when included in the guideline.
A nodule showing at least one of the suspicious ultrasonographic features is recommended to undergo fine-needle aspiration biopsy.
ATA, American Thyroid Association; ◯, included in the guideline; X, not included in the guideline.
ATA guidelines
According to the ATA guidelines (9), FNAB on subcentimeter nodules should be done in nodules with the following suspicious US features: microcalcification, hypoechogenicity, increased nodular vascularity, infiltrative margin, and taller-than-wide shape. A nodule with at least one of the above suspicious US features was determined as a suspicious nodule that needs FNAB.
Modified guideline 1
The modified guideline 1 excluded “increased nodular vascularity” from malignant US features of the ATA guidelines and included all the other characteristics; microcalcification, hypoechogenicity, infiltrative margin, and taller-than-wide shape. A nodule with at least one of the above suspicious US features was classified as a suspicious nodule that needed FNAB.
Modified guideline 2
The modified guideline 2 included composition criteria and only solid nodules were considered to be suspicious for malignancy; only solid nodules that had at least one of the other malignant US features of the ATA guidelines were considered suspicious. Therefore, suspicious features of the modified guideline 2 included solid composition and one of the following: microcalcification, hypoechogenicity, increased nodular vascularity, infiltrative margin, and taller-than-wide shape.
Modified guideline 3
The modified guideline 3 excluded the increased nodular vascularity feature from the modified guideline 2. Therefore, suspicious features included solid composition and one of the following: microcalcification, hypoechogenicity, infiltrative margin, and taller-than-wide shape.
Modified guideline 4
The modified guideline 4 included macrocalcification to the suspicious US features of the ATA guidelines. Therefore, nodules with microcalcification or macrocalcification, hypoechogenicity, infiltrative margin, or taller-than-wide shape were considered to be suspicious for malignancy.
Modified guideline 5
The modified guideline 5 included macrocalcification to the suspicious US features of the modified guideline 1.
Modified guideline 6
The modified guideline 6 included macrocalcification to the suspicious US features of the modified guideline 2.
Modified guideline 7
The modified guideline 7 included macrocalcification to the suspicious US features of the modified guideline 3.
Statistical analysis
Statistical analysis was performed using statistical software, SAS version 9.2 (SAS Institute Inc. Cary, NC). The χ2 test was used to compare the ratio of malignant and benign nodules with each US feature. Two sample t-tests were used to compare the age difference between the malignant and benign nodule group. The odds ratio of malignancy for each US feature was calculated with univariate and multivariate logistic regression analysis. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for each US characteristic suspicious for malignancy were calculated. The diagnostic accuracy of predictions of malignancy was calculated with receiver operating characteristic (ROC) analysis and the area under the ROC curve (Az) of each guideline was obtained. The adjusted p value for multiple comparisons of Az values of the four guidelines was calculated by the Bonferroni correction method. For all results, a two-sided p<0.05 was considered to be the level of statistical significance.
Results
A total of 713 nodules from 686 patients (women, 599; men, 87; mean age, 49.7 years) were included in the study. There were 417 benign (58.5%) and 296 malignant (41.5%) nodules. All malignant nodules were papillary carcinomas. Surgery was performed in 13 patients with benign cytology because of malignant cytology on another thyroid nodule (n=5), suspicious findings on follow-up cytology (n=5), suspicious cytology results for neck node metastasis (n=1), suspicious US features (n=1), and an ipsilateral parathyroid adenoma (n=1). The pathologic diagnosis from thyroidectomy of benign nodules included papillary carcinoma (n=6), adenomatous hyperplasia (n=6), and lymphocytic thyroiditis (n=1). The mean sizes of both the benign and malignant nodules were the same, 7.9 mm (range, 6–10 mm).
The mean age of patients with malignant nodules was younger than that of patients with benign nodules and it was statistically significant (malignant, 48.3 years; benign, 50.6 years; p=0.006). US characteristics of the benign and malignant nodules are compared in Table 2. US features of solid composition, hypoechogenicity, infiltrative margin, microcalcification, macrocalcification, taller-than-wide shape, and no vascularity were significantly more frequent in malignant nodules than in benign nodules. On the other hand, increased nodular vascularity was more frequently observed in benign nodules than in malignant nodules. All suspicious US features except for increased nodular vascularity were significantly related to malignancy on univariate and multivariate analysis (Table 3).
OR, odds ratio; CI, confidence interval.
Diagnostic performances of the ATA guidelines and the other seven modified guidelines are summarized in Table 4. The Az value of the ATA guidelines was 0.616. The Az differences between the ATA guidelines and other guidelines were statistically significant (p<0.001) except for the modified guideline 4. The Az values of the modified guideline 1 and 2 did not show significant difference. There was also no significant difference between the Az values of the ATA guidelines and modified guideline 4 (0.616 vs. 0.620, p=0.412). Similarly, all extended guidelines by adding macrocalcification to its native guideline did not show significant differences between each other (e.g., modified guideline 2; 0.669 vs. modified guideline 6; 0.673, p=0.410). Among the eight guidelines, the modified guideline 7 showed the highest Az value (0.699) and was statistically higher than that of all the six other guidelines (p<0.001) except for that of its native guideline (i.e., modified guideline 3; 0.697, p=0.059).
PPV, positive predictive value; NPV, negative predictive value; Az, area under receiver operating characteristic curve.
Discussion
With the introduction of high-resolution ultrasound, the detection of thyroid nodules has been increased (1). Conventionally, when impalpable nodules are incidentally found by ultrasound, two imaging criteria have been used to determine whether further workup such as FNAB is needed: size and US features (12). The Society of Radiologists Ultrasound (SRU) guideline published in 2005, as well as the ATA guidelines published in 2006 recommended FNAB for nodules larger than 1 to 1.5 cm (5,13). On the other hand, there were studies showing that subcentimeter nodules are not associated with a lower incidence of cancer and that they can be associated with a relatively high risk of malignancy (6,7). Still, whether or not FNAB is effective for subcentimeter nodules has been under debate (6,7,13 –16). First of all, this is because most subcentimeter nodules show an indolent course, a low malignancy rate, and good prognosis (16,17). Ito et al. grouped patients with papillary microcarcinoma into two groups, an observation and a surgical group, and showed that observation was a possible choice for these patients. This is due to the fact that 70% of the patients in the observation group showed no change in size of nodules or lymph node metastasis during observation, thereby confirming that papillary microcarcinomas typically have a stable clinical course (18). However, we can still not predict whether a nodule will follow an aggressive course or not. Second, as nodule size decreases, there is a higher possibility of an inadequate FNAB result (17,19). Contrary to these opinions, some data suggest that subcentimeter nodules can also have an aggressive course (8), and can be accurately and safely biopsied with a high diagnostic rate as well (6,19). Still, Mazzaferri et al. were opposed to recommending FNAB for 5 mm or smaller nodules (14). In our study, 141 of 1054 subcentimeter nodules showed nondiagnostic results (13.4%) and this is consistent with previous reports of subcentimeter nodule cytology (10.2% to 17.8%) (6,7,19).
In the revised ATA guidelines, 5 to 10 mm nodules with suspicious US features are recommended for FNAB if the patient has a high-risk history (9). Among these suspicious US features suggested by the revised ATA guidelines, increased nodular vascularity did not show significant association with malignancy on multivariate analysis. The increased vascularity can be related to the cellular proliferation in a neoplastic condition (20) and some suggest that the Doppler US feature of intranodular vascularity is useful for differentiating benign and malignant thyroid nodules (21,22). Still, benign nodules with hyperplastic follicular proliferation or granulation tissues can also show increased vascularity (23). Therefore, removing increased vascularity from the suspicious features suggested by the ATA guidelines (i.e., modified guideline 1) resulted in higher specificity, PPV, and NPV than the original ATA guidelines. This is consistent with a previous report showing that vascularity itself or a combination of vascularity with other US features was not as helpful as other suspicious US features for predicting malignancy (24).
There is no description for composition among the suspicious US features in the revised ATA guidelines. On the other hand, the SRU criteria applied composition criteria to select nodules; a completely or almost solid nodule of 1.5 cm or larger and a mixed solid or cystic nodule of 2.0 cm or larger would be selected to undergo FNAB (13). Suspicious US features suggested by Kim et al. only evaluated nodules with solid composition and that alone showed a high sensitivity of 93.8% (10). Solid thyroid nodules are more likely to grow than nodules with a cystic component and have a substantial higher risk of malignancy (25,26). However, some report that applying predominantly solid composition to suspicious US features is not recommended as it shows up with high frequency in both malignant and benign nodules (27). In fact, 77% of subcentimeter benign nodules showed a solid composition in our study, which was consistent with previous reports of nodules of all sizes (60% to 87%) (27 –29). On the other hand, some reports showed that a solid composition has a relatively high odds ratio in predicting malignancy (30,31). In our study, solid composition resulted in the highest odds ratio compared to that of any other suspicious US feature. There were only five malignant nodules that were not solid and among those, only one nodule showed no other suspicious US features, thereby falling into the false negative results with all the applied guidelines. The other four nodules showed at least one suspicious US feature other than solid composition and could be a true positive result with the ATA guidelines, whereas the nodules would fall into the false-negative results by the modified guideline 2. Although sensitivity was slightly decreased, specificity, PPV, and NPV values were increased along with diagnostic performance by combining solid composition to the ATA guidelines.
Taller-than-wide shape is one of the suspicious US features first described by Kim et al. (10). According to the revised ATA guidelines, taller-than-wide shape is defined only in transverse view. However, a recent report recommended determining taller-than-wide shape in either transverse or longitudinal view as it was more accurate and sensitive than determining the shape by only one view (11). Our data were prospectively recorded describing the shape as taller-than-wide if it was shown as such in either plane; therefore, it may not exactly match the ATA guidelines' diagnostic performance. However, since this feature was applied to the other seven combinations of guidelines as well, it would not have affected our comparison of diagnostic performances.
All the other suspicious US features suggested by the revised ATA guidelines, except for increased nodular vascularity, showed significant association with malignancy. Berker et al. also showed association of hypoechogenicity, microcalcification, and taller-than-wide shape with malignancy in subcentimeter nodules (7). Microcalcification showed a relatively low sensitivity of 30.4% but this was consistent with previous studies on subcentimeter nodules with 20% to 36.6% (27,31). The sensitivity of microcalcification in subcentimeter nodules was lower than that of other sizes (26% to 59%) (10,27,32), which might be due to the lower frequency of microcalcifications in subcentimeter nodules. The odds ratio of macrocalcification (2.290) was similar to that of microcalcification (3.704). We did not include eggshell calcifications in macrocalcifications and this might have increased the odds ratio of macrocalcification itself since eggshell calcification did not show significant association with malignancy in univariate analysis.
In addition to the ATA guidelines' suspicious US features, macrocalcifications were significantly more frequent in malignant nodules than in benign nodules and could predict malignancy with an odds ratio of 2.393 on multivariate analysis (p=0.003). This is consistent with a previous study in which subcentimeter size nodules showed an odds ratio of 3.43 (31). Extended guidelines with macrocalcification included as a suspicious US feature showed higher sensitivity but lower specificity compared to their native guidelines, therefore, this did not increase diagnostic performance significantly. Still, the modified guideline 7 showed the highest Az value among the eight guidelines.
Comparing diagnostic performances of the eight guidelines showed that the ATA guidelines had the lowest diagnostic performance (Az=0.616). On the other hand, excluding increased vascularity and combining solid composition to the suspicious US features of the ATA guidelines showed a higher diagnostic performance (Az=0.697). Adding solid composition and removing increased vascularity to the ATA guidelines improved specificity from 26.6% to 44%, while decreasing sensitivity to a relatively small degree from 96.6% to 95.3%. Additionally, when the presence of macrocalcification was added to the suspicious US features, sensitivity increased to 97.0% while slightly decreasing specificity to 42.9%. On the contrary, sensitivity and NPV, both essential to predict and avoid FNAB in subcentimeter nodules, increased only slightly by modifying the ATA guidelines (96.6% to 97.0% and 91.7% to 95.2%). Therefore, they may not influence the improved diagnostic performance as much as specificity and PPV. However, this might be because the ATA guidelines itself have a fairly high sensitivity and NPV compared to specificity and PPV.
There are some limitations to our study. First, unlike nodules with other FNAB results, benign nodules on FNAB that have not been operated on were included in the study population. This could have led to increased numbers false-negative results. However, to minimize this limitation, we followed the nodules for at least 1 year and only those that had no change in US features were included in the study population. Second, the interpretation of US features could have been different between the ultrasound performers; because US features were prospectively recorded by seven radiologists there is an inevitable reflection of interobserver variation. Still, studies on interobserver variability at our institution have shown more than moderate degree of agreement on US feature interpretation between experienced radiologists with 1 to 10 years of experience (33). Although three fellows with less than 1 year of experience were included in this study, man-to-man training conducted in our institution could have increased interobserver variability (34). Third, the ATA guidelines suggested FNAB for subcentimeter nodules in patients with a high-risk history, but we did not have detailed clinical information of patients available and were not able to consider history when performing FNAB. Further studies on subcentimeter nodules in patients with a high-risk history are needed. Fourth, there was a relatively high rate of malignant cytology and a low rate of indeterminate or suspicious cytology in our results. This is partially due to the sharply increasing incidence of thyroid cancer in Korea (25.4% annually), which is thought to be partially due to the advent of ultrasound and FNAB (35). Additionally, there is an inevitable selection bias in our study group as our institution is a referral hospital and many referred patients are already suspected to have thyroid cancer.
In conclusion, when applied in a real clinical setting, the ATA guidelines showed the lowest diagnostic performance among the eight compared guidelines. Excluding vascularity and combining solid composition to the suspicious US features of the ATA guidelines (modified guideline 3 and modified guideline 7) showed the highest diagnostic performance.
Footnotes
Author Disclosure Statement
The authors have nothing to disclose.
