Abstract
Background:
In patients undergoing active surveillance of papillary thyroid microcarcinoma, definitive therapy—usually preceded by a definitive diagnostic procedure—is not recommended until evidence of disease progression is obtained, as stated in the American Thyroid Association guidelines. This is because the deferring of definitive diagnosis and therapy until disease progression has no impact on the disease-specific survival. This study evaluated the malignancy rate and probability of thyroid nodules, which was further stratified based on the size cutoff value of 1 cm, with suspicious findings on ultrasonography (US), by using various malignant stratification systems.
Methods:
The data were retrospectively collected between January 2003 and June 2003 from nine university hospitals that had previously participated in the Korean Society of Thyroid Radiology multicenter study on the ultrasonographic differentiation between benign and malignant thyroid nodules. In total, 829 thyroid nodules from 711 patients (620 women, 91 men; M age = 48.7 years; range 6–98 years; 351 malignant and 478 benign nodules) were included. The probability for malignancy of thyroid nodules was calculated, which was further stratified by size, by using four different types of malignant risk-stratification systems. The factors that could differentiate benign from malignant nodules were assessed using the chi-square test.
Results:
In the suspicious thyroid nodules <1 cm on US, the malignancy probability ranged from 77.4% to 82.8%; the lowest rate was found in the Korean Society of Thyroid Radiology multicenter study, whereas the highest rate was noted in the Web-based system. Thus, the probability of benign nodules among suspicious thyroid nodules <1 cm on US was 17.2–22.6%.
Conclusion:
A biopsy should be considered before active surveillance to exclude benign nodules with suspicious US features, and could thus prevent unnecessary active surveillance and patient anxiety.
Introduction
A
Thyroid US has been widely used to differentiate benign from malignant nodules, and to guide FNA cytology for nodules suspected of being malignant (7,8). The American Thyroid Association (ATA) guidelines recommend against performing US-guided FNA or core-needle biopsy (CNB) for thyroid nodules <1 cm, regardless of the US features of the nodules, as long as the nodules are confined to the thyroid (9). The guidelines seek to avoid a definitive cytological diagnosis or the surgical treatment of PTCs <1 cm. In fact, in Japan and Korea, the guidelines state that the confirmation of a cytological diagnosis is more appropriate in order to achieve effective active surveillance (10,11).
Although compelling data have indicated that an active surveillance approach to PTMC is a safe and effective alternative to immediate surgery (5,6), controversy remains regarding the acceptability of this approach in terms of whether papillary carcinoma should be pathologically proven at an earlier stage or if biopsy should not be considered, and all thyroid nodules with suspicious US features should be followed. Various US risk-stratification systems have been recently described, with differences in the scoring of suspicious US features. The present study aimed to evaluate the reliability of the Thyroid Imaging Reporting and Data System (TIRADS) for thyroid nodules with suspicious features in terms of the malignancy rate and probability, as well as the need for a definite cytological diagnosis prior to performing active surveillance for nodules <1 cm.
Materials and Methods
Patient selection
This retrospective study was approved by the Institutional Review Board. Informed consent was waived for the evaluation of data, although written informed consent was obtained from all the patients prior to thyroid US and US-guided FNA or CNB before each US examination (12). Data were retrospectively collected between January 2003 and June 2003 from nine university hospitals that had previously participated in the Korean Thyroid Study Group multicenter study on the ultrasonographic differentiation between benign and malignant thyroid nodules (13). Data from this cohort were used because that study remains one of the most frequently cited articles on the US findings or biopsy of thyroid nodules. Moreover, the data have been validated by many studies, and can be readily applied to clinical practice in Korea (14,15). Accordingly, a series of 8024 consecutive patients with thyroid nodules who had undergone thyroid US at nine Korean hospitals were considered for inclusion.
Among these patients, the cases that met the following criteria were included: (i) patients who underwent surgery or CNB; (ii) patients who underwent FNA at least twice for benign thyroid lesions (Bethesda category II) (16); and (iii) patients who underwent initial FNA and US at follow-up (>12 months) for benign thyroid nodules. Cases of pure cysts and follicular neoplasms that did not receive a final diagnosis via surgery were excluded. Finally, a total of 829 thyroid nodules from 711 patients (620 women, 91 men; M age = 48.7 years; range 6–98 years; 351 malignant and 478 benign nodules) were included (Fig. 1).

Flow chart of patients in this study.
Analysis of US findings and malignancy rate
US images were obtained for the evaluation of thyroid nodules by using either an HDI 5000 (ATL Ultrasound, Bothell, WA) or Sequoia (Acuson, Mountain View, CA) instrument equipped with a 5–12 MHz or 8–15 MHz linear-array transducer. US included both the transverse and longitudinal real-time imaging of thyroid nodules. When analyzing the US images, the radiologists assessed the thyroid nodules using criteria obtained from published reports (7,9,11,13,14), including size (≥1 cm or <1 cm), internal content, shape, margin, echogenicity of the solid portion, and calcification (none, microcalcification, macrocalcification, and/or rim calcification).
The internal content of a nodule was categorized according to the ratio of the cystic to the solid portion within a nodule, that is, solid (≤10% cystic), predominantly solid (>10% cystic and ≤50% cystic), predominantly cystic (>50% cystic), cystic (no obvious solid content), and spongiform appearance. Only 45 samples with predominantly cystic patterns were included. Spongiform appearance was defined as the aggregation of multiple, microcystic components comprising >50% of the total nodule volume (13).
Echogenicity of the solid portion was classified as hyper- or isoechogenicity, hypoechogenicity, or marked hypoechogenicity. When the echogenicity of the nodules was similar to that of the surrounding thyroid parenchyma, this was labeled as isoechogenicity. Hypoechogenicity was defined as a decrease in echogenicity compared to the thyroid parenchyma. Marked hypoechogenicity was defined as a decrease in echogenicity compared to that of the strap muscles (17). Nodule shape was categorized as follows: ovoid to round (when the anteroposterior diameter of the nodule was equal or less than its transverse diameter on a transverse or longitudinal plane); taller-than-wide (when the anteroposterior diameter of a nodule was longer than its transverse diameter on a transverse or longitudinal plane); or irregular (when a nodule was neither ovoid to round nor taller-than-wide). Margins were classified as well-defined smooth, microlobulated or spiculated, or ill-defined (13).
Calcifications were categorized as microcalcifications, macrocalcifications, rim calcifications, or none. Microcalcifications were defined as calcifications ≤1 mm in diameter that were visualized as tiny, punctate, hyperechoic foci either with or without acoustic shadowing. The presence of tiny, brighter reflectors with a clear-cut, comet-tail artifact on conventional US was considered to represent a colloid appearance. Macrocalcifications were defined as hyperechoic foci >1 mm, whereas rim calcifications were defined as nodules with peripheral curvilinear or eggshell calcifications (14,18). When the nodules had many types of calcifications—that is, macrocalcifications including rim calcifications intermingled with microcalcifications—the nodule was considered to have microcalcifications. The suspicious US features suggested by the Korean Society of Thyroid Radiology (KSThR) and Moon et al. are as follows: taller-than-wide shape, spiculated/microlobulated margin, marked hypoechogenicity, and presence of microcalcifications (13,14). A thyroid nodule with at least one of these suspicious US findings was classified as a suspicious nodule.
The probability of malignancy was calculated using various malignant risk systems, such as web-based TIRADS (
Statistical analysis and reference standard
Statistical analysis was performed using IBM SPSS Statistics for Windows v19.0 (IBM Corp., Armonk, NY). Each of the US characteristics was analyzed to determine its association with a benign or malignant diagnosis. The diagnostic accuracy of the US findings for 829 nodules according to size was compared by calculating the sensitivity, specificity, negative predictive value, positive predictive value, and accuracy. A chi-square test was used for the comparison of categorical variables. Student's t-test was used for the comparison of quantitative variables. Two-tailed p-values of <0.05 were considered to be statistically significant.
For each thyroid nodule, the final diagnosis was determined via either histopathological examination or radiological follow-up. For malignant nodules, the pathologic diagnosis was confirmed by surgery or CNB, whereas for benign nodules, the pathologic diagnosis was confirmed by surgery or CNB, FNA repeated at least twice with benign results, or a benign result on FNA and no change or reduced size on follow-up US (>12 months).
Results
US characteristics of all nodules in the study series
The size of the nodules in the whole series ranged from 3 to 41.6 mm (M = 23 mm). The proportion of nodules with a diameter of <1 cm was 25.8% (214/829), and those with a diameter of ≥1 cm was 74.2% (615/829). A significant difference was found for all the features representing benign and malignant nodules, regardless of size. In particular, malignant nodules showed a taller-than-wide shape, spiculated/microlobulated margin, marked hypoechogenicity, and microcalcification. The frequency of malignant nodules with a spiculated/microlobulated margin was 62.2% (p < 0.002) and with marked hypoechogenicity was 78.4% (p < 0.001) in the subgroup of nodules with a smaller diameter (≤5 mm).
The malignancy rate was significantly higher in the thyroid nodules with a diameter <1 cm compared to nodules with a diameter of ≥1 cm (34.1% vs. 65.9%; p < 0.001; Table 1). However, suspicious US features were also observed in thyroid nodules that were finally considered to be benign, including those with taller-than-wide shape (4.2%), spiculated/microlobulated margin (9.0%), marked hypoechogenicity (12.1%), and microcalcification (11.5%). Moreover, benign nodules with a diameter <1 cm showed more frequently a taller-than-wide shape (9.6% vs. 3.2%), spiculated/microlobulated margin (21.9% vs. 6.7%), marked hypoechogenicity (24.7% vs. 9.9%), and microcalcifications (17.8% vs. 10.4%) compared to nodules with a larger diameter (Table 2 and Supplementary Table S1; Supplementary Data are available online at
Data indicate the number of lesions. Numbers in parentheses indicate percentages.
Data indicate the number of lesions. Numbers in parentheses indicate percentages.
US, ultrasonography.
For malignant nodules (n = 351), a pathologic diagnosis was confirmed at surgery (n = 330; 94%) or CNB (n = 21; 6%), whereas for benign nodules (n = 478), a pathologic diagnosis was confirmed at surgery (n = 157; 32.8%), CNB (n = 68; 14.2%), FNA repeated at least twice (n = 180; 37.7%), or FNA and follow-up US (>12 months; n = 73; 15.3%). The diagnoses of malignancy at histologic examination included papillary carcinoma (n = 334), follicular variant papillary carcinoma (n = 1), follicular carcinoma (n = 10), medullary carcinoma (n = 4), Hürthle cell carcinoma (n = 1), and diffuse large B-cell lymphoma (n = 1). The diagnoses of benign lesions at histologic examination included nodular hyperplasia (n = 373), adenomatous goiter (n = 45), follicular proliferating lesion (n = 3), follicular adenoma (n = 44), and thyroiditis (n = 13).
Comparison of the malignant rate using various malignant risk systems
Based on the KSThR and Moon et al. criteria (13,14), 412 patients (354 women and 58 men; M age = 48.9 years; range 6–97 years) with 461 nodules (M size = 18 mm; range 3–33.1 mm) had suspicious findings. Of these 461 suspicious nodules on US, 297 (64.4%) were finally confirmed as being malignant based on histopathologic readings of surgical (n = 278) or repeat CNB (n = 19) specimens. The pathologic results included 288 (97.1%) PTCs, four (1.3%) follicular thyroid carcinomas, three (1%) medullary carcinoma, one (0.3%) diffuse large B-cell lymphoma, and one (0.3%) Hürthle cell carcinoma.
Table 3 demonstrates the malignancy probability of thyroid nodules with suspicious US features according to various malignant risk-stratification systems. Among the various TIRADS, the malignancy probability ranged from 64.4% to 83.5%, wherein the Web-based TIRADS exhibited the highest sensitivity. Thus, 16.5–35.6% thyroid nodules with suspicious findings were finally found to be benign.
Data indicate the number of lesions. Numbers in confidence interval indicate percentages.
TIRADS, Thyroid Image Reporting and Data System; ATA, American Thyroid Association; CI, confidence interval.
A cutoff diameter of 10 mm was chosen, which has been used as a criterion for microcarcinoma on pathologic diagnosis (21). In thyroid nodules <1 cm with suspicious features, the malignancy probability ranged from 77.4% to 82.8%. Hence, 17.2–22.6% of the nodules <1 cm with suspicious features were actually benign. Overall, the probability of benign lesions among the suspicious thyroid nodules on US ranged from 16.1% to 42.2% (Table 3).
The diagnostic accuracy of US features of 829 nodules was evaluated according to nodule size cutoff 1 cm and 5 mm (Table 4). The suspicious US features showed both higher specificity and sensitivity for larger nodules in differentiating benignity and malignancy. However, those with marked hypoechogenicity and taller-than-wide shape had lower sensitivity but still higher specificity (75.3% and 90.4% in <1 cm vs. 90.1% and 96.8% in ≥1 cm, respectively). The diagnostic value of microcalcifications—one of the most important suspicious features for cancer diagnosis—was higher for larger nodules (89.6% specificity and 61% sensitivity) than for smaller nodules (82.2% specificity and 53.9% sensitivity). The diagnostic accuracy was highest for nodules with marked hypoechogenicity in both groups. The accuracy of all the US features was lower in smaller lesions (≤5 mm) than in larger lesions.
Raw data are presented as percentages. Numbers in confidence interval indicate percentages.
Bold indicates two sonographic features with lower sensitivity but higher specificity in size ≥1cm.
NPV, negative predictive value; PPV, positive predictive value.
Discussion
The present study found that the malignancy probability of suspicious nodules <1 cm ranged from 77.4% to 82.8% on various TIRADS. Thus, 17.2–22.6% of nodules were finally confirmed as benign, despite the presence of suspicious US features. Moreover, it is more difficult to interpret US features of smaller nodules, as benign cystic nodules (22 –27) and even benign solid nodules mimic malignancy after size reduction on follow-up (25,28). In fact, the current study findings demonstrate higher specificity for suspicious US features in larger nodules with the taller-than-wide feature (90.4% vs. 96.8%; p = 0.021), and lower diagnostic accuracy in smaller nodules with the “microcalcification” feature (63.6% vs. 79.8%; p < 0.001), and an even lower diagnostic accuracy (53.8%) in nodules <5 mm, consistent with the findings of a previous study (29). Therefore, biopsy should be considered before active surveillance in such cases to avoid unnecessary active surveillance and patient anxiety.
The reliability of the US risk-stratification system for the detection of cancers among suspicious thyroid nodules was also evaluated. With regard to sensitivity, the Web-based TIRADS showed excellent performance, with 82.8% sensitivity, in nodules <1 cm. Hence, the adoption of these scoring systems can increase the efficacy of cytological diagnosis, and therefore true low-risk PTMC could serve as candidates for active surveillance.
The most recent ATA guidelines state that thyroid nodules <1 cm do not need to be diagnosed based on cytology, unless evidence of aggressive features such as lymph node metastases, distant metastases, and apparent extrathyroidal extension are found, in order to avoid immediate admittance for surgical treatment (9). However, different opinions have been reported in the literature. In Japan, it is considered to be better to prove low-risk PTMC cytologically and provide patients with an incentive to visit the clinic regularly for active surveillance and prevent loss to follow-up (10). Similarly, in Korea, the K-TIRADS recommends the use of FNA for subcentimeter nodules >5 mm, with highly suspicious US patterns (solid hypoechoic nodule with any of the following three suspicious features: non-parallel orientation, spiculated/microlobulated margin, or microcalcification). Moreover, Gweon et al. (30) investigated the predictive factors associated with malignancy and aggressive biological behavior to determine the appropriate candidate factors for active surveillance. They suggested that active surveillance would not be the best choice for thyroid nodules <1 cm in male patients aged <45 years with evidence of microcalcification and a taller-than-wide shape on US, considering the high malignancy rate and risk of aggressive biological behavior among PTMCs with these features.
Biopsy of thyroid nodules <1 cm before active surveillance has certain advantages and disadvantages. If patients with actually benign nodules with suspicious findings on US undergo active surveillance without biopsy, the patients may be more anxious due to the fact that the thyroid nodule status remains unclear. This may present an obstacle for successful long-term surveillance and may decrease the quality of life, particularly in young patients. With the omission of FNA, a significant proportion of patients (about 16.1–42.2% according to this study) would be unnecessarily submitted to long-term active surveillance with psychological stress of having a presumptive diagnosis of a nodule with a high probability of cancer. Similarly, Rosario et al. recently reported that about 30% of patients had benign findings on repeated cytology results of highly suspicious nodules ≤1 cm on US, and they recommended the use of the widely available, low-cost, and safe FNA procedure before deciding active surveillance of nodules ≤1 cm with highly suspicious US features restricted to the thyroid (31). According to the present results, the malignancy proportion was significantly higher in the thyroid nodules with suspicious features with a diameter <1 cm (34.1% benign vs. 65.9% malignant; p < 0.001), and it is prudent in these cases to inform the patient of the diagnosis of malignancy. Moreover, a recent observation study (4) reported that 22.5% of young patients show tumor progression to clinical disease, and that 16.1% show lymph node metastasis over a 10-year observation period. Young age (≤40 years) is a recognized independent predictor of PTMC progression (10), and therefore, for active surveillance of PTMC with good compliance, thyroid nodules should be diagnosed early and the management strategy should depend on the pathologic result.
Recently, various TIRADS for the diagnosis of thyroid nodules have been reported from several institutions and societies (9,11,13,14,19,20). To generalize the present results, they were applied to various TIRADS, and a similar probability of malignancy was achieved (78.3–82.8%). The various TIRADS use different criteria for the diagnosis of malignancy. Accordingly, the results of the ATA guidelines, K-TIRADS, French-TIRADS, and web-based malignancy risk estimation were compared (9,11,14,19,20). In the ATA guidelines, highly suspicious nodules were defined as a solid hypoechoic nodule or the solid hypoechoic component of a partially cystic nodule with one or more of the following features: irregular margins (infiltrative, microlobulated), microcalcifications, taller-than-wide shape, rim calcifications with small extrusive soft tissue component, and evidence of extrathyroidal extension. In the K-TIRADS guidelines, a solid hypoechoic nodule with any of the suspicious US features, including microcalcification, nonparallel orientation, and spiculated/microlobulated margin, was categorized as a highly suspicious nodule. For the French-TIRADS, those classified in categories 4B and 5 were considered as highly suspicious nodules. Compared to the K-TIRADS, the French-TIRADS included high stiffness on elastography as a suspicious feature. Furthermore, the Web-based malignancy risk estimation is a 14-point risk scoring system based on US features, wherein a score of ≥8 was considered to represent a highly suspicious nodule. Moon et al. (13) previously reported that the false-positive rate for the detection of malignant nodules may be greater in smaller nodules, as microcalcification is not a major predictor of malignancy in nodules <1 cm, consistent with the findings of the present study. Moreover, marked hypoechogenicity, taller-than-wide shape, and spiculated margin were frequently observed in both small benign and malignant nodules (13). Thus, in the present study, nodules ≥1 cm had relatively higher specificity, sensitivity, and accuracy for malignant features than smaller nodules. This risk-stratification model can help increase the efficacy of biopsy in nodules <1 cm, avoid unnecessary procedures, and provide supplementary information on thyroid nodules after FNA. In addition, the model is linked to an online risk calculator (
There are more aggressive variants of PTC with unfavorable outcome, including tall cell (prevalence 5–10%), columnar cell (<1%), hobnail variants (<1%), and oncocytic (Hürthle cell) follicular carcinoma (<1%) (32 –36). In the present study, a Hürthle cell carcinoma <1 cm was diagnosed in one case. The pathological category provides important information in such cases, and immediate surgery rather than active surveillance is mandatory for poor variants of PTC. Moreover, certain molecular markers can predict the prognosis of PTCs, including presence of a BRAF and/or a TERT mutation (37). However, no useful predictors of rapid and life-threatening growth requiring immediate surgery have been identified thus far. The collection of pathological results and accumulation of evidence of molecular markers can ensure more effective active surveillance. In the future, cytology results may be able to exclude aggressive variants of PTC or PTCs with poor prognosis by using molecular markers and thus avoid active surveillance, which would not be appropriate for these tumors.
Besides active surveillance and surgery, image-guided ablation therapies such as radiofrequency ablation (38), laser ablation (39,40), and microwave ablation (41) have been suggested as other management options. These results did not show any local tumor recurrence during the follow-up period, and the absence of any metastatic lymph nodes in the neck at the short-term follow-up. Moreover, Kim et al. (42) reported similar results at four years of follow-up, without any life-threatening complications. However, Valcavi et al. (40) reported the results of three patients with total thyroidectomy after laser ablation of PTMC. Laser ablation was effective for the primary cancer, but tiny multiple thyroid cancers and microscopic metastasis in the lymph nodes were detected on surgery. The thermal ablation of thyroid cancer is effective for the management of the primary cancer itself but has limitations for the control of regional microscopic metastasis or tiny multifocal cancers. Hence, it is unclear whether thermal ablation should be used for PTMC (43). With the current trend for a more conservative approach, wherein a complete explanation of the disadvantages and benefits of immediate surgery, active surveillance, or nonsurgical management is provided, the preference for the cytological diagnosis of thyroid nodules is inevitable.
This study has some notable limitations. First, the retrospective design of the present study is a major limitation, which may have led to selection bias. Not all nodules <1 cm with intermediate and low suspicious US features were included, which may cause selection bias regarding FNA and subsequent surgery or follow-up imaging. For instance, patients with benign findings at US such as follicular neoplasm may not undergo biopsy or surgery. Second, although the study population has been validated by several studies (44,45) and may be more reproducible, the cause and approximate number of lesions that were excluded were not provided. Due to the retrospective design, it was not possible to identify how many of the 492 exclusions were due findings of cystic lesions, a follicular neoplasm, loss to follow-up, or an inconclusive biopsy result. These limitations should be overcome by large-scale prospective studies in the future.
In conclusion, this study demonstrates that the probability of malignancy of suspicious nodules <1 cm ranges from 77.4% to 82.8% in various TIRADS. Thus, 17.2–22.6% of nodules were finally confirmed as being benign, despite having suspicious US features. Therefore, biopsy should be considered prior to active surveillance to prevent unnecessary active surveillance and patient anxiety.
Footnotes
Author Disclosure Statement
The authors declare no conflicts of interest.
