Abstract
Background:
The malignancy risk of a cytology diagnosis may depend on the ultrasonography (US) patterns of thyroid nodules, and management should be determined by the combined malignancy risk of fine-needle aspiration (FNA) cytology and US patterns. This study was performed to develop a clinically applicable cytology-ultrasonography (CU) scoring system for malignancy risk stratification based on FNA cytology and US patterns, according to the Korean-Thyroid Imaging Reporting and Data System (K-TIRADS).
Methods:
This retrospective Institutional Review Board–approved study included 1651 thyroid nodules (≥1 cm) with final diagnoses. The malignancy risk was assessed of the combined results of FNA cytology and the K-TIRADS for the development of the CU system. The interaction between FNA cytology and US pattern (K-TIRADS) in the malignancy risk of nodules was investigated by using a binominal test.
Results:
The malignancy risk of nodules could be stratified into four CU scores (very low risk, <3%; low risk, ≥3%, <30%; high risk, ≥30%, <90%; very high risk, ≥90%). In nodules with non-diagnostic, benign, and atypia of undetermined significance/follicular lesion of undetermined significance cytology results, low-suspicion US pattern (K-TIRADS 3) significantly decreased the malignancy risk of nodules (p = 0.003, 0.013, and 0.027, respectively), and a high-suspicion US pattern (K-TIRADS 5) significantly increased the malignancy risk of nodules (p ≤ 0.001). A Bethesda 1 or 4 cytology result did not significantly change the malignancy risk of any K-TIRADS (p ≥ 0.518 and p ≥ 0.137, respectively). A Bethesda 2 cytology result decreased and a Bethesda 5 or 6 cytology result increased the malignancy risk of K-TIRADS 3, 4, and 5 (p ≤ 0.001). A Bethesda 3 cytology result increased the malignancy risk of K-TIRADS 3 and 4 (p < 0.001 and p = 0.024, respectively).
Conclusion:
The malignancy risk of thyroid nodules can be stratified by the CU risk-stratification system, based on FNA cytology and the K-TIRADS. The proposed CU scoring system may be helpful in the management of thyroid nodules after FNA.
Introduction
U
The malignancy risk of thyroid nodules depends on US features, and it may differ according to features such as non-diagnostic (2), benign (3 –6), and indeterminate (7 –12) FNA results. US malignancy risk-stratification systems have been suggested by several thyroid societies (13 –16). Recently, the Korean Society of Thyroid Radiology suggested a US malignancy risk-stratification system for thyroid nodules, the Korean-Thyroid Imaging Reporting and Data System (K-TIRADS), in which the malignancy risk is stratified by US patterns in terms of the integrated solidity, echogenicity, and suspicious US features (17 –19).
Recent guidelines (1,15,16) recommend considering US features in the management of nodules after FNA. However, an integrated risk-stratification system based on a combination of FNA results and US patterns has rarely been investigated in thyroid nodules (20,21). This study was performed to develop a clinically applicable cytology-ultrasonography (CU) scoring system for malignancy risk stratification based on FNA cytology and US patterns, according to the K-TIRADS.
Materials and Methods
The Institutional Review Board approved this retrospective study. The requirement to obtain informed consent was waived.
Study population
In total, 1651 consecutive thyroid nodules from 1457 patients (1126 women, 331 men; M age = 51 ± 12.1 years) who underwent FNA were enrolled. The enrolled patients were a subset of those enrolled in the Thyroid Imaging Reporting and Data System (TIRADS) study, which included consecutive patients with thyroid nodules (≥1 cm) who underwent FNA or core-needle biopsies (CNB) from January 2010 to May 2011 (17).
Final diagnoses of malignant tumors and benign neoplasms were determined by surgery. Final diagnoses of benign nodules were determined by (i) pathological results of surgical resections, (ii) benign cytology results of FNA or CNB that was repeated at least twice, (iii) an initial benign result from FNA or CNB, and (iv) decreased or stable nodule size at US follow-up of >12 months.
US exam and image analysis
A high-resolution US scan using a 10–12 MHz or 5–14 MHz linear-array transducer (AplioXG; Toshiba, Otawarashi, Japan; iU22; Philips Medical Systems, Bothell, WA) was performed. US images were retrospectively reviewed by one of three experienced radiologists (D.G.N., J.H.B., and J.Y.S., who had 19, 16, and 12 years of experience performing thyroid US and interventional procedures, respectively). All of the reviewers, who had no previous knowledge of FNA results or final diagnoses, assessed the following US features of the thyroid nodules: internal content, echogenicity, margin, shape, calcification, nodule vascularity, spongiform appearance, and comet-tail artifact. The thyroid nodules were categorized into four categories (benign, low suspicion, intermediate suspicion, and high suspicion) using the K-TIRADS, a malignancy risk-stratification system developed based on solidity, echogenicity, and suspicious US features in thyroid nodules (17 –19). K-TIRADS 5 (high suspicion) nodules include solid hypoechoic nodules with any suspicious US feature (microcalcification, non-parallel orientation, spiculated/microlobulated margin). K-TIRADS 4 (intermediate suspicion) nodules include solid hypoechoic nodules with no suspicious US feature and partially cystic or isohyperechoic nodules with any suspicious US feature. K-TIRADS 3 (low suspicion) nodules include partially cystic or iso- or hyperechoic nodules with no suspicious US feature. K-TIRADS 2 (benign) nodules include pure cysts, partially cystic with comet tail artifacts, and spongiform nodules.
US-guided FNA procedure
FNA was performed using a conventional method, and at least two samplings were performed for each nodule (22). FNA was routinely performed for thyroid nodules >1 cm, with the exception of pure cystic nodules, partially cystic nodules with comet-tail artifacts, and spongiform nodules (23). The interpretation of FNA was based on The Bethesda System for Reporting Thyroid Cytopathology (1).
Data analysis and statistics
To develop a CU risk-stratification system, the malignancy risk of each combined CU result was determined, created by the combination of the FNA cytology results (the six categories of the Bethesda system) and US patterns (the four categories of the K-TIRADS). The malignancy risk of thyroid nodules was stratified and categorized according to the calculated malignancy risk of each combined CU result. The rate of neoplasms, including malignant tumor and follicular neoplasm, was also calculated for each combined CU result. The four CU scores were determined with consideration for possible clinical management of nodules based on the malignancy risk of nodules.
The chi-square test or Fisher's exact test was used to compare the frequency of each K-TIRADS category among malignant tumors and to compare the malignancy risk of each K-TRIADS category among nodules with the same cytology results. The binomial test was used to test whether the malignancy risk of each FNA result was significantly changed by the K-TIRADS category and whether the malignancy risk of each K-TIRADS category was significantly changed by the FNA result. The increase or decrease in the malignancy risk of the combined CU result among nodules with the same FNA result or the K-TIRADS category was determined when the malignancy risk of a combined CU result differed significantly from the overall malignancy risk of the same K-TRIADS category or FNA cytology result.
Statistical analyses were performed with the IBM SPSS Statistics for Windows v20.0 (IBM Corp., Armonk, NY). A p-value of <0.05 was considered to indicate statistical significance.
Results
Demographic data
The maximum size of nodules ranged from 10 to 100 mm (M = 19.1 ± 10.7 mm; median size = 15 mm). Final diagnoses of the 1651 nodules were 1340 (81.2%) benign nodules (1297 benign non-neoplastic nodules, 43 benign tumors) and 311 (18.8%) malignant nodules. Final diagnoses were determined by surgical resections in 187 (14%) of 1340 benign nodules, which included 131 nodular hyperplasias, 41 follicular adenomas, 13 thyroiditis, and two other benign tumors. Final diagnoses of 311 malignant tumors were made after surgical resections, and there were 283 (91%) papillary thyroid carcinomas (PTC), including 27 follicular-variant PTC, 20 (6.4%) follicular carcinomas, five (1.6%) medullary carcinomas, two (0.6%) anaplastic carcinomas, and one (0.3%) lymphoma.
The low-suspicion US pattern (K-TIRADS 3) was found more frequently in follicular-variant PTC (40.7%; 11/27) and in follicular carcinomas (45%; 9/20) than in conventional PTC (11.7%; 30/256; each p < 0.001). The high-suspicion US pattern (K-TIRADS 5) was found more frequently in conventional PTC (63.7%; 163/256) than in follicular-variant PTC (22.2%; 6/27) or follicular carcinomas (5%; 1/20; each p < 0.001). Although follicular-variant PTC and follicular carcinomas were found in 20/40 (40%) malignant tumors with the low-suspicion US pattern (K-TIRADS 3), they were found in only 7/175 (4%) malignant tumors with the high-suspicion US pattern (K-TIRADS 5).
Malignancy risk of nodules by combined CU results
Table 1 shows the malignancy risk of each combined CU result of FNA cytology and the K-TIRADS. In nodules with non-diagnostic or atypia of undetermined significance/follicular lesion of undetermined significance (AUS/FLUS) cytology results, the malignancy risk of each combined CU result increased as the score of the K-TIRADS category increased. In nodules with non-diagnostic results, the malignancy risk of each K-TIRADS category was similar to the overall malignancy rate of each K-TIRADS score. The malignancy risk of K-TIRADS 3 nodules was lower than those of K-TIRADS 4 and 5 nodules (p = 0.066 and p < 0.001, respectively), and the malignancy risk of K-TIRADS 4 nodules was lower than that of K-TIRADS 5 nodules (p < 0.001). In nodules with AUS/FLUS cytology results, the malignancy risks of K-TIRADS 3 and 4 nodules were 20.2% and 34%, respectively. The malignancy risk of K-TIRADS 4 nodules was marginally higher than that of K-TIRADS 3 nodules (p = 0.062), and the malignancy risk of K-TIRADS 5 nodules was significantly higher than that of K-TIRADS 3 and 4 nodules (p < 0.001 and p = 0.019, respectively). In nodules with benign cytology results, the malignancy risks of K-TIRADS 2, 3, and 4 nodules were <3% and significantly lower than the malignancy risk (12.5%) of K-TIRADS 5 nodules (p ≤ 0.013). In nodules with follicular neoplasm/suspicious for follicular neoplasm (FN/SFN) cytology results, the malignancy risks of K-TIRADS 3 and 4 nodules were 14.3% and 35.3%, respectively. The malignancy risk of K-TIRADS 4 nodules was higher than that of K-TIRADS 3 nodules but statistically insignificant (p = 0.240). In nodules with suspicious for malignancy or malignant cytology results, the malignancy rate was very high (>90%), regardless of the K-TIRADS category.
FNA, fine-needle aspiration; K-TIRADS, Korean-Thyroid Imaging Reporting and Data System; AUS/FLUS, atypia of undetermined significance/follicular lesion of undetermined significance; FN/SFN, follicular neoplasm/suspicious for follicular neoplasm; NA, not applicable.
The rates of neoplasms, including malignant tumor and follicular neoplasm, were 9.4% and 25% higher than the rates of malignant tumors in nodules with AUS/FLUS and FN/SFN cytology results, respectively. The rate of neoplasms was only slightly higher (<4%) than the rate of malignant tumors in each category of nodules with other FNA cytology results. The rate of neoplasms was similar to that of malignant tumors (<2% difference) in K-TIRADS 2, 3, and 5 nodules, and it was 5.7% higher than the rate of malignant tumors in K-TIRADS 4 nodules. In nodules with FN/SFN cytology results, the neoplasm risk of K-TIRADS 4 nodules was significantly higher than that of K-TIRADS 3 nodules (70.6% vs. 28.6%; p = 0.002).
Changes in malignancy risk for each FNA cytology result according to the K-TIRADS category
Table 2 shows the changes in malignancy risk for each FNA cytology result according to the K-TIRADS category. In nodules with non-diagnostic, benign, and AUS/FLUS cytology results, a low-suspicion US pattern (K-TIRADS 3) significantly decreased the malignancy risk of nodules (p = 0.003, 0.013, and 0.027, respectively) and a high-suspicion US pattern (K-TIRADS 5) significantly increased the malignancy risk of nodules (p < 0.001, p < 0.001, and p = 0.001, respectively) compared to the overall malignancy risk of each FNA cytology result. An intermediate-suspicion US pattern (K-TIRADS 4) did not cause any significant change in the malignancy risk, regardless of FNA cytology results (p ≥ 0.078). In nodules with FN/SFN, suspicious for malignancy, and malignant FNA cytology results, the K-TIRADS did not cause significant change in the malignancy risk of each FNA cytology results (p ≥ 0.235).
Effect of K-TIRADS category on the malignancy risk of each FNA cytology result.
Binominal test for the difference of malignancy risk between a combined cytology-ultrasonography (CU) result and the overall malignancy risk of the same FNA cytology result.
Changes in malignancy risk for each K-TIRADS category according to FNA cytology results
Table 3 shows the changes in malignancy risk for each K-TIRADS category according to the FNA cytology results. In nodules with K-TIRADS 3 or 4, a benign cytology result decreased the malignancy risk of nodules significantly compared to the overall malignancy risk of K-TIRADS 3 or 4. AUS/FLUS, suspicious for malignancy, and malignant cytology results significantly increased the malignancy risk of nodules (p < 0.001 for K-TIRADS 3 and p ≤ 0.024 for K-TIRADS 4). Non-diagnostic or FN/SFN cytology results did not cause any significant change in the malignancy risk in TIRADS 3 or 4 nodules (p ≥ 0.137).
Effect of FNA cytology results on the malignancy risk of each K-TIRADS category.
Binominal test for the difference of malignancy risk between a combined CU result and the overall malignancy risk of the same K-TRIADS category.
In K-TIRADS 5, benign cytology results significantly decreased the malignancy risk (p < 0.001), and suspicious for malignant and malignant cytology results significantly increased the malignancy risk (p < 0.001). Non-diagnostic or AUS/FLUS cytology results did not cause any significant change in the malignancy risk in TIRADS 5 nodules (p = 0.592 and p = 0.055, respectively).
Proposed CU scoring system for malignancy risk stratification
The malignancy risk of each combined CU result could be stratified into four simplified scores: CU 1 (very low risk), CU 2 (low risk), CU 3 (high risk), and CU 4 (very high risk). Table 4 shows the CU scores of thyroid nodules according to FNA results and the K-TIRADS categories. Nodules with non-diagnostic or AUS/FLUS cytology results have three CU scores (1, 2, and 3). Nodules with benign cytology results have CU 1 or 2 scores, and nodules with FN/SFN cytology results have CU 2 or 3 scores. The nodules suspicious for malignant or malignant cytology results have only CU 4 scores, regardless of the K-TIRADS category.
Management decisions may be modified by other factors including nodule size, presence of aggressive cancer behaviors, clinical risk factors, and individual patient factors.
Discussion
The data demonstrate that thyroid nodules could be stratified into the four categories of the CU scoring system according to the malignancy risk, based on combined results of FNA cytology and the K-TIRADS. The CU scoring system can stratify the malignancy risk into three categories (CU 1, 2, and 3) for nodules with non-diagnostic or AUS/FLUS cytology results, two categories for nodules with benign cytology result (CU 1 and 2) or FN/SFN cytology results (CU 2 and 3), and one category (CU 4) for nodules with suspicious for malignancy or malignant cytology results.
Nodules with a CU 1 score included K-TIRADS 2 nodules and K-TIRADS 3 or 4 nodules with benign cytology results. Nodules with CU 1 scores can be reliably observed without repeated biopsies. Nodules with a CU 2 score include subsets of nodules with indeterminate cytology results, including non-diagnostic, AUS/FLUS, and FN/SFN cytology results, and these nodules may require repeated biopsies for optimal management decisions.
A previous study (2) suggested that non-diagnostic thyroid nodules without suspicious US features (solidity, hypoechogenicity or marked hypoechogenicity, microlobulated or irregular margin, microcalcification, and taller-than-wide shape) or those with one suspicious US feature could be followed up with US in lieu of a repeated biopsy, considering the calculated low malignancy risk. The K-TIRADS subcategorizes nodules without suspicious US features into benign (K-TIRADS 2) and low-suspicion nodules (K-TIRADS 3). The data suggest that non-diagnostic nodules with a low-suspicion US pattern (K-TIRADS 3) are categorized as CU 2 and require a repeated biopsy. First, a non-diagnostic FNA result did not significantly reduce the malignancy risk of low-suspicion nodules. Second, FNA for low-suspicion nodules is usually indicated clinically for progressively enlarging nodules >1.5–2 cm (15,16,18) to screen malignant tumors in which follicular neoplasms and follicular-variant PTC show relatively large proportions. Therefore, a non-diagnostic FNA result does not eliminate or reduce the clinical need for screening for malignancy in patients with large low-suspicion nodules if the malignancy risk is not in fact reduced significantly by a non-diagnostic FNA result. Third, it needs to be considered that the calculated malignancy rate of low-suspicion nodules in a cohort study depends on the proportion of follicular carcinoma and follicular-variant PTC because follicular carcinoma and follicular-variant PTC are present in relatively large proportions among malignant tumors showing low-suspicion US patterns.
AUS/FLUS nodules showed various CU scores, which may be related to the heterogeneous histopathology of thyroid nodules diagnosed as AUS/FLUS (24). Presently, repeated biopsy seems to be a reasonable management strategy for nodules initially diagnosed as AUS/FLUS because it results in a benign cytology diagnosis in >30–50% of patients, and unnecessary diagnostic surgery can be avoided for these nodules (24 –26). However, the efficacy of repeated FNA is controversial in AUS/FLUS nodules because repeat FNA also reproduces a significant rate of inconclusive results (24,25,27), and AUS/FLUS nodules may have a higher risk of malignancy than previously estimated (25). Other possible strategies for AUS/FLUS nodules include making management decisions based on a combination of US features and subcategory diagnoses of AUS/FLUS (9), use of CNB as an alternative to repeat FNA (26), and application of molecular tests (28,29).
Several studies (30,31) have suggested that AUS/FLUS nodules with suspicious US features should undergo thyroidectomies without a repeated FNA because these nodules have a high malignancy risk. In indeterminate nodules, the decision of diagnostic surgery needs to be determined by the efficacy and benefit of the alternative nonsurgical methods, including repeated biopsy (FNA or CNB) or molecular studies for reducing unnecessary surgery, as well as the estimated malignancy risk of a nodule. Although recent guidelines recommend repeated biopsy in lieu of immediate surgery for nodules with AUS/FLUS FNA results (15,16), the diagnostic efficacy and benefit of repeated biopsy has not been determined according to US patterns in AUS/FLUS nodules. For the management of high-risk AUS/FLUS nodules with high-suspicion US pattern (K-TIRADS 5), diagnostic surgery instead of repeated biopsy may be determined with consideration of nodule size, clinical risk factor, and patient's factor because the malignancy risk is high (62.5%) and diagnostic efficacy and benefit of repeated biopsy is not proven in these nodules, which might be lower compared to low-risk AUS/FLUS nodules (CU2). However, there is potential benefit for reducing unnecessary surgery and repeated biopsy (FNA or CNB) is a safe procedure that is rarely associated with major complications and does not result in a surgical scar or potential postsurgical complications such as hypothyroidism (10.9–48.8%) induced by hemithyroidectomy (32). Therefore, the first-line use of biopsy instead of immediate diagnostic surgery is a reasonable choice, and thyroid surgery can be considered if an indeterminate result is reproduced by repeated biopsies. High-risk AUS/FLUS nodules (CU3) are ideal candidates for rule-out molecular tests such as ThyroSeq or Afirma in order to avoid unnecessary diagnostic surgery. The optimal strategy for the application of repeated biopsies and molecular testing according to the CU score needs to be established in AUS/FLUS nodules.
This study showed that K-TIRADS 3 nodules have a slightly lower malignancy risk (CU 2) and a significantly lower neoplasm risk (<30%) than K-TIRADS 4 nodules (CU 3) among nodules with FN/SFN cytology results. Based on these results, repeated biopsies may be considered for K-TIRADS 3 nodules with FN/SFN cytology results (CU2), and CNB may be more effective than FNA for an accurate diagnosis and reducing unnecessary surgery (32 –34). However, the benefit of repeated biopsies for these nodules has not been confirmed and remains to be further investigated. Diagnostic surgery is recommended for CU 3 nodules with K-TIRADS 4 and FN/SFN cytology results because the neoplasm risk is high and the non-neoplastic nodules included in this category may be mostly hypercellular hyperplastic nodules, which may have a high probability of being diagnosed as FN/SFN or AUS/FLUS rather than benign follicular nodules by repeated biopsy.
A previous study (35) reported that the malignancy risk of thyroid nodules without suspicious US features is lower than that of nodules with suspicious US features among nodules with cytology results suspicious for PTC. However, no benign nodule was found among K-TIRADS 3 nodules with suspicious for malignancy cytology results in this study. This may be related to the very high malignancy risk of nodules with suspicious for malignancy cytology results in this study. The potential benefit of repeated biopsies in these cases needs to be further investigated in institutions showing relatively lower malignancy risk (approximately 70–80%) of nodules with suspicious for malignancy cytology results.
Although the management of thyroid nodules may be determined primarily by the malignancy risk based on FNA cytology results and US patterns, management decisions may need to be modified by other factors, including nodule size, presence of features suggestive for aggressive cancer such as extrathyroidal invasion or lymph node metastasis, various clinical risk factors, and individual patient factors, such as age, comorbidities, and patient preferences. The proposed management recommendations (Table 4) have not been tested in the present study and require further investigation to establish the management strategy according to CU score.
The very low malignancy risk of CU 1 score was determined with consideration for observation management, and the suggested malignancy risk is the same as that of a benign FNA result (1). The low malignancy risk of CU2 category was determined with consideration for repeated biopsy as a management strategy. The reported malignancy risk of AUS/FLUS nodules ranges approximately 27–34% according to systematic review and meta-analysis studies (36,37). The very high malignancy risk of CU 4 score was determined considering definite surgical management. The CU3 score was determined for nodules with high malignancy risk (30–90%), in which surgery or repeated biopsy may be chosen for the management of nodules.
This study has several limitations. First, its retrospective design may have induced selection bias. Further investigation through a larger prospective study is required to validate the utility of the proposed CU system. Second, there were no substantial differences in malignancy risk between suspicious for malignant and malignant cytology results in this study. CU score and management strategy for nodules with suspicious for malignancy cytology results may differ in institutions where the malignancy risk for such results is not very high. Third, further investigation is required to determine the diagnostic efficacy and benefit of repeated biopsy according to US patterns in nodules with indeterminate cytology results. Fourth, the inter-observer variability for US patterns (K-TIRADS) was not assessed in the present study.
In conclusion, the malignancy risk of thyroid nodules can be stratified into four scores by the CU risk-stratification scoring system. The proposed CU scoring system may be helpful in the management of thyroid nodules after FNA.
Footnotes
Author Disclosure Statement
The authors have nothing to disclose.
