Abstract
Background:
Bethesda System category III (atypia of undetermined significance/follicular lesion of undetermined significance [AUS/FLUS]) creates a dilemma because of heterogeneity. The aim of this study was to assess whether ultrasonography (US) contributes to differentiating AUS from FLUS and may suggest characteristics of malignancy within category III.
Methods:
From April 2011 to April 2012, 433 thyroid nodules that had been interpreted as nuclear atypia (AUS group; n = 322) or microfollicular architecture (FLUS group; n = 111) were included in the retrospective study. Final diagnoses were acquired in 327 nodules after surgery and clinico-radiological follow-up. The AUS group and FLUS group were compared in terms of US features (composition, echogenicity, margin, shape, and calcifications), US diagnosis (probably benign, malignant), malignancy rate, and final malignant histology.
Results:
In univariate and multivariate analysis, the AUS group more frequently had non-circumscribed margins, taller-than-wide shape, and an US diagnosis suggestive for malignancy than the FLUS group did. The risk of malignancy was significantly higher in the AUS group than it was in the FLUS group (51.0% vs.8.1%; p < 0.001). There was a significant difference in the presence of a BRAF mutation between the AUS group and the FLUS group (47.6% vs. 4.2%; p < 0.001). Of the patients with malignancy, papillary thyroid carcinoma was found more frequently in the AUS group than in the FLUS group (97.7% vs. 66.7%; p = 0.004). The incidence of follicular thyroid carcinoma was significantly higher in the FLUS group than it was in the AUS group (33.3% vs.1.6%; p = 0.004).
Conclusion:
Bethesda System category III subcategorization into AUS and FLUS can be supported by US features. In Bethesda III nodules, US features may further help in predicting a malignant histology.
Introduction
C
To resolve the heterogeneity of this category, several studies have proposed that AUS/FLUS should be split into two subcategories: (i) cases with nuclear features not characteristic of benign lesions (AUS), and (ii) cases with low cellularity with predominant microfollicular architecture and no or minimal colloid (FLUS) (13 –16). Recent studies have suggested that patients classified in the AUS subcategory should have a more aggressive follow-up, and repeat procedures should be performed sooner than in patients in the FLUS subcategory by stratification of risk for malignancy (13,14). In view of the availability of ultrasonography (US) findings for thyroid nodules, knowledge of US predictors for distinguishing AUS from FLUS is important to aid clinical decision making and provide further management guidelines.
This study divided AUS/FLUS cases into two subcategories in terms of the predominance of nuclear atypia or architectural atypia. Thus, the purpose of this study was to assess whether ultrasonography (US) contributes to differentiating AUS from FLUS and may suggest characteristics of malignancy within category III.
Materials and Methods
Study population
This retrospective study had Institutional Review Board approval, and the requirement for informed consent was waived. A computer search identified cytological diagnoses for 6402 thyroid nodules evaluated by US-guided FNA between April 2011 and April 2012 at the authors’ institution. Among them, 551 (8.6%) were diagnosed as BSRTC category III (AUS/FLUS). Ninety-seven nodules ≤0.5 cm in size, 16 nodules without US images, and four nodules that had no lesion matching with US results were excluded. A total of 433 thyroid nodules diagnosed as category III were included in this study. Of these, 327 thyroid nodules underwent surgery or US follow-up for at least 12 months and were included in the statistical analysis.
US examination and US-FNA procedure
US was performed with 5–12 MHz linear array transducers (iU22; Philips Medical Systems, Bothell, WA). Real-time US examinations were performed by one of seven radiologists with 5–20 years of experience in thyroid imaging. US-guided FNA was conducted subsequently by the same radiologist performing the US examination.
US-guided FNA was performed for the nodule presenting suspicious US features or the largest of multiple nodules when none showed suspicious US features. US-guided FNA was performed with a 23-gauge needle attached to a 2 mL disposable plastic syringe, without the use of an aspirator (17). Aspirated material was expelled onto glass slides, smeared, and immediately fixed in 95% ethyl alcohol for Papanicolau staining. The remainder of the material within the syringe was rinsed with ethyl alcohol for cell block processing, and BRAFV600E mutational analysis from FNA specimens was performed in some cases at the request of clinicians. Slides were sent to the pathology department for analysis.
Cytological analysis
FNA cytology was interpreted by one of seven pathologists. According to the BSRTC recommendations (1), cytologic diagnoses were classified into one of six categories: (i) nondiagnostic or unsatisfactory, (ii) benign, (iii) AUS/FLUS, (iv) follicular neoplasm or suspicious for a follicular neoplasm, (v) suspicious for malignancy, and (vi) malignant. The AUS/FLUS category was further classified into two subcategories (4,13 –16,18,19). The AUS group included cases that demonstrated features of nuclear atypia such as the presence of occasional nuclear pseudoinclusions, grooves, abnormal chromatin pattern, or nuclear overlapping and crowding, but did not meet the criteria for diagnosis of suspicious for malignancy. The FLUS group consisted of cases with features of architectural atypia such as the presence of a prominent population of microfollicles or Hürthle cells with scant colloid, but which were not sufficient to be diagnosed as a follicular neoplasm or suspicious for a follicular neoplasm. Through a careful retrospective review of the cytological slides by one cytopathologist (Y.L.O.) specializing in thyroid pathology for cases of category III that were not subclassified initially, subclassification of the AUS/FLUS category was achieved (15).
Analysis of US findings
Two radiologists (J.H.S. and S.L.) retrospectively reviewed the US images independently. They were blinded to each patient's clinical history, previous imaging results, and histologic results. When there was a discrepancy in the interpretation, the two reviewers’ opinions were resolved by consensus.
Nodule size was measured as the largest diameter on US images. The following US features of each thyroid nodule were documented: internal composition, echogenicity, margin, calcifications, and shape (20,21). Internal composition was further classified as solid, predominantly solid (solid contents consisting of >50% of the nodule), or predominantly cystic (solid contents consisting of <50% of the nodule). Echogenicity was divided into hyper- or isoechoic (nodules showing hyper- to isoechogenicity compared with the surrounding thyroid parenchyma), hypoechoic (nodules showing hypoechogenicity compared with the surrounding thyroid parenchyma), or marked hypoechogenicity (nodules showing hypoechogenicity compared with the surrounding strap muscle). Margins were categorized as circumscribed or noncircumscribed (i.e., microlobulated or spiculated margins). Calcifications were designated as absent, microcalcifications, macrocalcifications, and rim calcifications. Shape was classified as taller-than-wide (when the anteroposterior diameter of a nodule was longer than its transverse diameter on a transverse or longitudinal plane) or wider-than-tall. Of these features, marked hypoechogenicity, noncircumscribed margin, microcalcifications, and taller-than-wide shape were considered as features suggestive for malignancy according to previously published criteria (20,21). A US diagnosis suggestive for malignancy was made when a nodule had at least one of these features. A probably benign US diagnosis was made when a thyroid nodule had none of the features suggestive for malignancy.
Statistical analysis
Continuous data were evaluated by using two-sample t-tests or Mann–Whitney U-tests according to the normality assumption. Categorical variables were analyzed by using chi-square tests or Fisher's exact tests. In the cases where final diagnoses were obtained after histologic confirmation or clinico-radiological follow-up, logistic regression for univariate and multivariate (stepwise selection) analysis was used to compare predictive US factors between AUS and FLUS. Statistical analysis was performed by using SAS v9.4 software (SAS Institute, Cary, NC). A p-value of <0.05 was considered statistically significant.
Results
Of the 433 thyroid nodules classified as BSRTC category III by FNA cytology, 322 (74.4%) were diagnosed as AUS and 111 (25.6%) were diagnosed as FLUS. The clinical factors and US features of the AUS and FLUS groups are presented in Table 1. Nodule size was larger in the FLUS group (M ± SD = 1.85 ± 1.21 cm; range 0.6–7.0 cm) than in the AUS group (M ± SD = 1.23 ± 0.77 cm; range 0.6–5.9 cm; p < 0.001). Age (p = 0.781) and sex (p = 0.720) were not significantly different between the two groups.
Unless otherwise indicated, data in parentheses are percentages.
Data are means ± standard deviation.
US, ultrasonography; AUS, atypia of undetermined significance; FLUS, follicular lesion of undetermined significance.
Final diagnoses were acquired in 327 nodules (AUS, n = 253; FLUS, n = 74) after surgical and clinico-radiological follow-up (Fig. 1). The malignancy risk of BSRTC category III (AUS/FLUS) was 41.3% (135/327), and that of the AUS group (51.0%, 129/253) was significantly greater than that of the FLUS group (8.1%, 6/74; p < 0.001). Of the patients with malignancy, papillary thyroid carcinoma (PTC) was found more frequently in the AUS group than in the FLUS group (97.7%, 126/129 vs. 66.7%, 4/6; p = 0.004). In surgically confirmed malignant histologies, the AUS group included classic PTC in 114 cases, follicular variant of PTC in 11 cases, follicular carcinoma in two cases, an oncocytic variant of PTC in one case, and medullary thyroid carcinoma in one case. On the other hand, the FLUS group included classic PTC in three cases, follicular carcinoma in two cases, and follicular variant of PTC in two cases. The incidence of follicular thyroid carcinomas was significantly higher in the FLUS group (33.3%, 2/6) than in the AUS group (1.6%, 2/129; p = 0.004). In contrast, benign results were significantly greater in the FLUS group than in the AUS group (91.9%, 68/74 vs. 49.0%, 124/253; p < 0.001).

Flow chart of the study group. FNA, fine-needle aspiration; AUS, atypia of undetermined significance; FLUS, follicular lesion of undetermined significance; CNB, core-needle biopsy; F/U, follow-up; US, ultrasound.
Among the BSRTC category III, 211 samples had adequate BRAF mutation analyses. Of the 211 cases, 187 (88.6%) were diagnosed as AUS, and 89 (47.6%) of these 187 cases had BRAF mutations. On the contrary, only one (4.2%) of 24 FLUS cases had a BRAF mutation (p < 0.001).
In the analysis of US features, an US diagnosis suggestive for malignancy was significantly more frequent in the AUS group than in the FLUS group (66.0%, 167/253 vs. 12.2%, 9/74; p < 0.001) (Figs. 2 and 3). Moreover, there were statistically significant differences between the AUS and FLUS groups in individual US features including echogenicity, margin, calcifications, and shape (p < 0.05).

(

(
In univariate analyses of US predictors for differentiation between the AUS and FLUS groups, the parameters of smaller nodule size, hypoechogenicity, marked hypoechogenicity, noncircumscribed margin, microcalcification, taller-than-wide shape, and a US diagnosis suggestive for malignancy were significantly more common in the AUS group than in the FLUS group (p < 0.05). In multivariate analysis, margin (odds ratio [OR] 18.519; [confidence interval (CI) 3.774–90.909]; p < 0.001), shape (OR 7.685; [CI 1.500–39.373]; p = 0.014), and an US diagnosis suggestive for malignancy (OR 7.634; [CI 2.188–26.316]; p = 0.001) were independent predictors for distinguishing between AUS and FLUS (Table 2).
OR, odds ratio; CI, confidence interval.
Discussion
The present study found significant differences in US findings between the AUS and FLUS groups. In univariate analyses, certain US findings (smaller nodule size, hypoechogenicity, marked hypoechogenicity, noncircumscribed margin, microcalcifications, and taller-than-wide shape) and an US diagnosis suggestive for malignancy were significantly more frequent in the AUS group than in the FLUS group. According to multivariate analysis, margin, shape, and an US diagnosis suggestive for malignancy were predictive factors for differentiating between the AUS and FLUS groups. A previous study by Rosario et al. (16) showed similar results in that AUS presented a higher frequency of suspicious malignant US findings compared with FLUS. However, these authors did not systemically evaluate each US finding of the included nodules. A recent study by Choi et al. (14) on patients who underwent US-guided core needle biopsy on the basis of previous AUS/FLUS FNA results showed that smaller nodule size, spiculated margin, marked hypoechogenicity, micro- or macrocalcifications, and US findings suggestive for malignancy were significantly more common in AUS than in FLUS. However, these authors did not evaluate independent predictors for differentiating between these two groups in terms of individual US findings.
Recent studies that have further classified the AUS/FLUS category into subcategories have consistently indicated that the malignancy risk of the nuclear atypia group (32.6–65%) is significantly higher than that of the microfollicular architecture group (7–34%) (10,13,14,19,22 –24). Therefore, the present study is comparable to previous studies. A significantly higher malignancy risk was found in the AUS group (51.0%) compared with the FLUS group (8.1%). In addition, PTC was found more frequently in the AUS group, and the rate of follicular thyroid carcinomas was significantly higher in the FLUS group. The reported discrepancies between the malignancy risk and a malignant histology of AUS and FLUS could support the necessity for subcategorization within BSRTC category III. Furthermore, the present results indicate that US can serve as a differentiating tool for the subcategorization of category III.
The different histologies were associated with a significant difference in the presence of BRAF mutations between the AUS group and FLUS group (47.6% vs. 4.2%). Other molecular testing such as Afirma, Thyroseq, or ThygenX, among others, was not used in the clinical setting. Given the overwhelming incidence of PTC (97%) in the United States, primary testing for BRAF mutation seems most practical because of the high prevalence of BRAF mutations (80.6–84.0%), economic considerations, and a limit set by health insurance companies (25 –27).
The practice of using subcategories with AUS versus FLUS cytology is not yet accepted by all pathologists. The assessment of US features suggestive for malignancy can be predictive factors for differentiating between AUS and FLUS. Moreover, the identification of independent US predictors is valuable in order to guide the adequate management of patients. It is suggested that thyroid nodules with US features favoring AUS such as non-circumscribed margin, taller-than-wide shape, and an US diagnosis suggestive for malignancy should be followed up more closely, and further intervention using repeat FNA or core needle biopsy is needed promptly in such cases. On the other hand, in nodules showing US findings favoring a benign lesion that are more likely associated with FLUS with a relatively low risk of malignancy (larger size, circumscribed margin, wider-than-tall shape, solid, without calcifications), US follow-up may be appropriate for patient management, unless the nodule size increases or the US features change. The present study has clinical significance in that independent US predictors differentiating between AUS and FLUS may provide physicians with more detailed management recommendations.
A detailed knowledge of the US features may be helpful for pathologists in formulating a recommendation. In the case of discordant findings between cytology and US diagnosis, further evaluation may be necessary. In nodules with US features suggestive for malignancy and FLUS cytology findings, repeat FNA or biopsy should be considered. At the authors’ institution, pathologists compare the cytology findings with the US diagnosis, which does help to make management recommendations.
The present study has several limitations. First, selection bias might have been introduced due to the retrospective single-center nature of the study. Second, a relatively large number of patients were lost to follow-up (21.4% of the AUS group and 33.3% of the FLUS group). However, it is expected that this would have had little influence on malignancy rate because the compliance of patients for follow-up of nodules with US findings suggestive for malignancy is relatively good at tertiary hospitals. Third, the diagnostic performance of US findings for differentiating between the two groups was not evaluated. Thus, further investigations are required.
In conclusion, margin, shape, and an US diagnosis suggestive for malignancy (a nodule that had at least one malignant US feature as marked hypoechogenicity, noncircumscribed margin, microcalcifications, and taller-than-wide shape) can be predictive factors for differentiating between AUS and FLUS. Therefore, Bethesda System category III subcategorization into AUS and FLUS can be supported by US features. In Bethesda III nodules, US features may further help predicting a malignant histology.
Footnotes
Author Disclosure Statement
S.L. disclosed no relevant relationships. J.H.S. disclosed no relevant relationships. S.Y.H. disclosed no relevant relationships. Y.L.O disclosed no relevant relationships.
