Abstract
Background:
The recently introduced Afirma gene expression classifier (AGEC) provides binary results (benign or suspicious) to guide management of cytologically indeterminate thyroid nodules. The AGEC is intended to reduce unnecessary surgeries for benign nodules, and management algorithms favor surgery for suspicious results. Limited data are available on the performance of this test for Hürthle cell nodules (HCNs). This study hypothesized that a predominance of Hürthle cells leads to an increased rate of suspicious AGEC results with a potential for overtreatment, despite a relatively low risk of malignancy.
Methods:
The pathology databases from three tertiary care facilities were queried from 2010 to 2014 for fine-needle aspirates (FNAs) diagnosed as suspicious for Hürthle cell neoplasm (SHCN) or atypia of undetermined significance/follicular lesion of undetermined significance concerning for Hürthle cell neoplasm (AFHCN). Cytology diagnoses were rendered internally prior to AGEC testing. The patient demographics, FNA diagnosis, AGEC result, surgical procedure, and pathologic outcomes were recorded.
Results:
The cohort consisted of 134 patients with HCNs. Prior to AGEC availability, 62 patients underwent surgery: 81% (50/62) of patients had surgery, and 34% (17/50) of the resected index nodules were malignant. After introduction of the AGEC, 72 patients underwent AGEC testing: 65% (47/72) of patients had surgery, and 13% (6/46) of the resected nodules were malignant. Thirty-two percent (23/72) of patients had a benign AGEC result and did not undergo surgery, and 4% (3/72) had surgery despite a benign AGEC result with benign final pathology, whereas 63% (45/72) of patients had suspicious AGEC results, with 96% of these patients (43/45) undergoing surgery, and 14% (6/43) of these index nodules were malignant.
Conclusions:
While 32% of tested patients declined surgery based on a benign AGEC, 86% of patients with suspicious AGEC findings had unnecessary surgery, reflecting a substantially lower rate of malignancy from what was previously reported for all indeterminate nodules. Given the approximate pretest malignancy risk of 25–35% for an FNA diagnosis of SHCN or AFHCN, a suspicious AGEC diagnosis does not increase the probability of malignancy in an HCN, and patients should be counseled accordingly.
Introduction
T
Based on these test characteristics, AGEC results are being incorporated into surgical decision making for patients with cytologically indeterminate thyroid nodules. The forthcoming American Thyroid Association's updated clinical practice guidelines for thyroid nodule management are expected to include molecular testing as an acceptable adjunct to thyroid FNA. Similarly, the National Comprehensive Cancer Network's 2014 guidelines for thyroid carcinoma integrate molecular testing for optional triage of indeterminate FNAs, with observation being acceptable if the predicted risk of malignancy is equivalent to benign cytology (≤5%) (7). In routine practice, patients with indeterminate cytology and benign AGEC results have experienced a dramatic reduction in surgery rates, resulting in similar management for these patients as cytologically benign cases (8,9).
Following the integration of AGEC testing into clinical practice, further studies have indicated that the AGEC's performance for Hürthle cell (oncocytic) nodules (HCNs) may differ from the published data for other AUS/FLUS and SFN nodules (10). The initial validation data for the AGEC suggested a tendency for benign HCNs to yield suspicious AGEC results (1). These data are not surprising, given previous literature on the clustering of benign and malignant Hürthle cell tumors using various gene expression profiling methods (11 –15). Follow-up studies by several groups have also identified a trend of suspicious AGEC results in FNAs of nodules with Hürthle cell features (16 –18).
To further investigate and characterize the performance of the AGEC in thyroid nodules with Hürthle cell features, this study evaluated a multi-institutional cohort of 134 patients. This study is the largest series reported to date of HCNs that have undergone AGEC testing with detailed surgical and pathologic follow-up. The study aimed to describe the management and outcome of these HCNs before and after introduction of the AGEC and characterize the test's performance in this tumor subtype.
Materials and Methods
Study design
Following Institutional Review Board approval, the pathology databases of three tertiary care facilities were retrospectively queried from 2010 to 2014 at Massachusetts General Hospital (MGH) and Brigham and Women's Hospital (BWH), and from 2013 to 2014 at Beth Israel Deaconess Medical Center (BIDMC). Thyroid FNAs diagnosed as suspicious for a Hürthle cell neoplasm (SHCN) or AUS/FLUS with a predominance of Hürthle cells (AFHCN) were identified and further investigated.
Patient demographics, risk factors, clinical presentation (symptomatic vs. incidentally discovered nodule), nodule size by ultrasound, FNA diagnosis, AGEC result, surgical procedure, and final pathologic outcome were recorded. Incidentally discovered papillary thyroid microcarcinoma or malignancy identified in a different nodule were recorded but not considered as a malignancy in the index nodule for prevalence calculations and performance determination.
Based on the implementation of AGEC testing at each institution, a one-year reference population of FNAs performed prior to the availability of the AGEC from BWH (2010) and MGH (2012) was identified. No statistically significant differences in demographic characteristics or nodule size on ultrasound were observed between patients at BWH and MGH. This reference population facilitated determination of the baseline rate of malignancy in this specific setting. This population was compared with patients who were subsequently managed using AGEC testing (2011–2014 at BWH, 2013–2014 at MGH and BIDMC). The frequency of diagnoses in each of the TBSRTC categories as well as the percentage of total thyroid FNAs diagnosed as HCNs were compared across the three institutions in the study, and were found to be similar and stable throughout the study period (data not shown).
Patient selection
Patients aged 21 years or older who received clinical care at MGH, BWH, or BIDMC were included in the study when they were first diagnosed with SHCN or AFHCN cytology, regardless of previous FNA results for the same nodule. For patients who had more than one indeterminate nodule with a cytology rich in Hürthle cells, the nodule with larger size, associated symptoms, or higher clinical suspicion was included. Patients were excluded from the study if they were lost to follow-up, lacked critical data to evaluate outcomes, or if AGEC or final pathology results were pending at the time of data acquisition, as detailed in the results below.
Institutional protocol
At each of the participating institutions, the protocol for management of a thyroid nodule with indeterminate cytology is similar and calls for a repeat FNA with an additional two dedicated needle passes collected for AGEC testing. This AGEC specimen is held while the repeat FNA cytology is reviewed. If the second aspirate yields an AUS/FLUS, SFN, or SHCN diagnosis (at MGH and BWH) or any diagnosis other than suspicious for malignancy or malignant (at BIDMC), the AGEC sample is sent to Veracyte for testing. All cytology diagnoses are rendered internally at the institution prior to AGEC testing. The academic institutions are designated “Enabled Centers” not required to send aspirates to Veracyte's affiliated Thyroid Cytopathology Partners (Austin, TX) to access the AGEC.
Statistical analysis
For continuous variable analysis, a two-tailed Student's t-test was used for pairwise comparisons, and analysis of variance was used for multiple comparisons. Categorical variables were analyzed with Fisher's exact test, and percentages between groups were assessed by Pearson's chi-square test. Sensitivity and specificity were calculated by standard formulas, and positive predictive value (PPV) and negative predictive value (NPV) were determined using Bayes' formula. Statistical analysis was performed using IBM SPSS Statistics for Windows v21.0 (IBM Corp., Armonk, NY). Two-tailed p-values of<0.05 were considered statistically significant.
Results
Before the AGEC became available (during reference periods 2010 at BWH and 2012 at MGH), 66 patients were diagnosed with SHCN or AFHCN. Four (6%) patients were excluded due to being lost to follow-up (n=2), lack of data after the cytology result (n=1), or a diagnosis of suspicious for malignancy in another nodule, which dictated treatment decisions (n=1), leaving a reference population of 62 patients for further evaluation (Table 1). As illustrated in Figure 1A, 19% (n=12) of patients in this cohort were followed clinically, with documented reasons including repeat FNA diagnosed as benign or lacking oncocytic features (n=5), patient preference (n=3), and a separate malignancy in another organ (n=1), with no documented reason in three cases. Eighty-one percent (n=50) of patients underwent surgery, with 34% (n=17) of resections yielding a malignancy in the index nodule (9 Hürthle cell carcinoma, 4 papillary carcinoma, 3 follicular carcinoma, and 1 medullary carcinoma). Fifty percent (n=25) of patients underwent hemithyroidectomy (HT), and 50% (n=25) underwent TT or near-TT.

Study population characterized from three tertiary care facilities from 2010 to 2014, including (
Including 19 patients from BIDMC.
Age and ultrasound size are presented as mean±standard deviation (SD).
AGEC, Afirma gene expression classifier; BIDMC, Beth Israel Deaconess Medical Center.
After the introduction of the AGEC, 171 patients were diagnosed with SHCN or AFHCN cytology (2011–2014 at BWH and 2013–2014 at MGH). Twenty-one (12%) patients were excluded due to being lost to follow-up (n=6), awaiting surgery at the time of data acquisition (n=5), a lack of a documented treatment plan following cytology or AGEC results (n=4), awaiting repeat FNA for AGEC testing (n=3), a diagnosis of suspicious for malignancy in another nodule, which dictated treatment decisions (n=2), and aborted surgical procedure due to complications (anaphylaxis, n=1). The excluded patients were 81% female (n=17, p=0.76) with a mean age of 62 years, similar to the reference and AGEC populations (p=0.11).
Ninety-seven patients were clinically managed without using AGEC testing, despite molecular testing being available. For these patients, the decision for surgery or clinical follow-up was based on clinical and sonographic features alone. Statistical analysis did not reveal any difference in demographics or nodule size on ultrasound between this group and either the pre-AGEC cohort or the AGEC cohort (Table 1), although detailed reports of ultrasound characteristics that may have played a role in management decisions were not available. Eighty percent (n=78) of these patients had surgery, and 29% (n=23) were diagnosed with a malignant index nodule. Thus, the rates of surgical intervention and malignancy in the index nodule did not change significantly after AGEC implementation (p=0.9 and p=0.6, respectively), suggesting a relatively constant malignancy rate of 31% throughout the study period. Documented reasons for electing clinical follow-up instead of surgery (n=19, 20%) in this cohort included repeat FNA with benign cytology (n=10), a separate malignancy in another organ (n=6), prior neck surgery limiting future surgical approaches (n=1), a physician's decision to treat the cytology as if benign (n=1), and unknown (n=1).
From 2011 to 2014, 72 patients with an AFHCN or SHCN diagnosis underwent AGEC testing (Fig. 1B). Sixty-three percent (n=45) of AGEC results were classified as suspicious for malignancy, 36% (n=26) were classified as benign, and 1% (n=1) were non-diagnostic.
Surgery was performed in 96% (43/45) of patients with suspicious AGEC results. Two patients were clinically followed despite suspicious AGEC results due to metastatic lung cancer in one and refusal of surgical intervention in the other. Of the resected nodules, 14% (n=6) of patients were found to have a malignant index nodule (Table 2). Examining the 43 surgical cases by TBSRTC category, 18 patients had a cytologic diagnosis of AFHCN, 11% (n=2) of whom were diagnosed with a malignancy (1 papillary carcinoma, 1 Hürthle cell carcinoma), while 25 were cytologically SHCN, 16% (n=4) of whom were diagnosed with a malignancy (3 Hürthle cell carcinoma, 1 papillary carcinoma). Table 3 summarizes the AGEC's performance in light of the final pathology for each TBSRTC category in our study.
Including 19 patients from BIDMC.
Percentage of resected nodules.
HCNs, Hürthle cell nodules; SHCN, suspicious for Hürthle cell neoplasm; AFHCN, atypia of undetermined significance/follicular lesion of undetermined significance concerning for Hürthle cell neoplasm.
Sensitivity=100%, specificity=7.5%.
Sensitivity=100%, specificity=4.5%.
Sensitivity=100%, specificity=11%.
Benign AGEC results were associated with a decision for clinical follow-up in 88% of patients (23/26 benign AGEC cases), while 12% (n=3) of patients had surgery, all with benign final pathology results. The documented reasons for electing surgery in the setting of benign AGEC results included nodule growth during follow-up (n=1), the presence of two conflicting AGEC and FNA results for the same nodule (n=1), and a symptomatic nodule (n=1).
As the predictive value of a test in clinical practice depends heavily on the prevalence of the abnormality in the population being tested (19), a graph was determined showing the relationship between PPV and NPV over a range of possible pretest probabilities. Sensitivity and specificity of the AGEC in this specific cytologic setting were extracted using data gathered from four previous investigations and the current study (Table 4). A sensitivity of 95% and a specificity of 10% were assumed for the AGEC, and using Bayes' formula, a PPV was estimated across a range of pretest probabilities (Fig. 2). The prevalence of malignancy in the subset of AGEC patients, assuming the two AGEC suspicious patients who were not operated were diagnosed with malignant nodules on final pathology, was 18%, corresponding to a PPV of 19%. The minimal possible prevalence based on only known resected malignancies was 14%, corresponding to a PPV of 15%. In order to reflect the true prevalence of malignancy for patients undergoing AGEC testing, a 22% prevalence rate was calculated using the combined average of the present study and the other four published studies, which corresponds to a PPV of 23% (Fig. 2).

The relationship between positive predictive value (PPV) and negative predictive value (NPV) over a range of possible pretest probabilities. A sensitivity and specificity of 95% and 10%, respectively, were used. The dotted line represents 18% prevalence as reported in our current study. The dashed line represents a 22% prevalence of malignancy, averaged from data gathered from four previous studies and the current study.
Prior to the introduction of the AGEC, surgical cases were evenly divided between HT and TT or near-TT. Similarly, for AGEC suspicious cases, 53% (n=23) of patients underwent HT, and 47% (n=20) underwent TT. These two groups were further analyzed for index nodule size, the presence of contralateral nodules >1 cm, multinodular goiter, and the presence of clinical Hashimoto thyroiditis or hypothyroidism (Table 5). Patients in the TT group were more likely to have clinical Hashimoto thyroiditis or hypothyroidism (p=0.04), and tended to have a slightly larger nodule size, although this trend was not statistically significant (p=0.53). No correlation was found between the cytologic diagnosis and the type of surgery that was elected. In addition, papillary thyroid microcarcinoma (<1 cm) was incidentally discovered at the same rate in the TT and HT groups.
Age and nodule size are presented as mean±SD.
Discussion
To facilitate surgical decision making, it is important for clinicians to have accurate data regarding the AGEC's performance for HCNs. The baseline malignancy rate in resected nodules of 34% prior to the introduction of the AGEC is consistent with the prevalence of malignancy in the validation study (25%; p=0.62), facilitating comparisons of test performance and PPV (1).
Thirty-two percent of patients in the present study avoided surgery based on a benign AGEC result. While this rate is lower than the previously reported 41% for all indeterminate nodules in the original AGEC study (1), it still supports the alteration of HCN management based on the AGEC, deferring surgery in a significant number of patients. These results are consistent with prior reports, which have demonstrated an AGEC benign rate ranging from 28% to 53% (8,16 –18).
The current findings confirm a trend identified by earlier studies (Table 4). The original validation set contained 21 benign Hürthle cell adenomas, of which 81% yielded suspicious AGEC results (1). An independent follow-up study reported that 12/13 SHCN FNAs had suspicious AGEC results, but only 17% (2/12) of AGEC-suspicious nodules were malignant on resection (16). Similarly, Lastra et al. found that 68% of SHCN FNAs yielded suspicious AGEC results, but only 15% (2/13) harbored malignancy (17). Harrell et al. noted that 53% of suspicious AGEC cases had prominent Hürthle cell features, and 90% of AFHCN and SHCN FNAs yielded suspicious AGEC results (18). Together, these data demonstrate an increased rate of suspicious AGEC results in HCNs, despite a relatively low risk of malignancy.
While the low malignancy rate in the present AGEC cohort compared to the baseline malignancy rate raises concern that this may be a selected population with low clinical suspicion, the consistency between the current study and previous reports confirms the reproducibility of this finding. Since the majority of AGEC benign patients did not undergo surgery, this study cannot entirely exclude the possibility that some of these thyroid nodules harbored malignancies, accounting for the decrease in malignancy rate in AGEC suspicious cases. However, the present study reveals that in a sizable group of patients, clinicians are proceeding directly to surgery, even when AGEC testing is available, and that these patients have a similar rate of malignancy to individuals treated before the availability of the AGEC but a higher rate of malignancy on final pathology compared with the AGEC suspicious group (29% vs. 18%).
While the data show no statistically significant differences between the groups in the analyzed clinical and radiographic parameters, unfortunately due to incompletely recorded ultrasound data, it was not possible to study ultrasound characteristics such as central vascularity or irregular borders known to be associated with a higher malignancy rate in indeterminate nodules (20,21). It is plausible that clinicians opting for their patients to go directly to surgery may have had access to sonographic imaging/reports showing ultrasound findings that influenced this decision. Thus, while those patients who underwent AGEC testing may not be fully representative of the entire population of patients with HCNs, the substantial proportion of suspicious AGEC results despite a decreased malignancy rate suggests altered test performance in HCNs relative to other indeterminate nodules.
Compared with the pre-AGEC cohort in which 66% of resected HCNs were benign, 86% of resected nodules were benign in the AGEC suspicious cohort. If the two patients with suspicious AGEC results who did not undergo surgery are assumed to harbor malignancies, the maximum prevalence of malignancy would be 18%. These results demonstrate that applying clinical judgment before using the AGEC with a pretest probability of 18% will yield a PPV of 19%, suggesting decreased specificity of the AGEC when applied to HCNs. Moreover, if the AGEC had been implemented on all AFHCN/SHCN patients, a suspicious AGEC result would not increase the likelihood of malignancy for an index nodule (30% pretest probability, corresponding to 31% PPV).
The reason for the AGEC's overestimation of suspicious oncocytic nodules is unclear. Given the rarity of HCNs, the AGEC may not be sufficiently trained to differentiate benign from malignant samples. Although not confirmed, it has been suggested that accumulation of mitochondria in oncocytic thyroid cells is indirectly responsible for a distinct miRNA profile, potentially increasing the AGEC's false-positive rate (22). A review of the proprietary Veracyte algorithm, described briefly in the validation study's supplementary material, reveals an alternate possibility (1). The algorithm first analyzes data from six rare subtype classifiers comprised of 25 genes designed to identify melanoma, renal cell carcinoma, breast carcinoma, parathyroid tissue, medullary thyroid carcinoma, and Hürthle cell adenoma/carcinoma. Each of these independent classifiers is analyzed sequentially, and a suspicious result from any individual classifier halts progression through the remaining analysis, including the 142-gene benign versus suspicious classifier. Thus, it is possible that some AFHCN and SHCN FNAs are identified as suspicious by the initial rare subtype classifier's Hürthle cell cassette and are not analyzed by the main 142-gene classifier. However, the proprietary nature of the Veracyte algorithm precludes a more rigorous analysis of the workflow to confirm this hypothesis.
These observations regarding the unique nature of HCNs are supported by recent advances in the understanding of the molecular biology of Hürthle cell tumors. Unlike papillary thyroid carcinomas, which often have BRAF activating mutations or RET/PTC gene rearrangements, these molecular alterations are either not associated with Hürthle cell tumors (23,24) or are only seen in a minority of cases (25). Similarly, while RAS mutations and PAX8/PPARγ gene rearrangements most often characterize follicular carcinomas, a minority of Hürthle cell tumors harbor such genetic changes (24 –27). Instead, widely invasive Hürthle cell carcinomas demonstrate changes in the β-catenin and mTOR pathways with large regions of amplification in chromosomes 5, 7, 12, and 17 (24). Other studies have identified further targets that appear unique to Hürthle cell carcinoma, including PVALB (28). Although these results await confirmation and further study, they suggest that Hürthle cell neoplasms constitute a distinct molecular class of tumors, where genetic differentiation of benign from malignant appears not to be well described at this time.
As the molecular profile of thyroid neoplasms continues to be defined, the possibility of incorporating other gene panels in addition to the AGEC may improve specificity and facilitate clinical management of HCNs. Kloos et al. demonstrated that the addition of a single-gene test with low sensitivity failed to contribute diagnostic information due to inability to exclude carcinoma in AGEC-suspicious cases (29). Thus, multi-gene panels appear to be needed to increase specificity. Clinical validation data for a competing multi-gene panel using next-generation sequencing have recently been reported (30). It remains to be seen how this panel's performance will compare to the AGEC in multisite trials and in distinct tumor subtypes such as HCNs.
Similar to other post-validation studies, the present study was limited in determining the number of true-negative AGEC cases due to lack of surgical resection of AGEC-benign nodules. Additionally, the retrospective study design limits the ability to capture data fully regarding the decision-making process as it unfolds between the patient and physician. For example, Aragon Han et al. found that, in general, molecular testing did not contribute meaningfully to surgical decision making in a majority of thyroid FNAs, since the combination of clinical and sonographic features was sufficient to determine the appropriate treatment when reviewed retrospectively in light of the final pathology (31). Future prospective studies that examine factors driving a decision for surgery versus clinical follow-up are needed to determine when additional molecular information can optimally contribute to the decision-making process and improve patient outcomes. In this regard, when no other clinical parameters leading to an indication for TT are present, and as long as the false-positive rate of AGEC for HCNs remains high, it appears reasonable to pursue HT for AGEC-suspicious HCN patients.
As data regarding the most clinically relevant approach to using the AGEC accumulate, evaluation of the test's performance in biologically distinct subsets of thyroid lesions such as HCNs is essential to maximize the benefit of the AGEC. This study demonstrates an increased rate of suspicious AGEC results in HCNs compared to other indeterminate thyroid nodules, with a correspondingly low PPV. Thus, the findings confirm previously published reports indicating that a suspicious AGEC result does not warrant a more aggressive surgical approach. The decreased number of benign AGEC results limits the utility of the assay to spare many patients from surgery, though a significant number of patients may still avoid surgery when the AGEC is utilized as one tool in the evaluation of HCNs. This study reports the largest series of HCNs subjected to AGEC testing with surgical and pathologic follow-up, and it confirms that while a third of patients may be spared surgery, those with HCNs and suspicious AGEC findings need to be counseled accordingly, given the low malignancy rate on final pathology.
Footnotes
Acknowledgments
Eran Brauner is sponsored by a grant from the Clair and Emanuel G. Rosenblatt Fund and American Healthcare Professionals and Friends for Medicine in Israel (APF). This work was presented in abstract format at the United States and Canadian Academy of Pathology 2015 Annual Meeting in Boston, MA.
Author Disclosure Statement
James V. Hennessey has served as a onetime consultant to Veracyte in 2014. No competing financial interests exist for the other authors.
