Afirma Gene Sequencing Classifier Compared with Gene Expression Classifier in Indeterminate Thyroid Nodules

Abstract

Background:

The Afirma Gene Expression Classifier (GEC) has been used to further characterize cytologically indeterminate (cyto-I) thyroid nodules into either benign or suspicious categories. However, its relatively low positive predictive value (PPV) limited its use as a classifier for patients with suspicious results. The Afirma Gene Sequencing Classifier (GSC) was developed to improve PPV while maintaining a high negative predictive value (NPV), yet real-world assessment of its performance is lacking.

Methods:

We analyzed all patients who had cyto-I nodules and molecular testing with either GEC or GSC between 2011 and 2018 at a single academic medical center. Clinical information was obtained for 343 GEC-tested nodules and 164 GSC-tested nodules.

Results:

The GSC had a statistically significant higher benign call rate (76.2% vs. 48.1%, p < 0.001), PPV (60.0% vs. 33.3%, p = 0.01), and specificity (94.3% vs. 61.4%, p < 0.001) than the GEC. Improvement was statistically significant in both Bethesda III and Bethesda IV nodules. In particular, the benign call rate of GSC was significantly higher in nodules with Hürthle cell changes (88.8% vs. 25.7%, p < 0.01). The rate of surgical intervention in the indeterminate nodule cohort has decreased by 66.4% since switching to the GSC; 52.5% of indeterminate nodules went to surgery while using the GEC compared with 17.6% with the GSC (p < 0.001). This reduction was statistically significant in nodules with Bethesda III diagnoses, demonstrating a 70.9% decrease (GEC 51.3% vs. GSC 14.9%, p < 0.001), and in nodules with Bethesda IV cytology, a 39.2% decrease was noted (GEC 54.8% vs. GSC 33.3%, p = 0.003).

Conclusions:

Data from a single academic tertiary center show an improved specificity and PPV while maintaining high sensitivity and NPV for GSC compared with GEC. A statistically significant increase in benign call rates was observed in GSC compared with GEC, likely indicating fewer false positive results. After implementation of GSC, surgical interventions have been reduced by 68%.

Introduction

Ultrasound-guided fine needle aspiration (FNA) biopsy has remained the diagnostic tool of choice for evaluation of thyroid nodules. Although the introduction of FNA has reduced diagnostic thyroidectomy (1), challenges remain since 15–30% of nodules are categorized as cytologically “indeterminate” (cyto-I) results based on widely implemented Bethesda criteria (2). Indeterminate cytopathology includes two categories: Bethesda category III comprised atypia of undetermined significance (AUS) or follicular lesion of undetermined significance (FLUS) and Bethesda category IV comprised suspicious for follicular neoplasm (SFN) or Hürthle cell neoplasm (HCN). Approximately 30% of these are ultimately malignant on surgical pathology, although ranges are broad (3 –5), which poses diagnostic dilemma, often resulting in repeat FNA and/or diagnostic surgery (6). Better diagnostic strategies in these circumstances are needed since surgical intervention carries a risk of hypothyroidism, temporary or permanent primary hypoparathyroidism, voice alteration with or without recurrent laryngeal nerve damage, cosmetic issues, bleeding, infection, and rarely death (7). To address this diagnostic challenge, several molecular tests have been developed to better inform clinical management of patients with Bethesda III/IV thyroid nodules, including the Afirma Gene Expression Classifier (GEC; Veracyte, Inc., South San Francisco, CA) that used a microarray measurement of messenger RNA expression of 167 genes (3). This test was developed with a strategy to reduce unnecessary thyroid surgeries for benign cyto-I nodules with a goal to achieve a negative predictive value (NPV) similar to a benign cytological result. Indeed, this test achieved a high NPV of 94–95% among Bethesda III and IV nodules at a cancer prevalence of 24–25% (3,8). Based on these data, GEC became commercially available in 2011, with subsequent reports of 10–70% fewer surgical interventions with its implementation (9 –13). However, for patients with cyto-I nodules with suspicious GEC results, its relatively low positive predictive value (PPV) of ∼38% and specificity of 50% (3) limited its role as a positive predictor of malignancy. The Gene Sequencing Classifier (GSC) was then developed to improve specificity and PPV while maintaining a high NPV. The GSC incorporates nuclear and mitochondrial RNA transcriptome gene expression, RNA sequencing, and genomic copy number analysis (14). A blinded GSC clinical validation performance was previously reported using samples reported for the GEC classifier (14). The goal of this study was to determine the “real world” clinical performance of the GSC versus prior clinical experience with the GEC at our center.

Methods

Tissue samples

This is a retrospective analysis of patients who had one or more cyto-I thyroid nodules (Bethesda category III/IV) on thyroid FNA biopsy at The Ohio State University Medical Center. The decision of whether to proceed with FNA for each thyroid nodule was made according to the clinical judgment of the treating physicians and patient preference. The FNA samples were obtained with a 25- or 27-gauge needle under ultrasound guidance. Two dedicated needle insertions were obtained for molecular testing at the time of the initial FNA. The specimens were stored in a −75°C freezer. Samples with cytology results reported as cyto-I cytology were eligible for additional testing with GEC (February 2, 2011–July 11, 2017) or GSC (July 11, 2017–December 19, 2018). All samples were shipped at a temperature of 2–25°C to Veracyte, Inc. for testing. Thus, all cytology readings rendered from our team of academic cytopathology specialists in head and neck cancer occurred independent of the subsequent molecular test. The determination of whether to proceed with surgery and to what extent or to monitor was made by the treating physician and the patients. In the case of surgical intervention, the histologic diagnosis was rendered by academic pathologists specializing in head and neck pathology. The data collection was approved by our institutional review board (IRB No.: 2017H0464).

Statistical analysis

Patient and nodule characteristics were summarized by GEC/GSC test result (benign or suspicious) separately for nodules undergoing GEC and GSC testing. Continuous variables were compared between benign/suspicious molecular result groups by t-test or Wilcoxon rank-sum test and categorical variables were compared by chi-square tests. For both GEC and GSC, the diagnostic classification measures of sensitivity, specificity, PPV, and NPV were calculated both for the overall sample and separately for Bethesda III and Bethesda IV nodules. For statistical purposes, true status for each nodule was determined under two scenarios, with classification measures calculated for each: (1) true status determined by surgical pathology or a benign GEC/GSC test with no surgical pathology available considered truly benign by the expert treating physician based on ultrasound/clinical characteristics and (2) “true” status determined by surgical pathology only. Noninvasive Follicular Thyroid Neoplasm with Papillary-Like Nuclear Features (NIFTP) was classified as “malignant” due to the current recommendations for management with hemithyroidectomy (15 –17). For each diagnostic classification measure, exact binomial 95% confidence intervals [CIs] were calculated using the Clopper–Pearson method. Benign call rate, PPV, specificity, and surgical intervention rates were compared between GEC and GSC by Fisher's exact tests. For an analysis of categorical data to assess for the presence of an association between a variable with two categories and an ordinal variable with k categories, the Cochran–Armitage Trend Test was used. All analyses were performed using SAS software version 9.4 (SAS Institute, Cary, NC).

Results

Nodules with GEC testing

The GEC was performed on 428 thyroid nodule samples from February 2011 to July 2017. Of these, 343 nodules from 317 unique patients were included. Eighty-five nodules were excluded: 38 due to missing patient information, 24 due to nondiagnostic GEC results, 17 due to nondiagnostic cytology results, 3 due to benign cytology results, and 3 due to molecular results indicating parathyroid tissue (Fig. 1). Of the 343 included cases, 228 (66.5%) were categorized on cytology as Bethesda III and 115 (33.5%) as Bethesda IV. For the Bethesda category III, 95 cases were read as AUS, 130 cases were read as FLUS, and 3 cases were reported as “atypical follicular cells.” Cytological diagnoses of Bethesda IV were SFN in 86 cases and suspicious for HCN in 29 cases.

FIG. 1.

Exclusions and samples of study patients in GEC cohort. FNA, fine needle aspiration; GEC, Gene Expression Classifier; PTMC, papillary thyroid microcarcinoma.

GEC was suspicious in 178 cases (51.9%). Patient and nodule characteristics by GEC status are presented in Table 1. Mean size of the nodule was 2.31 cm (range 0.5–6.0 cm) in suspicious cases and 2.15 cm (0.5–7.0 cm) in benign cases (p = 0.22). Thirty-six percent (36.5%) of suspicious nodules were categorized as Bethesda IV compared with 30.3% of benign nodules (p = 0.25).

Table 1.

Baseline Characteristics of Nodules with Gene Expression Classifier Testing (n = 343 Nodules)

	GEC benign nodules (n = 165)	GEC suspicious nodules (n = 178)
Sex, female, n (%)	129 (78.2)	137 (77.0)
Age, mean (SD)	53.0 (14.3)	51.4 (14.9)
Nodule size (cm), mean (SD)	2.15 (0.99)	2.31 (1.18)
Bethesda III nodules, n (%)	115 (69.7)	113 (63.5)
AUS	48 (29.1)	47 (26.4)
FLUS	66 (40.0)	64 (36.0)
Atypical cell	1 (0.6)	2 (1.1)
Bethesda IV nodules, n (%)	50 (30.3)	65 (36.5)
SFN	46 (27.3)	41 (23.0)
HCN	5 (3.0)	24 (13.5)

AUS, atypia of undetermined significance; FLUS, follicular lesion of undetermined significance; GEC, Gene Expression Classifier; HCN, Hürthle cell neoplasm; SD, standard deviation; SFN, suspicious for follicular neoplasm.

Diagnostic accuracy of GEC

The benign call rate [95% CI] for GEC was 48.1% (43–54%). Of the 343 GEC cases, 180 (52.5%) had a surgical intervention. Assuming all nonsurgical GEC benign cases (n = 138) were benign based on pathology and clinical criteria, sensitivity, specificity, PPV, and NPV were 94.4%, 61.4%, 33.3%, and 98.2%, respectively, with a disease prevalence of 15.7% (see Table 2 for CIs). Specificity and PPV were 61.4% and 26.8% in Bethesda III nodules and 61.3% and 44.6% in Bethesda IV nodules, respectively (Table 3). Specificity between Bethesda III and Bethesda IV was not significant, but PPV was significantly higher in Bethesda IV nodules (26.8 vs. 44.6%, p = 0.03). Restricted to surgical cases, sensitivity, specificity, PPV, and NPV were 94.4%, 19.1%, 33.3%, and 88.9%, respectively (Table 3). Specificity and PPV were 20.2% and 26.8% in Bethesda III nodules and 16.2% and 44.6% in Bethesda IV nodules, respectively (Table 4).

Table 2.

Classification Measures Including Operated and Unoperated Benign Gene Expression Classifier/Gene Sequencing Classifier Nodules

Test	Sensitivity	Specificity	NPV	PPV	n	TP	FP	TN	FN
All nodules
GEC	94% [85–99%]	61% [55–67%]^*	98% [95–100%]	33% [26–41%]^*	318	51	102	162	3
GSC	100% [78–100%]	93% [87–96%]^*	100% [97–100%]	60% [39–79%]^*	150	15	10	125	0
Bethesda III nodules
GEC	93% [76–99%]	61% [54–68%]^*	98% [94–100%]	27% [18–37%]^*	212	26	71	113	2
GSC	100% [63–100%]	94% [88–98%]^*	100% [96–100%]	57% [29–82%]^*	114	8	6	100	0
Bethesda IV nodules
GEC	96% [80–100%]	61% [50–72%]^*	98% [89–100%]	45% [31–59%]	106	25	31	49	1
GSC	100% [59–100%]	86% [68–96%]^*	100% [86–100%]	64% [31–89%]	36	7	4	25	0

Unoperated Afirma suspicious nodules are excluded. Values in square brackets are 95% CIs.

p < 0.05.

CIs, confidence intervals; FN, false negative; FP, false positive; NPV, negative predictive value; PPV, positive predictive value; TN, true negative; TP, true positive.

Table 3.

Classification Measures Excluding Unoperated Benign Gene Expression Classifier/Gene Sequencing Classifier Nodules

Test	Sensitivity	Specificity	NPV	PPV	n	TP	FP	TN	FN
All nodules
GEC	94% [85–99%]	19% [13–27%]	89% [71–98%]	33% [26–41%]	180	51	102	24	3
GSC	100% [78–100%]	17% [2–48%]	100% [16–100%]	60% [39–79%]	27	15	10	4	0
Bethesda III nodules
GEC	93% [76–99%]	20% [12–30%]	90% [68–99%]	27% [18–37%]	117	26	71	18	2
GSC	100% [63–100%]	25% [3–65%]	100% [16–100%]	57% [29–82%]	17	8	6	3	0
Bethesda IV nodules
GEC	96% [80–100%]	16% [6–32%]	86% [42–100%]	45% [31–59%]	63	25	31	6	1
GSC	100% [59–100%]	0% [0–60%]	n/a, no cases	64% [31–89%]	12	7	4	1	0

Unoperated Afirma suspicious nodules are excluded. Values in square brackets are 95% CIs.

p < 0.05.

n/a, not available.

Table 4.

Baseline Characteristics of Nodules with Gene Sequencing Classifier Testing (n = 164 Nodules)

	GSC benign nodules (n = 125)	GSC suspicious nodules (n = 39)
Sex, female, n (%)	94 (75.2)	29 (74.3)
Age, mean (SD)	55.9 (13.6)	51.3 (15.2)
Nodule size (cm), mean (SD)	2.31 (1.14)	2.34 (1.52)
Bethesda III nodules, n (%)	100 (80.0)	24 (61.5)
AUS	68 (54.4)	16 (41.0)
FLUS	30 (24.0)	8 (20.5)
Atypical cell	2 (1.6)	0 (0)
Bethesda IV nodules, n (%)	25 (20.0)	15 (38.5)
SFN	23 (18.4)	14 (35.9)
HCN	2 (1.6)	1 (2.6)

GSC, Gene Sequencing Classifier.

Surgical pathology among GEC-tested nodules

Of the GEC-suspicious cases, 153 (86.0%) had a surgical intervention. Histopathology revealed cancer or NIFTP in a total of 51 cases including the following: papillary thyroid carcinoma (PTC) in 19 cases (6 cases <1 cm), follicular-variant papillary thyroid carcinoma (FVPTC) in 5 cases, follicular cell carcinoma (FTC) in 11 cases, Hürthle cell carcinoma (HCC) in 10 cases, NIFTP in 6 cases, Hürthle cell adenoma (HCA) in 8 cases, and benign nodule in 94 cases. The benign call rate [95% CI] in AUS nodules was 50.5% (40–61%) and 50.8% (42–60%) in FLUS nodules, (p > 0.99). Of the 51 GEC-suspicious/malignant histology cases, 26 had a preoperative cytological diagnosis of Bethesda III (9 AUS, 17 FLUS) and 25 had a diagnosis of Bethesda IV (16 SFN, 9 HCN). Of the 102 GEC-suspicious/benign histology cases, 71 had a preoperative cytological diagnosis of Bethesda III (28 AUS, 41 FLUS, 2 atypical follicular cell) and 31 had a diagnosis of Bethesda IV (22 SFN, 9 HCA). Fifty-four patients (35.3%) underwent hemithyroidectomy, while 99 cases (64.7%) underwent total thyroidectomy (TT).

Twenty-seven GEC benign nodules had a surgical intervention and 24 (88.9%) of these nodules were histologically benign. There was a preoperative cytological diagnosis of Bethesda III in 18 cases (AUS 10, FLUS 8) and Bethesda IV in 6 cases (3 HCA 3 SFN). Mean duration from initial FNA to surgery was 8.5 months (0–27 months). The reasons for surgery in GEC benign-path benign nodules were interval enlargement of the nodule (n = 7), compressive symptoms (n = 6), cancer in the contralateral lobe (n = 4), patient concern (n = 3), suspicious GEC result (final pathology benign) in the contralateral lobe (n = 2), repeat suspicious GEC result (n = 1), and physician recommendation due to FNA demonstrating Hürthle cell (HC) changes (n = 1). Four cases (16.7%) underwent hemithyroidectomy and 20 cases (83.3%) underwent TT. The final surgical pathology of three false negative cases all showed papillary thyroid microcarcinomas in the originally aspirated nodule; however, in all cases the largest ultrasound measurement was ≥1 cm. Two had cytological diagnoses of Bethesda III (both FLUS) and one has diagnosis of Bethesda IV (SFN). Case 1 had repeat suspicious GEC result on the same nodule 1 year after the initial biopsy, case 2 underwent surgery within 4 months because the contralateral lobe contained PTC, case 3 presented with a level IV lymph node metastasis 3 years after the FNA. All cases underwent TT. Case 3 underwent TT with lateral neck dissection, radioactive iodine therapy was held due to patient preference. None has had structural recurrence of disease to date with a median follow-up of 7.5 months.

Nodules with GSC testing

The GSC was performed on 166 thyroid nodule samples from July 2017 to December 2018. One case was excluded due to triggering the parathyroid classifier and one case was excluded due to nondiagnostic GSC results due to insufficient RNA quantity (Fig. 2). Thus, the analyzed cohort consisted of 164 nodules from 153 unique patients (11 patients had 2 nodules included). Of these 164 nodules, 124 (75.6%) were categorized as Bethesda III and 40 (24.4%) as Bethesda IV on cytology. For the Bethesda III nodules, 84 were AUS, 38 were FLUS, and 2 were reported as “atypical cells.” The cytological diagnoses of 40 Bethesda IV nodules were SFN in 37 cases and suspicious for HCN in 3 cases.

FIG. 2.

Exclusions and samples of study patients in GSC cohort. GSC, Gene Sequencing Classifier.

The GSC was suspicious in 39 of 164 nodules (23.7%). Patient and nodule characteristics by GSC status are presented in Table 4. Mean size of the nodule was 2.3 cm (range 0.8–8.9 cm) in suspicious cases and 2.3 cm (range 0.65–8.9 cm) in GSC benign cases (p = 0.90). Thirty-eight percent (38.5%) of GSC-suspicious nodules were categorized as Bethesda IV compared with 20.0% of benign nodules (p = 0.02).

Diagnostic accuracy of GSC

The benign call rate [95% CI] for GSC was 76.2% (69–83%). Of the 164 GSC nodules, 29 (17.6%) underwent thyroid surgery. Assuming all nonsurgical GSC benign cases (n = 123) were truly benign, the sensitivity, specificity, PPV, and NPV were 100%, 92.6%, 60.0%, and 100%, respectively, with a disease prevalence of 9.2% (Table 3). Specificity and PPV were 94.3% and 57.1% in Bethesda category III nodules and 86.2% and 63.6% in Bethesda category IV nodules, respectively (Table 3). The benign call rate [95% CI] in AUS nodules was 81.0% (71–89%) and 78.9% (63–90%) in FLUS nodules (p = 0.81). Restricted to cases with surgical pathology, sensitivity, specificity, PPV, and NPV among GSC tested nodules were 100%, 16.7%, 60.0%, and 100%, respectively (Table 4). Specificity and PPV were 25.0% and 57.1% in Bethesda III nodules and 0% and 63.6% in Bethesda IV nodules, respectively. Predicted performance of GSC based on different prevalence is shown in Figure 3.

FIG. 3.

Predicated performance of GSC based on disease prevalence. Predicated performance of NPV and PPV (solid lines) with 95% CIs (dotted lines) based on sensitivity and specificity. CIs, confidence intervals; NPV, negative predictive value; PPV, positive predictive value.

Surgical pathology among GSC nodules

Of the 39 GSC-suspicious cases, 25 (64.1%) underwent surgery by the end of the data collection. Histopathology revealed cancer or NIFTP in 15 cases, including PTC (n = 11) (classic variant 4 cases, PTC with mostly follicular pattern not meeting criteria for FVPTC 4 cases, oncocytic variant 1 case), FVPTC (n = 1), FTC (n = 2), and NIFTP (n = 1). One case was positive for BRAF^V600E on GSC testing; the final pathology showed classic variant PTC (pT1aNxMx based on AJCC staging), negative for lymphovascular invasion and extrathyroidal extension. Nine of the 10 GSC-suspicious cases were benign adenomatous nodules on surgical pathology and one was HCA (cytology SFN without Hürthle change). Of the 15 GSC-suspicious/malignant histology cases, 8 had a preoperative cytological diagnosis of Bethesda III (6 AUS, 2 FLUS) and 7 had a diagnosis of Bethesda IV (6 SFN, 1 HCN). Of the 10 GEC-suspicious/benign cases, 6 had a preoperative cytological diagnosis of Bethesda III (4 AUS, 2 FLUS) and 4 had a diagnosis of Bethesda IV (4 SFN). Seventeen cases (68%) underwent hemithyroidectomy, nine cases (32%) underwent TT. One patient underwent a two-step surgery (hemithyroidectomy followed by completion thyroidectomy). Four cases out of 127 GSC benign Bethesda III (3 AUS and 1 SFN) nodules underwent thyroid surgery for local compressive symptoms. All cases were benign on histopathology.

Comparison of GEC and GSC benign call rate, specificity, PPV, and clinical care

The benign call rate was higher for GSC than for GEC among all cyto-I cases (GSC 76.2% vs. GEC 48.1%, p < 0.001) and among subgroups of Bethesda III nodules (GSC 80.7% vs. GEC 50.4%, p < 0.001) and Bethesda IV nodules (GSC 62.5% vs. GEC 43.5%, p = 0.04). When considering all benign cases to be truly benign for both tests, the GSC had higher specificity and PPV than the GEC (specificity: GSC 92.6% vs. GEC 61.4%, p < 0.001; PPV: GSC 60.0% vs. GEC 33.3%, p = 0.01). Similar findings were shown among Bethesda III nodules (specificity: GSC 94.3% vs. GEC 61.4%, p < 0.001; PPV: GSC 57.1% vs. GEC 26.8%, p = 0.03). In Bethesda IV nodules, the GSC had higher specificity (GSC 86.2% vs. GEC 61.3%, p = 0.02) but did not differ in PPV (GSC 63.6% vs. GEC 44.6%, p = 0.33).

When considering only surgical cases, GSC and GEC did not differ in specificity among all nodules (GSC 16.7% vs. GEC 19.1%, p > 0.99) or within Bethesda III (GSC 25.0% vs. GEC 20.2%, p = 0.67) or Bethesda IV (GSC 0% vs. GEC 16.2%, p > 0.99) subgroups (Table 4). There was no difference in specificity and PPV by excluding HCA from both GSC and GEC.

To analyze the possibility of “diagnosis drift” caused by the addition of molecular testing to preoperative nodule evaluation (i.e., more nodules previously classified as benign now being diagnosed Bethesda III/IV), we evaluated the institutional cytologic diagnostic rates of thyroid nodules in three different time periods. The first period spans August 2010 to December 2010, after the implementation of the Bethesda classification system but before the use of molecular testing. The second period spans March 2013 to July 2013 when GEC was used, and the third spans August 2017 to December 2017 when GSC was used. Excluding malignant and unsatisfactory nodules, the percentage of nodules classified as indeterminate was 8.4% (20/237) in period 1, 12.6% (31/246) in period 2, and 17.3% (18/104) in period 3. The diagnosis of indeterminate nodules did increase significantly over time (p = 0.02) by Cochran–Armitage Trend Test. We then analyzed whether the increased benign call rate in GSC versus GEC can be explained by an increase in the percentage of nodules classified as indeterminate. If we assume the 12.6% and 17.3% rates are representative of the rates for the periods in which GEC and GSC were used, respectively, we can estimate the potential impact on benign call rate in GSC due to “diagnostic drift” and calculate an adjusted benign call rate for GSC. The benign call rate for our full inclusion sample was 48.1% (165/343) for GEC and 76.2% (125/164) for GSC. Based on the diagnosis percentages, we estimate that 27.2% of the nodules in the GSC sample may have been classified as benign rather than indeterminate in the GEC/2013 time period [calculation: 100% × (17.3%–12.6%)/17.3% = 27.2%]. To estimate the maximum effect on benign call rate, we applied a conservative scenario in which all 27.2% are assumed to have benign GSC results. The result would be 45 fewer GSC nodules with a benign call rate of 67.2% (80/119), which would still be a statistically significant increase over the 48.1% benign call rate for GEC (p < 0.001).

The rate of surgical intervention in the indeterminate nodule cohort has decreased by 66.4% since switching to the GSC; 52.5% of indeterminate nodules went to surgery while using the GEC compared with 17.6% with the GSC (p < 0.001). This reduction was statistically significant in nodules with Bethesda III diagnoses, demonstrating a 70.9% decrease (GEC 51.3% vs. GSC 14.9%, p < 0.001), and in nodules with Bethesda IV cytology, a 39.2% decrease (GEC 54.8% vs. GSC 33.3%, p = 0.003). When comparing the extent of the surgery, TT decreased by 36% while hemithyroidectomy increased in the GSC-suspicious cohort compared with the GEC-suspicious cohort (TT: GEC 63.9% vs. GSC 40.7%, p = 0.03). However, it is important to recognize that recommendations for extent of surgery changed over that time frame, a factor that may also contribute to this difference.

Comparison of GEC and GSC in thyroid nodules with HC changes

We analyzed the diagnostic performance of GSC and GEC based on the presence of HC on the cytology report (Table 5). A recent study showed that GSC has a higher benign call rate than GEC in nodules with HC (18). Indeed, the benign call rate in GSC with HC (n = 17) was significantly higher than in GEC (n = 58) (GEC 25.7% vs. GSC 88.8%, p < 0.01). Within GSC, the benign call rate was not statistically different in nodules with HC versus without HC (HC 88.8% vs. non-HC 74.7%, p = 0.25). Sensitivity, specificity, PPV, and NPV among all GEC-tested nodules with HC were 100%, 42.5%, 43.9%, and 100%, respectively. For GSC, sensitivity, specificity, PPV, and NPV were 100%, 100%, 100%, and 100%, respectively. Specificity was higher in GSC than in GEC (GEC 42.5% vs. GSC 100%, p < 0.01); however, no differences in PPV were identified in our cohort (GEC 43.9% vs. GSC 100%, p = 0.45). The percentage of nodules with HC in GEC and GSC groups was not significantly different, 18.2% versus 10.3% (p = 0.10), respectively. By subclassifying nodules to Bethesda categories III and IV, there was no statistically significant difference in specificity and PPV between the GSC and GEC except for those with a cytologic diagnosis of SFN with HC, which showed improved specificity (GEC 30% vs. GSC 100%, p < 0.003).

Table 5.

Classification Measures Divided by Hürthle Cell Changes Including Operated and Unoperated Benign Gene Expression Classifier/Gene Sequencing Classifier

Test		Sensitivity	Specificity	NPV	PPV	n	TP	FP	TN	FN
All Hürthle change nodules
GEC		100% [81–100%]	43% [27–59%]^*	100% [80–100%]	44% [28–60%]	58	18	23	17	0
GSC		100% [3–100%]	100% [79–100%]^*	100% [79–100%]	100% [3–100%]	17	1	0	16	0
Bethesda III nodules
GEC	HC−	91% [72–99%]	62% [54–69%]^*	98% [93–100%]	25% [16–35%]^*	191	21	64	104	2
GEC	HC+	100% [48–100%]	56% [30–80%]	100% [66–100%]	42% [15–72%]	21	5	7	9	0
GSC	HC−	100% [63–100%]	94% [88–98%]^*	100% [96–100%]	57% [29–82%]^*	109	8	6	95	0
GSC	HC+	n/a, no cases	100% [48–100%]	100% [48–100%]	n/a, no cases	5	0	0	5	0
Bethesda IV nodules
GEC	SFN-HC−	92% [64–100%]	73% [60–84%]	98% [87–100%]	44% [25–65%]	69	12	15	41	1
	SFN-HC+	100% [40–100%]	30% [7–65%]^*	100% [29–100%]	36% [11–69%]	14	4	7	3	0
	HCN	100% [66–100%]	36% [13–65%]	100% [48–100%]	50% [26–74%]	23	9	9	5	0
GSC	SFN-HC−	100% [54–100%]	78% [52–94%]	100% [77–100%]	60% [26–88%]	24	6	4	14	0
	SFN-HC+	n/a, no cases	100% [66–100%]^*	100% [66–100%]	n/a, no cases	9	0	0	9	0
	HCN	100% [3–100%]	100% [16–100%]	100% [16–100%]	100% [3–100%]	3	1	0	2	0

Values in square brackets are 95% CIs.

p < 0.05.

HC, Hürthle cell.

Comparison of GEC and GSC in thyroid nodules <1 cm in size

In this cohort, there were eight indeterminate nodules measuring <1 cm (mean size 0.8 cm) that were tested with the GEC. Cytology showed AUS in 2 cases, FLUS in 4 cases, and HCN in 2 cases. GEC was benign in 5 cases and suspicious in 3 cases (2 AUS and 1 FLUS). Surgical intervention in 1 GEC-suspicious case (cytology FLUS) revealed a benign nodule, and in 2 GEC benign cases (cytology HCN), histology was benign. GSC testing was performed on 5 indeterminate nodules measuring <1 cm (mean size 0.79 cm); cytology showed AUS in 2 cases, FLUS in 2 cases, and SFN with HC in 1 case. GSC was benign in 3 cases and suspicious in 2 cases (cytology AUS and SFN with HC).

The indication for FNA was patient preference in three cases (one with history of thyroid cancer [TC]), a sonographically suspicious nodule located at the posterior lobe close to the recurrent laryngeal nerve in one case, and to exclude the possibility of metastasis of another primary malignancy in one case. The specificity and NPV were 66.6% and 100%, respectively, in the GEC group. None of the patients with a nodule in the GSC group underwent surgery by the end of the study. Owing to small number of nodules <1 cm, we did not perform statistical tests for differences.

Xpression Atlas

The Xpression Atlas (XA) is a feature available for suspicious GSC (but not in GEC) in which specific genetic driver mutations identified in the RNA sequencing can be reported (19,20). XA was ordered in 9 cases, with 4 positive results. Of the 4 positive cases, 2 cases showed a BRAF^K601E (c.1801A>G) variant. Cytology showed AUS and SFN, respectively. Both patients underwent hemithyroidectomy and surgical pathology showed FVPTC in both cases. Those were negative for lymphovascular invasion, perineural invasion, and extrathyroidal extension. Two cases showed HRAS p.Q61R (c.182A>G). Cytology showed “atypical cells” and SFN, respectively. Surgical pathology is pending in both cases as of this writing.

Discussion

Management of cyto-I thyroid nodules remains a challenge. Several molecular tests are commercially available with the goal to improve the accuracy of preoperative diagnosis (21 –23). In this study, we focused on the GEC and GSC tests. Although the high sensitivity and NPV of the GEC as a “rule-out test” have reduced the rate of surgery with a benign GEC (9), its relatively low PPV and specificity still resulted in a high rate of benign pathology with GEC-suspicious results (3,9,18,24,25). To improve the specificity, the GSC was developed and established in 2017. The GSC utilizes RNA sequencing that also enables detection of specific oncogenes, a feature common to other commercially available tests (21 –23). Published data using the same samples as used in the initial GEC validation study suggested that the GSC platform resulted in an improved PPV while maintaining high NPV (14). Our short-term “real life” experience demonstrates a statistically significant improvement in specificity from 61% to 94% for the GSC versus GEC. We have also found a statistically significant higher benign call rate from the GEC 48% to the GSC 76%, largely driven by a higher benign call rate in Bethesda III nodules. The benign call rate between AUS and FLUS did not differ in both GEC and GSC.

Another major modification made from GEC to GSC was the incorporation of an algorithm for nodules with HC changes. HCs, also known as oncocytic cells, are follicular-derived cells with acidophilic cytoplasm (26).While most cyto-I nodules with HC are benign on surgical pathology, 10–40% can represent malignancy, particularly HCC, which tends to be resistant to radioactive iodine treatment compared with other differentiated thyroid carcinomas (DTCs) (27). Nodules with HC have been a major challenge to accurately classify by GEC, and were generally classified as suspicious on this test, resulting in a high NPV but a lower PPV (28). This is reflected by clinical data by Brauner et al. (24) who reported that only 13% of GSC-suspicious nodules with HC were malignant on surgical pathology. Insights into the genomic landscape of HCC have evolved since the initial implementation of the GEC. Ganly et al. (29) and Gopal et al. (30), as well as others, have reported that HCC has distinct genomic alterations compared with other DTCs, characterized by frequent loss of heterozygosity and a high number of mitochondrial mutations enabling the possibility of improved molecular diagnosis for these tumors (28).

Harrell et al. (18) recently reported that the benign call rate for GSC was higher in nodules with HC on FNA, suggesting that the diagnostic improvement of GSC is related to its improvement in performance with HC. Indeed, the benign call rate and specificity of nodules with HC in our institution were also significantly higher in GSC than in GEC. The same trend was not observed with PPV but the number of nodules in our study may not have been sufficient for this subanalysis. Overall, in our cohort, GSC performance in nodules with and without HC did not differ in specificity and PPV. While our findings are promising and consistent with those of Harrell et al. (18), the data need to be interpreted with caution as the number of nodules with HC in our GSC cohort was small (N = 17, 10.3%).

The higher benign call rate with the GSC has impacted our clinical care; it has resulted in a 68.6% reduction of surgical intervention in the combined GSC and GEC cohorts. There was also a reduction in the percentage of patients with cyto-I nodules treated with TT by 36% in GSC-suspicious cohort compared with GEC-suspicious cohort. Ongoing clinical follow-up will be needed to clarify the long-term clinical impact of the new assay. It is not clear whether this is related to the switch in assays, changes in clinical practice reflecting new data and guidelines (6), or a combination of both.

Our current study has several limitations. Given the recent implementation of GSC, the follow-up period is short, and the sample size is smaller than the GEC cohort. We cannot exclude a possibility that a higher benign call rate actually implies an increase in false negative results, thus missing a diagnosis of TC. Based on blinded validation study done in FNA samples of which 384 had corresponding histopathological specimens, the likelihood is not high but it cannot be excluded (3,14). In addition, selection bias cannot be avoided in this situation as operated molecularly benign nodules likely differ from those triaged to clinical follow-up. The reduction in surgical intervention seen in the GSC cohort could also be due to shorter follow-up as surgical intervention of benign nodules may increase over time once they have grown to the point of causing compressive symptoms or requiring repeat biopsy. In addition to the GSC or GEC benign nodules, 35% of suspicious cases have not undergone surgical intervention. We are planning to continue collection of clinical data for these patients to see long-term outcomes. Also, because this is a retrospective analysis, there is the potential for referral and sample bias. Cyto-I nodules with either clinically or sonographically worrisome features had a higher possibility of going to surgery without molecular testing and, therefore, would be excluded from our study. This may have skewed our cohort to include more “benign” cyto-I nodules. Finally, these data are limited to the GEC and GSC tests. We have not systematically analyzed other commercially available molecular diagnostic tests.

There are several important strengths of our cohort: our data set includes the highest number of GSC cases published to date. There was a low nondiagnostic rate on the molecular testing, and histopathology results were available in 70% of molecularly suspicious cases. All cytology results were read by academic pathologists highly specialized in thyroid pathology who were blinded to the results of the molecular testing since those were obtained subsequent to the cytology reading.

In summary, our early experience from a single academic tertiary center shows a statistically significant increase in benign call rates for GSC compared with GEC, with improvement in specificity and PPV while maintaining high sensitivity and NPV. The benign call rate in thyroid nodules with HC particularly showed dramatic improvement. Surgical intervention has reduced by 66% in the combined GSC and GEC cohorts. Longer term follow-up and larger studies will be required to fully determine the clinical accuracy of GSC in TC.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Cibas

, Baloch

, Fellegara

, LiVolsi

, Raab

, Rosai

, Diggans

, Friedman

, Kennedy

, Kloos

, Lanman

, Mandel

, Sindy

, Steward

, Zeiger

, Haugen

, Alexander

. 2013. A prospective assessment defining the limitations of thyroid nodule pathologic evaluation. Ann Intern Med, 159:325–332.

Cibas

, Ali

, Conference NCITFSotS. 2009. The Bethesda system for reporting thyroid cytopathology. Am J Clin Pathol, 132:658–665.

Alexander

, Kennedy

, Baloch

, Cibas

, Chudova

, Diggans

, Friedman

, Kloos

, LiVolsi

, Mandel

, Raab

, Rosai

, Steward

, Walsh

, Wilde

, Zeiger

, Lanman

, Haugen

. 2012. Preoperative diagnosis of benign thyroid nodules with indeterminate cytology. N Engl J Med, 367:705–715.

Angell

, Frates

, Medici

, Liu

, Kwong

, Cibas

, Kim

, Marqusee

. 2015. Afirma benign thyroid nodules show similar growth to cytologically benign nodules during follow-up. J Clin Endocrinol Metab, 100:E1477–E1483.

Cibas

, Ali

. 2017. The 2017 Bethesda system for reporting thyroid cytopathology. Thyroid, 27:1341–1346.

Haugen

, Sawka

, Alexander

, Bible

, Caturegli

, Doherty

, Mandel

, Morris

, Nassar

, Pacini

, Schlumberger

, Schuff

, Sherman

, Somerset

, Sosa

, Steward

, Wartofsky

, Williams

. 2017. American Thyroid Association Guidelines on the management of thyroid nodules and differentiated thyroid cancer task force review and recommendation on the proposed renaming of encapsulated follicular variant papillary thyroid carcinoma without invasion to noninvasive follicular thyroid neoplasm with papillary-like nuclear features. Thyroid, 27:481–483.

Giordano

, Valcavi

, Thompson

, Pedroni

, Renna

, Gradoni

, Barbieri

. 2012. Complications of central neck dissection in patients with papillary thyroid carcinoma: results of a study on 1087 patients and review of the literature. Thyroid, 22:911–917.

Kloos

. 2017. Molecular profiling of thyroid nodules: current role for the Afirma Gene Expression Classifier on clinical decision making. Mol Imaging Radionucl Ther, 26:36–49.

Alexander

, Schorr

, Klopper

, Kim

, Sipos

, Nabhan

, Parker

, Steward

, Mandel

, Haugen

. 2014. Multicenter clinical experience with the Afirma Gene Expression Classifier. J Clin Endocrinol Metab, 99:119–125.

10.

Yang

, Sullivan

, Zhang

, Govind

, Levin

, Rao

, Moatamed

. 2016. Has Afirma Gene Expression Classifier testing refined the indeterminate thyroid category in cytology?. Cancer Cytopathol, 124:100–109.

11.

Duick

, Klopper

, Diggans

, Friedman

, Kennedy

, Lanman

, McIver

. 2012. The impact of benign Gene Expression Classifier test results on the endocrinologist-patient decision to operate on patients with thyroid nodules with indeterminate fine-needle aspiration cytopathology. Thyroid, 22:996–1001.

12.

Noureldine

, Najafian

, Aragon Han

, Olson

, Genther

, Schneider

, Prescott

, Agrawal

, Mathur

, Zeiger

, Tufano

. 2016. Evaluation of the effect of diagnostic molecular testing on the surgical decision-making process for patients with thyroid nodules. JAMA Otolaryngol Head Neck Surg, 142:676–682.

13.

Sipos

, Blevins

, Shea

, Duick

, Lakhian

, Michael

, Thomas

, Sosa

. 2016. Long-term nonoperative rate of thyroid nodules with benign results on the Afirma Gene Expression Classifier. Endocr Pract, 22:666–672.

14.

Patel

, Angell

, Babiarz

, Barth

, Blevins

, Duh

, Ghossein

, Harrell

, Huang

, Kennedy

, Kim

, Kloos

, LiVolsi

, Randolph

, Sadow

, Shanik

, Sosa

, Traweek

, Walsh

, Whitney

, Yeh

, Ladenson

. 2018. Performance of a genomic sequencing classifier for the preoperative diagnosis of cytologically indeterminate thyroid nodules. JAMA Surg, 153:817–824.

15.

Strickland

, Vivero

, Jo

, Lowe

, Hollowell

, Qian

, Wieczorek

, French

, Teot

, Sadow

, Alexander

, Cibas

, Barletta

, Krane

. 2016. Preoperative cytologic diagnosis of noninvasive follicular thyroid neoplasm with papillary-like nuclear features: a prospective analysis. Thyroid, 26:1466–1471.

16.

Baloch

, Harrell

, Brett

, Randolph

, Garber

; AACE Endocrine Surgery Scientific Committee and Thyroid Scientific Committee. 2017. American Association of Clinical Endocrinologists and American College of Endocrinology Disease State Commentary: managing thyroid tumors diagnosed as noninvasive follicular thyroid neoplasm with papillary-like nuclear features. Endocr Pract, 23:1150–1155.

17.

Baloch

, Harrell

, Brett

, Randolph

, Garber

; AACE Endocrine Surgery Scientific Committee and Thyroid Scientific Committee 2017 Managing thyroid tumors diagnosed as non-invasive follicular tumor with papillary like nuclear features (NIFTP). Endocr Pract 2017 Jul 13 [Epub ahead of print]; DOI:10.4158/EP171940.DSC.

18.

Harrell

, Eyerly-Webb

, Golding

, Edwards

, Bimston

. 2018. Statistical comparison of Afirma GSC and Afirma GEC outcomes in a community endocrine surgical practice: early findings. Endocr Pract, 25:161–164.

19.

Sadow

, Barbiarz

, Kennedy

, Barbiarz

Kennedy

. 2018. Detecting Expressed Variants and Fusions in RNASeq Data from Thyroid FNAs. AACE Late Breaking Abstract No. 1227, May 11–May 16, Boston, MA.

20.

Cancer Genome Atlas Research N. 2014. Integrated genomic characterization of papillary thyroid carcinoma. Cell, 159:676–690.

21.

Zhang

, Lin

. 2016. Molecular testing of thyroid nodules: a review of current available tests for fine-needle aspiration specimens. Arch Pathol Lab Med, 140:1338–1344.

22.

Nikiforov

. 2017. Role of molecular markers in thyroid nodule management: then and now. Endocr Pract, 23:979–988.

23.

Sahli

, Smith

, Umbricht

, Zeiger

. 2018. Preoperative molecular markers in thyroid nodules. Front Endocrinol (Lausanne), 9:179.

24.

Brauner

, Holmes

, Krane

, Nishino

, Zurakowski

, Hennessey

, Faquin

, Parangi

. 2015. Performance of the Afirma Gene Expression Classifier in Hurthle cell thyroid nodules differs from other indeterminate thyroid nodules. Thyroid, 25:789–796.

25.

Wong

, Angell

, Strickland

, Alexander

, Cibas

, Krane

, Barletta

. 2016. Noninvasive follicular variant of papillary thyroid carcinoma and the Afirma Gene-Expression Classifier. Thyroid, 26:911–915.

26.

Maximo

, Lima

, Prazeres

, Soares

, Sobrinho-Simoes

. 2012. The biology and the genetics of Hurthle cell tumors of the thyroid. Endocr Relat Cancer, 19:R131–R147.

27.

Haddad

, Nasr

, Bischoff

, Busaidy

, Byrd

, Callender

, Dickson

, Duh

, Ehya

, Goldner

, Haymart

, Hoh

, Hunt

, Iagaru

, Kandeel

, Kopp

, Lamonica

, McIver

, Raeburn

, Ridge

, Ringel

, Scheri

, Shah

, Sippel

, Smallridge

, Sturgeon

, Wang

, Wirth

, Wong

, Johnson-Chilla

, Hoffmann

, Gurski

. 2018. NCCN guidelines insights: thyroid carcinoma, version 2.2018. J Natl Compr Canc Netw, 16:1429–1440.

28.

Hao

, Duh

, Kloos

, Babiarz

, Harrell

, Traweek

, Kim

, Fedorowicz

, Walsh

, Sadow

, Huang

, Kennedy

. 2019. Identification of Hurthle cell cancers: solving a clinical challenge with genomic sequencing and a trio of machine learning algorithms. BMC Syst Biol, 13:27.

29.

Ganly

, Makarov

, Deraje

, Dong

, Reznik

, Seshan

, Nanjangud

, Eng

, Bose

, Kuo

, Morris

LGT

, Landa

, Carrillo Albornoz

, Riaz

, Nikiforov

, Patel

, Umbricht

, Zeiger

, Kebebew

, Sherman

, Ghossein

, Fagin

, Chan

. 2018. Integrated genomic analysis of Hurthle cell cancer reveals oncogenic drivers, recurrent mitochondrial mutations, and unique chromosomal landscapes. Cancer Cell, 34:256–270 e255.

30.

Gopal

, Kubler

, Calvo

, Polak

, Livitz

, Rosebrock

, Sadow

, Campbell

, Donovan

, Amin

, Gigliotti

, Grabarek

, Hess

, Stewart

, Braunstein

, Arndt

, Mordecai

, Shih

, Chaves

, Zhan

, Lubitz

, Kim

, Iafrate

, Wirth

, Parangi

, Leshchiner

, Daniels

, Mootha

, Dias-Santagata

, Getz

, McFadden

. 2018. Widespread chromosomal losses and mitochondrial DNA alterations as genetic drivers in Hurthle cell carcinoma. Cancer Cell, 34:242.e5–255.e5.