Abstract
Background:
Although the current gold standard for diagnosing thyroid nodule malignancy is ultrasound-guided fine-needle aspiration (FNA) cytology, about 20–25% of cytological evaluations are considered indeterminate for malignancy. This limitation has led to the emergence of next-generation sequencing panels, for example, ThyroSeq v3 (TSv3), which recognize highly diagnostic genetic mutations of common thyroid carcinomas in FNA samples and classify them as test-negative or test-positive, helping optimize treatment for indeterminate thyroid nodules (ITNs). Our goals were to evaluate the benign call rate (BCR) of TSv3 and assess its diagnostic performance and clinical utility while highlighting the points of consideration for a public Canadian institution.
Methods:
This is a single-center study conducted at the Royal Victoria Hospital (McGill University Health Centre) in Montreal, Canada, between January and February 2019. Patients were offered TSv3 following the McGill algorithm for ITN workup, a novel protocol developed at our institution to select only diagnostic surgery candidates to minimize waste of public resources, considering the single-payer health care system. Patient demographics, cytopathology results, TSv3 data, treatment plan, and final histopathology result were reviewed.
Results:
A total of 50 ITNs underwent TSv3 testing; molecular analysis yielded 20 (40%) “positive” results and 24 (48%) “negative” results. Six (12%) results were classified as “currently negative” or “negative but limited.” “Currently negative” results indicate a low-risk mutation that alone is insufficient for development of a malignant lesion. “Negative but limited” results indicate a sample that is nondiagnostic for malignancy due to low cell count. BCR was calculated as (“negative” and “currently negative”)/total, resulting in a BCR of 58%. Twenty-three (46%) patients were scheduled for surgery and 27 (54%) patients continued with surveillance. Ninety-one percent (20 of 22) of the resected target nodules were malignant on final pathology.
Conclusions:
TSv3 proved beneficial in classifying ITNs as positive or negative, avoiding surgery in the latter cases. We found a lower reduction rate in surgery and BCR than the previously published studies, which is attributable to the criteria of the McGill algorithm. In the Canadian public health care system, preventing unnecessary surgery represents significant cost savings for the provincial government while also improving patient quality of life.
Introduction
The incidence of thyroid cancer has steadily increased over the past several decades, likely due to the concurrent progress of technologically advanced imaging modalities that have facilitated detection of thyroid nodules, and in turn, thyroid cancer (1,2). The current gold standard for diagnosing thyroid nodule malignancy is still ultrasound-guided fine-needle aspiration (USFNA) cytology. The relevance of fine-needle aspiration (FNA) cytology as the first diagnostic tool for the evaluation of thyroid lesions has been clearly demonstrated and validated throughout multiple studies in these last decades (3,4). It represents the most commonly used approach in thyroid nodules because of its simplicity, safety, cost-effectiveness, and relatively high diagnostic accuracy (4 –6). Despite this, one of its greatest limitations is that 20–25% of cytological evaluations are classified as indeterminate thyroid nodules (ITNs), corresponding to Bethesda III [“atypia (or follicular lesion) of undetermined significance” (AUS/FLUS)] and Bethesda IV [“follicular neoplasm”/“suspicious for follicular neoplasm” (FN/SFN)] diagnostic categories of The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) (7 –9).
Nodules classified as Bethesda III or Bethesda IV have a relatively low, but non-negligible risk of malignancy (ROM), ranging from 5% to 15% for AUS/FLUS and from 15% to 40% for FN/SFN (7,10). Historically, surveillance by repeat FNA was an option for nodules classified as AUS/FLUS with diagnostic lobectomy recommended for nodules that remained cytologically indeterminate on repeat FNA and/or otherwise showed worrisome clinical or sonographic features. Similarly, diagnostic lobectomy has traditionally been recommended for nodules classified as FN/SFN (11). However, up to 75% of ITN diagnostic surgery is unnecessary, as the majority of AUS/FLUS and FN/SFN nodules that undergo surgical resection are ultimately found to be histologically benign (8,12). For these cases, surgery may be justified for diagnostic purposes (e.g., distinction between follicular adenoma and follicular carcinoma based on capsular/vascular invasion) but considered suboptimal from a therapeutic standpoint. Furthermore, ITNs with malignant pathology results on their initial lobectomy must at times undergo a second surgery, a completion thyroidectomy. The indeterminate cancer risk prediction for Bethesda III and IV patients results in suboptimal surgical planning, especially for those who undergo unnecessary surgery for benign lesions or those who require a second completion thyroidectomy, contributing to higher health care costs and morbidity (13).
These limitations in cytopathology have led to the emergence of molecular testings (MTs), especially in the past decade where their role has become increasingly important in improving diagnosis and guiding treatment course of thyroid nodules (8,14,15). According to the 2015 American Thyroid Association (ATA) guidelines, after consideration of clinical and sonographic features, MT may be used to supplement malignancy risk assessment for Bethesda III and IV patients instead of proceeding directly with the traditional strategy (e.g., repeat FNA, surveillance or diagnostic surgery) and may also be considered for Bethesda V and VI lesions, if the results are expected to alter the management (e.g., extent of surgery) (16). The three MTs that are currently offered by commercial laboratories for ITNs are all nucleic acid-based tests: Afirma Gene Sequencing Classifier™ (Veracyte, Inc., South San Francisco, CA), ThyGenX/ThyraMIR™ (Interpace Diagnostics, Parsippany, NJ), and ThyroSeq™ (University of Pittsburgh Medical Center, Pittsburgh, PA, and CBLPath, Inc., Rye Brook, NY). These tests can be categorized by their general testing approach: expression profiling for a panel of genes (messenger RNAs or microRNAs), genotyping for tumor-associated driver mutations and gene fusions, or a combination of these methodologies.
ThyroSeq, now in its third version, uses targeted next-generation sequencing to assay for a broad panel of point mutations, insertions/deletions, gene fusions, copy number alterations (CNA), and gene expression alterations associated with thyroid neoplasia. To determine the likelihood of benignity and malignancy, a genomic classifier (GC) scheme is applied to assign a value (0–2) to each sample with a detected genetic alteration based on the strength of its association with malignancy: 0 (no association with cancer), 1 (low cancer probability), or 2 (high cancer probability). Every sample has a GC score that is a sum of individual values of all detected alterations, with GC scores 0 and 1 considered test-negative (a test score of 1 is commercially reported as “currently negative”) and scores 1.5 and above test-positive (12).
Test-negative results are divided further into subcategories of “negative” and “currently negative.” “Negative” test results indicate samples with a 3–4% ROM as there are no detected gene alterations associated with thyroid cancer. “Currently negative” results indicate samples that contain a low-risk mutation that alone is not sufficient for the development of a malignant lesion (ROM <10%). “Currently negative” samples may be benign at the time of sampling but may undergo clonal expansion or acquire additional mutations; therefore, ThyroSeq v3 (TSv3) suggests more active surveillance compared with samples categorized as “negative.” The last category of “negative but limited” indicates samples that have a low number of thyroid epithelial cells (ROM cannot be determined) (9). Test-positive results indicate an ROM of 40–100% depending on the mutation identified. TSv3 test performance, according to a multicenter clinical validation study, is 94% sensitivity, 82% specificity, 97% negative predictive value (NPV), and 66% positive predictive value (PPV), and leads to the reduction of a significant number of diagnostic thyroid surgeries (9). Herein, we sought to assess the diagnostic performance and clinical utility of TSv3 in the Canadian public health care setting.
Materials and Methods
Study population: the McGill algorithm
Canada's health care system is composed of publicly funded health insurance plans that provide coverage to all Canadian citizens. Due to this single-payer system, social responsibility is an important factor to consider in any provision of health care services, especially new diagnostic procedures such as MT. In an effort to minimize wasting of public resources, we created an algorithm for the evaluation and management of patients with ITNs, coined the McGill algorithm (Fig. 1). The McGill algorithm aims to use public health care resources efficiently by imposing strict selection criteria for MT, restricting the number of MTs offered for thyroid nodules unless it is truly deemed to be the most appropriate diagnostic modality. The ultimate goal of this algorithm is to identify only Bethesda III or Bethesda IV patients who would have been recommended diagnostic surgery if MT was not available.

McGill algorithm for workup of indeterminate thyroid nodules. AUS, atypia of undetermined significance; ETE, extrathyroidal extension; FLUS, follicular lesion of undetermined significance; FN, follicular neoplasm; HCLUS, Hurthle cell lesion of undetermined significance; LN, lymph node; ThyCa, thyroid cancer; TI-RADS, Thyroid Imaging Reporting and Data System; U/S, ultrasound; USFNA, ultrasound-guided fine-needle aspiration.
This novel protocol was developed at our institution through an interdisciplinary effort between thyroid cancer specialists at the McGill University in the following disciplines: endocrinology, otolaryngology—head and neck surgery, general surgery, pathology, and medical biochemistry. The algorithm is based on an amalgamation of multidisciplinary evidence-based literature on thyroid cancer, notably the 2015 ATA guidelines for thyroid nodule management (16) as well as clinical expertise. Patients in the current study were thus selected according to the McGill algorithm for ITN workup (Fig. 1), which marks the algorithm's first application in clinical practice.
Patients aged 18 years or older were eligible for MT through the algorithm if they had a Bethesda III/IV nodule with a Thyroid Imaging Reporting and Data System (17) score of 4 or 5 and consented to USFNA. If the USFNA sample pathology was reported as ITN but had low or scant cellularity, then the USFNA was repeated. Patients were presented different management options: surgery, active surveillance, or MT along with detailed counseling on the limitations and advantages of each option. Patients who preferred MT and were approved by the publicly funded insurance plan were included in the study.
Study design and molecular analysis
The study was conducted at the Royal Victoria Hospital (McGill University Healthcare Center) in Montreal, Quebec, between January and February 2019. Fifty-one patients selected using the McGill algorithm underwent TSv3 MT. Ethics approval was obtained from the McGill University Health Centre (MUHC) Research Ethics Board in Montreal, Quebec. Consent for MT was obtained through a general written consent form used at the MUHC for all surgery, anesthesia, diagnostic, or therapeutic procedures, which we specified as TSv3. Fifty-one collected samples yielded 50 adequate molecular analysis results (Fig. 2). Samples were sent for TSv3 analysis at a commercial laboratory at the University of Pittsburgh Medical Center.

Study design. One of the 23 surgical patients was pregnant and chose to postpone her surgery outside this project's time frame and therefore was not included in the final surgical analysis.
Pathology
All cytology and pathology specimens were interpreted by board-certified head and neck fellowship-trained cytopathologists in the Department of Pathology at the McGill University Healthcare Center. TBSRTC (7) was used to classify all USFNAs.
Data collection
For each patient, the following was reviewed: patient demographics, cytopathology results, TSv3 results, surgical procedure (if applicable), and final histopathology result.
Study outcomes
The primary outcome was the benign call rate (BCR) of TSv3, defined by Ohori et al. (18) as the percentage of ITNs with benign or negative molecular results and calculated as (“negative” and “currently negative”)/total. As the “negative but limited” category is defined as inadequate sample cell count, we considered it separate from the “negative” category and “currently negative” subcategory. Our goal was to analyze the percent reduction in diagnostic surgery, assess TSv3's clinical decision-making implications in a Canadian setting, and compare our results under the McGill algorithm with existing literature.
Statistical analysis
Statistical analysis was performed using SPSS statistical software version 24 (19). To compare the performance of TSv3 in correctly classifying ITNs as benign and malignant, we used Fisher's exact test, and statistical significance was defined as p < 0.05.
Results
A total of 51 patients underwent TSv3 testing. One sample was excluded due to an inadequate count of thyroid cells in the tested sample. The characteristics of the 50 eligible samples and patients are summarized in Table 1.
Population Demographics and Nodule Characteristics
F, female; M, male; max, maximum; min, minimum; SD, standard deviation; TSv3, ThyroSeq v3.
Table 2 summarizes the TSv3 results of the tested nodules and their associated surgical or nonsurgical outcome. Molecular analysis yielded 20 “positive,” 26 “negative,” 2 “currently negative,” and 2 “negative but limited” results. As defined by TSv3, “negative” results indicate a sample with no genetic mutations associated with thyroid cancer (ROM 3–4%). A “currently negative” result indicates a sample that contains a low-risk mutation that alone is not sufficient for the development of a malignant lesion, but that may undergo clonal expansion and acquire additional mutations that could increase the malignancy risk (ROM <10%). “Negative but limited” results indicate a sample that has a low thyroid epithelial cell count that is nondiagnostic for malignancy. All 20 “positive” patients underwent surgery, as well as both “currently negative” patients (n = 2) and 1 patient with a “negative but limited” result (n = 1). All 26 “negative” patients and 1 patient with a “negative but limited” result (n = 1) continued with surveillance. In total, 23 (46%) patients underwent surgery and 27 (54%) patients were followed with conservative management. The 2 out of 50 results that were “negative but limited” were excluded from the calculation of the BCR due to their unknown ROM, as similarly done by Ohori et al. (18). The BCR was 71% and 48% in Bethesda III and IV nodules, respectively, and 58% overall.
ThyroSeq v3 Results by Thyroid Nodule Classification and Treatment Plan
BCR, benign call rate.
Of the 20 surgical patients, 1 test-positive Bethesda III patient was pregnant and chose to postpone her surgery outside this project's time frame and therefore was not included in the final surgical analysis. Among the 22 patients who completed surgery, 21 patients underwent hemithyroidectomy and 1 patient underwent total thyroidectomy. Table 3 summarizes the performance of TSv3 for these patients. All (n = 19) test-positive nodules resected were malignant on final pathology and all (n = 2) test-negative nodules resected were benign on final pathology; thus, the test correctly classified the nodules in 100% of cases (p < 0.001). The two benign lesions both had a Bethesda III cytology, TSv3-category “currently negative,” and were follicular adenomas on histopathological analysis, as summarized in Table 4. Among the malignant pathologies, 18 (90%) lesions were papillary carcinomas, 1 (5%) was a Hurthle cell carcinoma, and 1 (5%) was a follicular carcinoma. Within the category of papillary carcinomas, 75% (n = 15) were follicular variants, 10% (n = 2) were oncocytic variants, and 5% (n = 1) were solid variants.
Performance of the ThyroSeq v3 Test in Patients Who Underwent Surgery
Histopathology Analysis of Patients Who Underwent Surgery
In the 50 thyroid nodules assayed, TSv3 recorded at least 1 molecular alteration in 30 of the nodules (Table 5). This includes the 24 “negative” nodules of which 6 samples had a TSHR mutation, 20 “positive” nodules of which all 20 samples had a mutation [NRAS (n = 8), HRAS (n = 5), BRAFR462I (n = 1), PTEN (n = 1), THADA (n = 1), IGF2BP3 (n = 1), CNA (n = 3)], and the 4 “currently negative” samples of which all 4 had a mutation as well [NRAS (n = 1), TSHR (n = 1), PTEN (n = 1), CNA (n = 1)]. In total, we observed 47 molecular alterations among all nodules. The most common mutations involved the RAS genes, either HRAS (n = 5), which was associated with a 100% rate of malignancy, or NRAS (n = 9), which was associated with an 89% rate of malignancy. RAS gene mutations were present in 30% (n = 14) of all patient samples, gene expression alterations were present in 21% (n = 10), and CNA were present in 19% (n = 9). No BRAFV600E mutations or noninvasive follicular thyroid neoplasms with papillary-like nuclear features (20,21) were identified in our cohort. There was one patient, who underwent a total thyroidectomy, who had an NRAS together with an EIF1AX mutation in the lesion of interest. In the two follicular adenomas in this series, one harbored an NRAS mutation, and one had a positive but low number of CNA.
Molecular Alteration Profile of Thyroid Nodules and Final Outcome
One of the three HRAS, GEP patients scheduled for surgery did not undergo surgery within the time frame of this project due to her pregnancy, and therefore, final pathology data were not collected.
CNA, copy number alterations; GEP, gene expression profile.
Discussion
For the past decade, the rapidly improving diagnostic ability of MT is becoming a valid tool in the routine workup of cytology-ITNs. TSv3 has evolved over the last few years to risk stratify lesions at 66% PPV and 97% NPV, branding itself as a powerful rule-in and rule-out test, reducing the need for diagnostic surgery in patients with ITNs (16).
In our 50 samples from patients with Bethesda III/IV pathology, 20 results were “positive,” 24 results were “negative,” 4 results were “currently negative,” and 2 results were “negative but limited.” Although we obtained a BCR of 58%, only 54% of patients continued with surveillance. This 4% difference is attributed to patients who were in the categories of “currently negative” and “negative but limited” that had variable surgical outcomes. When comparing our BCR to the single-center study of Ohori et al. (18) that reported a BCR of 74%, we suspect that our lower BCR is due to stricter patient inclusion criteria in the McGill algorithm. Furthermore, we found a higher rate of benign calls in Bethesda III nodules than in Bethesda IV: 71% versus 48%. This is likely due to the inherent higher ROM in Bethesda IV nodules, which falls in the range of 15–40%, compared with Bethesda III nodules, which have an ROM of 5–15% (7). Consequently, we expect the BCR to be higher in the Bethesda III group given the higher likelihood of benignity; this difference in BCR between Bethesda III/IV is also reflected in other studies (9,18). Due to our selection of high-risk Bethesda III and IV patients, we also expected a smaller sample size with selection bias toward malignant cases, and thus TSv3-positive results, when compared with studies that included all Bethesda III and IV lesions (9,12,18,22). We have an increased overall pretest probability of cancer of 40% as 20 of 50 nodules assayed were malignant on final pathology, which is higher than the 23–25% that is typically reported in ITNs and likely contributes to our high PPV of 100% (23). Despite these factors, our study represents the first study analyzing the utility of TSv3 with such a protocol and we expect reproducibility of these results in other clinical settings that adopt similar protocols.
Overall, we obtained 27 “negative” results that continued with conservative management, thereby resulting in a 54% decrease in diagnostic surgery. Again, the McGill algorithm for ITN workup may explain why our finding of a 54% reduction in diagnostic surgery was lower than the 61% reduction reported by Steward et al. (9) in their multicenter clinical validation study, where they also imposed less strict patient selection criteria. The 27 patients with “negative” results were ultimately reclassified into the benign category and counseled to have a longitudinal ultrasound follow-up. One of the two benign resections was an NRAS-only follicular adenoma that was classified as “currently negative” by TSv3, which indicates a low but non-negligible ROM of <10%. In a recent study by Patel et al. (24), they found that up to 81% of cytologically indeterminate nodules with a preoperative isolated RAS mutation are malignant on final pathology over an 8-year period. Typically, it is recommended such nodules undergo surgical intervention as they have a high lifetime ROM despite initially being benign (25). Thus, TSv3's classification “currently negative” of an NRAS-only ITN that is benign at present day may appropriately reflect the beginning of a step-wise progression of NRAS-driven thyroid cancer.
To date, there have been few publications on TSv3: one large prospective clinical validation study (18), one analytical study (12), as well as a few independent studies (22,26), all conducted in the United States. There are no current studies published on the review of the performance of this test used in the Canadian health care system. Our study aimed to show that by using TSv3, there is a significant reduction in the number of diagnostic thyroid surgeries in situations of ITNs.
As surgery poses a significant cost burden to the health care system (pre-, intra-, and postoperative costs, lifelong medication, potential complications, etc.), a 54% decrease in surgery may translate to important cost savings for the provincial health care system. Previous cost-analyses done for TSv1 and TSv2 demonstrated a 30% and 36% respective decrease in cost per patient in the U.S. setting (27,28). More recently, in a cost-comparison between MTs and diagnostic lobectomy, Nicholson et al. (29) reported TSv3 as the most cost-effective option in ITN diagnosis, resulting in a decrease of $24, 131 per correct diagnosis when compared with diagnostic surgery. Although this study was conducted in the United States and thus cost savings reported by Nicholson et al. (29) may not reflect the reality of the Canadian system, it nonetheless implies a major economic impact on a large-scale level. This less-invasive alternative to diagnostic surgery provides more benefit to the patient, obviating the risks of surgery and lifelong medication use. Finally, by using the McGill algorithm to restrict the number of MT candidates in the context of a public health care system, our study data more realistically reflect the reality of TSv3 used in daily practice in a Canadian institution.
A limitation to our study is that we did not evaluate the complete performance (test specificity, sensitivity, NPV) of TSv3 as surgery was not performed on test-negative patients. The long-term outcome of test-negative nodules is currently unknown; longer studies are required to evaluate if these nodules are truly benign. It is anticipated that ∼4% of these patients will eventually undergo surgery for malignancy, based on clinical expertise. Although another limitation to our study design is that pathologists were not blinded to molecular results, they were all fellowship-trained in head and neck pathology and MT information was primarily used to reconcile morphologic and molecular findings. Finally, as this was the first-time application of the McGill algorithm in a clinical study, a longer and larger prospective cohort study is required to validate the McGill algorithm as a sustainable routine protocol for MT.
Although the algorithm may seem like heavy selection criteria compared with studies that included all Bethesda III–V patients (9), it reflects more realistically the population of most quaternary care centers in clinical practice where we would likely not operate on all ITNs. Overall, despite our small sample size due to the project's short time frame, we believe the results in the present study will guide future surgical decision-making in broader patient populations. Health outcomes must also be measured in longer follow-up examinations to see if the McGill algorithm can truly distinguish between patients who do and do not benefit from surgery and MT. With a more in-depth estimate of risk versus benefit, we will more definitively be able to assess its value as a triage protocol.
Conclusion
This study demonstrates TSv3's utility in classifying thyroid cytology-indeterminate nodules as positive or negative, preventing unnecessary diagnostic surgery in the majority of patients. We found a lower reduction rate in surgery and lower BCR than the previously published studies due to the application of the McGill algorithm for ITN workup, developed by our institution to restrict MT in the Canadian public health care system. Although there is a cost associated with the test, avoiding unnecessary surgery represents significant cost savings to the health care system in the long term and improves patient quality of life by sparing patients from taking lifelong medications and complications associated with surgery.
Footnotes
Acknowledgment
We thank Dr. Sabrina Wurzba whose statistical expertise was invaluable during the analysis and interpretation of the data collected.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received.
