Abstract
Objective:
To present clinical outcomes of the prospective implementation of the 2015 American Thyroid Association (ATA) guidelines for the management of thyroid nodules and differentiated thyroid cancer (DTC) using the modified ATA recurrence risk (RR) stratification system.
Methods:
We prospectively analyzed 612 patients with DTC treated between April 2017 and December 2021 in Calgary, Alberta. Each patient was prospectively assigned a modified ATA RR and American Joint Committee Cancer 8th edition stage. Initial risk stratification and consideration of the 2015 ATA guidelines guided surgical management as well as the indication for and dose of radioiodine (RAI) and other adjuvant therapies. Patients were assessed for their response to treatment (RTT) at 2-years postoperatively.
Results:
There were 479 patients who had 2-year follow-up data and were included in the study. Of these patients, there were 253 (53%) low-, 129 (27%) intermediate-, and 97 (20%) high-RR patients. Of these, 227 patients (47%) underwent total thyroidectomy (TTX) plus RAI, 178 (37%) underwent TTX only, and 74 (16%) underwent lobectomy. The RTT at 2 years was excellent for 89% (66) of patients with lobectomy, 84% (149) for TTX only, and 53% (121) for TTX plus RAI. Among 253 patients who were deemed low RR, 85% (216) had excellent RTT, 13% (32) indeterminate RTT, 2% (4) biochemical incomplete RTT, and 1 patient had structural incomplete RTT. The intermediate RR group had the following RTT outcomes: 64% (83) excellent, 23% (30) indeterminate, 6% (7) biochemical incomplete, and 7% (9) structural incomplete. The high RR group had the worst RTT outcomes, with 38% (37) excellent, 19% (18) indeterminate, 10% (10) biochemical incomplete, and 33% (32) structural incomplete RTT.
Conclusions:
The 2015 ATA RR stratification system is useful for predicting disease status at 2-year post-treatment in patients with DTC. The 2015 ATA guidelines and modified ATA RR stratification treatment recommendations may reduce thyroid cancer overtreatment by including lobectomy as a definitive treatment option for low-risk thyroid cancers and selective use of RAI for intermediate and high-risk patients.
Introduction
The 2015
After publication of the ATA guidelines, retrospective cohort studies have validated the 2015 revised ATA RR stratification system. 6 –11 A recent large prospective multicenter study from Italy evaluated the extent to which the performance of the ATA RRA is affected by the treatment center itself. 12 The participating centers were not provided guidance or restrictions for patient management, and the study included patients treated since 2013.
Therefore, all evaluations of the 2015 revised ATA RRA system for patients with DTC were based on outcome assessments for patients whose initial treatment was mostly guided by the 2009 iteration of the ATA guidelines. 13 However, the 2015 guideline update introduced several recommendations to reduce thyroid cancer overtreatment. Most importantly, these changes included (a) the option to either omit RAI therapy or use reduced 131I activities for postoperative adjuvant therapy of select patients and (b) the option to choose lobectomy as a definitive treatment for low-risk thyroid cancers, which was used for only 3.5% of the patients evaluated in the most recent study by Grani et al. 2,12,14
We present our findings on the prospective implementation of the 2015 ATA guidelines for the management of thyroid nodules and DTC using the modified ATA RR stratification system in a Canadian tertiary care referral setting.
Patients and Methods
Since April 2017, the University of Calgary Division of Endocrinology implemented the 2015 ATA guidelines and modified ATA RR stratification system, and it has prospectively assessed all new patients with thyroid cancer in the Calgary and Southern Alberta Healthcare regions with more than 1.5 million inhabitants for their ATA RR, TNM/American Joint Committee Cancer (AJCC) staging, metastasis (distant), age (at presentation), completeness of excision, invasion, size (MACIS) score, and their indication for dose of RAI treatment within 3 months after surgery. All patients diagnosed with DTC in this health care region are referred to the Calgary Division of Endocrinology thyroid cancer group, which meets biweekly to review new referrals as described above, and perform RAI treatments and follow patients long-term.
The 612 patients evaluated for this study were identified in the prospective web-based RedCap Calgary thyroid cancer database, which contains ethics approved (Ethics ID:

Patient flow chart.
As per ATA guidelines, initial follow-up assessment occurred between 6 and 12 months after surgery and was classified as the 1-year follow-up, and the subsequent visit that occurred between 18 and 24 months was classified as the 2-year follow-up. Response to treatment (RTT) for patients who underwent total thyroidectomy (TTX) plus RAI remnant ablation was classified according to the 2015 ATA guidelines. 1 For patients with TTX only or lobectomy, the ATA guidelines do not give specific criteria for RTT.
We, therefore, followed the criteria reported by Momesso et al. The RTT for lobectomy was classified as follows: excellent RTT had stable nonstimulated thyroglobulin (Tg) <30 ng/mL with undetectable anti-Tg antibodies (TgAbs) and negative imaging. Indeterminate RTT had stable/declining TgAb levels. Biochemical incomplete RTT had nonstimulated Tg >30 ng/mL or increasing TgAb levels. Finally, structural incomplete RTT had any structural or functional evidence of disease. The RRT for TTX only was classified as follows: excellent RTT had nonstimulated Tg <0.2 ng/mL with undetectable TgAbs and negative imaging.
Indeterminate RTT had nonstimulated Tg level of 0.2 to 5 ng/mL, or stable/declining TgAb levels. Biochemical incomplete RTT had nonstimulated Tg >5 ng/mL or increasing TgAb levels. Finally, structural incomplete RTT had any structural or functional evidence of disease regardless of Tg or TgAb level. 7 Patients who underwent TTX plus RAI versus TTX alone versus lobectomy were categorized for RTT by their biochemical markers and follow-up ultrasound (US). The criteria for excellent RTT, indeterminate RTT, and biochemical incomplete RTT are different depending on the treatment modality offered, as described in the ATA guidelines.
Structural incomplete response is defined as any structural or functional evidence of disease with any Tg level, with or without anti-Tg antibodies. Patient data were collected and entered into our database within 1 month of each patient's first post-operative visit with their respective endocrinologist, except 70 patients whose data were entered outside of the 1-month period. Follow-up data were entered within 2 months of each follow-up visit.
Initial surgery of 84% of our 612 patients was performed by one of our higher volume thyroid surgeons. Standardized surgical and histology reports were introduced within our health care region before 2016. In January 2018, we introduced postoperative neck ultrasound assessments according to the European Thyroid Association (ETA) guidelines to standardize the RTT seen by neck ultrasound. 15
Statistical analyses were performed using R statistical software package, R Core Team (2017). 16 Continuous variables were expressed as medians and ranges, while nominal variables were expressed as frequency counts and percentages. We used the Welch t-test and cumulative link model in the R library ordinal. 17 A p-value of <0.05 was considered statistically significant.
Results
Clinical and demographic features of 479 patients with 2-year follow-up are described in Table 1.
Clinical and Demographic Features of the 479 Differentiated Thyroid Cancer Patients with 2-Year Follow-Up
Total thyroidectomy includes two-step thyroidectomies.
Four PTC other type, 5 PTC clear cell, 2 PTC warthin-like.
Three PTC other type, 3 PTC clear cell, 2 PTC warthin-like.
One PTC other type, 2 PTC clear cell.
AJCC, American Joint Committee Cancer; ATA, American Thyroid Association; PTC, papillary thyroid cancer; RAI, radioiodine; TTX, total thyroidectomy.
Among our 479 patients, 93% (445) were diagnosed with papillary thyroid cancer (PTC), including 28% (132) classic variants, 10% (49) aggressive variants, 14% (69) micro-PTC (miPTC), and 41% (195) other PTC variants. The majority had multifocal malignancy and were AJCC stage I. Among 69 miPTC patients, 41 had microcarcinoma targeted by fine needle aspiration and underwent therapeutic surgery, and 21 patients had microcarcinoma incidentally found on pathology reports. More details are in Supplementary Table S1.
Of the 41 miPTC patients who underwent therapeutic surgery, 22 had max target lesions ≥1 cm on preoperative US, and 19 had max lesions <1 cm (0.6–0.9 cm) on preoperative US. The reasons for biopsies of the 19 lesions <1 cm are described in Supplementary Table S2. Among these 41 miPTC patients, 29 were treated by TTX. Reasons for TTX are described in Supplementary Tables S3 and S4. The distribution of ATA RR across all patients was ATA high RR for 97 (20%), intermediate for 129 (27%), and low for 253 (53%). The prevalence of pathologic characteristics is detailed in Table 2.
Pathologic Characteristics of the 479 Differentiated Thyroid Cancer Patients with 2-Year Follow-Up, Classified by Risk of Recurrence
Tumor size <1.
Four PTC other type, 5 PTC clear cell, 2 PTC warthin-like.
Four PTC other type, 2 PTC clear cell, 1 PTC warthin-like.
Two PTC clear cell.
One PTC clear cell, 1 PTC warthin-like.
Nine microcarcinoma, all have <5 LN and LN metastasis ≤0.2 cm.
Three follicular variant PTC, all have <5 LN (0.03–0.4 cm).
Seven follicular thyroid carcinoma with <4 foci vascular invasion.
Three micro extrathyroidal extension into perithyroidal fat only.
Seventeen patients had known persistent disease at baseline.
LN, lymph node.
For the low-risk group, 40 out of 253 patients had N1 disease but are still classified as low RR according to ATA guidelines because they had <5 positive lymph node (LN) met discovered or a max metastasis size <0.2 cm. When patients are classified as intermediate or high-risk, they could have multiple pathologic characteristics that meet the ATA guideline criteria.
The RTT evaluation for the 479 patients classified by treatment type and ATA RR is included in Tables 3 and 4. There were 227 (47%) patients treated with TTX and RAI, 178 (37%) with TTX only, and 74 (16%) with lobectomy. Of the 74 patients who underwent lobectomy, 42 underwent a therapeutic lobectomy, 31 underwent a diagnostic lobectomy, and 1 was unspecified (Supplementary Table S12). The criteria for offering lobectomy are described in Supplementary Table S13.
Response to Treatment Evaluation for 479 Patients at 2-Year Follow-Up, Classified by Treatment Type
RTT, response to treatment.
Response to Treatment Evaluation for 479 Patients at 2-Year Follow-Up, Classified by Risk of Recurrence
Small suspicious lymph node on postop ultrasound.
Excellent RTT was seen in 89% (66) of lobectomy patients, 84% (149) of TTX only, and 53% (121) of TTX plus RAI (p < 0.0001 and p = 0.1 when comparing lobectomy with TTX plus RAI and TTX only, respectively). Indeterminate RTT was seen in 7% (5) of patients with lobectomy, 15% (27) with TTX only, and 21% (48) with TTX plus RAI. Biochemical incomplete RTT was seen in 3% (2) of patients with lobectomy, 0.5% (1) with TTX only, and 8% (18) with TTX plus RAI. Finally, structural incomplete RTT was seen in 1% (1) of patients with lobectomy, 0.5% (1) with TTX only, and 18% (40) with TTX plus RAI (Table 3).
The distribution of initial postoperative ATA RR for these 479 patients was 97 (20%) with high RR, 129 (27%) with intermediate RR, and 253 (53%) with low RR. Excellent RTT was seen in 85% (216) of low-risk patients, 64% (83) of intermediate-risk, and 38% (37) of high-risk (p < 0.0001 when comparing low RR with either the intermediate or high RR). Indeterminate RTT was seen in 13% (32) of low-risk, 23% (30) of intermediate-risk, and 19% (18) of high-risk patients. Biochemical incomplete RTT was seen in 2% (4) of low-risk, 6% (7) of intermediate risk, and 10% (10) of high-risk patients.
Structural incomplete RTT was recorded in 7% (9) of patients with intermediate RR, 33% (32) of high RR, and only recorded in 1 low RR patient (p < 0.0001 when comparing high RR with either low or intermediate RR) (Table 4 and Fig. 2). Detailed data of RTT evaluation classified by ATA RR for lobectomy, TTX only, and TTX plus RAI patients are presented in Supplementary Table S5.

Response to treatment at the 2-year follow-up for all 479 patients and for 445 PTC patients, according to the American Thyroid Association recurrence risk group. PTC, papillary thyroid cancer.
The odds ratio for structural incomplete RTT versus excellent RTT at 2-year follow-up was 186.81 (95% CI: 24.76–1409.19) and 23.42 (95% CI: 2.92–187.75) for high risk and intermediate risk when using low risk as reference level, respectively (p < 0.0001 and <0.005, respectively). The OR for less-than-excellent RTT was 13.26 (95% CI: 7.78–22.59) and 3.27 (95% CI: 1.99–5.35) for high risk and intermediate risk when using low risk as reference level, respectively (p < 0.0001 for both; Table 5).
Likelihood of Structural Incomplete Response to Treatment Versus Excellent Response to Treatment, and Likelihood of Less-Than-Excellent Response to Treatment at 2-Year Follow-Up
CI, confidence interval; OR, odds ratio; SE, standard error.
The number and distribution of intermediate/high-risk pathologic characteristics for TTX plus RAI or TTX only patients determined to have indeterminate RTT, biochemical incomplete RTT, or structural incomplete RTT at 2-year follow-up are shown in Supplementary Tables S6 and S7, and details of how patients' RTT were categorized are provided in Supplementary Table S8. We observed that higher RR was associated with higher baseline post-surgical Tg (Supplementary Table S9).
Of the 227 patients who received RAI, 120 (53%) received 30 mCi, most commonly for vascular invasion and microscopic extrathyroidal extension, and a few had gross extrathyroidal extension or incomplete tumor resection. Among the patients who received 100 mCi, there was a higher prevalence of lymphatic spread, gross extrathyroidal extension, or incomplete tumor resection. Finally, for those receiving 150 to 250 mCi, they had the highest prevalence of incomplete tumor resection or distant metastases. The characteristics of patients who received RAI is in Supplementary Table S10. Data for suppressed TSH of 479 patients with at least 2 years of follow-up are provided in Supplementary Table S11.
Discussion
Our prospective data demonstrate the prognostic utility of the ATA RRA in predicting postoperative outcomes at the 2-year follow-up. There is an increase in suboptimal RTT with increasing ATA RRA category, despite the intermediate- and high-risk patients receiving more intensive treatments (Table 4). When we divided the patients by RR as well as the type of treatment they underwent (Supplementary Table S5), we found that for low RR patients, more aggressive treatment portended to a lower likelihood of excellent RTT.
This may be confounded by the fact that patients with high ATA RR had multiple higher risk features that worsen outcomes. Altogether, these findings support the concept that low RR patients will have favorable outcomes, even with less aggressive treatment such as lobectomy and less aggressive follow-up schedules. In contrast, it seems that despite more aggressive treatments, such as high dose RAI, patients with high RR will have a high probability of biochemical or structural incomplete RTT. The benefits of this RR adapted approach include lower treatment-related morbidity, better quality of life, and lower health care costs. 18 However, those who are ATA intermediate- or high-RR must have their benefits weighed against the risk of biochemical or structural incomplete RTT and a higher risk for DTC recurrence.
When we looked at baseline Tg levels immediately after TTX plus RAI or TTX only, patients who had excellent RTT at 2 years were more likely to have undetectable levels of Tg when compared with patients with indeterminate, biochemical incomplete, or structural incomplete RTT (p < 0.0001 for any other group). Those with indeterminate RTT were more likely to have Tg levels between 0.2 and 0.99 (p < 0.001 vs. any other group).
Finally, those with biochemical or structural incomplete RTT were more likely to have Tg ≥1 when compared with patients with excellent and indeterminate RTT with p < 0.0001. We did not observe any specific pattern of Tg-antibody levels across the different RTT groups (Supplementary Table S9).
We note that the TTX plus RAI group have more intermediate or high RR characteristics, even with the same RTT (Supplementary Tables S6 and S7). Of the patients who underwent TTX plus RAI, 40 had structural incomplete RTT at the 2-year follow up, 25 of whom were classified based on distant metastases. However, 17 had distant metastases before surgery (Supplementary Table S6), which would skew the distribution toward structural incomplete RTT. We also acknowledge that these findings may be confounded by other clinical factors not captured by the ATA RR, such as more rapid tumor growth rate, multiple foci of malignancy, or lack of TSH suppression.
Although 2 years is not a long duration of follow up, our data already show a separation in the RTT by ATA RR. Durante et al previously noted in their study of 1020 PTC patients across eight hospitals in Italy that recurrence tends to occur within the first 6 to 12 months after treatment. 19 Medas et al looked at 579 patients with DTC in 2019 and showed that 10 out of 36 cases of recurrence occurred within the first year, and the remaining 26 patients had a mean time to recurrence of 21.8 months. 20
Since these studies suggest that most cases of persistent and recurrent disease occur within the first 2 years after surgery, our 2-year follow-up data are expected to reflect the pattern of longer-term data and can be used to substantiate and inform our use of the current ATA guidelines.
Considerable interinstitutional and interobserver variabilities are documented for the diagnosis of histological subtypes of thyroid cancer, 21 –24 detection and quantification of extrathyroidal extension, postoperative neck ultrasonographic examination, indication for dose of RAI treatment, and method of TSH stimulation. 25 –27 Moreover, outcomes also vary depending on the number of thyroidectomies performed by the respective surgeons. 28,29
One of the strengths of our prospective study is the reduced influence of these variables on outcomes by the prospective determination of ATA RRA, TNM staging, and MACIS staging, which guided standardized and consistent risk-stratified recommendations for or against postoperative RAI treatment. In addition, other strategies include ATA-guideline based follow-up evaluations, introduction of ETA guideline-based postoperative neck ultrasound for assessing RTT, standardization of surgery and histology reports, and defining and adopting the criteria for offering lobectomy for the definitive treatment of thyroid cancer.
This evidence-based, risk-stratified clinical care pathway for thyroid cancer patients that was applied to all new thyroid cancer referrals in our health care region has led to reduced treatment variability, allowing us to accurately and prospectively assess the ability for the ATA RRA system to prognosticate patients with DTC. This contrasts with the recent prospective multicenter study from Italy by Grani et al, which did not provide guidance or restrictions for patient management. 12 Considerable variability in management can be expected in this cohort considering it spanned a timeline before and after introduction of the modified ATA RRA system and 2015 ATA guidelines for the management of thyroid cancer.
Grani et al reported a similar rate of ATA low RR of 53.6% versus our 53% (p > 0.05), a higher rate of ATA intermediate RR of 38.4% versus our 27% (p < 0.05), and a lower rate of ATA high RR of 8.0% versus our 20% (p < 0.005). However, despite lower rates of high RR, the treatments offered to patients in that study were more aggressive, with higher rates of TTX plus RAI (57.5% vs. 47%, p < 0.0001), and of neck dissections (38% vs. 33%, p < 0.05). In addition, Grani et al had lower rates of lobectomies as definitive treatment of thyroid cancer (3.5% vs. 16%, p < 0.0001).
This result is consistent with a previous study based on the National Surgery Quality Improvement Program database that looked at 35,291 patients undergoing thyroid surgery between 2009 and 2017, and it reported a significant increase in lobectomies from 17% to 22% after the release of the 2015 ATA guideline. 30
The results of our prospective study demonstrate that ATA RR stratification is a good predictor of 2-year RTT (Table 5). Sapuppo et al indicated that persistent disease is a much more relevant marker of unfavorable outcomes than recurrent disease in patients with DTC. 31 In addition, a study by Bates et al defined persistent disease as the presence of less-than-excellent RTT within 1 year of initial treatment, and recurrent disease as cases in which patients had excellent RTT throughout the first year post initial treatment. Using this definition, over a period of 16 years, they found that 97% of re-operations on pathology proven PTC were due to persistence rather than recurrence.
Similarly, we found that only 15% of our patients with less-than-excellent RTT at the 2-year follow-up showed excellent RTT at the 1-year follow-up. Together, these findings indicate that the majority of patients with recurrent disease (characterized by having no biochemical or radiographic abnormalities within 1 year of treatment) or persistent disease (having biochemical or radiographic abnormalities within 1-year) have less-than-excellent RTT at 1 year after initial treatment. 32 As most recurrences are identified within 2 years and almost all within 5 years, 19 further follow-up of our prospective cohort is necessary to assess longer-term recurrence rates.
Poorly differentiated follicular thyroid cancers, Hurthle cell cancers, and some of the other variants of PTC are known to behave differently from classic PTC and more cases are required to assess the impact of variants on RR. 21
The 2015 ATA guidelines and the classification criteria published by Momesso et al are meant to be used on patients who underwent TTX plus RAI or TTX only, respectively. 1,7 They differ in their cut-off Tg values when classifying patients' responses to account for the difference in intervention. When we assessed RTT for our 178 TTX only patients using either the 2015 ATA guidelines or the criteria by Momesso et al, we found that only three patients had a difference in their RTT category. This suggests that either TTX only patients can be assessed using the 2015 ATA guidelines, or that if there is a significant difference in outcomes between the Tg cut-off values of ATA versus Momesso et al, our study was not powered to capture it.
Our study is subject to some limitations. A potential source of bias includes patients who have worse disease at baseline being selected for more aggressive therapies. This is noted among our 25 patients with distant metastases at the 2-year follow-up, of whom 17 had metastatic disease at baseline (Supplementary Table S6). However, this only constitutes a small portion of the 227 patients who underwent TTX plus RAI.
Missing data information is noted in tables with missing data, and details for two patients who were lost to follow up were included in the patient flow chart (Fig. 1). Another limitation is that we did not distinguish outcomes for persistent or recurrent disease. We also acknowledge that 2 years is a relatively short follow-up period for study of thyroid cancer outcomes.
Conclusion
The 2015 ATA RR stratification is a reliable tool to predict 2-year RTT and its implementation with the 2015 ATA DTC guidelines may help to reduce overtreatment in patients with DTC. This is enabled by including lobectomy as definitive treatment for low-RR thyroid cancers and selective use of postoperative RAI treatment. Further follow-up of our prospective cohort is necessary to assess longer-term outcomes for the respective ATA RR.
Footnotes
Authors' Contributions
Study conception and design: J.W. and R.P. Drafting of article: J.W., X.Y.H., and R.P. Data analysis and interpretation: J.W., X.Y.H., S.G., and R.P. Contribution of patient data: S.G., S.K., C.J.S., P.G., V.M.P., P.S., D.L., M.K., M.H., S.P.C., J.L.P., A.H., J.W., R.H., M.D., D.R.R., and R.P. Review and approval of manuscript: all authors.
Author Disclosure Statement
All authors have no potential conflicts of interest or disclosures.
Funding Information
Funding for this study was obtained from: Cardiometabolic fund University of Calgary; Sanofi; Bayer; Eli Lilly; EFW Radiology and a Canadian Society of Endocrinology and Metabolism/Thyroid Foundation of Canada Research Award.
Supplementary Material
Supplementary Table S1
Supplementary Table S2
Supplementary Table S3
Supplementary Table S4
Supplementary Table S5
Supplementary Table S6
Supplementary Table S7
Supplementary Table S8
Supplementary Table S9
Supplementary Table S10
Supplementary Table S11
Supplementary Table S12
Supplementary Table S13
