Abstract
Objectives:
The aims of this study were to validate the proposed Latin American Thyroid Society (LATS) risk of recurrence stratification system and to compare the findings with those of the American Thyroid Association (ATA) risk of recurrence stratification system.
Subjects and Methods:
This study is a retrospective review of papillary thyroid cancer patients treated with total thyroidectomy and radioactive iodine at a single experienced thyroid cancer center and followed according to the LATS management guidelines. Each patient was risk-stratified using both the LATS and ATA staging systems. The primary endpoints were (i) the best response to initial therapy defined as either remission (stimulated thyroglobulin [Tg] <1 ng/mL, negative ultrasonography) or persistent disease (biochemical and/or structural), and (ii) clinical status at final follow-up defined as no evidence of disease (suppressed Tg <1 ng/mL, negative ultrasonography), biochemical persistent disease (suppressed Tg >1 ng/mL in the absence of structural disease), structural persistent disease (locoregional or distant metastases), or recurrence (biochemical or structural disease identified after a period of no evidence of disease).
Results:
One hundred seventy-one papillary thyroid cancer patients were included (mean age 45±16 years, followed for a median of 4 years after initial treatment). Both the ATA and LATS risk stratification systems provided clinically meaningful graded estimates with regard to (i) the likelihood of achieving remission in response to initial therapy, (ii) the likelihood of having persistent structural disease in response to initial therapy and at final follow-up, (iii) the likely locations of the persistent structural disease (locoregional vs. distant metastases), (iv) the likelihood of recurrence, and (v) the likelihood of being no evidence of disease at final follow-up. The likelihood of having persistent biochemical evidence of disease was not significantly different across the staging categories.
Conclusions:
Both the ATA and LATS risk of recurrence systems effectively risk-stratify patients with regard to multiple important clinical outcomes. When used in conjunction with a staging system that predicts disease-specific mortality, either of these systems can be used to guide risk-adapted individualized initial management recommendations.
Introduction
Although these novel risk stratification systems were originally designed as tools to estimate the risk of recurrence, several studies have now demonstrated that these risk of recurrence stratification systems are predictive of multiple important clinical outcomes, including the likelihood of going into remission with initial therapy (also known as having an excellent response to initial therapy), the likelihood of having persistent biochemical or structural evidence of disease after initial therapy, the likelihood of having disease recurrence after achieving remission, and the likelihood of being disease-free at final follow-up (often described as no evidence of disease [NED]) (6 –9).
Although the proposed risk of recurrence stratification systems from the ATA, ETA, and LATS all rely on a common set of clinicopathologic features (3 –5), there are some differences in the specific definitions of the risk categories between the systems (Table 1). Unlike the ATA system, the LATS and ETA systems define a very low-risk group characterized by papillary microcarcinomas confined to the thyroid (T1a N0 M0). All three organizations define a low-risk group, which comprises intrathyroidal 1–4 cm tumors in the LATS and ETA system, and any size intrathyroidal primary tumor in the ATA system, provided there is no aggressive histology, extrathyroidal extension, vascular invasion, or lymph node metastases. While each of the systems classifies distant metastases and gross residual disease as being at high-risk disease, the ETA and LATS classifies all N1 disease as high risk, whereas the ATA classifies N1 disease as intermediate risk. Extrathyroidal extension, vascular invasion, and aggressive histologies are classified as intermediate-risk features in the ATA system and as high-risk features in the LATS system. Therefore, although there are many similarities between the proposed staging systems, differences exist that could result in significantly different risk of recurrence, persistent disease, and final outcomes in patients classified as either “low risk” or “high risk” by the different systems (10,11).
LATS, Latin American Thyroid Society; ETA, European Thyroid Association; ATA, American Thyroid Association; ETE, extrathyroidal extension; TB, thyroid bed; Tg, thyroglobulin.
Therefore, the aim of the present study was to describe both early and late clinical outcomes (12) in the same cohort of DTC patients risk-stratified according to the ATA and LATS risk of recurrence classification systems in order to better understand the clinical implications of being classified as very low, low, intermediate, or high risk of recurrence.
Materials and Methods
We retrospectively reviewed our database containing 535 file records of patients with DTC who had been followed up from January 2001 to December 2011. To be included in the analysis, patients were required to have undergone total thyroidectomy with or without lymph node dissection and should have received remnant ablation with radioactive iodine (RAI) after thyroid hormone withdrawal (THW). After this initial approach, patients were followed up according to our LATS guidelines (4). Of 535 DTC patients evaluated at our center, 156 were excluded because the follow-up was less than 1 year, 107 were excluded because RAI ablation was done using recombinant human thyrotropin (TSH), 73 were excluded because of the presence of detectable anti-thyroglobulin antibodies (anti-TgAb), 16 were excluded because they were more than 80 years old at diagnosis, and 12 were excluded because they were treated with hemithyroidectomy without RAI remnant ablation. With these criteria, 171 DTC patients were included in the study.
Ablation protocol
Our ablation protocol used fixed RAI activities based on the extent of initial disease. Patients typically received 3.70 GBq (100 mCi) 131I for low-risk (ATA) disease, 5.55 GBq (150 mCi) for intermediate-risk (ATA) disease, and 7.40 GBq (200 mCi) for T4 and/or M1 patients. A low-iodine diet was prescribed from one week before RAI administration through two days afterward. THW comprised at least three weeks without thyroid hormone, starting from thyroidectomy or THW for the diagnostic studies. RAI was administered following that interval, in all cases with TSH levels above 50 mIU/L. A post-therapy whole body scan (WBS) was performed five to seven days after therapeutic RAI administration.
Thyroglobulin/thyroglobulin antibody measurement
Samples for Tg and TgAb measurement were taken on the day of ablative RAI administration. Tg and TgAb levels were assessed in one of two reference laboratories from Argentina using either of two commercial immunometric assays; the same laboratory and assay were used throughout a patient's follow-up. Tg assays comprised the Elecsys Tg Electrochemiluminescence Immunoassay (Roche Diagnostics GmbH, Mannheim, Germany), which has a 0.5 μg/L detection limit, or the Immulite 2000 Tg Chemiluminescence Assay (Siemens, Los Angeles, CA), with a 0.9 μg/L functional sensitivity. TgAb assays comprised the Elecsys Anti-Tg Electrochemiluminescence Immunoassay (RSR Ltd., Pentwyn, Cardiff, United Kingdom), or the Immulite 2000 Anti-TGAb chemiluminescent immunometric assay method (Siemens). For both TgAb assays, values >20 IU/mL were considered to be positive, and to render Tg measurements uninterpretable. These patients were excluded from the study.
Clinical management during follow-up
The clinical status in response to initial therapy was assessed using THW stimulated Tg testing and neck ultrasonography (US) in all patients augmented with diagnostic WBS in high-risk patients (150 MBq [4 mCi] activity) performed 6–12 (mean 9±3) months after ablation. Neck US using an 11 MHz linear array transducer was performed every 6 months after ablation. Patients with measurable stimulated or unstimulated Tg, suspicious neck US findings, or both during follow-up underwent morphological or functional imaging or both, including computed tomography (n=59 [35%]) or 18-fluorodeoxyglucose positron emission tomography (n=26 [15%]). All ultrasonographically suspicious nodules ≥1 cm in diameter underwent fine-needle aspiration with measurement of Tg in the aspirate.
After ablation, all patients were kept on a suppressed TSH level until January 2008, when all patients started thyroid hormone therapy according to the LATS recommendations for each risk of the recurrence group (target TSH: <0.01 mIU/L high risk, 0.4–1 mIU/L low risk, and thyroid hormone replacement for very low-risk LATS classification) (4).
Clinical outcome definitions
The primary endpoint of the study was the best response to initial therapy (surgery plus RAI ablation) assessed at the 1 year (±2 months) follow-up visit based on stimulated Tg values, neck US, diagnostic whole body scanning, and risk-appropriate additional functional and cross-sectional imaging (6,7,12). Remission (also known as excellent response to therapy) was defined as a stimulated Tg <1 μg/L in the absence of TgAb, plus absent or <0.1% thyroid bed uptake on dxWBS (if done), with a normal postoperative neck US. Patients demonstrating a stimulated Tg value >1 μg/L without structural evidence of disease were classified as having biochemical persistent disease. Patients with structural evidence of disease (with or without abnormal Tg values) were classified as having structural persistent disease.
The second endpoint of the study was the clinical status at the time of final follow-up (6,12). Patients were classified as having NED if at the time of final follow-up the suppressed Tg was <1 μg/L, Tg antibodies were negative, neck US was free of suspicious signs, and there were no pathological findings on any other imaging studies performed for clinically indicated reasons (WBS, radiography, computed tomography, 18-fluorodeoxyglucose positron emission tomography, or any other modality) or in any biopsy specimen. Patients with persistent disease at the time of final follow-up were classified as having either biochemical or structural persistent disease using the same definitions used in the evaluation of response to initial therapy. Patients who had structural or biochemical evidence of disease identified following a period of NED were classified as having recurrent disease. Disease sites were classified as local (thyroid bed), lymph node (metastasis confirmed by fine-needle aspiration biopsy with positive cytology), and/or distant (metastasis confirmed by biopsy and/or imaging).
Statistical analysis
Data are expressed as mean±SD unless otherwise noted. Categorical comparisons were made using chi-square testing with the Fischer's exact test when appropriate. In each best response to initial therapy, a Mantel–Haenszel weighted estimator was calculated to compare crude rate with risk of both classifications systems. Analysis was performed using SPSS software (version 15.0.0: SPSS, Inc., Chicago, IL). p-Values ≤0.05 were considered to be statistically significant.
Results
Each of the 171 DTC patients underwent total thyroidectomy in a specialized center with subsequent RAI remnant ablation after traditional THW. The mean follow-up in the whole cohort was 64±48 months (range 12–276 months, median 4 years). As can be seen in Table 2, the majority of patients had classic papillary thyroid cancer (PTC) (82%), 85% were female, and 58% were AJCC stage I.
AJCC, American Joint Committee on Cancer; LN, lymph node; M1, systemic metastatic disease; RA, remnant ablation; NED, no evidence of disease.
Lymph node dissections were performed in 57%. In 28 of the 97 patients (29% of the subgroup), central neck dissection had been performed when intra-operative frozen section analysis verified lymph node metastasis. In the remaining 69 of the 97 (71% of the subgroup), central neck (level VI) dissection had mostly been indicated after T3 tumor status confirmation, when suspicious lymph nodes were noted during surgery, or when both conditions pertained. Therefore, of the 97 patients undergoing lymph node dissection, 84 (i.e., 87%) had ultimately confirmed nodal involvement. It was N1a only for 30 patients and N1a+N1b for the remaining 54 subjects.
While only 17% of patients were classified as high risk in the ATA system, 71% were classified as high risk by the LATS system (Table 2). Conversely, 35% were classified as ATA low risk, while 16% were classified as LATS low risk, with 13% classified as LATS very low risk.
For the entire cohort, the best response to initial therapy was remission in 75%, structural persistent disease in 17.5%, and biochemical persistent disease in 7.5%. At the time of final follow-up, 59% were classified as NED, 15% as having had recurrent disease, 14% as having structural persistent disease, and 12% as having biochemical persistent disease.
The primary endpoint defined as the best response to initial therapy is given in Table 3 for the ATA and LATS classification systems. With regard to predicting the likelihood of achieving remission in response to initial therapy, the ATA system provides a broad spectrum of estimates ranging from 90% in the low-risk patients to 76% in the intermediate-risk and 41% in the high-risk patients (Table 3). The LATS system also shows a decreasing likelihood of achieving remission in response to initial therapy across the categories ranging from 91% in the very low-risk, to 86% in the low-risk and 70% in the high-risk categories. While the likelihood of achieving remission was similar in the ATA low risk (90%) and the LATS very low risk (91%), the persistent disease in the LATS very low risk was exclusively biochemical persistent disease, while a few cases of structural persistent disease (3%) were identified in the ATA low-risk group.
Furthermore, the percentage of patients with structural persistent disease rises from 3% in the ATA low-risk to 17% in the ATA intermediate-risk and 48% in the ATA high-risk patients. Likewise, an increasing risk of structural persistent disease is seen across the LATS categories ranging from 0% in the LATS very low-risk, to 7% in the LATS low-risk, to 23% in the LATS high-risk patients. In both systems, the highest likelihood of having persistent structural disease after initial therapy was seen in the high-risk groups (48% in ATA high risk, and 23% in LATS high risk). The likelihood of having persistent disease (biochemical and/or structural persistent disease) after initial therapy increases across the spectrum of risk categories in both systems ranging from 9% in the LATS very low-risk, and 10% in the ATA low-risk patients to 59% in the ATA high-risk patients.
All of the persistent structural disease in the ATA low-risk, ATA intermediate-risk, and LATS low-risk patients consisted of cervical lymph node metastases (no distant metastases). However, the persistent structural disease in the ATA high-risk patients included 13 patients with lung metastases, and 1 patient with bone metastases. Similarly, the LATS high-risk group included 14 patients with cervical lymph node metastases, 13 patients with pulmonary metastases, and 1 patient with bone metastases. Therefore, both risk stratification systems predict the likelihood of having persistent disease after initial therapy, the likelihood that the persistent disease will be structural (vs. biochemical persistent disease), and the likelihood that the persistent disease will manifest as distant metastases (vs. locoregional disease in the neck).
The secondary endpoint defined as the clinical status at final follow-up is given in Table 4 for both classification systems. Once again, both risk stratification systems show a gradation of risk across the categories with regard to likelihood of being NED at final follow-up, likelihood of having persistent structural disease at final follow-up, and likelihood of having structural or biochemical evidence of recurrent disease during follow-up. The highest likelihood of being NED at final follow-up was seen in the ATA low-risk group (78%) and LATS very low-risk group (77%). Conversely, the highest recurrence rates were seen in the ATA high-risk group (20%) and the LATS high-risk group (22%). The likelihood of having persistent structural disease ranged from 3% (in low risk) to 42% (in high risk) in the ATA system, and from 4.5% (in very low risk) to 18% (in high risk) in the LATS system.
The likelihood of having persistent or recurrent disease at the time of final follow-up increases across the spectrum of risk categories in both systems ranging from 20% in the ATA low-risk to 66% in the ATA high-risk patients. The likelihood of having persistent biochemical disease was remarkably similar across the various risk categories without a clearly evident gradation across the categories (7–15% of the patients within each category). However, the risk of having disease recurrence (defined as biochemical or structural evidence of disease following a period of NED) steadily increased across the risk categories rising from a 5–7% risk in the very-low-risk and low-risk categories to 21% in the ATA intermediate risk category, 19% in the LATS high-risk, and 17% in the ATA high-risk patients.
In patients who achieved remission as their best response to initial therapy, there was no statistically significant difference between the classification systems with respect to their ability to predict NED as disease status at final follow-up, biochemical persistent disease status at final follow-up, and recurrent status at final follow-up (Mantel–Haenszel weighted estimator). In patients who achieved biochemical persistent disease as their best response to initial therapy, there was no statistically significant difference between the classification systems with respect to their ability to predict NED as disease status at final follow-up. In patients who achieved structural persistent disease as their best response to initial therapy, there was no statistically significant difference between the classification systems with respect to their ability to predict the presence of both biochemical persistent disease status at final follow-up and structural or system disease status at final follow-up (Mantel–Haenszel weighted estimator).
Discussion
By using the ATA and LATS risk of recurrence prognostic systems to risk-stratify the same cohort of 171 DTC patients treated with total thyroidectomy and RAI ablation at a single thyroid cancer specialty center, we have confirmed the utility of the ATA system (6,8,13) and, for the first time, demonstrated the clinical utility of the LATS system. Thus, the ATA risk stratification system has now been validated in cohorts of DTC patients in Argentina (our data), Brazil (13), Italy (8), and New York (6), confirming its clinical applicability across a wide spectrum of patients and healthcare systems.
Our data demonstrate that both the ATA and LATS risk stratification systems effectively stratify patients with regard to a broad spectrum of clinical outcomes. As would be expected, the precise estimates for remission, persistent disease, and recurrence will vary between risk categories based on the specific criteria used to define “low risk” and “high risk.” Therefore, it is important to clearly specify what is meant when describing a patient as either “high risk” or “low risk.” Sometimes these terms refer to AJCC staging, in which case they refer to the risk of death from thyroid cancer. At other times, these terms refer to risk of recurrence but without specifying whether we are referring to ATA high risk, LATS high risk, or ETA high risk. For clear communication between clinicians and in research reports, it is important to be very specific with regard to what risk is being referred to and what classification system (specific definitions) is being used.
With regard to predicting response to initial therapy, both the LATS and ATA systems demonstrated decreasing likelihood of achieving remission across the risk categories. As can be seen in Table 3, the LATS system classified a larger number of patients as high risk (n=121) than the ATA system (n=29). This is primarily related to the difference in classification of N1 patients as high risk in the LATS system, and intermediate risk in the ATA system. It appears that inclusion of the N1 patients in the LATS high-risk group resulted in higher remission rates (70%) than those seen in the ATA high-risk group, which included a smaller number of patients who were classified as ATA high risk on the basis of gross extrathyroidal extension or distant metastases (41%).
Both the ATA and LATS risk stratification systems consider that all N1 disease is equivalent and is grouped together as either intermediate risk in the ATA system or high risk in the LATS system. This leads to inappropriate upstaging of many patients with small-volume microscopic lymph node metastases. A recent review has demonstrated that the risk of structural disease recurrence can range from 4% in patients with fewer than 5 metastatic lymph nodes, to 5% if all involved lymph nodes are <0.2 cm, to 19% if more than 5 lymph nodes are involved, to 21% if more than 10 lymph nodes are involved, to 22% if macroscopic lymph node metastases are clinically evident (clinical N1 disease), and 27% if any metastatic lymph node is greater than 3 cm (14). When examined in the light of the 7% risk of structural recurrence in the ATA low-risk category, and the 4.5% risk of structural recurrence in the LATS very low-risk category, it is clear that patients with very small-volume metastatic disease in cervical lymph nodes have a risk of recurrence (4–5%) that is much closer to low-risk patients than high-risk patients. Conversely, patients with clinically evident macroscopic lymph node metastases have a risk of recurrence (22–27%) that is more similar to high-risk patients. Further studies are needed to determine how best to re-classify N1 patients into appropriate risk categories based on size and number of involved lymph nodes.
Both the ATA and LATS systems also provided a spectrum of estimates across the risk categories with regard to the likelihood of having persistent structural disease in response to initial therapy as well as having persistent structural disease at the time of final follow-up. Furthermore, the location of the structural disease also differed across the risk groups with all of the persistent structural disease in the very low-, low-, and intermediate-risk categories being locoregional lymph node metastases, while the high-risk categories included all of the patients with distant metastases. We have previously demonstrated that patients with structural disease remaining after initial therapy have far worse clinical outcomes than patients who either enter remission or have persistent biochemical evidence of disease (9). In addition, the highest risk of disease-specific mortality was seen in patients with persistent structural distant metastases, which conveyed a significantly worse prognosis than persistent locoregional disease (9). Therefore, early identification of patients at risk for persistent structural disease is an important component of risk-adapted thyroid cancer management.
The likelihood of achieving remission in response to initial therapy was very similar in the ATA low-risk and the LATS very low-risk patients (90% and 91%, respectively), as was the risk of recurrence (7% vs. 4.5%, respectively). These data are consistent with previous studies that have demonstrated that the risk of structural disease recurrence can range from 1–2% in unifocal papillary microcarcinomas, to 4–6% in multifocal papillary microcarcinomas (15,16), to 5–6% in 2–4 cm intrathyroidal PTC (17), and 8–10% in intrathyroidal PTC >4 cm (17). Therefore, while unifocal papillary microcarcinomas confined to the thyroid (LATS very low risk) can be expected to have slightly lower recurrence rates (1–2%) than multifocal papillary microcarcinomas (4–6%) or larger intrathyroidal papillary cancers up to 4 cm (5–6%), the absolute difference in risk is so small that it is unlikely to result in substantial differences in clinical management. Therefore, it is likely that clinical management recommendations with regard to issues such as the optimal degree of TSH suppression, and frequency and type of follow-up studies required will be very similar in patients initially classified as either LATS very low risk or ATA low risk.
Although both the ATA and LATS systems provided gradations of the likelihood of achieving remission in response to initial therapy, likelihood of having persistent disease in response to initial therapy, likelihood of having persistent structural disease in response to initial therapy and at final follow-up, the likelihood of having recurrence, and the likelihood of being NED at final follow-up, neither of the systems provided a meaningful risk stratification with regard to the likelihood of having persistent biochemical evidence of disease in the absence of structurally identifiable disease. Similar rates of persistent biochemical disease as the best response to initial therapy (mean 8%) and as the clinical status at final follow-up (mean 11%) were seen across all of the categories in each system. This finding is consistent with our previous observations where we found that 11–19% of ATA low-risk patients, 21–22% of ATA intermediate-risk patients, and 16–18% of ATA high-risk patients had biochemical persistent disease during follow-up (7,13). The clinical significance of stable or declining serum Tg values in the absence of structural disease is probably minimal as we have previously demonstrated a 100% five-year survival in this cohort with spontaneous resolution of the Tg values over time in 34% of the cohort (9). These observations are consistent with several previous studies that have demonstrated that abnormal serum Tg values often gradually decline over time without additional RAI or surgical therapy, in the absence of structurally identifiable disease (18 –24).
In summary, our data demonstrate the clinical utility of the proposed LATS risk of recurrence classification system and provides additional validation of the ATA risk of recurrence system in a cohort of DTC patients evaluated and treated outside the United States healthcare system. Both the ATA and LATS risk stratification systems provide clinically meaningful graded estimates with regard to the likelihood of achieving remission in response to initial therapy, the likelihood of having persistent structural disease in response to initial therapy and at final follow-up, the likely locations of the persistent structural disease (locoregional vs. distant metastases), the likelihood of recurrence, and the likelihood of being NED at final follow-up. In keeping with both the ATA and LATS thyroid cancer guidelines, we agree that risk of recurrence classification systems should be used in conjunction with the AJCC/TNM system to provide initial risk estimates that can be used to guide an individualized risk-adapted follow-up management strategy.
Footnotes
Author Disclosure Statement
R.M.T. is a consultant for Genzyme. The remaining authors report no competing financial interests.
