Abstract
Background:
Active surveillance has been proposed as an appropriate management strategy for low-risk differentiated thyroid cancer (DTC), due to the typically favorable prognosis of this condition. This systematic review examines the benefits and harms of active surveillance vs. immediate surgery for DTC, to inform the updated American Thyroid Association guidelines.
Methods:
A search on Ovid MEDLINE, Embase, and Cochrane Central was conducted in July 2021 for studies on active surveillance vs. immediate surgery. Studies of surgery vs. no surgery for DTC were assessed separately to evaluate relevance to active surveillance. Quality assessment was performed, and evidence was synthesized narratively.
Results:
Seven studies (five cohort studies [N = 5432] and two cross-sectional studies [N = 538]) of active surveillance vs. immediate surgery, and seven uncontrolled treatment series of active surveillance (N = 1219) were included. One cross-sectional study was rated fair quality, and the remainder were rated poor quality. In patients with low risk (primarily papillary), small (primarily ≤1 cm) DTC, active surveillance, and immediate surgery were associated with similar, low risk of all-cause or cancer-specific mortality, distant metastasis, and recurrence after surgery. Uncontrolled treatment series reported no cases of mortality in low-risk DTC managed with active surveillance. Among patients managed with active surveillance, rates of tumor growth were low; rates of subsequent surgery varied and primarily occurred due to patient preference rather than tumor progression. Four cohort studies (N = 88,654) found that surgery associated with improved all-cause or thyroid cancer mortality compared with nonsurgical management, but findings were potentially influenced by patient age and tumor risk category and highly susceptible to confounding by indication; eligibility for, and receipt of, active surveillance; and timing of surgery was unclear.
Conclusions:
In patients with small low-risk (primarily papillary) DTC, active surveillance and immediate surgery may be associated with similar mortality, risk of recurrence, and other outcomes, but methodological limitations preclude strong conclusions. Studies of no surgery vs. surgery are difficult to interpret due to clinical heterogeneity and potential confounding factors and are unsuitable for assessing the utility of active surveillance. Research is needed to clarify the benefits and harms of active surveillance and determine outcomes in nonpapillary DTC, larger (>1 cm) cancers, and older patients.
Introduction
Thyroid cancer accounts for more than 90% of endocrine system malignancies, with an estimated 44,280 cases in 2021 (1). It is the most common cancer among adolescents and young adults (2,3) and the seventh-most common among women overall (1). More than 95% of thyroid cancers are classified as differentiated thyroid cancer (DTC), primarily of a papillary (70–90%) or follicular (10–20%) subtype (4). Localized DTC is associated with a highly favorable prognosis, with a 5-year survival rate of nearly 100% for papillary and 98% for follicular cancers (5). Worldwide, the incidence of thyroid cancer nearly tripled from 1975 to 2009 (6). Although some studies indicate stable thyroid cancer mortality (suggesting increased identification of subclinical indolent cancers) (7,8), other data indicate increased mortality (9).
The standard primary treatment for DTC has been surgery (total thyroidectomy or lobectomy). However, surgery is associated with potential morbidity, including the need for thyroid hormone treatment, hypoparathyroidism, and recurrent laryngeal nerve injury. To avoid surgical morbidity and potential overtreatment, active surveillance has been proposed as an alternative to immediate surgery for small low-risk DTC (8,10). Active surveillance refers to close monitoring of the primary cancer without performing initial surgery or other more intensive treatments (8). In active surveillance, patients may be offered surgery with curative intent when progression occurs. This differs from watchful waiting, which usually involves less intensive observation and symptom management in persons typically not candidates for curative treatment.
A 2015 American Thyroid Association guideline stated that surgery is “generally recommended” for DTC, but noted active surveillance as an alternative for very low-risk tumors (e.g., small papillary microcarcinoma without evidence of metastases or local invasion and favorable cytology), high surgical risk, short life expectancy, or other significant health conditions (11). Given the availability of new evidence, the American Thyroid Association commissioned a systematic review on active surveillance for DTC, to inform updated guidelines.
Methods
This review was conducted using a prespecified protocol and followed published methods for conducting effectiveness and comparative effectiveness reviews (12). In conjunction with the American Thyroid Association Differentiated Adult Thyroid Cancer Guidelines Task Force, we developed the Key Questions for this review. Patient representatives were not involved in the development of the Key Questions, although we sought to address important health outcomes as well as patient-reported outcomes, including impacts on quality of life.
In adult patients with DTC, what are the effects of active surveillance versus thyroid surgery on risk of recurrence, mortality (all-cause or thyroid cancer), and other outcomes (subsequent surgery, lymph node or distant metastasis, quality of life, and harms [e.g., vocal cord paralysis, hypoparathyroidism, receipt of thyroid hormone replacement])?
In adult patients with DTC, what are the effects of no surgery versus surgery on risk of recurrence, mortality, and other outcomes?
Key Question 1 addresses studies that directly compared active surveillance versus immediate thyroid surgery in patients with DTC. In these studies, active surveillance involved close monitoring for cancer progression and symptoms. Key Question 2 was a secondary comparison addressing studies of no surgery versus surgery. In these studies, there was no clear active surveillance protocol, and reasons for not undergoing surgery or timing of surgery were unreported. However, these studies were addressed as a secondary Key Question to assess the relevance and limitations for informing outcomes of active surveillance. For both Key Questions, we examined how outcomes varied in groups defined by patient age and tumor size.
Search strategies
We searched the Cochrane Central Register of Controlled Trials, Elsevier Embase®, and Ovid MEDLINE® (through July 2021). Search strategies are shown in Supplementary Appendix SA1. Searches were supplemented by reference list review of relevant articles.
Study selection
Abstracts and full-text articles were evaluated using prespecified eligibility criteria. The population was adults with DTC of any size. The main comparison (Key Question 1) was active surveillance versus immediate thyroid surgery (lobectomy or total thyroidectomy). Active surveillance was defined as close monitoring without surgery in patients eligible for surgery with curative intent. Because we anticipated few studies of active surveillance versus immediate surgery, we also included uncontrolled treatment series of patients undergoing active surveillance.
As a secondary comparison (Key Question 2), we also included cohort studies of no surgery versus surgery. Such studies lack information regarding eligibility for, or receipt of, active surveillance, reasons for not performing surgery, and intended intitial treatment, with high potential for confounding by indication. Therefore, the populations and interventions are distinct from those evaluated in Key Question 1.
For both Key Questions, primary outcomes were thyroid cancer recurrence and all-cause or thyroid cancer-specific mortality; secondary outcomes were tumor growth, subsequent thyroid surgery (in active surveillance patients, crossover to surgery; in immediate surgery patients, repeat surgery), lymph node or distant metastasis, quality of life or function, and harms. Subgroups of interest were based on tumor size, tumor type, and patient age. Studies had to have at least one year of follow-up. Inclusion was restricted to English-language studies; studies published only as conference abstracts were excluded due to insufficient information to fully assess quality and results, potential for changes in results between the abstract and full publication; and exclusion of conference abstracts in systematic reviews usually does not impact findings (13).
Data abstraction
Data on study characteristics, patient and tumor characteristics, and results were extracted by one investigator and verified by a second. To avoid overweighting data, we treated multiple reports from an institution of the same or significantly overlapping populations as a single study (14).
Assessing methodological quality of individual studies
The quality (risk of bias) of each study was rated as “good,” “fair,” or “poor” using predefined study design-specific criteria adapted from the U.S. Preventive Services Task Force (Supplementary Appendix SA2 and SA3). Study ratings require interpretation within the context of the study design utilized. For example, a well-conducted uncontrolled study is of a lower quality than a well-conducted cohort study.
Synthesizing the evidence
The evidence was synthesized narratively; meta-analysis was not performed due to the absence of randomized trials and limitations in the observational studies. The overall quality of evidence for each comparison was assessed separately using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) methods, based on risk of bias, consistency, directness, precision, and reporting bias (15). The quality of evidence was graded “high,” “moderate,” or “low,” indicating the confidence in the findings (16); and evidence too limited to permit conclusions was graded “insufficient.”
Results
Literature search
Database searches resulted in 721 potentially relevant articles (Fig. 1). After dual review of abstracts and titles, 64 articles were selected for full-text dual review. Of these, 18 studies (in 27 publications) met the inclusion criteria (17 –43); there were no randomized trials. Fourteen studies addressed Key Question 1 (active surveillance vs. immediate surgery) and 4 studies addressed Key Question 2 (no surgery vs. surgery).

Literature flow diagram.
Key Question 1: active surveillance versus immediate surgery
Seven studies [five cohort studies reported in 12 publications (17,19 –22,25 –29,31 –33) and two cross-sectional studies (34,35)] addressed active surveillance versus immediate surgery in patients with DTC; there were also seven uncontrolled treatment series of active surveillance (36 –41). There was potential overlap between a cross-sectional study (35) of active surveillance versus immediate surgery conducted in Kuma Hospital, Japan, and cohort studies (19,20,33) from the same institution, as well as partial overlap between the cohort studies conducted in Kuma Hospital (one enrolled patient diagnosed between 1993 and 2011, and one enrolled patient diagnosed between 2005 and 2017).
Patients undergoing active surveillance were monitored and underwent surgery if there were signs of tumor progression or for other reasons (e.g., patient choice, management of other conditions). Follow-up protocols usually included clinical follow-up and ultrasonography every 6 to 12 months, although some details were lacking.
Cohort studies
Among the five active surveillance cohort studies (N = 5432, Tables 1 and 2), three were conducted in Japan (two studies conducted in Kuma Hospital had partial overlap) (17,19 –21,26,28,29,33), one in South Korea (22,25,31), and one in Brazil (27). Tumor size for inclusion was ≤1 cm in three studies (19 –22,25,26,31,33,42), ≤1.2 cm in one study (27), and <2 cm in one study (28). All of the cohort studies were restricted to papillary cancers without high-risk features (e.g., nodal or distant metastasis, extrathyroidal extension, high-grade cytology, evidence of progression, or location on the posterior surface of thyroid gland).
Study Characteristics, Active Surveillance vs. Immediate Surgery
AS, active surveillance; FNAB, fine needle aspiration biopsy; IQR, interquartile range; SD, standard deviation.
Results, Active Surveillance vs. Immediate Surgery
TSH, thyrotropin.
One cohort study (22,25,31) focused on quality of life, and the others reported mortality, local recurrence, or other oncologic outcomes (e.g., metastasis). Mean duration of follow-up ranged from two to seven years in all of the cohort studies except for one (27) that had a six-month to a three-year follow-up. Three studies (17,19 –21,26,28,29,35) reported the surgical procedures performed (lobectomy, total thyroidectomy, or near-total thyroidectomy); in two studies, details regarding surgical procedures were not provided. Mean or median age ranged from 49 to 57 years in patients undergoing active surveillance. In all studies, patients were predominantly female (72% to 92% in the active surveillance groups).
All of the cohort studies of active surveillance versus immediate surgery were rated poor quality (Supplementary Appendix SA2). No study controlled for confounders or reported attrition or missing data. Other methodological limitations included unclear methods for selecting patients and baseline differences between groups, and no study reported masking of outcome assessors or data analysts.
Cross-sectional studies
One fair-quality (n = 191) (34) and one poor-quality cross-sectional study (n = 347) (35) compared quality of life in patients with papillary thyroid carcinoma ≤1 cm who underwent active surveillance versus immediate lobectomy (Tables 1, 2, and Supplementary Appendix SA2). Because of the cross-sectional design, follow-up protocols were not described. One study was conducted in South Korea and the other in Japan. The duration since immediate surgery was 38 months in one study and 84 months in the other.
Uncontrolled treatment series
Seven uncontrolled treatment series (N = 1219) evaluated patients with DTC managed with active surveillance (Tables 3 and 4) (36 –41). The duration of follow-up ranged from 13.3 to 32.5 months. One study was conducted in the United States (41), one in Italy (38), two in Colombia (40,43), and three in South Korea (36,37,39). The two South Korean studies had potentially overlapping patient populations (37,39). Although the two Colombian studies were performed at the same institution, the populations and enrollment dates did not overlap. The median or mean patient age ranged from 44 to 52 years, and the proportion of females ranged from 75% to 84%.
Study Characteristics, Uncontrolled Treatment Series of Active Surveillance
Prospective, retrospective, or unclear.
Results, Uncontrolled Treatment Series of Active Surveillance
CI, confidence interval; HR, hazards ratio.
Two treatment series (39,41) were rated fair quality, and five (36 –38,40,43) were rated poor quality (Supplementary Appendix SA3). Methodological shortcomings included failure to report attrition or missing data and unclear patient selection methods; in addition, no study blinded outcome assessors or data analysts.
Recurrence, progression, mortality, and subsequent surgery
Five poor-quality cohort studies (N = 5432) compared active surveillance versus immediate surgery for low-risk DTC (Tables 1 and 2) (17,19 –22,25 –29,31 –33). Tumor size was ≤1.2 cm in four studies and <2 cm in one study (28). Three studies (17,26 –29,32,33) reported all-cause mortality (N = 2982), although no cases were reported in two (17,27 –29) of the studies. In the third study (26), there were few deaths and no difference in all-cause mortality after 47 months (0.3% [3/1179] vs. 0.5% [5/974]. Four studies (N = 4377) (17,19 –21,26 –29,32,33) reported thyroid cancer mortality, with no cases in three studies (17,26 –29).
In the fourth study (19 –21), there were few deaths and no difference in thyroid cancer mortality after a median of 76 months (0% [0/340] for active surveillance vs. 0.2% [2/1055] for immediate surgery). Among patients undergoing active surveillance, the rates of tumor growth ≥3 mm were from 1.4% to 7.5% (four studies, N = 2026) (17,20,26 –29,32), and the proportion of patients who underwent subsequent surgery ranged from 2.6% to 32% (four studies, N = 2160) (17,19 –21,26 –29,32). The most common reason for surgery in patients undergoing active surveillance was patient preference rather than tumor growth.
Three studies (N = 2574) reported similar local recurrence rates following active surveillance with subsequent surgery versus immediate surgery (0% vs. 3.0%, 1.1% vs. 0.5%, and 0% vs. 2.4%) (17,19 –21,26,28,29). In three studies (N = 2982), the proportion of patients with lymph node metastasis during follow-up was similar with active surveillance and immediate surgery (26 –28); in one (27) of the studies there was only one case of lymph node metastasis.
In four studies (N = 4377), no cases of distant metastasis were reported in patients undergoing active surveillance (17,19 –21,26 –29); in a fifth study one case of distant metastasis (32) was reported following immediate surgery in a stage T1b patient. Harms were poorly reported; one study (n = 1179) found that surgery associated with increased risk of temporary (but not permanent) vocal cord paralysis (0.6% vs. 4.1%) and hypoparathyroidism (2.8% vs. 16.7%), and increased likelihood of receiving thyroid hormone replacement or supplementation (20.7% vs. 66.1%) (26).
Quality of life
One fair-quality cross-sectional study (n = 191) (34), one poor-quality cross-sectional study (n = 347) (35), and one poor-quality cohort study (n = 395) (22,25) of patients with papillary thyroid microcarcinoma (<1 cm) evaluated quality of life. The fair-quality study found no differences between active surveillance versus immediate surgery in Short-Form-12 or Thyroid Cancer-Quality-of-Life scores (34). The poor-quality studies found that active surveillance associated with higher (better) scores on the Quality of Life in Thyroid Cancer Patient Questionnaire for overall, psychological, and physical health (22,25,35).
Although results were statistically significant, differences were small (<1 point on a 0- to 10-point scale). One poor-quality study found that active surveillance was associated with better Hospital and Anxiety Depression Scale scores, but the differences were small (1 to 2 points on a 0 to 42 scale) (35).
Uncontrolled studies
Seven uncontrolled treatment series of patients with small (≤1.5 cm) DTC managed by active surveillance (N = 1219) reported results consistent with the cohort studies (Tables 3 and 4) (36 –41,43). Across all studies, no cases of thyroid cancer mortality or all-cause mortality were reported. The proportion of patients with tumor growth >3 mm ranged from 2.1% to 10% (6 studies, N = 996) (36,38 –41,43). The proportion of active surveillance patients who underwent subsequent surgery ranged from 3.5% to 23% (7 studies, N = 1240) (36 –41,43), with no cases of recurrences after surgery in three studies (N = 80) (38,39,41).
In most studies, surgery was more commonly performed for patient anxiety or preference than for tumor progression (e.g., tumor enlargement). Five studies (N = 1004) reported that the proportion of patients with lymph node metastasis ranged from 0% to 2.9% (37,38,41 –43). Four studies (N = 946) reported no cases of distant metastasis (37 –39,41).
Key Question 2: no surgery versus surgery
Four cohort studies (N = 88,654) compared outcomes of patients with DTC who underwent no surgery versus surgery (Tables 5 and 6) (18,23,24,30). All studies were restricted to papillary carcinoma except for one (93% papillary and 4.2% follicular) (23). In these studies, patients were analyzed according to whether they underwent surgery or not, regardless of the initial intended treatment; the proportion of patients who crossed over from nonsurgical management to surgery was unknown. In addition, it was unclear if nonsurgical patients were eligible for active surveillance or did not undergo surgery for other reasons (e.g., high surgical risk, limited life expectancy). Details regarding surgical procedures were not provided, and follow-up protocols in patients who did not undergo surgery were not reported.
Study Characteristics, No Surgery vs. Surgery
SEER, Surveillance, Epidemiology, and End Results.
Results, No Surgery vs. Surgery
RR, relative risk.
One study restricted tumor size for inclusion to ≤1 cm (24), and three studies (18,23,30) did not apply a tumor size restriction. Two (23,30) studies were not restricted to low-risk tumors but conducted subgroup analyses by risk category, and two studies (18,24) did not report outcomes by tumor risk category. All four studies were conducted in the United States. Two studies (18,24) were based on the analyses of Surveillance, Epidemiology, and End Results (SEER) registry data, one study (23) was based on the California Cancer Registry, and one study (30) was based on a health system administrative database.
The studies primarily focused on mortality (all-cause or thyroid cancer), with one study (30) reporting on metastasis. Median duration of follow-up ranged from 4.2 to 5.3 years in three studies; one study (24) did not report follow-up duration. Mean or median patient age ranged from 55 to 61 years, with the exception of one study that restricted inclusion to patients 65 years of age or older (mean 72 years) (24). In all studies, patients were predominantly female (67% to 73% in nonsurgical groups).
Three studies (18,24,30) were rated fair quality, and one study (23) poor quality (Supplementary Appendix SA4). Two studies (24,30) adequately controlled for confounders, and one study (18) partially controlled for confounders; the poor-quality study (23) did not control for confounders. No study reported attrition or missing data; other methodological limitations included unclear methods for selecting patients and baseline differences between groups. No study reported masking of outcome assessors or data analysts. As noted above, studies of surgery versus no surgery conducted analyses based on whether patients underwent surgery, not according to receipt of active surveillance.
Mortality, progression
One fair-quality cohort study evaluated 2323 patients in the U.S. SEER registry with papillary thyroid carcinoma ≤1 cm (not limited by tumor risk category) (24). Inclusion was limited to patients 65 years of age or older (mean age 73.3 and 71.4 years in the nonsurgical and surgical groups, respectively). In a propensity score-adjusted analysis, surgery was associated with a decreased risk of all-cause mortality versus no surgery (adjusted hazards ratio [HR] 0.11, confidence interval [CI 0.09–0.13]). The model for the propensity score included patient age, sex, race, marital status, tumor multifocality, extrathyroidal extension, and region (state).
Another fair-quality study, also based on the SEER registry data, compared nonsurgical versus surgical treatment in 56,171 propensity-matched patients with papillary thyroid carcinomas ≤10 cm (18). Mean age was 55.4 years in the nonsurgical group and 50.3 years in the surgery group. Surgery was associated with improved 10-year thyroid cancer mortality in the entire sample, after adjusting for patient age, marital status, and tumor size (adjusted HR 0.56, CI [0.36–0.85]). There appeared to be an interaction between older age and decreased risk of thyroid cancer mortality with surgery. Among patients >75 years of age, surgery was associated with a higher 10-year thyroid cancer survival for tumors of all sizes, with the largest relative difference for tumors >6 cm (91% vs. 48%, p < 0.001). There was no difference in thyroid cancer mortality between no surgery versus surgery among patients 14 to 55 years of age.
A smaller (n = 180) fair-quality study based on a health system administrative database compared nonsurgical versus surgical treatment (within 1 year of diagnosis) in propensity-matched patients with papillary thyroid carcinoma (no size restriction) (30). Mean age was 57.1 and 55.5 years in the nonsurgical and surgery groups, respectively. Among patients who did not undergo surgery, TNM stage was I in 61% and IV in 32%. Nonsurgical treatment was associated with increased risk of all-cause mortality (adjusted HR 4.1, CI [1.8–9.4]) and thyroid cancer mortality (adjusted HR 10.2, CI [2.9–36.4]) at 10 years. In a stratified analysis, there was no increased risk of all-cause mortality among low-risk patients (adjusted HR 0.7, CI [0.07–6.4]), but increased risk was observed among high-risk patients (adjusted HR 4.8, CI [1.8–12.4]). Estimates for lymph node and distant metastasis were imprecise.
A poor-quality study also compared outcomes of nonoperative versus operative management in patients with DTC (93% papillary, no size restriction, mean age 61.2 with nonoperative management and 49.3 years with operative management) (23). Among patients (n = 10,634) with low-risk tumors (<4 cm without extrathyroidal extension, nodal involvement, or distant metastasis; local summary stage; and no prior chemotherapy or radiation treatment), the incidence of thyroid cancer mortality was very low in both groups (0% [0/161] for nonoperative and 0.1% [10/10,473] for operative treatment).
Discussion
The main findings of this review, including quality-of-evidence ratings, are summarized in Table 7. For active surveillance versus surgery, the quality of evidence was assessed as low for all outcomes, being based on observational studies with high risk of bias. In younger (mean age in the 50s to low 60s) adults with small (≤1 cm), low-risk, papillary thyroid cancer, cohort studies found active surveillance and immediate surgery associated with similar low risk of all-cause or cancer-specific mortality, distant metastasis, and recurrence after surgery. Uncontrolled studies with more than 1000 patients with low-risk DTC managed with active surveillance reported no cases of mortality, reflecting the highly favorable prognosis. In patients managed by active surveillance, tumor growth rates were low. Rates of subsequent surgery among patients managed by active surveillance varied and resulted more from patient preferences than tumor progression.
Quality of Evidence
Downgraded due to serious indirectness for evaluation of active surveillance.
Data on harms were extremely limited but identified temporary vocal cord paralysis and hypoparathyroidism as surgery complications. Evidence about quality of life or functional outcomes was limited, but indicated small or no differences. Although this review focused on oncological outcomes, a systematic review on costs (44) found that most studies favored active surveillance. However, cost analyses had methodological limitations, and costs may differ based on follow-up duration (due to up-front costs of surgery versus costs of long-term active surveillance), impact of living with an untreated cancer on quality of life, and age.
Cohort studies of no surgery versus surgery tended to find that surgery is associated with improved all-cause or thyroid cancer mortality, but the quality of evidence was also assessed as low due to being based on observational studies with moderate risk of bias. In addition, there was some inconsistency in results, potentially related to thyroid cancer risk category and age. One study that found surgery was associated with better outcomes focused on older patients (mean age >70 years, compared with the 50s to early 60s in the other studies) (24), and another study that enrolled a mixed population of younger and older patients found that benefits of surgery were restricted to older patients (18).
However, a systematic review (14) found that risk of DTC progression decreases with age, suggesting that active surveillance may be an appropriate strategy in older patients. The observed finding of worse outcomes with no surgery in older patients may reflect confounding by other prognostic factors associated with the decision not to perform surgery or inclusion of higher risk patients not suitable for active surveillance (20). One study (30) found that surgery associated with decreased risk of all-cause or thyroid cancer mortality in higher, but not lower, risk patients, and another study (23) restricted to low-risk tumors reported very low rates of thyroid cancer mortality.
Importantly, it was impossible to determine if patients who did not undergo surgery were potential surgical candidates or had risk factors that placed them at risk for adverse surgical or thyroid cancer-related outcomes. Information was not available on how patients were selected for surgery or nonsurgical treatment, the degree to which patients who did not undergo surgery were actively monitored, or timing of surgery with regard to DTC diagnosis. Therefore, these studies are highly susceptible to confounding by indication, have very low applicability to active surveillance, and are not suitable for assessing the utility of active surveillance.
A major limitation of the evidence is the presence of methodological shortcomings in all studies. There were no randomized trials, and no observational study was rated good quality, with most rated poor quality. Methodological limitations included failure to control for confounders, unclear methods for selecting patients, poor reporting of attrition and missing data, and unclear masking of outcome assessors. The outcome of subsequent surgery was difficult to interpret because criteria for undergoing curative surgery were not strictly defined, and patients often underwent surgery for reasons other than tumor progression (e.g., patient choice).
A number of active surveillance studies were conducted in Asia, although findings are likely applicable to other settings with similar epidemiology and management of DTC. Uncontrolled series of patients undergoing active surveillance reported low rates of mortality or progression, but the lack of a surgery control group limits interpretation. A limitation of the review process is that the protocol was not registered before initiating the review. However, the scope and methods were developed before conducting the review, and no protocol changes occurred, except to place the comparisons (active surveillance versus no surgery and no surgery versus surgery) into separate Key Questions, to more clearly distinguish these distinct bodies of evidence.
Our findings are consistent with prior systematic reviews that also reported low rates of metastasis and tumor growth and very low rates of mortality among patients with low-risk DTC who underwent active surveillance (45,46). One of the prior reviews reported a pooled estimate for thyroid cancer mortality in patients who underwent active surveillance (0.03%) that indicated at least one thyroid cancer death; however, we identified no thyroid cancer deaths among active surveillance patients in the studies included in our review.
Strengths of our review are the inclusion of additional active surveillance studies and studies of no surgery versus surgery, expanding the evidence base and assessment of methodological limitations. In addition, we avoided overweighting by treating multiple publications of the same or overlapping populations as a single study (47). Unlike prior reviews, we did not perform a meta-analysis, due to study methodological limitations and heterogeneity, to avoid misleading pooled results.
Research is needed to clarify outcomes of active surveillance versus immediate surgery and to further define (and perhaps expand) populations appropriate for active surveillance based on tumor size, tumor type, age, or other factors. Data on outcomes of active surveillance in pregnancy are limited but suggest (48) that pregnancy does not negatively impact outcomes of patients with low-risk, small papillary thyroid carcinomas who undergo active surveillance.
Although randomized trials would be ideal for minimizing confounding and other potential biases, we only identified one small (n = 40) in-progress randomized trial (49). Well-conducted, prospective cohort studies that measure and control well for confounders (e.g., age, comorbid conditions, and tumor characteristics) could also help inform this issue; a number of cohort studies are in progress (50 –53). Research is also needed to better understand patient preferences, comparative costs, and decision-making regarding treatments for low-risk DTC.
One study found that ∼70% of patients with small, low-risk papillary thyroid cancer would select active surveillance over surgery (47). Increased use of minimally invasive treatments as an option for some low-risk DTCs could impact preferences regarding management (54). Despite recommendations to consider active surveillance for small, low-risk DTCs, implementation has been limited (55). In one survey, fewer than half of the surgeons and endocrinologists treating thyroid cancer used active surveillance, although most (76%) believed it was an appropriate option (56).
In conclusion, active surveillance and immediate surgery may be associated with similar mortality, risk of recurrence, and other outcomes in patients with small, low-risk DTCs, but methodological limitations preclude strong conclusions. Studies of no surgery versus surgery are difficult to interpret due to clinical heterogeneity and potential confounding. Research is needed to clarify the benefits and harms of active surveillance and identify populations in whom it is an appropriate strategy.
Footnotes
Author Disclosure Statement
R.P.T.: Consultant for RGS Healthcare, Pulse BioSciences, and Medtronic. J.A.S.: Member of the Data Monitoring Committee of the Medullary Thyroid Cancer Consortium Registry supported by GlaxoSmithKline, Novo Nordisk, Astra Zeneca, and Eli Lilly. Institutional research funding was received from Exelixis and Eli Lilly.
Funding Information
This study was supported by the American Thyroid Association.
Supplementary Material
Supplementary Appendix SA1
Supplementary Appendix SA2
Supplementary Appendix SA3
Supplementary Appendix SA4
