Abstract
Background:
The rapid increase in the incidence of small papillary thyroid carcinoma (PTC) appears to be caused by the detection of small thyroid cancers. Active surveillance (AS) was therefore suggested to overcome this problem. As the results were favorable with low rates of size enlargement and lymph metastasis, the 2015 American Thyroid Association Management Guidelines endorsed AS as an alternative to immediate surgery. As the clinical value of AS is a subject of ongoing active discussions and surveys, we considered a systematic review and meta-analysis to be timely and necessary.
Methods:
Ovid-MEDLINE and EMBASE databases were searched up to January 5, 2019, for studies reporting patients who were followed up with AS for PTC. Data extraction and methodological quality assessment were performed independently by two radiologists. The primary outcomes were to identify the annual pooled proportions of size enlargement of 3 mm or more and the detection of lymph node metastases at a 5-year follow-up period. These were calculated using an inverse-variance weighting model. An additional outcome was evaluation of the reasons for surgery during AS.
Results:
The pooled proportion of size enlargement occurring at 5 years was 5.3% [95% confidence interval (CI), 4.4–6.4%], and the pooled proportion of 5-year lymph node metastasis was 1.6% [CI, 1.1–2.4%]. In many subjects undergoing delayed operations, the reasons for operation were often other than those of size enlargement or lymph node metastasis.
Conclusions:
AS is effective for the management of small PTC, with a low proportion of size enlargement or lymph node metastasis occurring at 5 years. However, a substantial proportion of the causes of delayed surgery were other than size enlargement or lymph node metastasis, and these situations need to be optimally managed.
Introduction
Recently, the incidence of thyroid cancer, especially papillary thyroid carcinoma (PTC) of less than 1 cm, has increased in many countries. This increase is primarily a result of the increased detection of small PTC due to unnecessary evaluations of small thyroid nodules using high-resolution ultrasound (US). However, despite the increased incidence of PTC, the mortality rates from thyroid cancer have remained largely stable. Therefore, it has been suggested that small PTCs may be overdiagnosed and overtreated (1 –3).
In Japan, the Kuma Hospital group in Kobe led by Dr. Miyauchi and the Cancer Institute Hospital in Tokyo led by Dr. Sugitani initiated active surveillance (AS) management (4,5), and then reported the natural course of patients during AS (6,7). A report from the Kuma Hospital in 2014 showed rates of 8% tumor size enlargement and 3.8% nodal metastasis among 1235 patients at a 10-year observation point (8). The Cancer Institute Hospital reported that 7% of cases had increased in size and 1% had developed apparent lymph node metastasis during the mean AS period of 5 years published in 2010 (5). Subsequently, the 2015 American Thyroid Association (ATA) Management Guidelines endorsed AS as an alternative to immediate surgery (9).
For patients with tumor growth or node metastasis during AS, delayed rescue surgery achieved excellent outcomes (10). The cost of AS is even 4.1 times less than immediate surgery (11). However, although many advantages of AS have been demonstrated, there are ongoing discussions and surveys over the anxiety or resistance of the patient (even among physicians) during AS, and over the optimization of inclusion/exclusion criteria (12 –15). In our clinical practice, considerable numbers of patients have received delayed surgery during AS, particularly because of their anxiety levels (12,14). Considering the clinical value of AS, its systematic evaluation is timely and necessary. However, to our knowledge, no study has investigated this significant issue.
Therefore, we performed a systematic evaluation of tumor growth, nodal metastasis, and delayed surgery rates during AS.
Materials and Methods
This systematic review and meta-analysis were performed according to the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines (16).
Literature search
A search of MEDLINE and EMBASE databases was performed to find original literature reporting patients who were followed up with AS for PTC. The following search terms were used: ([“active surveillance”] OR [observation]) AND ([PTC] OR [papillary thyroid microcarcinoma]) OR ([papillary microcarcinoma of the thyroid] OR [thyroid microcarcinoma]). No beginning search date was set, with the literature search being updated until January 5, 2019. The search was limited to English language publications. The bibliographies of relevant articles were searched to identify any other appropriate articles.
Inclusion criteria
Studies satisfying the following criteria were included: (1) included patients with PTC; (2) included observation data during AS without surgery for PTC; and (3) contained information on reference standards based on cytological tests.
Exclusion criteria
Studies or subsets of studies were excluded if any of the following criteria were met: (1) case reports or case series including fewer than 20 patients; (2) letters, editorials, conference abstracts, systematic reviews or meta-analyses, consensus statements, guidelines, and review articles; (3) articles not focusing on AS for proven PTC; (4) articles with, or with suspicion of, overlapping populations; (5) articles without observation data during AS for PTC; and (6) articles without reference standards based on cytological tests.
The literature search and selection were independently performed by two radiologists (S.J.C. and C.H.S., both with 5 years of experience in thyroid imaging), followed by establishing consensus in case of discrepancies.
Data extraction
The following data were extracted using standardized forms according to the PRISMA guidelines [1]: (1) article characteristics: institution, country of origin, authors, year of publication, duration of patient recruitment, classification of included tumor, patient numbers, mean patient age, male to female ratio, study design (prospective vs. retrospective, consecutive or not), and exclusion criteria for AS; (2) details of AS: size information of included tumors, criteria for tumor enlargement during AS, follow-up methods during AS, and number of patients undergoing delayed surgery during AS (total, and numbers due to size enlargement, lymph node metastasis, or any reason for surgery other than size enlargement or lymph node metastasis); and (3) observational data during AS for PTC: the yearly percentage of size enlargement of 3 mm or more up to 5 years; the percentage of lymph node metastases up to 5 years.
Quality assessment
Two reviewers independently performed data extraction and quality assessment using the risk of bias for nonrandomized studies (RoBANS) tool for nonrandomized controlled trials (17).
Data synthesis and analyses
The primary outcomes of the current systematic review and meta-analysis were identification of the annual pooled rates of tumor size enlargement of 3 mm or more and lymph node metastases occurring during AS of up to 5 years. An additional outcome was the description of the reasons for performing surgery during AS.
The pooled proportions were calculated using an inverse-variance weighting model (18). Random effects meta-analysis of proportions was utilized to calculate the overall proportions. Study heterogeneity was evaluated using Higgins inconsistency index (I 2), with substantial heterogeneity being indicated by an I 2 value greater than 50% (19). All statistical analyses were conducted by one author (C.H.S., with 6 years of experience in conducting systematic reviews and meta-analysis) using the “meta” package in R, version 3.4.1.
Results
Literature search
The article selection process is described in detail in Figure 1. The initial systematic literature search identified 642 articles. A total of 353 records were excluded in the MEDLINE and EMBASE database classification systems because they were case reports or case series, letters, editorials, conference abstracts, or review articles. After removing 21 duplicates, screening of the remaining 268 titles and abstracts yielded 28 potentially eligible articles. No additional article was identified in the searches of the bibliographies of these articles. After full-text reviews of the 28 provisionally eligible articles (4 –8,10,11,20 –41), 22 were excluded because they were not in the field of interest (5,6,10,27,28,30,33,35,37,39,41) or had populations that were overlapping or had a suspicion of overlapping (4,7,11,26,29,31,32,34,36,38,40). Despite using overlapping data from the same center, two articles were included to gain additional information specific to the discussion (20,24). Finally, six studies from four groups (two Japanese, one Korean multicenter, and one North American) were included in the qualitative systematic review (8,20 –22,24,25), and three studies from four groups were included in the quantitative meta-analysis (with no data overlapping) (8,21,22).

Flow diagram of the study selection process.
Characteristics of the included studies
The detailed characteristics of the included studies are reported in Table 1. One Korean study was a multicenter design (21), but the others were from single centers. Five studies from three groups included low-risk papillary thyroid microcarcinoma (8,20 –22,24), while another group included lesions with proven Bethesda categories of V or VI with a size smaller than 1.5 cm (25). The sizes of the study populations ranged from 291 to 1235 patients, with the patients having mean ages of 51–54.4 years. All studies contained more female than male patients (8,20 –22,24,25). Four studies from two Japanese groups presented results for mean follow-up periods of about or more than 5 years (8,20,22,24), while the remaining two groups had mean follow-up periods of less than 5 years (21,25). Four studies of two Japanese groups reported prospective articles (8,20,22,24), while the other studies were all retrospective (21,25). The exclusion criteria for AS are described in Table 1, with the exclusion criteria common across six studies from four groups being lymph node metastasis and extrathyroidal extension (8,20 –22,24,25). Four of these studies from three groups had an additional exclusion criterion of distant metastasis (21,22,24,25). Further detailed characteristics of the AS of the included studies are summarized in Table 2. The mean maximum diameters of the included nodules were larger in one group (the study that included proven Bethesda categories V and VI) (25) than in the others (8,20 –22,24). The criterion for size enlargement was an increase of ≥3 mm than initial nodular size in all studies (8,20 –22,24,25). In terms of the follow-up method, a group used periodic US (25), two studies used periodic US and fine needle aspiration biopsy (FNAB) for suspicious lymph nodes (8,20), one group used US, FNAB, and washout thyroglobulin for suspicious lymph nodes (21), and two studies used periodic palpation, US, chest radiography, or computed tomography (22,24).
Characteristics of the Included Studies
The mean age was not clarified in the study; instead, the report classified mean age into three groups: 496 (40%) patients, >60 years; 570 (46%) patients, 40–59 years; 169 (14%) patients, <40 years.
We extracted data for only low-risk PTMC (T1a; size, <1 cm) for the current systemic review and meta-analysis, despite both T1a (<1 cm) and T1b (≥1 cm) tumors being enrolled in the study.
Asan Medical Center, Samsung Medical Center, and Seoul St. Mary's Hospital. dOther causes of exclusion in the study: autonomously functioning thyroid nodule (1 patient), Graves' disease (10 patients), or exogenous supplementation with thyroxine (four patients) was excluded. During the initial 2 years, exclusion criteria included elected surgery regardless of unchanged tumor status (three patients), underwent surgery due to increase in tumor size (one patient), and died from unrelated disease (two patients).
CNB, core needle biopsy; FNAB, fine needle aspiration biopsy; F-U, follow-up; IQR, interquartile range; MSKCC, Memorial Sloan-Kettering Cancer Center; NA, not available; PTMC, papillary thyroid microcarcinoma; US, ultrasound.
Details of the Active Surveillance of the Included Studies
Mean of 7.8 ± 0.1 mm for 390 patients without change in size of nodule, and mean of 7.6 ± 0.5 mm for 25 patients with increase in size of nodule.
Asan Medical Center, Samsung Medical Center, and Seoul St. Mary's Hospital.
AS, active surveillance; CT, computed tomography; LN, lymph node; postop., postoperative.
Tumor size enlargement and lymph node metastasis
The yearly (until 5 years of follow-up) cumulative proportions of size enlargement of 3 mm or more over initial nodular size for papillary thyroid microcarcinoma are listed for three groups (two Japanese, one Korean, and one North American group) (8,21,22) in Table 3, along with the pooled cumulative proportions according to the meta-analysis. The pooled proportions of 1-, 2-, 3-, 4-, and 5-year size enlargement were 0.1% [95% confidence interval (CI), 0.0–0.5%], 0.3% [CI, 0.1–1.0%], 2.7% [CI, 2.0–3.6%], 5.1% [CI, 4.1–6.2%], and 5.3% [CI, 4.4–6.4%], respectively. Figure 2 shows the pooled cumulative proportions of size enlargement for the four groups. The size enlargement tended to be steeper after 2 years of AS. Study heterogeneity was absent in the size enlargement for the follow-up years of 1, 2, 3, 4, and 5 (I 2 = 31.4%).

The cumulative pattern of size enlargement by 3 mm or more.
Cumulative Data (%) for Size Enlargement or Lymph Node Metastasis During Follow-Up Period
Asan Medical Center, Samsung Medical Center, and Seoul St. Mary's Hospital.
The original data were refereed by the first author (Oh et al. [21]) of the journal.
The yearly cumulative proportions of lymph node metastasis of two groups (one Japanese, one Korean) (8,21) and the pooled cumulative proportions according to the meta-analysis until 5 years are also summarized in Table 3. The pooled proportion of 5-year lymph node metastasis was 1.6% [CI, 1.1–2.4%]. There was no study heterogeneity (I 2 = 0%).
Total number of delayed operations performed during AS
As there was no detailed information about the reasons for surgery during AS in the two recent large and longest follow-up studies from the Japanese groups (8,22), we extracted the corresponding data from two other studies published previously by the same groups (which constituted the longest and largest studies up until the previously mentioned two studies) (20,24). The causes were heterogeneous across the four groups: in the Kuma Hospital and Korean groups, the reason for a delayed operation was not due to a size increase or lymph node detection in two-thirds of the patients: 66.1% in Kuma Hospital and 69% in the Korean group (20,21). Although the sample size of patients undergoing delayed operations was small, 50% of the patients who underwent delayed operation in the North American group were not due to an increase in size or lymph node detection (25). In the Cancer Institute Hospital, 32% of the delayed operations were not due to an increase in size or the detection of lymph nodes (24).
Quality assessment of the studies
The quality of the included studies was assessed according to the RoBANS criteria and is presented in Figure 3. All studies showed a low risk of bias in comparability of patient groups, selection of patient group, measurement of exposure to intervention, outcome assessment, incomplete outcome data, and selective reporting domains. However, the studies showed an unclear risk of bias in the blinding of participants and personnel domain, as they did not make clear statements in this regard (8,20 –22,24,25). In the confounder domain, a study showed an unclear risk of bias due to unclear information on statistical adjustment for the observation periods (22).

The quality assessment of the included studies according to the RoBANS. RoBANS, risk of bias for nonrandomized studies.
Discussion
The current systematic review and meta-analysis showed a pooled proportion of 5-year tumor size enlargement of 5.3% [CI, 4.4–6.4%] and a pooled proportion of 5-year lymph node metastasis of 1.6% [CI, 1.1–2.4%] as the primary outcome. Second, many delayed surgeries were performed because of reasons other than size increase or lymph node metastasis. These results suggest that AS is effective for the management of PTC, with a low proportion of tumor size increase and lymph node metastasis occurring at the 5-year follow-up. However, the reasons for delayed surgeries other than size enlargement or lymph node metastasis need to be further evaluated.
The worldwide incidence of papillary thyroid cancers has markedly increased during the past 20 years, especially in the United States and South Korea. The main cause of this phenomenon is detection of small PTCs during screening, or incidental detection on imaging studies (1 –3). As these small PTCs are generally indolent, the Kuma Hospital group in Kobe led by Dr. Miyauchi and the Cancer Institute Hospital in Tokyo led by Dr. Sugitani suggested AS as an alternative to surgery (4,5). As time has progressed, several research groups from Korea, the United States, and Latin America have added their results on AS (8,20 –25). The greatest advantage of AS is the reduction of unnecessary immediate surgery. In a study comparing AS and immediate surgery, the oncological outcomes of the immediate surgery and AS groups were similar, but the occurrence of unfavorable events, including vocal cord paralysis, hypoparathyroidism, and postsurgical complications, was higher in the immediate surgery group (11). On the basis of this background, the 2015 ATA guidelines suggested AS as first-line management of low-risk PTC of 1 cm or less (42). To our knowledge, the present study forms the first systematic review of the results of AS, evaluating the rates of tumor size enlargement and lymph node metastasis with pooled analyzed data. Moreover, we also evaluated the reasons for delayed surgery.
Although AS has become gradually more accepted in clinical practice, several issues remain to be resolved. First, there is patient hesitancy and resistance of patients and/or physicians (12,14). Inevitably, the patients and physicians have to confront the choice between possible unfavorable results of immediate surgery or living with cancer. In a recent study using survey and interview responses performed at Kuma Hospital by Davies et al., the concerns about AS were initially comparable with those about immediate surgery, although the concern regarding AS decreased over time (12). A multicenter collaboration has developed a tool called thyroid cancer choice (TCC) for supporting decision-making by the patient and medical team (14). In a pilot trial study, patients using TCC more easily chose AS than patients who did not use TCC [relative risk, 1.16; CI, 1.04–1.29] (14). This approach offers a model to decrease or solve the anxiety issue. Second, a patient younger than 40 years is reported to be a predictor of tumor progression (8). Considering the long life-time expectancy and increased chance of annual tumor size increase, young patients have a higher chance of requiring delayed surgery. Therefore, younger age should be carefully considered for optimal risk stratification when choosing between AS and immediate surgery (13). Third, in cases of size enlargement or lymph node metastasis, delayed surgery may require more extensive intervention; however, it is unclear whether lymph node metastasis was already present at initial diagnosis or whether it developed during the time of AS follow-up (43).
A strict study design is important for achieving reasonable results with AS. A recent review article suggested that stricter inclusion criteria, closer monitoring protocols, and earlier initiation of intervention will increase the acceptance rate of AS (15). This suggestion was based on knowledge gained from similar experiences of accepting AS in prostate cancer. As prostate cancer shows higher disease progression rates, distant metastasis rates, cancer-specific mortality, and lower overall survival than thyroid cancer, AS should be an even more effective and a relevant management model for thyroid cancer than for prostate cancer (15). The Memorial Sloan-Kettering Cancer Center (MSKCC) and Kuma Hospital recently developed a clinical framework for risk stratification in decision-making for AS for small PTCs (13). They categorized candidates into three groups (ideal, appropriate, and inappropriate) on the basis of various imaging findings, patient characteristics, and medical team characteristics. This approach tried to involve the characteristics of patients and medical teams used in decision-making, with the intention of raising the reliability of decision-making and achieving better results.
There are limitations to the current study. First, only a small number of relevant studies have been published, and the pooled data are based on this low number of studies. We included two articles with overlapping populations from the Japanese groups to gain additional information specific to the discussion. Second, the characteristics of the enrolled small PTC cases were not exactly the same across the studies. One of these characteristics is the definition of size enlargement. Although we defined size enlargement as an increase in maximal diameter of 3 mm or more according to the majority of enrolled studies, there is a suggestion that tumor volume measurement is more relevant than maximum diameter (25).
In conclusion, this systematic review and meta-analysis show that AS is effective for the management of PTC, with a low proportion of size enlargement or lymph node metastasis occurring in the 5-year follow-up. However, the substantial proportion of patients who undergo delayed surgeries for reasons other than size enlargement or lymph node metastasis needs to be further characterized.
Footnotes
Author Disclosure Statement
None, except J.H.B. Financial activities related to the present article: none to disclose. Financial activities not related to the present article: patent holder of unidirectional ablation electrode. Other relationships: Consultant of two radiofrequency companies, STARmed and RF medical since 2017.
Funding Information
No funding was received for this article.
