Abstract
Background:
In many risk-stratification systems, the decision to biopsy thyroid nodules is determined by their sonographic features and size. Nevertheless, even low-suspicion nodules are often biopsied at small size thresholds because it is assumed that larger malignant nodules are associated with poorer outcomes. The aim of this study was to quantify the effect of thyroid cancer tumor size on survival and risk of T4 stage, nodal disease, and distant metastases.
Methods:
The Surveillance, Epidemiology, and End Results 18 database was queried to obtain tumor size, staging information, and survival data for cases of differentiated thyroid cancer (DTC) and non-DTC reported between 2004 and 2014. Observed probabilities of tumor extent at diagnosis, including regional nodal disease and distant metastases, as a function of size and tumor histology were estimated for thyroid cancers measuring between 1 and 150 mm. A multivariate Cox regression model was used to describe all-cause mortality as a function of patient and tumor characteristics, and the functional dependence of mortality on size was computed.
Results:
A total of 112,128 patients were analyzed, with 67% having thyroid cancers ≥1 cm, and 29% ≥ 2.5 cm. For DTC tumors <4 cm, the risk of local invasion, nodal metastases, or distant metastases was low, and there was no size threshold associated with a sharp rise in adverse outcomes. For DTC tumors <4 cm, the probability of distant metastases was <3%. Older age, male sex, non-DTC histology, T4 stage, and regional and distant metastatic disease increased the all-cause mortality rate. Tumor size did not increase the mortality rate above baseline until tumors were >2.5 cm.
Conclusion:
Increasing tumor size does not affect survival until a threshold of 2.5 cm. Since the dimension of nodules on ultrasound has been shown to be larger than their size at gross pathology, these findings suggest that recommended size thresholds to biopsy low-suspicion thyroid nodules can be increased.
Introduction
T
Most risk-stratification systems for thyroid nodules base management decisions on sonographic features and nodule size, with lower size thresholds for recommending biopsy for nodules with more suspicious findings. In general, 1 cm is considered a practical lower limit for routine biopsy, even for highly suspicious nodules, because papillary microcarcinomas are well accepted to have an indolent clinical course (4). The size thresholds for nodules with low or mild suspicion for malignancy are more variable. The American Thyroid Association and Korean Thyroid Imaging Reporting and Data System (TIRADS) systems use 1.5 cm for mildly suspicious nodules (4,5), the French and European TIRADS use 2 cm (6), and the ACR TIRADS specifies a threshold of 2.5 cm for low-suspicion nodules (7). The threshold of 1.5 cm in some systems is partially based on a single-institution retrospective study by Machens et al. published in 2005 (8) that showed that the cumulative risk of distant metastases from PTC or FTC began to increase at a tumor size of 2 cm. Although distant metastases usually confer a poor prognosis (9), survival was not explicitly evaluated in that study. Importantly, applying this threshold to ultrasound may result in overestimation of risk, as sonographic dimensions have been reported to be larger than the size at gross pathology by 5 mm on average (10,11).
The American Joint Committee on Cancer tumor staging system for thyroid cancer uses size thresholds of 2 cm and 4 cm for T-staging, implying that larger sizes are associated with poorer survival. However, there are few large, population-based studies examining the prognostic significance of tumor size, particularly <4 cm. Analysis of data from large-scale cancer databases, such as the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute, affords an opportunity to ascertain the relationships between tumor size and other variables related to disease extent and survival more accurately. The aim of the current study was to quantify the effects of tumor size of PTC, FTC, and other thyroid cancers on survival and on other measures of disease extent, including T4 stage, nodal disease, and distant metastases.
Methods
Study population
Using SeerStat 8.3.4, the SEER 18 database (12) was queried to identify cases in the research database meeting the following inclusion criteria: thyroid carcinoma, year of diagnosis from 2004 through 2014, known age, known tumor size between 1 and 150 mm, and malignant behavior. The geographic areas encompassed approximately 28% of the U.S. population.
Cases were subdivided by histology based on the International Classification of Diseases for Oncology, third edition (ICD-O-3) codes (13) for papillary carcinoma (8050, 8260, 8340–8344, 8350, and 8450–8460) and follicular carcinoma (8290 and 8330–8335). The remainder of the thyroid cancer cases consisted primarily of medullary and anaplastic thyroid carcinomas. Details about tumor size and stage at diagnosis were obtained from the Collaborative Staging extent of disease variables. Cases were subdivided by tumor dimension based on their reported size in millimeters. Cases without a recorded numerical size in millimeters (e.g., cases coded with approximate size ranges or cases coded as microscopic tumor only without a recorded size) were excluded. Although tumor sizes as large as 98.9 cm can be recorded in the database, a maximum of 15 cm was set for this analysis, as larger tumors are unlikely to be encountered in clinical practice, limiting the accuracy of the data. In the majority of cases, tumor sizes in SEER were derived from surgical pathology reports, although tumor sizes in a very small minority of cases were obtained from imaging or other noninvasive clinical evidence. Status of advanced tumor stage (T4), regional nodal disease, and distant metastases at the time of diagnosis were also recorded. In addition to N1 status, cases were also subdivided into N1a, N1b, or N1 status not otherwise specified (NOS). Similarly, cases with M1 status were subdivided into cases with distant lymph node metastases only, distant metastases to non–lymph node organs, distant metastases to both, or distant metastases NOS. Vital status data, including number of months of survival, were also recorded.
Statistical analysis
Based on the SEER data, the observed probability of T4 status, regional nodal disease (N1 status), and distant metastases (M1 status) at diagnosis were computed as a function of size and tumor histology for all thyroid cancers ranging in size between 1 and 150 mm. The probabilities were estimated by fitting a logistic regression with a nonlinear term for size to the data. For M1 status, cumulative probabilities for size (defined for a given tumor size x as the number of cancer cases with size ≤x with M1 status divided by the total number of cancer cases with size ≤x) were also generated in a manner analogous to that used by Machens et al. (8).
A multivariate Cox regression model (14) was used to describe all-cause mortality as a function of patient age, patient sex, and tumor characteristics (histology, tumor size, and other tumor extent variables). Cases with exact tumor sizes in millimeters ranging from 1 to 150 mm were included in the model. A Cox regression model was fit of the form:
where hi (t, Xi ) is the hazard rate of mortality at time t with a vector of covariates Xi = (Xi 1, Xi 2, …, XiK ). The model produces an estimate for a hazard ratio by approximating this hazard relative to a baseline hazard λ0(t), which represents a 50-year-old female with papillary thyroid cancer, with a 1 mm primary tumor size, no distant metastases, no regional lymph node involvement, and non-T4 stage. The effect of the covariates is multiplicative, with β k representing the log hazard ratio due to covariate Xk .
The hazard ratio is the change in the instantaneous risk of death for a subject, per unit increase in the predictor (e.g., size) relative to the corresponding risk in the baseline subject. The instantaneous risk of death for a subject at any given time represents the (small) chance that the subject would die in the next small interval of time (e.g., the next second), given they were alive just previously. Under the proportional hazards assumption used in Cox regression, the hazard ratio remains unchanged over time. Hence, it is applicable to all subjects, irrespective of the length of their survival. A hazard ratio <1 indicates a lower instantaneous risk of death, while a hazard ratio >1 indicates a higher instantaneous risk of death compared to the baseline subject.
To estimate the functional dependence of the hazard ratio on log(size), the size predictor was chosen to be a penalized spline with four degrees of freedom for the fit. The spline is a piecewise polynomial function, whose fit is optimized to best fit the observed survival distribution. The log function tends to emphasize smaller values relative to larger ones, thereby focusing on the size range where most of the data are present. Statistical computations were performed using R v3.3.0. Statistical significance was assessed based on p-values <0.05.
Results
Study population
There were 112,128 patients in the SEER database who met the inclusion criteria. The median age was 50 years (range 0 to >85 years; interquartile range 39–61 years). A total of 24% (26,680) were male, and 82% (91,504) were white, 11% (11,854) other (American Indian/Alaska native or Asian/Pacific Islander), 7% (7432) black, and 1% (1338) unknown race/ethnicity. PTC was the tumor histology in 89% (99,584) of cases, FTC in 8% (8532) and other in 4% (4012). The distribution of tumor sizes in the SEER population is shown in Figure 1. Approximately 67% of thyroid cancers in the studied SEER population were ≥1 cm, and 29% of thyroid cancers were ≥2.5 cm.

Histogram of tumor size in the Surveillance, Epidemiology, and End Results population of thyroid cancers.
Primary tumor size and risk of invasive and metastatic disease
Figure 2 depicts the risk of T4 stage (advanced locally invasive cancer) at diagnosis for PTC, FTC, and other thyroid cancer histologies. For PTC, as primary tumor size increases, the risk of T4 stage increases linearly, without a threshold effect. For FTC, the probability of T4 stage increases in a near-linear fashion once the size exceeds a threshold of approximately 4 cm. However, the risks of T4 stage for PTC and FTC are both low. At a size of <2.5 cm, the probability of local invasion is <5% for PTC and <1% for FTC.

Probability of T4 stage at diagnosis as a function of tumor size and histology. Dashed lines represent limits of 95% confidence intervals (CIs).
Risk curves for regional nodal disease (N1 status) at diagnosis are shown in Figure 3. PTC and non-DTC cancers show a linear increase in regional nodal disease risk for primary tumor sizes from 0 to 2 cm, with a plateau effect above this size range. For PTC, the maximum probability of N1 status is slightly <40%. FTC has an extremely low risk of nodal metastases for primary tumor sizes <5 cm. Above 5 cm, N1 risk increases slightly, but remains <20%, and is substantially lower than for PTC.

Probability of regional nodal disease (N1 status) at diagnosis as a function of tumor size and histology. Dashed lines represent limits of CIs.
The risks of distant metastases (M1 status) at diagnosis as a function of primary tumor size are shown in Figure 4A. For the histological groupings examined, no size threshold effect is seen, with PTC and FTC both showing similar low, gradual increases in the probability of M1 status as tumor size increases. At a size <4 cm, the probability of distant metastases for DTC is well below 3%. The probability of M1 status increases much more rapidly with size increases for non-DTC histologies. Cumulative risks of M1 status at diagnosis, as shown in Figure 4B, are consistently higher in non-DTC cancers than in DTC and are also higher for FTC than PTC.

(
Tumor size and survival
The multivariate Cox regression model shows the effect of tumor and patient characteristics on survival (Table 1). Age and male sex both demonstrate statistically significant contributions to the hazard function, which is based on all-cause mortality. Compared to the baseline scenario of PTC, FTC does not show a statistically significant elevated hazard ratio (1.04; p = 0.34), but non-DTC histologies show an elevated hazard ratio (3.25; p < 0.000001). Regional nodal disease confers statistically significantly elevated hazard ratios (1.1–1.4). Hazard ratios in patients with distant metastases are also elevated, ranging from a ratio of 1.63 in patients with distant lymph node metastases only (p = 0.01) to 6.17 in patients with both distant nodal and non-nodal metastases (p < 0.000001).
SE, standard error; LNs, lymph nodes; NOS, not otherwise specified.
The contribution of size to all-cause mortality is shown in Figure 5. The baseline scenario was a female PTC patient aged 50 years (the median age for the population) with no T4 status and no regional or distant metastases. The all-cause mortality risk attributable to size does not exceed that of the baseline scenario until a pathological tumor size of 2.5 cm. Beyond this threshold, partial hazard ratios increase linearly with log size. Partial hazards 10% and 20% higher than baseline are reached at tumor sizes of 3.1 cm and 3.7 cm, respectively.

Partial hazard ratio for all-cause mortality attributable to size in a multivariate Cox regression model, which includes patient age, patient sex, tumor histology, T4 status, N stage, and M stage as covariates. The y axis refers to the ratio of the hazard function relative to the baseline scenario, namely a female papillary thyroid carcinoma patient aged 50 years (the median age for the population) with no T4 status and no regional or distant metastases. Dashed lines represent limits of CIs.
Discussion
Existing risk-stratification systems for thyroid nodules base recommendations for biopsy on the presence of suspicious ultrasound features and size. For low-suspicion nodules, size thresholds vary from 1.5 to 2.5 cm. The latter threshold is typically based on a study by Machens et al. that showed an increased risk of distant metastases at a threshold of 2 cm (8). However, the data presented here show no threshold effect of tumor size on the risk of distant metastases, nodal metastases, or locally invasive disease for tumors <4 cm.
Large discrepancies were also found between the absolute values for risk compared to Machens et al. For example, the risks of extrathyroidal disease reported by these authors, which approached 90% for PTC and 50% for FTC at the upper end of their tumor size range (8 cm), are much higher than the 20% rate observed at this size in the present study. One possible explanation for this discrepancy is that some cases of extrathyroidal extension, if sufficiently mild, may have been T3 stage instead of T4, and in the present study, minimal extrathyroidal extension was not specifically evaluated due to its relatively limited impact on prognosis. However, a more likely explanation is that Machens et al. only examined patients who had surgery, with most undergoing reoperation as opposed to primary surgery. Two-thirds of the reoperations were completion thyroidectomies, and the remainder were interventions for tumor recurrence. Therefore, they may have comprised a selected population with a higher proportion of advanced disease or aggressive tumors.
This study correlated primary tumor size with all-cause mortality and found very little change in mortality until tumor size was >2.5 cm. The relatively high size threshold for increased cancer-related deaths is similar to previously published data on the effects of tumor size on survival (15 –17). The lack of a clear-cut size threshold effect on survival <2.5 cm suggests that increasing the size cutoff for biopsy to 2.5 cm is unlikely to result in an abrupt increase in cancer-related deaths. This is larger than what some guidelines recommend but corresponds to the threshold advocated for nodules classified as mildly suspicious in the ACR TIRADS (7). Moreover, tumor sizes in the study by Machens et al. (and in most of the SEER database) were obtained from surgical pathology. Published reports have shown that ultrasound often overestimates thyroid tumor size by an average of 5 mm compared to pathology (10,11), which adds further justification to the use of higher size thresholds for fine-needle aspiration based on ultrasound measurements. Analogous to how shifts of staging based on age cutoff occurred following a SEER study (18), and a follow-up study (19), the present data, in combination with future studies evaluating outcomes, may have implications for re-evaluating size criteria in biopsy guidelines.
This study has several limitations. First, SEER variables for extent of disease are based on staging at or shortly after diagnosis and therefore do not permit evaluation of tumor recurrence or progression. The data from this study would therefore not be able to predict how likely a tumor of a given size will eventually progress to metastatic disease. Nonetheless, for the purpose of setting size criteria for management of low-suspicion nodules, predicting initial extent of disease or survival based on tumor size is still helpful. Another limitation is that it is difficult to make clinical decisions based on size data alone. Clinical management of thyroid nodules requires a combination of size and numerous other sonographic and other imaging characteristics that are not available in SEER.
In summary, this study provides useful quantitative information on disease extent and survival for thyroid cancer as a function of primary tumor size. The study shows that there is no size threshold within the size ranges typically used to guide management of thyroid nodules that carries a higher risk of nodal or distant metastases, and that all-cause mortality is elevated above baseline only for thyroid tumor sizes >2.5 cm. This represents an advance over existing literature because the data are derived from a large number of patients, and the study design minimized selection biases that can occur when examining only surgical referrals. Although more studies with outcome data should be performed to determine the clinical impact of changing size thresholds for recommending biopsy of thyroid nodules, this study suggests that these size thresholds may be increased without a significantly increased risk of morbidity and mortality.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
