Abstract
Background:
Over the last decade, the utilization of molecular testing (MT) for the evaluation of thyroid nodules has increased. Rates and patterns of adoption of MT and its effect on thyroidectomy rates nationally are unknown. Varying rates of MT adoption at the state level provide an opportunity to study the effects of MT on thyroidectomy rates using a quasiexperimental study design.
Methods:
We performed a retrospective analysis of American adult patients in the Merative™ MarketScan® Research Databases who underwent thyroid fine-needle aspiration (FNA) from 2011 to 2021. MT included commercially available DNA and RNA platforms and traditional targeted mutational analysis. Interrupted time series analysis was used to evaluate the inflection of MT adoption and thyroidectomy rates after 2015. Difference-in-differences (DID) analysis was used to causally analyze the effect of MT adoption on thyroidectomy rates in high-adoption (at least a 10% increase in MT utilization) versus low-adoption states (no more than 5% increase in MT utilization) from 2015 to 2021.
Results:
We identified 471,364 patients who underwent thyroid FNA. The utilization of MT increased over the study period from 0.01% [confidence interval, CI: 0.00% to 0.02%] to 10.1% [CI: 9.7% to 10.5%], in 2021, with an immediate (β2 = 1.61, p = 0.002) and deeper (β3 = 0.6, p < 0.001) increase in MT adoption after 2015. Utilization of MT was lower in black patients, the elderly, rural areas, and patients with Medicaid (p < 0.05). Thyroidectomy rates were inversely correlated with MT utilization (r = −0.98, p < 0.0001). From 2015 to 2021, the average MT utilization rate increased from 2.4% to 15.3% in high-adoption states and 1.6% to 5.6% in low-adoption states. In low-adoption states, thyroidectomy rates decreased more but to similar levels (18.5–13.2%) compared with high-adoption states (15.9–13.4%) with an adjusted DID rate of −3.3% [CI −5.6% to −0.8%].
Conclusions:
The acceleration in adoption of MT after 2015 likely coincides with the publication of American Thyroid Association guidelines. Black, elderly, and rural patients are less likely to receive MT. Although thyroidectomy rates were inversely correlated with MT utilization, our study suggests that this correlation is not causal. The effect of MT on thyroidectomy rates may be overshadowed by decreasing aggressiveness of thyroid nodule evaluation.
Introduction
Thyroid nodules affect more than 60% of the population worldwide. 1,2 These common nodules are often asymptomatic and benign: fine-needle aspiration (FNA) classifies 55–74% of thyroid nodules as benign, while only 2–8% are malignant. 2,3 Notably, 15–30% of thyroid nodules as determined by FNA are cytologically indeterminate. 1,3 –5 Indeterminate nodules have historically been referred for diagnostic thyroid surgery. 3,6,7 However, because indeterminate nodules are often benign, many patients may be exposed to the risks of thyroid surgery with only a diagnostic rather than a therapeutic benefit.
Molecular testing (MT) has evolved into a tool to rule out malignancy in indeterminate thyroid nodules and reduce unnecessary thyroid surgery. A variety of commercial MT platforms based on somatic mutational analysis, RNA/DNA gene expression classifiers, and microRNA evaluation are available, and validation studies show excellent diagnostic performance. 1 –3,8,9 Multiple events have influenced the increasing adoption of MT over the last decade, as commercial platforms such as Afirma (Gene Expression Classifier, 2012; Genomic Sequencing Classifier, 2018) and ThyroSeq (v1, 2013; v2, 2015; v3 2018) have undergone iterative improvements. Furthermore, in 2015, several significant events with respect to MT in the United States occurred, including the release of ThyroSeq v2, the approval of the Afirma Current Procedural Terminology (CPT) code, and incorporation of MT into the American Thyroid Association (ATA) guidelines for thyroid nodules and differentiated thyroid cancer. 7
While evidence supporting the efficacy of MT for indeterminate thyroid nodules has grown, rates of utilization and factors influencing MT adoption are unknown. Furthermore, while the primary purpose of MT is currently to reduce unnecessary thyroid surgery, it is yet to be demonstrated that MT causes an observable decrease in thyroidectomy rates at a national level. We hypothesized that converging events in 2015 led to heterogeneous state-level MT adoption in the United States, and that quasiexperimental methods would be able to demonstrate whether variable MT adoption at the state level has led to a greater decrease in thyroidectomy rates in high-adoption states versus low-adoption states.
Materials and Methods
Data source
We performed a retrospective cohort study using the Merative™ MarketScan® Research Databases from 2008 to 2021. 10 The MarketScan Research Databases contain individual-level, deidentified, health care claims information from employers, health plans, hospitals, and Medicare and Medicaid programs (government-funded insurance programs) in the United States. The Commercial Claims and Encounters Database and Medicare Supplemental and Coordination of Benefits databases include more than 273 million unique patients since 1995. These data represent the national health care experience (both inpatient and outpatient setting) of insured employees and their dependents for active employees, early retirees, and Medicare-eligible retirees with employer-provided Medicare Supplemental plans. The ability to capture patients in both inpatient and outpatient settings is critical to studies that concurrently analyze outpatient evaluation and thyroidectomy, which can be either an inpatient or outpatient procedure.
The annual databases include private-sector health data from more than 350 unique payers in all 50 states. The Multi-State Medicaid Database reflects the health care service use of beneficiaries covered by Medicaid programs in numerous geographically dispersed states. This study uses deidentified data and was determined to be exempt by the Columbia University Institutional Review Board.
Study population
We identified patients who underwent thyroid FNA by selecting for patients with CPT and International Classification of Disease (ICD) procedure codes for FNA in combination with diagnosis codes for thyroid nodules or goiter (Supplementary Table S1). The most recent FNA date was identified as the index FNA date if a patient had multiple FNAs. Patients were eligible to the analytical cohort if they were 18 years or older, had continuous health insurance enrollment from 1 month before index FNA date to 3 months after.
Outcomes and covariates
Patients were deemed to have undergone MT if they had CPT codes for molecular tests from 1 month before FNA date to 3 months after the date of index FNA. Molecular tests included commercially available DNA and RNA platforms in addition to targeted mutational analysis (TMA) of BRAF, RET/PTC rearrangements, PAX8/PPARG, TERT, or other targeted genomic sequence analysis panels (Supplementary Table S1). Thyroidectomy was identified using ICD and CPT procedure codes, and categorized into the following three groups: lobectomy, total thyroidectomy, and unspecified thyroidectomy (Supplementary Table S1). Because cytology results are not available in MarketScan, thyroidectomy rates were defined per FNA rather than per Bethesda III or IV result.
Additional patients' demographic data included health insurance (Medicaid, commercial insurance with/without supplemental Medicare), age at FNA, self-reported gender (male, female), race (white, African American, Hispanic, other/unknown), metropolitan statistical area (MSA), and insurance type. An MSA is a region with a high population density, which is in contrast to rural regions. The information on race was only available in the Medicaid data set; while geographic data, including states where patients resided, were only available for commercially insured patients with/without supplemental Medicare.
Statistical analyses
The study period was divided into pre- and postintervention periods coinciding with the publication of 2015 ATA Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer on January 12, 2016. 7 Descriptive statistics were used to summarize patients' characteristics depending on their date of FNA (preintervention from 2011 to 2015 vs. postintervention from 2016 to 2021) and MT status (yes vs. no). Standardized mean differences (SMDs, the difference in means or proportions divided by the pooled standard error [SE]) were used to compare study populations' demographics at pre- and post-2015. We considered an SMD >0.1 to indicate a clinically significant difference between two periods. Chi-square tests were used to determine differences in MT rates by patient characteristics. Pearson's correlation coefficient was used to examine a correlation between MT utilization and thyroidectomy rates at both annual and state levels.
Quasiexperimental methods were used to examine the effect of ATA guideline publication on rates of MT and thyroidectomy. Interrupted time series (ITS) analysis evaluates a baseline outcome (β0) and preintervention trend (β1) and determines whether the intervention leads to immediate changes in outcome (β2) or changes in postintervention trends (β3). 11,12 In our study, unadjusted ITS analysis was used to quantify changes in MT and thyroidectomy rates pre-2015 and post-2015. ITS analyses were accomplished using ordinary least squares (OLS) regression with Newey–West autocorrelation-adjusted SEs. A prior defined lag of 1 year assumed that annual mean MT rate and mean thyroidectomy rate were correlated with the previous 1-year rates.
Difference-in-differences (DID) analysis is a quasiexperimental method that evaluates causality when randomized trials are not feasible. DID analysis defines an intervention effect as a difference between an expected difference in outcomes had an intervention not occurred and actual differences in outcomes after the intervention took place. 13 We applied DID analysis to quantify the effect of MT adoption on thyroidectomy rates in low- versus high-adoption states. In other words, assuming trends in thyroidectomy rates between states would be constant had MT not been adopted, changes in trends in thyroidectomy rates between states can be causally attributed to variable rates of adoption of MT. As geographic data were not available in the MarketScan Medicaid database, the DID analysis was restricted to commercially insured patients with known information on states where they resided in year 2015 or 2021.
The utilization rates of MT and thyroidectomy were calculated at individual state level in 2015 and 2021, respectively. States where MT utilization increased <5% from 2015 to 2021 were defined as low-adoption states, while states where MT utilization increased by ≥10% were defined as high-adoption states. Each state had to contribute at least 20 cases of FNA in 2015 and 2021 to be included in the DID analysis. The differences of thyroidectomy rates between 2015 and 2021 and confidence intervals (CIs) were calculated in low-adoption states and high-adoption states. A marginal binomial regression model based on generalized estimation equation with the identity linkage was developed at the patient level to calculate the adjusted DID of thyroidectomy rates by including the indicators of pre- or postperiod and high- or low-adoption states and interaction between these two indicators as well as adjusting for patients' age, gender, MSA, and region.
If our hypothesis were true in terms of increasing MT adoption causing decreasing thyroidectomy rates, we would observe a greater decrease in thyroidectomy rate in high-adoption states compared with low-adoption states, with the point estimate as a positive delta (difference) in the difference between post- and prethyroidectomy rates with the CIs not crossing zero.
All hypothesis tests were two-sided. A p-value of <0.05 was considered statistically significant. All analyses were conducted using SAS version 9.4 (SAS Institute, Inc., Cary, NC). The study adhered to the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines.
Results
We identified 807,877 patients who underwent thyroid FNA from 2008 to 2021 (Fig. 1). After excluding patients with incomplete enrollment, patients with age <18 years, and patients from 2008 to 2010 where the MT rate was 0%, our final cohort included 471,364 patients who underwent thyroid FNA from 2011 to 2021. Of these, 66,515 (14.1%) underwent multiple FNAs. A total of 83.7% of patients were female and the median age at FNA was 52 years (Table 1).

Cohort selection. FNA, fine-needle aspiration.
Patients' Demographics by Pre- and Postintervention (Guidance Released)
The information on race was only available in the Medicaid data set, and geographic data were only available for commercially insured patients and Medicare beneficiaries. SMD >0.1 indicates a clinically significant difference between the pre- and postintervention groups.
FNA, fine-needle aspiration; IQR, interquartile range; MSA, metropolitan statistical area; SMD, standardized mean difference.
Overall, 16,218 (3.6%) patients underwent MT and 71,640 (15.2%) underwent thyroidectomy. From 2011 to 2021, the utilization of MT as a proportion of patients undergoing thyroid FNA increased from 0.01% [CI: 0.00% to 0.02%] to 10.1% [CI: 9.7% to 10.5%], while the thyroidectomy rate decreased from 17.5% [CI: 17.2% to 17.8%] to 12.5% [CI: 12.1% to 13.0%] (Fig. 2A). Afirma claims were most common (N = 8964), followed by TMA (N = 7721), ThyroSeq (N = 1427), and ThyGeNEXT/ThyraMIR (N = 612) (Fig. 2B). Thyroidectomy rates by MT subtype were comparable, although an increase in thyroidectomy rate with TMA compared with the commercially available MTs was observed from 2018 to 2021 (Fig. 2C). Adoption of the various subtypes of MT was similar across regions, with the exception of ThyroSeq, which was predominantly used in the northeast (Fig. 3).

Annual rates of molecular testing utilization and thyroidectomy. (

Adoption of molecular tests by region. (
A negative correlation between MT and thyroidectomy rates was observed at both the annual level (r = −0.98, p < 0.001) for the overall study population and at the state level (r = −0.28, p = 0.03) for commercially insured patients (Fig. 4). In examining characteristics associated with MT, we found that patients who received MT were more frequently younger (median age 50 [interquartile range, IQR 41–58] years vs. 52 [IQR 42–59] years), male (4.2% vs. 3.3% female), Hispanic (4.7% vs. 2.2% white vs. 1.4% black), had commercial insurance (3.5% vs. 2.1% Medicaid), and resided in an MSA (3.5% vs. 2.4% non-MSA) (Table 2).

Pearson correlation of molecular testing utilization with thyroidectomy rates at annual year level (
Patients' Demographics by Molecular Testing
NA, not applicable; TMA, targeted mutational analysis.
A total of 85.1% (13,808 of 16,218) of the MTs in our study took place after 2015, coinciding with the publication of ATA guidelines (Table 1). ITS analysis demonstrated an immediate increase in MT utilization after 2015 (β2 = 1.61, SE 0.32, p = 0.002) (Table 3 and Fig. 5A). Furthermore, the increasing trend of MT utilization rate post-2015 exceeded that of the pre-2015 era (β3 = 0.60, SE 0.09, p < 0.001). In contrast, thyroidectomy rates did not immediately change post-2015 (β2 = −0.50, SE 0.30, p = 0.138) and continued to decrease at a similar slope as the pre-2015 trend (β3 = −0.025, SE 0.128, p = 0.851) (Table 3 and Fig. 5B).

Single interrupted time series analysis for molecular testing (
Interrupted Time Series Analysis Examining the Association of the Guidance Release with Utilization of Molecular Testing and Thyroidectomy at State Level
Interrupted time series analyses were accomplished using OLS regression with Newey–West autocorrelation-adjusted SEs. A prior defined lag of 1 year indicated that the annual mean molecular testing rate and mean thyroidectomy rate were correlated with the rates 1 year previous.
OLS, ordinary least squares; SE, standard error.
We identified 17 high-adopting states (N = 16,273) where MT utilization increased by at least 10% and 8 low-adopting states (N = 5756) where MT utilization increased by no more than 5% (Table 4). From 2015 to 2021, the average MT utilization rate increased from 2.4% to 15.3% in high-adoption states and from 1.6% to 5.6% in low-adoption states (Fig. 6). In contrast to our hypothesis that thyroidectomy rates would decrease more in high-adoption states, we observed that the decrease in thyroidectomy rates in low-adoption states (−5.3% [CI −3.5 to −1.9]) exceeded that of high-adoption states (−2.5% [CI −3.6 to −1.3]). The unadjusted DID was of −2.6% [CI −3.5 to −1.9]. After controlling for age, sex, MSA, and region, we observed an adjusted DID of −3.2% [CI −0.8 to −5.6]. However, the 2021 thyroidectomy rates in both high (13.4%)- and low (13.2%)-adoption states were comparable.

Molecular testing and surgery rates in high utilizers vs. low utilizers from 2015 to 2021 (N = 22,029, high-adoption state N = 16,273, low-adoption state N = 5756). DID, difference-in-differences; MSA, metropolitan statistical area.
Comparisons of Patients' Demographics Within High-Adoption and Low-Adoption States by Pre- and Postintervention
SMD >0.1 indicates a clinically significant difference between the pre- and postintervention groups.
Discussion
This retrospective national cohort study outlined trends in MT adoption over the last decade. In addition to identifying several populations where MT is less frequently utilized, we demonstrated a steady increase in MT utilization and a decrease in thyroidectomy rates since 2011, with a specific inflection point denoting increased MT adoption around 2015. However, in contrast to our hypothesis, we observed greater decreases in thyroidectomy rates in low-adoption states compared with high-adoption states, suggesting a correlative rather than causal relationship between MT utilization and thyroidectomy.
Literature surrounding MT has largely focused on performance and cost-effectiveness, while data on its utilization and adoption are lacking. We believe our study is the first to show the rate at which MT utilization has increased in the United States, from 0% to ∼10% nationally over the last decade. However, given that rates of indeterminate cytology are even higher, ranging from 15% to 30%, our study shows that MT technology is yet to fully saturate clinical practice on a national level. 1,3,4,7,14,15
There are several possible reasons why MT utilization has not fully saturated clinical practice in the United States. Our study suggests that factors often associated with limited access to health care also hinder access to MT. We observed that MT is less frequently utilized by underinsured, elderly, rural, and black patients. Commercial MT platforms cost an excess of $3000 per test and coverage by insurance can be variable. Although Hispanic patients more frequently received MT, race data were only available for Medicaid patients, and therefore, race observations may not be generalizable to the entire population.
Disparities have been similarly demonstrated across other aspects of thyroid cancer care in the United States, from diagnosis to treatment. Minority race and lower socioeconomic status have been associated with advanced thyroid cancer stage at presentation. 16 –18 Rates of postoperative complications after thyroidectomy are higher in racial/ethnic minority groups, 16,19 and patients with lower educational attainment are more likely to encounter inadequate thyroid cancer care. 16,20 MT therefore represents an additional area, with unique challenges due to its financial cost and variability in insurance coverage, where conscious effort must be paid to decreasing disparities in care.
The lack of full MT adoption may also be explained by skepticism regarding the practical impact of MT on surgical decision-making. Several single-institution investigations have challenged the clinical utility of molecular profiling for indeterminate thyroid nodules. 15,21,22 A 2016 study from Noureldine et al. found that MT altered surgical decision-making in only 7.9% of patients who underwent MT; in other words, other factors such as imaging characteristics and patient preference were concordant with MT results such that the added value of MT was small. 15 Specific situations in which MT was found to be redundant included highly suspicious sonographic features suggesting malignancy, large nodules causing compressive symptoms, and patient preference for surgery and against surveillance. Similarly, Huang et al. determined that MT, in relation to its preceding diagnostic steps (i.e., patient interview, sonographic evaluation, and FNA biopsy), did not significantly improve upon the ability to distinguish malignant thyroid nodules. 21
Another study of a high-volume thyroid center noted unchanged thyroidectomy rates even after adoption of MT, challenging the utility of MT. 22 Therefore, while the performance of MT has been validated and the negative predictive value sufficiently high to rule out malignancy, there may be a discordance between the actual and perceived performance of MT such that a significant amount of surgical decision-making nationally may be occurring independently of MT results.
Furthermore, it is important to understand the utility of MT in the context of its cost. Cost-effectiveness analyses (CEAs) calculate an incremental cost-effectiveness ratio (ICER), which is the quotient of incremental changes in costs over incremental changes in outcome, typically quality-adjusted life years. An assumption of CEAs analyzing MT is that given an intervention of MT or no MT, all differences in outcome are attributed to MT when in fact, MT may only alter decision-making in a fraction of these cases, as demonstrated by Noureldine et al. 15,23 –28
In other words, the benefit in outcome attributed to MT may be overestimated, and when the incremental benefit provided by MT is considered, the ICER may become less favorable for MT. In a comparison of reflexive versus selective strategies of MT, selective MT was the more cost-effective strategy if the costs of MT exceeded $1050, a threshold that current costs of MTs in the United States do exceed. 29
Despite these obstacles, our study shows that MT rates have increased nationally in the United States after the publication of the first large, multicenter validation study of MT in 2012. 8 Although adoption subsequently increased, our study suggests an additional inflection point occurred in 2015, likely representing the composite effect of the 2015 ATA guidelines, approval of the Afirma CPT code, and release of ThyroSeq v2 on clinical practice. 7 Previous studies have also documented the impact of the 2015 ATA guidelines on practice patterns. For instance, implementation of the more conservative 2015 ATA guideline-concordant treatment for low-risk patients with well-differentiated thyroid carcinomas resulted in decreased rates of upfront total thyroidectomies and completion thyroidectomies. 14,30 Use of radioactive iodine therapy after total thyroidectomies also decreased following publication of the guidelines. 14
Notably, although our study demonstrated an immediate and sustained increase in MT utilization, thyroidectomy rates remained in stable decline without a change in the trend post-2015. This stable decline may be attributable to a general decrease in aggressiveness in the US approach to the evaluation of thyroid nodules, as evidenced by increasing sonographic thresholds for FNA, lobectomy versus total thyroidectomy, and active surveillance for papillary microcarcinoma.
Discordance between the post-2015 acceleration of MT adoption and lack of equivalent post-2015 deceleration of thyroidectomy rates was also demonstrated in our DID analysis. We paradoxically demonstrated that low-adoption states had a larger decrease in thyroidectomy rates compared with high-adoption states. This suggests that the correlation we observed between declining thyroidectomy rates and increasing MT utilization may be correlative rather than causal and could be explained by other factors. However, absolute thyroidectomy rates in low- and high-adoption states in 2021 were comparable, suggesting that an overall decrease in aggressiveness toward pursuing thyroid surgery perhaps overshadowed this effect between 2015 and 2021.
Several limitations to this study require acknowledgment. Administrative data sets are generally limited by missing data, coding errors, and inadequate granularity. Particularly relevant is the lack of cytology results, requiring our analysis to evaluate MT and thyroidectomy rates as a proportion of FNAs as a whole rather than the subset with indeterminate cytology. However, given that rates of indeterminate cytology may be increasing, even approaching nearly 50% at an academic medical center, the number of total FNAs may be a more stable denominator with which to normalize MT utilization rates. 31
MT utilization was derived from claims data, and because CPT codes for the different MT subtypes may vary by payer and the CPT codes available at the time, there may be mixing and migration between MT subtypes within our cohort, which is why these subtypes were analyzed collectively rather than individually. MarketScan is largely limited to commercial insurers, which limits the applicability of our results across the range of insurers in the United States. Due to the heterogeneity of MT adoption within states, our state-level analysis perhaps obscured effects that would have possibly been observed on an institutional level. However, codes to identify individual treating facilities are not captured in MarketScan, which likely diluted the capture of a real effect of MT on thyroidectomy rates.
In summary, there has been an increase in MT utilization and a decrease in thyroidectomies over the past decade nationwide. The MT utilization inflection point at 2015 likely exemplifies how the recent ATA guidelines have directly influenced clinical practice patterns. The growing adoption of MT leads us to confront the true efficacy and cost-effectiveness of MT for betterment of patient care and the health care system more broadly. As we determine the conditions in which MT is best utilized, there also must be awareness and action against disparities in access to MT. Future studies of the real-world cost of MT, as well as analysis of the effects of MT on an institutional level rather than a state level, may improve our understanding of MT's true impact on thyroid nodule evaluation.
Footnotes
Acknowledgment
This work was presented as a poster at the ATA Annual Meeting 2023 in Washington, DC.
Authors' Contributions
Y.H.: formal analysis, methodology, and writing—review and editing; S.J.C.: writing—original draft; J.D.W.: writing—review and editing; J.H.K.: conceptualization and writing—review and editing; C.M.M.: writing—review and editing; J.A.L.: writing—review and editing; E.J.K.: conceptualization, methodology, writing—original draft, and writing—review and editing.
Author Disclosure Statement
The authors have no financial conflicts of interest to disclose.
Funding Information
This research did not receive any specific grant from funding agencies in the public, commercial, or nonprofit sectors.
Supplementary Material
Supplementary Table S1
