Abstract
Background:
Little is known about the application of the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) in pediatric thyroid nodules. This meta-analysis was aimed to investigate the use of TBSRTC in the pediatric population.
Methods:
Relevant articles were searched in PubMed and Web of Science. Meta-analysis of proportion and its 95% confidence interval (CI) were computed utilizing the random-effect model. We used subgroup analyses and meta-regression to explore the sources of heterogeneities. Egger's regression test and funnel plot visualization were used to examine publication bias.
Results:
We included 17 articles comprising of 3687 pediatric thyroid nodules for meta-analyses. TBSRTC outputs including frequency and risk of malignancy (ROM) for the majority of categories were not statistically different from recently published meta-analysis of 145,066 thyroid nodules in adult patients. The resection rate (RR) in the pediatric group was significantly higher in most of the categories compared with published adult data: benign, 23.2% [CI = 18.6–27.9] vs. 13.0% [CI = 9.5–16.5]; atypia of undetermined significance/follicular lesion of undetermined significance, 62.6% [CI = 50.3–74.9] vs. 36.2% [CI = 29.9–42.5]; follicular neoplasm/suspicious for follicular neoplasm, 84.3% [CI = 75.2–93.4] vs. 60.5% [CI = 54.5–66.5]; and suspicious for malignancy, 93.8% [CI = 90.1–97.6] vs. 69.7% [CI = 64.0–75.5].
Conclusion:
TBSRTC is a valuable tool to make clinical decisions for pediatric patients with thyroid nodules. Pediatric patients with benign and indeterminate thyroid nodules had a higher RR than adult counterpart, but ROM of these categories in adults and children was not statistically different suggesting a potential risk of overtreatment in pediatric patients. Determining the best treatment guidelines and additional tools for risk stratification must be a top priority to precisely identify the target patient groups for surgical intervention.
Introduction
Fine-needle aspiration (FNA) is a useful tool to guide therapeutic management for patients with thyroid nodules (1,2). Currently, most thyroid FNAs are classified based on the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC), which includes six diagnostic categories: nondiagnostic (ND), benign, atypia of undetermined significance/follicular lesion of undetermined significance (AUS/FLUS), follicular neoplasm/suspicious for follicular neoplasm (FN/SFN), suspicious for malignancy (SM), and malignant (1,3). TBSRTC aimed to implement the uniform terminology and to provide several numerical outputs, such as frequency of each diagnostic category, resection rate (RR), and risk of malignancy (ROM) (4 –6). While ROM is a major decision driver, indicating the need for surgery, the frequency of diagnostic categories and RR serves as quality control measures. For example, an unusually high ND rate may prompt an audit of the whole thyroid FNA workflow in an institution. Since the implementation of TBSTRC in 2009, the utility of TBSTRC has been widely acknowledged (7).
Pediatric thyroid nodules are uncommon and only account for 1–2% of all thyroid nodules (8). The American Thyroid Association (ATA) has published a separate management guideline for pediatric thyroid nodules (9) because these nodules were shown to have different demographic, clinical, prognostic, and molecular characteristics when compared with adult cases (10 –14). Although the analytical performance of FNA in children is considered similar to that in adults (9), several institutional series reported that RR and ROM of pediatric thyroid nodules may differ from those of adult patients (6,15 –19).
Therefore, while there is evidence that the application of TBSRTC in pediatric thyroid FNA is an accurate tool to tailor the management of patients (17,19), the ATA pediatric task force recognized that available data are limited at the moment (9). We performed this meta-analysis to determine the utility of TBSRTC in the pediatric population.
Materials and Methods
Search strategy and study identification
We identified relevant articles in PubMed and Web of Science from January 2007 to June 2020 using the search term “thyroid AND (FNA OR fine-needle aspiration) AND (children OR childhood OR adolescence OR pediatric).” We also reviewed the reference list of selected articles to search for additional data. We followed the recommendation of the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) statement while conducting this meta-analysis (20).
Selection criteria and abstract screening
After searching the electronic database, search results were imported into EndNote (Clarivate Analytics, Philadelphia, PA). Two reviewers independently screened the titles and abstracts for potential articles using predetermined criteria. Studies were included if they fulfilled all the following criteria: (i) reporting FNA results of thyroid nodules using TBSRTC 2009 or other national reporting systems convertible to TBSRTC [e.g., The Italian Working Group classification (21)]; (ii) baseline data of pediatric population; and (iii) providing adequate data regarding the number of FNAs, the number of cases undergoing resection, and the number of malignancies following surgery for at least one category. The exclusion criteria were (i) review articles; (ii) proceeding articles, posters, or theses; or (iii) studies with a duplicated population. If potentially overlapping data were found (i.e., same institution and the same study period), we selected the study with the largest number of FNAs. Discrepancies between two reviewers were resolved by discussion and consensus.
Full-text screening and data extraction
The full texts of included studies were downloaded and screened independently by two reviewers. A standardized data extraction form was used to extract the following information: institution, country, year of publication, study design, the number of patients, age, sex, the number or proportion of US-guided FNAs, the number of FNAs, the number of surgeries, the number of malignancies following histopathologic examination for each TBSRTC category, and the number of histological diagnoses. Discussion and consensus were used to resolve discrepant results between two reviewers.
Data analysis
RR in each TBSRTC category was identified by dividing the number of cases undergoing surgery by the total number of FNAs in the corresponding category. ROM was considered the proportion of malignant cases confirmed by histopathologic examination among the resected cases. To avoid bias, cases of noninvasive follicular thyroid neoplasms with papillary-like nuclear features (NIFTP), if present, were classified as malignant.
We utilized the independent samples t-test (Student's and Welch's t-test) to compare the TBSRTC outputs of the pediatric thyroid nodules with the adult nodules, the latter data were extracted from the most recent meta-analysis of 145,066 thyroid nodules (6). However, it is important to note that the statistical comparisons of the two groups were mainly based on the 95% confidence interval (CI) values. If the CIs of these two groups did not overlap each other, it indicated a statistically significant difference. Otherwise, the difference was not considered statistically significant (22,23). The calculated p-value from the independent sample t-test was only used to help the readers to have an easier interpretation of the presented data. Because the p-value calculation does not take into account the weight of each study, it might lead to bias if the result interpretation was solely based on the p-value.
We utilized JAMOVI and Comprehensive Meta-Analysis (Englewood, NJ) for statistical analyses. Meta-analyses of proportion and their CIs were pooled using the random-effect model.
We quantified heterogeneity across the studies using the I2 statistic (24). Heterogeneity was classified into three degrees as “low,” “moderate,” and “high” as previously described (25). The origins of heterogeneity were further examined using meta-regression and subgroup analysis.
Risk of bias and publication bias assessment
The quality of the included studies' method was independently evaluated using the National Health, Lung, and Blood Institute (NHLBI) assessment tool (26). Egger's regression test and funnel plot of estimates were used to assess the publication bias. A p-value of <0.05 was considered statistically significant for publication bias.
Results
After removing duplicates from PubMed and Web of Science, we identified a total of 1005 studies for abstract screening from which 50 were included for full-text evaluation. We further excluded 33 of them and, finally, selected 17 studies with 3687 nodules for meta-analysis (Fig. 1) (16 –19,27 –39).

Flowchart for study selection.
Among these nodules, 1426 (38.7%) were selected for surgery and 683 of them turned out to be malignant (47.9% of resected; 18.5% of all) after histopathologic examination. There were two studies (17,28) in which the authors did not report data for AUS/FLUS and FN/SFN categories separately but only provided the combined data for these two categories. All the included studies were retrospective. Most studies used the TBSRTC 2009 to classify thyroid FNA, and only one study used the Italian Working Group classification (17). Table 1 shows the characteristics of the 17 included studies. Using the NHLBI assessment tool, the majority of included studies were of fair to good quality (Supplementary Table S1).
Baseline Characteristics of Included Studies
F:M, female to male; FNA, fine-needle aspiration; NA, not accessible.
The frequency, RR, and ROM of each TBSRTC category in the pediatric population
Similar to adults (6), the benign category was most commonly diagnosed (62.3% [CI = 58.2–66.5]) in pediatric thyroid FNA. The RR was highest in the malignant category (92.0% [CI = 87.8–96.2]) and lowest in the ND category (21.0% [CI = 14.2–27.9]). The ROM was highest in the malignant category (98.9% [CI = 97.8–99.9]) and SM (90.5% [CI = 85.0–95.9]). We found a significant level of heterogeneity in most of the analyses for RR (I2 ≥ 50%). The ROM results were more robust among the institutions (I2 < 25%), except in the indeterminate categories.
We observed a tendency of increased ROM in pediatric patients aged ≤18 years, particularly in the ND and indeterminate nodules, compared with the whole pediatric population. However, there were no statistically significant differences between the two age groups (Supplementary Table S2).
Among the resected nodules, 743 (52.1%) and 683 (47.9%) nodules were benign and malignant, respectively. Papillary thyroid carcinoma (PTC) was the most common type of malignancy (88.4%), followed by follicular thyroid carcinoma (7.3%). There was only one NIFTP case reported in the included studies (16).
The TBSRTC outputs of all categories are shown in Table 2. Compared with adult data (6), there were no statistically significant differences concerning frequency and ROM, with the only exception being that ROM of SM category in the pediatric population was significantly higher than that in the adult group. For pediatric nodules, the RR of most categories was significantly higher when compared with adult nodules, most notably in the indeterminate categories (AUS/FLUS, FN/SFN, and SM) (Table 2 and Figs. 2 and 3). Pooled TBSRTC outputs of Western and Asian countries are shown in Supplementary Table S3. There were no significant differences between these two geographic regions.

Resection rate of pediatric thyroid nodules in AUS/FLUS (

Resection rate and malignancy risk of six Bethesda system categories in pediatric and adult population. ROM, risk of malignancy; RR, resection rate.
Comparison of Adult and Pediatric Thyroid Fine-Needle Aspiration Using the Bethesda System
Data were extracted from the Vuong et al. study (6).
The statistical comparisons of the two groups were based on the CI values; bold values suggest a significant difference between two groups since the CIs of these two groups do not overlap with each other.
AUS/FLUS, atypia of undetermined significance/follicular lesion of undetermined significance; CI, 95% confidence interval; FN/SFN, follicular neoplasm/suspicious for follicular neoplasm; ND, nondiagnostic; SM, suspicious for malignancy.
Sensitivity analysis and publication bias
Following the leave-one-out method, no outlier studies significantly altered the overall meta-analysis results and a considerable amount of heterogeneity was still present. To search for publication bias, funnel plots visualization and Egger's regression test pooled from individual studies were carried out. Funnel plots showed no evidence of publication bias among the included studies, and Egger's regression test further confirmed the absence of publication bias (Supplementary Figs. S1 and S2).
Discussion
TBSRTC has been extensively used in classifying thyroid nodules (1), but much less is known about its utility in the pediatric population, mostly because of the rarity of thyroid nodules in children. This study provided insights regarding the use of TBSRTC in pediatric patients with thyroid nodules. To the best of our knowledge, this is the first meta-analysis to summarize the significance of TBSRTC in pediatric nodules. The study by Vuong et al. (6) was used as a comparison because it is the most recent meta-analysis on TBSRTC outputs in adult population. Furthermore, it included a fair amount of data from Asian institutions, which were mostly neglected by other existing meta-analysis studies (4,5)
There was no consistency in the age threshold for pediatric patients among the included studies, which applied the upper limit from 17 to 21 (Table 1). The ATA has recommended the age of 18 as the cutoff age for the pediatric group (9). At the same time, the American Academy of Pediatrics has identified the upper age limit of pediatric as 21 years (40). When analyzed, the subgroup of studies using patients aged ≤18 years, we found no statistically significant differences in TBSRTC outputs in comparison with the whole series (Supplementary Table S2).
Compared with the adult cases, the RRs of pediatric patients in all TBSRTC categories were significantly higher with the only exception in the ND category (Table 2). The largest absolute differences were observed in the indeterminate categories (Fig. 3), suggesting that clinicians dealing with pediatric nodules more frequently opted for surgery. An explanation for this variation is that the current guidelines (e.g., ATA) recommend definitive thyroidectomies or lobectomies for children with indeterminate thyroid nodules because of the existing knowledge of high ROM in these categories in the pediatric population (9). Furthermore, surgical resection may be indicated in children without malignant FNA cytology if the thyroid nodules increase in size during the follow-up or exceed 4 cm in size. Pediatric patients are more likely to present with palpable nodules, whereas most thyroid nodules in adults are incidentally detected by imaging, and there is evidence that sporadic pediatric thyroid cancer are larger and more likely to present with more advanced disease than adult ones (41 –45), which could be another factor contributing to surgical intervention. As it was learned from other childhood cancers (46,47), an uncertain diagnosis (e.g., indeterminate thyroid nodule) in pediatric patients may cause significant psychological burdens to their families; therefore, parents may accept more diagnostic surgeries. Additionally, active surveillance has been recommended for indeterminate cytology and low-risk thyroid cancer in adult patients (48,49), but such studies have not been performed in the pediatric population.
The ATA pediatric guidelines stated that indeterminate FNA categories accounted for 35% of pediatric FNA based on the limited data available (9). Our meta-analysis found that the actual frequency of indeterminate pediatric nodules, including AUS/FLUS and FN/SFN, was two times lower (16.6%). In their institutional series, Cherella et al. reported that AUS/FLUS nodules were less likely to be malignant than FN/SFN (44% vs. 71%, respectively), advocating that different management strategies may need to be considered for these two categories (18). However, our study could not confirm such a considerable gap in ROM between both indeterminate categories. At the same time, we clearly demonstrated that pediatric SM nodules should not be considered indeterminate because of the very high ROM (90.5%), which is close to the malignant category.
A noteworthy result from this study is that the ROMs in indeterminate categories were not statistically different between pediatric and adult series despite a significantly higher RR in the pediatric group. Interestingly, such trend was also observed in ND and benign nodules, which collectively accounted for 2/3 of all pediatric nodules and ∼40% of surgically resected nodules. Many previous studies focused on indeterminate nodules, and this finding has been overlooked. A higher RR but low ROM is well illustrated by the case of benign nodules, which were operated much more frequently in children (23.2% vs. 13%), but the ROM was lower than that in adults (4.6% vs. 8.0%). The findings of our meta-analysis raise a question of whether pediatric patients with nonmalignant nodules are being overtreated. The postoperative complications may have a distressing effect on children since they can physically and psychosocially affect these patients. A careful management protocol for pediatric thyroid nodules should be optimized to reduce the rate of overtreatment in pediatric patients with benign or indeterminate FNA result.
Among the resected nodules, most of them were benign and PTC was the most common malignant histological type, which is consistent with published data (41,50,51). NIFTP, a newly established entity, was reported as exceedingly rare in children consisting of about 2% of pediatric nodules, which is less common than in the adult population (52,53). We identified only one NIFTP (1/1426) case in this series. This extremely low prevalence might, however, underestimate the real rate of NIFTP in our study because most of the included studies were published before the NIFTP diagnosis was established. The ATA and TBSRTC recently proposed molecular testing for indeterminate nodules (3,54), which is aimed at reducing the rate of unnecessary diagnostic surgeries. Preoperative molecular workup, particularly multigene panel testing, is now widely used in North America, but it is not readily available elsewhere (55). Unlike adult cases, pediatric nodules with a genetic alteration are most likely associated with a diagnosis of cancer (56) making the combination of FNA cytology and molecular testing a valuable tool to discriminate benign from malignant nodules. In this meta-analysis, we found only two series employing FNA cytology combined with preoperative molecular testing (33,57). Interestingly, all cases with genetic alterations were malignant on histopathology examination. Somatic RAS mutations and RET/PTC rearrangements are the most common genetic alterations in sporadic pediatric thyroid cancer (33,57). In radiation-induced pediatric thyroid cancer, BRAF mutations were the most prevalent genetic event in Fukushima PTC patients following the nuclear power plant accident (58), which is different from post-Chernobyl accident-associated PTC cases (59). Post-Chernobyl thyroid cancer is associated with a solid growth pattern and increased aggressiveness (60). Following the Fukushima power plant accident, however, the characteristics of PTC did not change significantly over the years (61).
Although this analysis provides useful information about the values of TBSRTC in pediatric thyroid nodules, there are several limitations to this meta-analysis, mainly due to the level of heterogeneity. First, the difference in age cutoff value might contribute to the significant level of heterogeneity, although we have determined that there were no differences in TBSRTC outputs between the age groups stratified by cutoff of 18 years. Next, only rare studies reported the mean/median diameter of the nodules (29,31,41,62). Meta-regression can help analyze whether the difference in tumor diameter affects the degree of heterogeneities. Unfortunately, this was not possible due to insufficient data for meta-regression analysis. To avoid bias caused by missing data of NIFTP in published studies before the NIFTP era, we considered all noninvasive encapsulated follicular variant PTC (potential NIFTP diagnosis) in malignant tumor category. Given the reported low rate of NIFTP in pediatric thyroid nodules (52), this is not likely to affect the ROM compared with the adult group. Other factors that could contribute to the significant level of heterogeneity include geographic variation of pediatric thyroid cancer prevalence and difference in diagnosis among individual institutions. It should also be emphasized that TBSRTC itself has other well-recognized limitations, which are associated with differential diagnoses of follicular-patterned hyperplastic/adenomatous nodules, follicular adenoma/carcinoma, and some cases of follicular variant of PTC (63).
The statistics in this meta-analysis and in the adult series were mainly based on summary level data, which might limit the findings compared with meta-analysis using individual patient data (IPD). However, a meta-analysis of IPD would require substantial time and efforts to contact the authors of included studies to obtain their IPD and generate a consistent data set across all included studies. Given the huge numbers of included studies, ethical and confidentiality concerns when using IPD, it is very difficult to perform such a study on a large scale comparable to this meta-analysis. Recently, it has been recommended that TBSRTC outputs at each individual institution should be established, which can inform management decisions at the local level (64). However, such an approach is difficult to apply to pediatric nodules due to their rarity, which underscores the value of our meta-analysis.
Ideally, it would be more precise and informative to obtain data on adult and pediatric patients from the same institutions. Given the rarity of pediatric thyroid nodules and the nature of meta-analysis, it is not practical to perform such an analysis. There are established evidences that the management of thyroid nodules is different among geographic regions (6,65). As a result, the comparisons between adult and pediatric series could be profoundly impacted if the included data are skewed toward a specific geographic region. In our previous meta-analysis (6) and this study, studies from Western countries, particularly from the United States, were most frequent, which might decrease the risk of bias. Although we found certain significant differences between Western and Asian adult series (6), TBSRTC outputs in pediatric population of these two areas were similar. It reflects more uniform management guidelines of Western and Asian practice because of the high cancer rate in pediatric thyroid nodules (41,66,67).
In conclusion, this study showed that the six-tier TBSRTC is a valuable tool to predict ROM, and, therefore, to tailor clinical decisions in pediatric thyroid nodules. Irrespective of the cytological category, thyroid nodules in children were operated more often than in adults. Pediatric patients with benign and indeterminate thyroid nodules were more likely to have surgery, but the ROM in adults and children was not statistically different, which may potentially indicate a risk of overtreatment. Determining the best treatment guidelines and additional tools for risk stratification must be a top priority to avoid the risk of unnecessary surgeries for pediatric patients with nonmalignant cytology.
Statement of Ethics
The article is exempt from ethical committee approval since this is a systematic review and meta-analysis.
Footnotes
Authors' Contributions
H.G.V.: conceptualization, data curation, formal analysis, investigation, methodology, software, validation, writing original draft, and editing. D.G.B.C.: data curation, formal analysis, investigation, methodology, validation, writing review, and editing. L.M.N. and T.Q.B.: data curation, formal analysis, investigation, writing review, and editing. L.H. and C.K.J.: investigation, methodology, software, validation, writing review, and editing. K.K.: investigation, methodology, validation, writing review, and editing. A.B.: conceptualization, investigation, methodology, software, validation, writing original draft, and editing.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Figure S1
Supplementary Figure S2
Supplementary Table S1
Supplementary Table S2
Supplementary Table S3
