Abstract
Background:
The 2015 American Thyroid Association (ATA) thyroid nodule guidelines recommend selecting nodules for biopsy based on a sonographic pattern classification. These patterns were developed based on features of differentiated thyroid cancer. This study aimed to evaluate the performance and the inter-observer agreement of this classification system in medullary thyroid carcinoma (MTC).
Methods:
The medical records of all patients with MTC evaluated at the authors' institution between 1998 and 2014 were retrospectively reviewed. Only patients with presurgical thyroid ultrasound available for review were included in the study. Five independent reviewers assessed the stored ultrasound images for composition, echogenicity, margins, presence of calcifications, and extrathyroidal extension for each nodule. The presence of suspicious lymph nodes was also evaluated when presurgical lateral neck ultrasound was available for review. Each nodule was classified according to the ATA sonographic patterns. Inter-observer agreement was calculated for each sonographic feature and for the sonographic patterns. To validate the findings, a systematic review of the literature and meta-analysis on the sonographic features of MTC was conducted.
Results:
In this institutional cohort, the inter-observer agreement for individual sonographic features was moderate to good (κ = 0.45–0.71), and for the ATA classification it was good (κ = 0.72). Ninety-seven percent (29/30) of the MTCs were classified in the intermediate or high suspicion patterns. A total of 249 MTCs were included in the meta-analysis. Based on pooled frequencies for solid composition and hypoechogenicity, >95% of MTCs would be classified at least in the intermediate suspicion pattern, warranting the lowest-size threshold for biopsy (≥1 cm).
Conclusions:
The sonographic patterns proposed by the ATA perform well in MTC, and inter-observer agreement is good to very good.
Introduction
T
MTC represents around 5% of all thyroid cancers, but its early diagnosis has important prognostic implications (17). Ideally, the newly proposed ATA system should classify MTC in the categories with the lowest threshold for biopsy (≥1 cm for both the intermediate and high suspicion pattern), and inter-observer agreement should be high. This study aimed to determine how MTC would be classified by the ATA system and to measure the inter-observer agreement of the ATA classification for MTC.
Patients and Methods
In this Institutional Review Board–approved retrospective study, the records of all patients with MTC evaluated at the authors’ institution between 1998 and 2014 were reviewed. Presurgical thyroid ultrasound images were available for review in 30/93 (32%) patients with MTC evaluated at the institution, which constitutes the study cohort. The oldest ultrasound was performed in 2005, and the remainder after 2007. On average, the institutional study cohort was 55 years of age (range 26–82 years), 67% were women, and 87% were white. The MTCs were an average of 2.4 cm in maximum diameter (range 0.4–5.5 cm). RET germline analysis was available for 23 (77%) patients, and a mutation was identified in three (13%) of them. Other baseline characteristics of the patients are shown in Table 1. Tumor staging was based on the seventh edition of the American Joint Committee on Cancer classification for MTC (18). Five independent reviewers with expertise in thyroid ultrasound reviewed the images independently (P.V., D.L.K., J.B.T., H.S.L., and B.M.), though knowing these were ultimately proven to be MTC. Each investigator assessed information regarding composition, echogenicity, margins, calcifications, extrathyroidal extension, and presence of suspicious lymph nodes (when cervical ultrasound was available). Nodule composition was classified as solid, solid with areas of cystic degeneration (<20% cystic component), mixed (>20% cystic component), spongiform, or cystic. Echogenicity was classified as hypoechoic, isoechoic, hyperechoic, or heteroechoic with respect to the normal thyroid parenchyma. Margins were classified as regular or irregular. Calcifications were classified as none, microcalcifications, macrocalcifications, or rim calcification. Extrathyroidal extension and suspicious lymph nodes were rated as present or absent. The shape was considered taller than wide in the transverse view when the measurement difference was ≥2 mm. Hypoechogenicity, irregular margins, microcalcifications, presence of extrathyroidal extension, presence of suspicious lymph nodes, and shape taller than wide in the transverse view were considered suspicious sonographic features.
Stages of MTC according to the sevent edition of the American Joint Committee on Cancer classification.
Information was available in 23 patients (77%).
Information available in 17 patients (57%).
CEA, carcinoembryonic antigen.
The ATA considers five different sonographic patterns: (i) “high” suspicion for hypoechoic solid or partly cystic nodules with at least one other suspicious sonographic feature; (ii) “intermediate” suspicion for hypoechoic solid nodules without any other suspicious sonographic features; (iii) “low” suspicion for iso- or hyperechoic solid, or partly cystic nodules that have solid eccentric areas, without any other suspicious sonographic features; (iv) “very low” suspicion for spongiform, or partly cystic nodules without solid eccentric areas, without any other suspicious sonographic features; and (v) “benign” for pure cystic nodules. Before initiating the imaging review, some sonographic patterns were identified that are not clearly specified in the ATA classification, such as heteroechoic nodules with or without any other suspicious sonographic features and nodules that were iso- or hyperechoic and had at least one other suspicious sonographic feature. In these scenarios, it was predetermined that solid or partly cystic heteroechoic nodules without any other suspicious sonographic features would be considered “low” suspicion and that any solid or partly solid nodule with at least one other suspicious sonographic feature would be classified as “high” suspicion pattern, regardless of the echogenicity of the solid area.
The percent of overall agreement and Randolph's free-marginal multi-rater kappa for raters (n = 5) was calculated. Confidence intervals (CI) were calculated for kappa by non-parametric bootstrap using a first-order normal approximation (19,20). The kappa statistic ranges from 0 (no agreement) to 1 (full agreement). The inter-observer agreement is considered poor from 0.0 to 0.2, fair from 0.2 to 0.4, moderate from 0.4 to 0.6, good from 0.6 to 0.8, and very good if >0.8.
Systematic review and meta-analysis
Because MTC is relatively uncommon, any single-center series is limited and unlikely to provide sufficiently robust information to generalize into routine practice. For this reason, a systematic review of the literature was conducted, searching for other studies that had provided a detailed description of the sonographic appearance of MTC. The PubMed database was searched on March 3, 2016, for the terms “medullary thyroid carcinoma” and “ultrasound” or “ultrasonography.” The search was filtered for English/Spanish language, and reviews and case reports were excluded. The inclusion criteria for the meta-analysis were publications that provided a detailed description of the sonographic features of the primary tumor. The systematic review was conducted by a single author (P.V.). Nine studies met the inclusion criteria (11 –14,21 –25). See the flow diagram in Figure 1. One prior meta-analysis was identified in this topic, including articles from January 2000 to May 2013 (26). Nevertheless, a decision was made to conduct a new meta-analysis because the cases of the authors’ study and three other studies not included in the prior meta-analysis comprised one third of all reported cases. Four of the studies included in the meta-analysis compared the sonographic features of MTC and PTC. The overall prevalence of each suspicious sonographic feature in those PTCs was calculated to allow for a more robust comparison with the results of the current MTC meta-analysis (11 –14).

Flow diagram of systematic review.
All analyses were performed using the statistical software package R v3.1.3. For the percentages of features in the current study, exact binomial confidence intervals were constructed. Inverse variance weighted proportions and confidence intervals were used in the meta-analysis. Proportions were calculated over all studies and stratified by time to explore a possible time trend.
Results
According to most observers, 93% of the MTCs were solid, whereas the other 7% were solid with small areas of cystic degeneration; 83% were hypoechoic; 70% had irregular margins; 33% were taller than wide in the transverse view; 20% had microcalcifications; 23% had macrocalcifications; 7% (n = 2) had ultrasonographic evidence of extrathyroidal extension; and 30% (n = 9) had suspicious cervical lymph nodes (50% of the 18 patients with lateral neck ultrasound available for review). Extrathyroidal extension was present on histology in 12 tumors, but only in one of two MTCs in which this feature was characterized by most observers. Lymph node metastases were confirmed in nine patients, including eight of the nine with suspicious lateral neck ultrasound and in one additional patient who had small (<5 mm) metastatic deposits in the central neck only. One (3%) MTC exhibited no suspicious sonographic features, whereas the remaining MTCs exhibited one (3%; n = 1), two (23%; n = 7), three (20%; n = 6), four (40%; n = 12), or five (10%; n = 3) suspicious features.
Inter-observer agreement was moderate to good for each of the individual sonographic features, with the kappa coefficient ranging from 0.45 to 0.71 (Table 2). Those features were (from lowest to highest concordance): irregular margins, solid composition, hypoechogenicity, microcalcifications, extrathyroidal extension, and presence of suspicious lymph nodes. All observers classified 90–100% of the MTCs as “intermediate” or “high” suspicion sonographic pattern. As considered by most observers, 80% (n = 24), 17% (n = 5), and 3% (n = 1) were classified in the “high,” “intermediate,” and “low” suspicion sonographic patterns, respectively. Nonetheless, 6/30 nodules evaluated were classified as “low” suspicion pattern by at least one of the observers. The percentage of overall agreement for the ATA classification was 78%, with a kappa coefficient of 0.72 ([CI 0.59–0.85]; good agreement). As the “intermediate” and “high” suspicion patterns have the same size threshold for biopsy (≥1 cm), it is reasonable to combine them for analysis. The overall agreement combining these two categories increased to 91%, with a kappa coefficient of 0.88 ([CI 0.80–0.97]; very good agreement). Representative images of each sonographic pattern are shown in Figure 2.

Representative images of different American Thyroid Association (ATA) sonographic patterns identified. (
Randolph's free-marginal multi-rater kappa.
Assessed only in 18 patients with lateral neck ultrasound available for review.
CI, confidence interval; MTC, medullary thyroid carcinoma.
Data included in the meta-analysis are summarized in Table 3. All of the included series had a retrospective design, and together they encompassed a total of 249 tumors in 225 patients, 64% of them women, with a mean age of 50 years. The mean size of the tumors was 2.0 cm. The pooled frequencies of the sonographic features were 93% for solid composition, 96% for hypoechogenicity, 39% for irregular margins, 11% for shape taller than wide in the transverse view, 32% for microcalcifications, and 26% for macrocalcifications. Extrathyroidal extension was only reported in one other series, in which it was detected in 42% (8/19) of the tumors (22). The presence of lymph node metastasis was also reported in only one other series, in which it was identified in 50% (3/6) of the tumors (21). When comparing the studies published before 2014 and the studies published after 2014, there was a significant increase in the percentage of nodules reported to show irregular margins (33.8% vs. 51.9%) or with a shape taller than wide in the transverse view (7.7% vs. 30.3%). The confidence interval of the other suspicious sonographic features, including solid composition and hypoechogenicity, overlapped in both periods and was therefore not statistically significant.
Hypoechoic includes deep/marked hypoechogenicity. Trimboli et al. provided echogenicity on solid nodules only (8/12).
Irregular margins includes spiculated and ill-defined margins.
Studies included in prior meta-analysis (26).
Compared with PTCs, MTCs are less likely to have irregular borders (39% vs. 78%), microcalcifications (32% vs. 47%), and a shape taller than wide in the transverse view (11% vs. 46%). Conversely, MTCs are hypoechoic (96% vs. 90%) and have macrocalcifications (26% vs. 6%) more often than PTCs (Table 4).
Hypoechoic includes deep/marked hypoechogenicity. Irregular margins include spiculated and ill-defined margins. Macrocalcifications includes rim calcifications.
Weighted percentage (inverse variance method).Taller than wide shape calculation based on 260 PTCs with these data available.
PTC, papillary thyroid carcinoma.
Discussion
This study describes the sonographic appearance of MTC. The findings suggest that the ATA sonographic patterns perform well in MTC, warranting the lowest-size threshold for biopsy in almost all cases. It was also found that the inter-observer agreement for the ATA sonographic patterns in this cohort of MTCs is better than that for individual sonographic features, being good to very good overall.
This study has several limitations, including the use of a retrospective analysis of ultrasound images by observers not blinded to the final diagnosis of the nodule. However, details of the final pathology, including extrathyroidal extension or presence of lymph node metastases, were blinded to all observers. Each observer reviewed independently the electronic images on a different computer, potentially impacting image interpretation due to equipment variations in image resolution that could reduce the concordance between observers. Finally, thyroid ultrasound images were available for review in only 33% (30/91) of the MTCs evaluated at the authors’ institution, and neck ultrasound was available for review only in 60% (18/30) of them. These findings reflect our previous institutional practice of using computed tomography imaging rather than ultrasound for preoperative evaluation of thyroid cancer.
To strengthen these results and buttress the conclusions, a systematic review and meta-analysis was undertaken of published studies detailing the sonographic appearance of MTC. That analysis confirms that the sonographic features of the MTC cohort are similar to previously described series (11 –14,21 –25). The rate of nodules with a shape taller than wide in the transverse view in the present series (33%) is the second highest observed, after one other publication that included only six MTCs (25). In this study, this feature was common to all observers, and was defined as an anterior-posterior diameter at least 2 mm greater than the medial-lateral diameter. Whereas this feature is usually considered present if the ratio anterior-posterior diameter/medial-lateral diameter is >1, a stricter criterion was used to allow for small differences attributable to measurement technique. A recent study demonstrated that this feature was more likely present in MTCs <1 cm than it was in larger tumors (71% vs. 13%) (14). However, only five nodules in the current series were ≤1 cm in maximum diameter, none of which met the criterion for being considered taller than wide in the transverse view, and only one had a ratio of >1. Of the 10 nodules that met this criterion in the present series, the difference was of 2 mm in two nodules, 3 mm in four nodules, and ≥4 mm in the other four nodules. All patients with a shape taller than wide exhibited at least one other suspicious sonographic feature.
All MTCs in the present cohort were solid, or predominantly solid (<20% cystic component), and only two were described by most observers as having areas of cystic degeneration. The inter-observer agreement was moderate for solid composition (κ = 0.48), with small foci of cystic degeneration providing the primary source of disagreement. No nodules were classified as either mixed (>20% cystic component), spongiform, or cystic by any of the observers. This is in agreement with prior publications, with only five predominantly cystic MTCs (12,23) and two spongiform MTCs previously reported (13), representing <3% of all MTCs in this meta-analysis.
The new ATA classification does not distinguish between hypoechogenicity with respect to the normal thyroid parenchyma, and marked hypoechogenicity, with respect to the strap musculature. So, any degree of hypoechogenicity was classified under the same category (1). Heteroechogenicity is also more frequently found in malignant than benign nodules, with a positive likelihood ratio similar to that of “solid composition” or “solitary nodule” (3). However, this parameter is not included in the new ATA classification, and has not been reported in most studies describing the sonographic features of MTC (11,13,14,22 –24). In two studies, however, heterogenous echotexture was analyzed as a separated variable from echogenicity, and was found in 41% (19/46) and 83% (5/6) of the MTCs evaluated, respectively (12,25). Heteroechogenicity was included as a type of echogenicity, not as a different variable, which is likely the reason for this study having one of the lowest rates of hypoechogenicity. Of the five nodules not classified as hypoechoic, four were classified as heteroechoic by at least one observer, and this was the most common description in one of the nodules.
The inter-observer agreement for individual suspicious features was moderate, as has been observed in other studies not focused on MTC (27 –29). Irregular margins had the lowest inter-observer agreement, with a kappa coefficient of 0.45. This feature has also been reported by others to have a lower inter- and intra-observer agreement than most other suspicious features (27 –29). In contrast, the interpretation of lateral neck lymph node involvement achieved the highest inter-observer agreement, with a kappa coefficient of 0.71.
In the present cohort, 97% of patients had a sonographic pattern classified as “intermediate” or “high” suspicion by most observers, with good to very good inter-observer agreement. However, six of the MTCs were characterized as having a “low” suspicion pattern by at least one observer. Nonetheless, applying the proposed size threshold for biopsy, fine-needle aspiration would have been performed in five of these “low” suspicion nodules due to their size being >1.5 cm (n = 3), or the clinical history (known MEN2 syndrome and elevated plasma calcitonin; n = 2). Biopsy might have been delayed in only one nodule (1.3 cm) by one of the observers. Moreover, only three of these six nodules were classified in the “low” suspicion pattern following the criteria specified in the ATA classification; the other three nodules were heteroechoic without any other suspicious features. A decision was made a priori to include these nodules in the “low” suspicion category. There are no data to suggest the real prevalence of malignancy of this sonographic pattern. If these heteroechoic nodules had been classified in the “intermediate” suspicion category, the inter-observer agreement for the sonographic pattern would be significantly higher, and no nodule would have been missed for biopsy.
Two previous studies, including 77 and 134 MTCs, respectively, have reported the sonographic pattern of MTC using the Kuma Hospital classification system (15,16). That classification also defines five sonographic patterns: classes 1 to 5, with intermediate steps of 0.5. Nodules classified in class 3.5 or higher are deemed malignant, with a reported 97% positive predictive value (5). In both studies, two thirds of all MTCs were identified as having a malignant sonographic pattern (class ≥3.5) that would correspond to the “high” suspicion pattern of the ATA classification. Also, in both studies, class 3, equivalent to the “intermediate” suspicion pattern of the ATA classification, was grouped with the other “benign” sonographic patterns for analysis purposes, preventing any comparisons at this level with the ATA classification. In neither study were the specific sonographic features described, and therefore these studies could not be included in the present meta-analysis. It is very likely that most of these “non-malignant looking” MTCs were hypoechoic nodules (96% prevalence in the meta-analysis) without any other suspicious sonographic features (ATA “intermediate” suspicion sonographic pattern). A higher proportion of MTCs than PTCs is expected to fall under this sonographic pattern, as all suspicious sonographic features are significantly less often present in MTCs, with the exception of hypoechogenicity.
It has been suggested that the sonographic appearance might also be associated with tumor aggressiveness or extent (15,16). Remarkably, 74% of the nodules in the present cohort were found to have at least two suspicious sonographic features, as characterized by most observers. This is a significantly greater proportion than has been reported previously (13,16), suggesting that the current MTC cohort includes more aggressive variants of MTC. The presurgical serum calcitonin and carcinoembryonic antigen levels were high in most of these patients, findings known to be associated with high tumor burden, further supporting this possibility (30,31). In addition, the proportion of patients presenting with compressive symptoms (13%), lateral lymph node metastasis (47%), and distant metastasis (13%) is higher than in another series describing the sonographic features and providing clinical information (16), though similar to reports of other clinical MTC series (32 –34). The new ATA classification performance and the inter-observer agreement may be better for these more aggressive tumors. However, it is remarkable that >95% of the MTCs reported to date had solid/predominantly solid composition and were hypoechoic. This suggests that the lowest threshold for biopsy will be met for most MTCs, with only very rare exceptions in which delaying the biopsy might, in any case, have less severe consequences (15,16).
The 2015 ATA thyroid nodule guidelines recommend not to biopsy thyroid nodules <1 cm, even in the presence of “high” suspicious sonographic pattern, based on the evidence that small PTCs are unlikely to become clinically relevant (1). However, MTCs <1 cm may be indistinguishable from PTCs <1 cm (14,24). As a consequence, the diagnosis of small MTCs may be delayed, decreasing their chance of achieving biochemical cure (35). To detect these tumors early, it might be worth measuring basal serum calcitonin in patients with sonographically suspicious nodules <1 cm for whom biopsy is not recommended.
In conclusion, the sonographic patterns proposed in the recently revised ATA guidelines for the management of patients with thyroid nodules and differentiated thyroid cancer perform well in MTC, and inter-observer agreement is good overall.
Footnotes
Acknowledgments
We thank Susan Sharpe, Library Supervisor of the Moffitt Biomedical Library, for assistance with conducting the literature search. Dr. Pablo Valderrabano gratefully acknowledges the financial support from the Alfonso Martín Escudero Foundation (Spain).
Author Disclosure Statement
No competing financial interests exist.
