Abstract
Background:
Work-up of thyroid nodules remains challenging. Fine-needle aspiration (FNA) has been shown to be the most cost-effective way to select patients for surgery with sensitivities of 54%–90% and specificities of 60%–96% for the detection of malignant lesions. Ultrasound-based real-time elastography (RTE) enables the determination of tissue elasticity and has shown promising results for the differentiation of thyroid nodules. A meta-analysis was performed to assess the overall performance of RTE for the differentiation of thyroid nodules.
Methods:
Literature databases were searched. The inclusion criteria for studies were the use of FNA cytology histopathology of surgical specimens as the diagnostic reference standard and assessment of sensitivity and specificity of RTE. The meta-analysis was performed using an inverse variance method and the Der Simonian and Laird Random effect estimator in case of established heterogeneity.
Results:
Eight studies that included a total of 639 thyroid nodules were analyzed. The overall mean sensitivity and specificity for the diagnosis of malignant thyroid nodules by RTE of the eight studies was 92% confidence interval 88–96 and 90% confidence interval 85–95, respectively. A significant heterogeneity was found for specificity of the different studies.
Conclusions:
RTE has a high sensitivity and specificity in the evaluation of thyroid nodules. This technique might be useful in conjunction or even instead of FNA to select patients with thyroid nodules for surgery.
Introduction
Methods
Real-time elastography
RTE is an imaging technique to directly reveal the physical property of tissue with conventional ultrasound probes. Tissue elasticity distribution is calculated by the strain and stress of the examined tissue. The calculation of tissue elasticity distribution is performed in real-time and the examination results are represented as color-coded images over the conventional B-mode image (blue = hard tissue, red and green = soft tissue). Details have been described in previous studies (14,15). RTE of the thyroid gland is performed with a high-frequency (7.5–13 MHz) probe. The probe is placed on the neck and a light pressure is applied for measurement. The freehand compression applied on the neck region while performing RTE was standardized by real-time measurement displayed on a numerical scales of 1–5 or 1–6. Only images obtained with an intermediate level of pressure were used in the different studies. The region of interest for the elastography examination is selected by the operator including the nodule and surrounding normal thyroid tissue. Video clips or single images are stored. The entire examination lasts ∼5–10 minutes per patient. Qualitative (16 –21) and or quantitative (22,23) criteria are used for the interpretation of RTE. Qualitative classifications differentiating four to six patterns have been reported for RTE. The most frequently used is one that differentiates four RTE patterns (17 –19). In this scheme RTE pattern 1 is a nodule that displays homogeneously in green (soft), RTE pattern 2 is a nodule that displays predominantly in green with a few blue areas or spots, RTE pattern 3 is a nodule that displays predominantly in blue with few green areas or spots, and RTE pattern 4 is a nodule that displays completely in blue (hard). RTE patterns 3 and 4 are read as a malignant thyroid nodule, and RTE patterns 1 and 2 are read as a benign thyroid nodule.
Literature search
A PubMed Medline search was performed with the keywords “elastography” and “thyroid” up to December 2009 to evaluate the performance of RTE for the differentiation of benign from malignant thyroid nodules. Studies were included in the analysis if they were original complete publications (no reviews, letters, abstracts, or editorials) that evaluated RTE of the thyroid gland, used appropriate cytology acquired by FNA biopsy or histology by surgery as reference standard for the diagnosis of malignancy, and assessed sensitivity and specificity of RTE for the differentiation of benign and malignant thyroid nodules.
Data analysis
The meta-analysis was performed using the R package meta 1.1–4. of Guido Schwarzer (
To assess the quality of the studies included in the meta-analysis the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) questionnaire was used (see Supplementary Table S1; Supplementary Data are available online at
Results
Literature search with the predefined key words, “elastography” and “thyroid,” yielded 28 publications. From these, eight publications met the criteria for use in our meta-analysis (16 –23). There were a total of 639 thyroid nodules that were evaluated in the eight studies.
Table 1 presents patient characteristics and study results. There was some variation of the analyses and classifications of RTE among the studies. Six studies performed qualitative analysis, one study a quantitative analysis, and one study both a qualitative and quantitative analysis. The ultrasound machines were from Hitachi (LOGOS, HV-5550, EUB-8500, and EUB-900; Tokyo, Japan) in seven studies, and from Siemens (Sonoline Elegra, Erlangen, Germany) in one study (22).
FNA, fine-needle aspiration; NPV, negative predictive value; PPV, positive predictive value; quant, quantitative measurement; sens., sensitivity; spec., specificity.
Three studies (16,20,22) included patients who were referred to surgery only, and five studies (17 –19, 21, 23) included patients who were referred for FNA. In this latter group of studies surgery was only performed in patients with suspicious or malignant FNA results. Therefore, FNA was used as the reference standard for the diagnosis of benign nodules if the patients did not have thyroid surgery and histopathology was used if the patients had thyroid surgery. In 59.6% (381/639) of thyroid nodules histopathology was used as reference standard. All patients with diagnosis of malignancy on FNA received thyroid surgery. Of the 639 nodules, 153 (24%) were malignant. There were 135 papillary carcinomas (88%), 9 follicular carcinomas (6%), 6 medullary carcinomas (4%), 2 poorly differentiated metastatic adenocarcinomas, and 1 non-Hodkin lymphoma.
Results for differentiation between benign and malignant thyroid nodules
The sensitivity and specificity of the studies included in the analysis are shown in Figure 1 and Table 1. The mean sensitivities and specificities for the diagnosis of malignant thyroid nodules were 92% confidence interval 88–96 and 90% confidence interval 85–95, respectively. A significant heterogeneity was found for the specificities of the studies (p < 0.0001), whereas significant heterogeneity was not observed for their sensitivities (p > 0.2). A rank-based test gave no indication for publication bias (p > 0.2 and p = 0.14 for publication bias with respect to the sensitivities and specificities, respectively).

Forest plot from meta-analysis of sensitivity (left) and specificity (right) using a random-effects model for the differentiation of benign and malignant thyroid nodules. Distribution is shown in random order. The squares present the mean of the single studies with their size varying according to their preponderance for the calculation of the overall mean. The length of the horizontal line presents the 95% confidence interval. The diamonds show the overall mean sensitivity and specificity with 95% confidence bounds.
Overall 2.5% (16/639) false-negative results were observed. These were noted for 4 follicular carcinomas, 10 papillary carcinomas, and 2 poorly differentiated metastatic adenocarcinomas. Therefore, overall 7% (10/135) of the papillary carcinomas, 44% (4/9) of the follicular carcinomas, and all (2/2) of the poorly differentiated metastatic adenocarcinomas were not diagnosed as being malignant by RTE.
Quality assessment using the QUADAS questionnaire
Detailed information on the rating of items for each study can be found in Table 2. The overall quality of the studies was excellent, with most studies having an “yes” rating for nearly all items. Item 1, concerning whether the spectrum of patients was representative of the patients who will receive the test in practice, was rated with a “no” for the three studies whose patients were all referred for surgery. Item 13, concerning results that could not be interpreted, was difficult to rate. All of the studies reported their results in great detail, most of them not reporting any results that could not be interpreted. Therefore, they were given an “yes” rating. Item 14 (withdrawals) was always given an “yes” rating.
Discussion
Our meta-analysis indicated that RTE, a noninvasive technique, can be used with a high sensitivity and a good specificity to detect malignant thyroid nodules. This is of great importance because of the possibility that clinical practice will require changes because of the limitations of current approaches to thyroid nodules.
Several criteria on B-mode ultrasound have been evaluated for their ability to diagnose thyroid cancer (25). No single criterion or even combination of criteria, however, is entirely accurate in diagnosing a malignant thyroid nodule. A combination of several criteria has increased sensitivity and specificity for the prediction of malignancy. Criteria most associated with malignancy are hypoechogenicity, blurred margins, microcalcifications, absence of a halo, a deeper than wide shape, and increased vascularization on power-Doppler ultrasound (12,26,27). Despite this, ultrasound imaging is not sensitive or specific enough to diagnose thyroid malignancy. Therefore, American, European, and German thyroid or endocrine societies recommend FNA as the major diagnostic procedure to evaluate thyroid nodules of >1 cm that are not hyperfunctioning, and for nodules <1 cm, ultrasound followed by FNA if the nodule is suspicious on ultrasound (25).
This approach has several limitations, however. The prevalence of thyroid nodules is quite high, especially in countries with inadequate iodine supply. Therefore, the currently recommended approach requires an enormous number of somewhat complicated and invasive procedures. Although FNA is minimally invasive, some patients prefer not to have any such procedure. Moreover, reliable FNA is dependent on the technical expertise of the operator and the experience and training of the pathologist. This may be some of the reasons why there is considerable variation in the reported sensitivity and specificity of FNA, which could miss up to a third of all thyroid malignancies (6). The problem is more severe for small nodules, more dorsally located nodules, and when there are many nodules as in a multinodular goiter and in fibrotic goiter.
It is obvious that in populations where thyroid nodules are very common but the prevalence of malignancy is <5% (28,29), high sensitivity and specificity is essential for differentiating between benign and malignant thyroid nodules. In this respect, RTE, with a mean sensitivity of 92%, is promising with the added advantage that it is a noninvasive procedure and is able to reveal nodules with a diameter of few millimeters even in the case of multinodular goiters. Therefore, RTE could be integrated in the diagnostic work-up of thyroid nodules either in a diagnostic algorithm before or even instead of FNA. RTE might also be useful to select suspicious nodules for FNA. In addition, in patients with multinodular goiter, where surgery is planned, the ability to identify those nodules most likely to be malignant before surgery could be important.
A major limitation of RTE is that it is not part of most of ultrasound devices. Rather, it is restricted to the more expensive, the so called high-end ultrasound devices. These may become less costly in the future. In addition, RTE is not reliable in thyroid nodules with coarse calcification. Notably, nodules with course calcification were excluded from some studies used in our meta-analysis (17,18,20,22), and it is possible that they were excluded in others. As is evident from our meta-analysis and the literature, both RTE and FNA are not suitable for the diagnosis of follicular carcinoma (4,16,17,22). In the eight studies included in the present meta-analysis overall, nine follicular carcinoma were reported but four of them were not diagnosed as follicular carcinoma using RTE. This may be because the gross anatomy and cellular patterns of follicular carcinoma overlap with those of benign follicular adenoma. Thus, follicular carcinoma can be differentiated from benign follicular adenoma only by obtaining evidence of capsular or vascular invasion, and this requires histological examination (22). Friedrich-Rust et al. (17) reported that the follicular carcinoma that was falsely read as benign on RTE had power-Doppler perfusion patterns of three and follicular cell formation on FNA. A possible approach to this problem would be to consider FNA in patients with RTE readings of 1 or 2 but suspicious hypoechogenic nodules with Doppler pattern 3 or 4 and then refer patients to surgery if FNA reveals follicular cells. However, it is clear from this approach that most of these patients will still be sent to surgery not primarily for therapeutic but for diagnostic reasons so the correct diagnosis of follicular carcinoma before surgery remains an unresolved clinical problem. Several molecular markers like Galectin-3 that might distinguish follicular carcinomas from adenomas on FNA have been reported; however, most of them have not been confirmed to be beneficial for clinical use (31). Further studies including more follicular carcinomas are necessary to evaluate the best approach to patients with suspected FTC.
Heterogeneity was found for the specificities of studies used in the present meta-analysis. This could be explained by the spectrum composition bias of the different studies including patients referred for surgery only with a high percentage of malignant thyroid nodules in three studies as compared to five studies including patients referred for FNA. The latter surely is more representative of the population that will receive the test in clinical practice. Another explanation could be the relative small number of studies included in the present meta-analysis with low statistical power. Nevertheless, sensitivity is more important for the work-up of thyroid nodules, and no significant heterogeneity was found for sensitivity. In addition, the meta-analysis included two studies (22,23) using off-line quantitative analysis of thyroid nodules. This is more time consuming and has to be performed after the real-time examination; the advantage over the qualitative analysis has not been shown up to now.
In summary, RTE can be used with high sensitivity in the work-up of thyroid nodules and might be a useful method in addition to or even instead of FNA to select patients for surgery. Large prospective international multicenter studies in various regions, including those with different degrees of iodine supply and thyroid cancer risk, are necessary to further evaluate the potential of RTE.
Footnotes
Disclosure Statement
The authors declare that no competing financial interest or conflict of interest exists for any of the authors.
