Abstract

We appreciate the interest of Treglia et al. in our recent review (1), in which we estimated the prevalence of thyroid incidentalomas detected by 18F-FDG-PET, and the risk of malignancy in such lesions. Our calculations were based on data from 22 carefully selected studies comprising a total of 125,754 investigated subjects. Our main finding was that an unexpected focal hypermetabolic activity in the thyroid gland was seen in 1.6% of the subjects, of whom 34.8% harbored thyroid malignancy (1). Treglia et al. question our arithmetic approach, suggesting that a meta-analysis would lead to more robust estimates of the parameters in question. The authors deserve much credit for presenting results from such a meta-analysis.
Meta-analyses are widely used within medical research. A meta-analysis is a mathematical tool used to increase the statistical power by pooling data from a range of studies that focus on a well-defined topic. Meta-analyses have become very popular, and since many researchers consider the results of a meta-analysis to be more solid—as opposed to results of the individual studies—such articles are often quoted in the scientific literature. In our opinion, meta-analyses have a role primarily in a context where an intervention has taken place, for example, in situations where a series of controlled trials failed to solve a specific issue due to small sample sizes, or negative or—even more important—divergent results. However, in studies of an observational character, as our recent review (1), it is questionable whether meta-analyses have any greater role. Indeed, meta-analyses should not be used uncritically since there are pitfalls with this method—especially pertaining to heterogeneity between studies (2). While statistical benefits can be obtained by a meta-analysis when employed properly, this method can also be used wrongly, or even be misused, leading to misinterpretation of results and lack of transparency for the uninitiated reader with limited insight into how these results were obtained.
As already emphasized in our previous article (1), it remains unquestioned that the 22 studies, which our review were based upon, are very heterogeneous in terms of the study populations, the PET-scan protocols, and the clinical evaluation of the patients. The high degree of heterogeneity across studies is reflected by an I2 value above 90% according to the calculations based on a random effect model as carried out by Treglia et al. Although heterogeneity can be corrected for statistically to some extent, it is an inherent limitation of meta-analyses, and it can be questioned whether a meta-analysis is justified when the included studies show a high degree of heterogeneity (2). The results from the meta-analysis performed by Treglia et al. are very close to our results (1), both with respect to the prevalence of thyroid incidentalomas and the risk of malignancy. This is not surprising since the authors included exactly the same studies as in our review (1). Just because a set of data has been through the statistical mill of a meta-analysis, this provides no guarantee for the final results being closer to the truth (which in this case, by the way, remains to be settled) than a more simple calculation based on the raw data, as done in our study (1). It is therefore up to discussion whether including data from a range of previous—and highly heterogeneous—studies in a meta-analysis, as performed by Treglia et al., is a scientifically more correct way to estimate the prevalence of thyroid incidentalomas, and their risk of harboring malignancy, when detected by 18F-FDG-PET.
Regarding the standardized uptake values (SUV) in thyroid lesions detected by 18F-FDG-PET, these were calculated according to different protocols and procedures in the various studies. Comparison of SUV between individual studies therefore makes little sense, and we are fully in agreement with Treglia et al. regarding this point. Nevertheless, we found it of interest to investigate whether an overall difference exists between SUV measured in benign and malignant thyroid nodules. Therefore, we pooled SUVs from all studies, and a significant difference between the two groups was indeed found, with higher values in malignant nodules (1). We are well aware that there are methodological limitations with the use of SUV, as discussed by Treglia et al., and earlier pointed out by some of the authors (3). However, when comparing SUV across studies, we believe that any potential bias inherent to this method should affect benign and malignant nodules largely to the same extent. As we already emphasized in our article (1), the low accuracy of the method, reflected by the considerable overlap in SUV between benign and malignant nodules, excludes the use of this parameter alone as a basis for any clinical decision.
The main point—independent of which study is in focus—is the fact that a considerable risk of malignancy exists if a focal lesion in the thyroid gland is detected incidentally by 18F-FDG-PET, further underlined by the recent study by Bertagna et al. (4). The present evidence consistently supports that such patients should be thoroughly evaluated. To this end, the meta-analysis provided by Treglia et al. confirms our findings, but adds little to answering the questions that remain. More importantly, there are no data on cost-effectiveness of such a strategy and whether the natural history of thyroid malignancy diagnosed in this manner differs from that of a diagnosis based on investigating, for example, the patient with clinically evident nodular thyroid disease. These are the areas where our joint forces need to focus attention.
Footnotes
Disclosure Statement
None of the authors have received any financial or other type of compensation related to the subject of this article. There are no competing financial interests.
