Abstract
Background:
Thyroid nodules found incidentally on 18F-fluorodeoxyglucose–positron emission tomography (FDG-PET) have been shown to be malignant in 30%–50% of cases. The American Thyroid Association recommends performing fine needle aspiration cytology (FNAC) for thyroid nodules showing FDG uptake. On the other hand, the role of FDG-PET in characterizing thyroid nodules with indeterminate cytology before surgery is not clear. The goal of this study was to evaluate the role of FDG-PET/computed tomography (CT) in predicting malignancy of thyroid nodules with indeterminate FNAC and to correlate FDG uptake with pathological and ultrasonographic (US) features.
Methods:
Between November 2006 and October 2009, 55 patients (42 women, mean age: 50 years) planned for surgery for 56 thyroid nodules with indeterminate FNAC were prospectively included and considered for analysis. All patients underwent presurgical FDG-PET/CT (Siemens Biograph, mean FDG injected activity: 165 MBq) and neck US. Pathology of the corresponding surgical specimen was the gold standard for statistical analysis.
Results:
At pathology 34 nodules were benign, 10 were malignant (7 papillary and 3 follicular carcinomas), and 12 were tumors of uncertain malignant potential (TUMP). The median size of the thyroid nodules was 21 mm (range: 10–57). Sensitivity, specificity, positive (PPV), and negative predictive (NPV) values of FDG-PET in detecting cancer/TUMP were 77%, 62%, 57%, and 81%, respectively. In multivariate analysis, cellular atypia was the only factor predictive of FDG uptake (p<0.001). Hurthle cells and poorly differentiated components were independent predictive factors of high (≥5) SUV Max (p=0.02 and p=0.02). Sensitivity, specificity, PPV, and NPV of US in detecting cancer/TUMP were 82%, 47%, 50%, and 80%, respectively. In multivariate analysis, hypervascularization was correlated with malignancy/TUMP (p=0.007) and cystic features were correlated with benignity (p=0.03).
Conclusion:
Adding FDG-PET findings to neck US provided no diagnostic benefit. The sensitivity and specificity of FDG-PET in the presurgical evaluation of indeterminate thyroid nodules are too low to recommend FDG-PET routinely.
Introduction
18F-Fluorodeoxyglucose (FDG) positron emission tomography (PET) is widely used to detect malignancy in many other tumor types, and FDG avid thyroid nodules found incidentally on FDG-PET scan performed for nonthyroid disease present a risk of malignancy ranging between 14% and 50% in most studies (11 –18). The American Thyroid Association recommends performing a “prompt evaluation” by FNAC of all thyroid nodules showing FDG uptake even if in absence of suspicious US findings (4). On the other hand, the systematic use of this technique in the presurgical evaluation of thyroid nodules is currently not recommended. This is due to discordant results among studies reporting a sensitivity and specificity of FDG-PET in detecting malignancy ranging from 60% to 100% and 40% to 90%, respectively (19 –26). These discrepancies seem to be related to the small samples studied and/or to the differences in the inclusion criteria. Most studies reported a low positive predictive value (PPV, around 30%) essentially due to a high number of false positive results represented by benign nodules with FDG uptake. The reported negative predictive value (NPV) is usually high (80%–100%) essentially due to the small proportion of thyroid cancers among nodules, suggesting that FDG-PET could be useful to avoid unnecessary surgery for benign nodules (21). The aim of the current study was to validate these results on a larger number of selected patients with cytologically indeterminate follicular lesions and to search for relationships between FDG uptake, US, and histological features.
Materials and Methods
Patients
Between November 2006 and October 2009, 60 consecutive patients (47 women, 13 men, median age: 51 years, range: 18–83 years) planned for surgery at the Institute Gustave Roussy harboring a total of 62 thyroid nodules classified at cytology as indeterminate follicular lesions were initially considered for this study and evaluated by FDG-PET before surgery. Other inclusion criteria were as follows: patient age above 18 years, thyroid nodule with a diameter ≥1 cm to avoid any partial volume effect on FDG-PET scan, a normal serum TSH level (0.5–4 mU/L), and absence of known diabetes. Clinical data, biology, and neck US were available in all cases. Our institutional review board approved the study and all patients gave their written informed consent.
FNAC
All of the thyroid nodules initially included in this study were defined at cytology as “indeterminate follicular lesions.” FNAC under US guidance was performed at the Institut Gustave Roussy in 35 cases and in other centers in 27 cases. The FNAC performed at the Institute Gustave Roussy was retrospectively reviewed by an expert cytologist (P.V.) in order to classify cytologic results according to the “The Bethesda System for Reporting Thyroid Cytopathology” available since 2009 (5). The cytologist was blinded to histology and FDG-PET findings at the moment of the review. The FNAC samples from outside centers did not undergo a central review. For the final analysis of the study we took into account the nodules with indeterminate cytology diagnosed in other centers without central reviewing (n=27) and the nodules analyzed in our center reclassified in categories III (atypia of undeterminate significance or follicular lesions of undetermined significance) (n=6) and IV (follicular neoplasm or suspicious for a follicular neoplasm) (n=23) of the Bethesda classification. Nodules classified in categories I (nondiagnostic, nonsatisfactory), II (consistent with a benign follicular nodule), V (suspicious for malignancy), and VI (consistent with malignancy) were excluded from final analysis (n=6).
US characteristics
Neck US studies were performed in the Department of Radiology of the Institute Gustave Roussy in 46 cases with a high-energy linear probe of 8–14 MHz (Aplio® Toshiba), or in other centers in 14 cases. All the studies were retrospectively reviewed by an expert radiologist in thyroid imaging (L.C.) blinded to the histology and FDG-PET findings. For each nodule the following US features were evaluated: size (antero-posterior, longitudinal, and transverse diameters), echo structure (hypoechoic, isoechoic, or hyperechoic), margins (regular, irregular), internal structure (solid or cystic), microcalcifications, hypervascularization, (peripheral, internal, mixed, and regular/irregular), and hypoechoic halo (present/absent). The internal structure of each nodule was defined as predominantly solid or cystic on the basis of the percentage of its cystic component (<or >50%, respectively), or a mixed pattern when the solid and cystic components were equally represented. Microcalcifications were defined on the basis of their diameter (<1 mm). Vascularization was determined by color Doppler technique. On the basis of these features, a final judgment on neck US was given: a malignant US pattern was defined in the presence of microcalcifications or of at least two suspicious characteristics among the following—hypoechoic solid structure, hypervascularization, or irregular margins. A benign US pattern was defined in the presence of an iso- or hyperechoic structure with regular margins and/or a hypoechoic halo. Any other US pattern was considered as a suspicious US pattern.
FDG-PET/CT protocol
A presurgical low-dose FDG-PET/CT (Siemens Biograph) was performed at the Institut Gustave Roussy in all cases with a median interval time of 16 days before surgery (range 3–158 days). All patients, after fasting for at least 6 hours, received a low FDG activity (mean: 165 MBq; range: 118–189 MBq). Image acquisitions were obtained 60 minutes after FDG injection, and a median of 3 bed positions of 10 minutes each in the head and neck region was obtained. A three-dimensional mode was used for PET image acquisition. PET data were reconstructed on a 128×128 matrix, using an iterative algorithm (FORE and AWOSEM) with two iterations, eight subsets, and a 5-mm FWHM gaussian postfilter. PET images were analyzed on an e.soft workstation (Siemens). A visual analysis was performed by two expert nuclear medicine physicians (D.D. and J.L.) blinded to histological results and to each other's interpretations. The nodule location was known at the moment of interpretation, however. The presence or absence of focal FDG uptake in the thyroid nodule was analyzed first. A focal uptake was considered significant when uptake was higher than the normal thyroid background. A consensus between the two experts was reached in case of discordant findings. Any other abnormal uptake in the neck was also evaluated. When focal uptake was present, the maximum standardized uptake value normalized for body weight (SUV Max) was measured by drawing a three-dimensional region of interest.
Histopathological characteristics
All histological specimens were separately reviewed by two experts in thyroid pathology (B.C. and A.A.G., Institut Gustave Roussy) blinded to PET findings. On the basis of histological analysis, tumors were classified according to WHO recommendations as benign tumors (adenomas), papillary carcinoma (classical or variant), follicular carcinoma (minimally or widely invasive), poorly differentiated thyroid cancer, or tumor with uncertain malignant potential (TUMP) (27). Discrepancies in the diagnosis were resolved by consensus between the two experts. Adenomas were classified as follicular or oncocytic adenomas if comprised of ≥50% oncocytic cells, while adenomas with <50% oncocytic cells were considered as oncocytic metaplasia. TUMP was defined as an encapsulated tumor composed of well-differentiated follicular cells with no blood vessel invasion, and questionable PTC-type nuclear changes, and/or questionable capsular invasion (28).
Statistical analysis
Histological findings were taken as the gold standard. FDG-PET and neck US sensitivity, specificity, PPV, and NPV in detecting malignant and potentially malignant nodules (TUMP) were calculated. Kruskal-Wallis nonparametric test was used to test the relationships between quantitative variables (SUV Max value and nodule diameter) and histological diagnosis. Chi-square test or Fisher's exact test was used to evaluate relationships between FDG uptake (visual qualitative analysis) and each US and histological parameter in univariate analyses. Multivariate analyses were performed by logistic regression. All variables with a p-value<0.20 in univariate analysis were tested in the multivariate analysis. Since in all patients, except one, only one nodule was biopsied and considered for the statistical analyses, we did not take into account the patient cluster. All p-values were two-sided. Analyses were performed using SAS statistical software V9.1 (SAS Institute).
Results
Patients
Finally a total of 56 nodules (55 patients, 42 women, mean age=50.7 years, range=24–82) were considered for analysis (Table 1). Thirty-six out of the 55 patients (65%) underwent a total thyroidectomy because of nodular goiter (17 cases) or because the frozen section analysis suggested malignancy (19 cases). The remaining 19 patients (35%) underwent a lobectomy.
(1, 2=nodule 1, nodule 2 in the same patient)
TUMP, tumors of uncertain malignant potential; FV-PTC, follicular variant of papillary carcinoma; PD FC, poorly differentiated carcinoma; IM FC, minimally invasive follicular carcinoma; PTC, papillary thyroid cancer; US, ultrasonographic; FDG, fluorodeoxyglucose.
Histology
At histology, 34 (61%) nodules were benign (adenomas), 10 (18%) were malignant (carcinomas), and 12 (21%) were TUMP. Discrepancies among the two pathologists did not occur. Among the 34 adenomas, 11 were oncocytic adenomas. Among the 10 carcinomas, 7 were papillary carcinomas, including 3 classical, 2 tall cell, 1 follicular, and 1 Warthin-like variants, and 3 were minimally invasive follicular carcinomas.
Hurthle cells were found in 29/56 nodules (52%) (15 adenomas, 10 TUMP, 2 follicular carcinomas, and 2 papillary carcinomas). Lymphocyte and macrophage infiltration was present in 15/56 (27%) nodules, all adenomas. Cellular atypia was present in 23/56 (41%) nodules (6 oncocytic adenomas, 12 TUMP, 3 follicular carcinomas, 1 follicular variants of papillary carcinoma, and 1 classical papillary carcinoma). Focal necrosis was found in 4/56 (7%) nodules (3 oncocytic adenomas and 1 follicular carcinoma). Mitoses were present in 7/56 (12.5%) nodules (2 adenomas, 4 TUMP, and 1 minimally invasive follicular cancer with a poorly differentiated structures). Vascular invasion was found in 3/56 (5%) nodules (2 classical papillary carcinomas and 1 minimally invasive follicular carcinoma). Fifty of the 56 (89%) nodules were encapsulated with capsule infiltration present in all malignant tumors. Limited poorly differentiated areas (≤50%) were present in 6/56 (9%) nodules, all TUMP. Large poorly differentiated components (>50%) were found in 2/56 (2%), and all were in minimally invasive follicular carcinomas.
US findings
US features of benign and malignant or potentially malignant nodules are reported in Table 2. For the detection of malignant and potentially malignant nodules, neck US had a sensitivity, specificity, PPV, and NPV of 82% (18/22), 47% (16/34), 50% (18/36), and 80% (16/20), respectively. The median diameter of thyroid nodules was 21 mm (range: 10–57). The median diameter was 21 mm (range: 10–43) for benign nodules, 27 mm (range: 11–57) for TUMP, and 20 mm (range: 10–35) for malignant nodules (Kruskal-Wallis test p=0.18).
Univariate analysis showed that among US features, a solid hypoechoic structure and hypervascularization were the only predictive patterns of malignancy/TUMP (p=0.02 and 0.001, respectively). Regular margins and a cystic structure were significantly correlated with benign lesions (p=0.046 and 0.01). In multivariate analysis, hypervascularization was the only independent predictive factor of malignancy/TUMP (p=0.007) and a cystic structure the only independent predictive factor for benign lesions (p=0.03).
FDG-PET/CT findings
On the basis of visual analysis of the PET scan, 30/56 (54%) nodules showed FDG uptake (Table 3): 13 adenomas, 9 TUMP, and 8 carcinomas. Among the 13 adenomas with FDG uptake, 7 were oncocytic adenomas. Among the 8 carcinomas with FDG uptake, 3 were minimally invasive follicular carcinomas (with poorly differentiated structures in 2 cases), and 5 were papillary carcinomas (1 follicular variant, 1 classical, 1 Warthin like, and 2 tall cell variants). Among the 26/56 (46%) nodules without FDG uptake, there were 21 adenomas (5/21 classified as oncocytic adenomas), 3 TUMP, and 2 carcinomas (2 classical papillary carcinomas). No statistical difference was found between the diameter of nodules with and without FDG uptake (median 20.5 mm [range 10–57] and 21.0 mm [range 10–43], respectively, Kruskal-Wallis p=0.93). The diameters of the 2 malignant nodules without FDG uptake were 10 mm and 20 mm. The median SUV Max value for the 30 nodules with FDG uptake was significantly lower (Kruskal-Wallis test p=0.03) in benign than in malignant/potentially malignant nodules: 3.1 (range: 2.2–11.2) for adenomas, 5.6 (range: 2.2–33) for carcinomas, and 10 (range: 3–55) for TUMP lesions (Fig. 1). An SUV Max value ≥5 (n=13 nodules) was observed more frequently in malignant/potentially malignant nodules (10/22, 45%) than in adenomas (3/34, 9%) (p=0.002).

Distribution of SUV Max value in adenomas, carcinomas, and TUMP (p=0.03).
PET, positron emission tomography.
The sensitivity, specificity, and PPVs and NPVs of FDG-PET/CT for the diagnosis of malignant and potentially malignant nodules were 77% (17/22), 62% (21/34), 57% (17/30), and 81% (21/26), respectively, with a significant correlation between the presence of FDG uptake and malignancy/TUMP (p=0.004).
In univariate analysis, FDG uptake was associated with the presence of oncocytic cells (p=0.003), cellular atypia (p<0.0001), tumor capsule invasion (p=0.003), and poorly differentiated areas (p=0.005) (Table 4). Among nodules showing FDG uptake, oncocytic cells were present in 10/13 benign nodules, 7/9 TUMP, and 4/8 carcinomas. Cellular atypia was present in 5/13 benign nodules, 7/8 carcinomas, and in all cases of TUMP (9/9). Tumor capsule invasion and poorly differentiated areas were present only in carcinomas. In multivariate analysis, cellular atypia was the only feature associated with FDG uptake (p<0.0001). A high SUV Max (SUV Max ≥5) in univariate analysis was correlated with the presence of oncocytic cells (p=0.0008), cellular atypia (p=0.0003), poorly differentiated component (p=0.001), mitosis (p=0.04), necrosis (p=0.04), and capsule invasion (p=0.02). In multivariate analysis, the presence of oncocytic cells (p=0.02) and a poorly differentiated component (p=0.02) were the only independent factors predicting a high SUV Max.
The association of a malignant/suspicious US pattern and a positive PET scan presented a higher specificity than neck US alone (71% vs. 47%) but had a lower sensitivity (64% vs. 82%) (Table 5).
CI, confidence interval.
Discussion
FDG avid thyroid nodules detected on PET scan are at a high risk of malignancy but the utility of FDG-PET in the presurgical characterization of indeterminate thyroid nodules at cytology is controversial, due to discordant results in the literature. The most recent American Thyroid Association (ATA) guidelines recommend exploring any FDG avid nodule by FNAC (recommendation A1: Strongly recommends) but do not recommend the routine presurgical use of FDG-PET to detect malignancy (recommendation E: Recommends against) (4). In our study including 55 patients, FDG-PET had a sensitivity of 77% and an NPV of 81% in detecting malignant and potentially malignant thyroid nodules. The low sensitivity and the low NPV are due to the absence of FDG uptake in a proportion of malignant nodules (2/10) and TUMP (3/12). Our data do not support previous studies of FDG-PET finding a sensitivity and NPV of 100% (21,23,25,26). The discordance between our results and previous reports maybe related to the definition of “indeterminate cytology” itself. Since 2009 the availability of “the Bethesda System for Reporting Thyroid Cytopathology” allows for clearly defining indeterminate follicular lesions and for differentiating them between nodules with cytological features consistent with benign lesions or suspicious for malignancy. Before 2009 the term “indeterminate follicular neoplasm” included a heterogeneous population of lesions ranging from adenomas to follicular carcinomas or follicular variants of papillary thyroid carcinomas. In our study we reviewed and reclassified the 35 lesions analyzed in our center on the basis of the Bethesda System and we excluded all the nodules with cytology consistent with benign lesions or suspicious for malignancy, in order to reduce our selection bias. Unfortunately, the remaining 27 thyroid nodules included in our study did not undergo a central review. We consider this the main limitation of our study.
The discordance among previous studies could also be related to the heterogeneity of the pathology of the thyroid tumors included in the studies. While high FDG uptake has been reported in aggressive tumors such as anaplastic cancer or Hurthle cell carcinoma, the absence of FDG uptake in thyroid carcinomas is often related to well-differentiated tumor histology or to small tumor size (<10 mm) as previously reported by Mitchell et al. and Traugott et al. (20,26). In our study, most of the malignant nodules analyzed were well-differentiated cancers and the two carcinomas that did not show FDG uptake were classical papillary carcinomas. Their small size (1 with a diameter of 10 mm) and their differentiation could be a reason for the low sensitivity reported in our study. On the other hand, the high number of TUMP in this study cannot be an explanation of the low sensitivity of FDG-PET in our study since most of these tumors showed FDG uptake. The unexpectedly high number of lesions with undetermined malignant potential at final histology is an unusual finding, and our two pathologists agreed in all cases. This maybe explained by the selection of only patients with indeterminate cytology for our study, who generally represent a small portion of nodules operated in our center. Whether TUMP should be grouped with benign or malignant nodule is a matter of debate. Since TUMP cannot be defined as benign, we chose to combine TUMP with the malignant tumors in our study.
Our study also shows a high rate of false positive PET results with 13/34 (38%) benign nodules showing FDG uptake. Thus, the specificity and PPV were low (62% and 57%, respectively). The specificity found in our study is comparable, however, to the values reported in previous reports and ranging between 40% and 60% (19 –23). Ours is the first study to perform an analysis of specific histological and US features as compared to FDG uptake in thyroid nodules and in particular in benign nodules. The only independent predictive factor for FDG uptake was cellular atypia (p<0.0001) that can be seen in both malignant and benign nodules and which is frequently linked to the presence of oncocytic cells. These two correlated factors may partially explain the low specificity of FDG-PET and confirm the previously reported link between FDG uptake and oncocytic cells in thyroid adenomas (19). At histology an aggressive pattern, such as a poorly differentiated component for follicular carcinomas and a tall cell variant for papillary cancer, was found among malignant nodules showing FDG uptake. In particular, the SUV Max was significantly higher in malignant/potentially malignant nodules than in adenomas in general, but it was even higher (>5) in malignant tumors with a poorly differentiated component (p=0.02). Unfortunately this does not mean that a high SUV Max can be used as specific feature of tumor aggressiveness, due to the possibility of a high SUV for benign nodules as well, which is related to the presence of oncocytic cells (25,29). However, in case of malignancy and in the absence of oncocytic metaplasia, a high SUV Max could be an indicator for an aggressive tumor (22,25). The added value of FDG-PET findings to US in our study was low, the two techniques having a similar sensitivity (82% vs. 77%, respectively) and PPV (57% vs. 50%, respectively). Another limit of our study is that not all US were performed at our center, but they were all centrally reviewed by the same radiologist in order to obtain a reliable and homogeneous evaluation of the US criteria. Our results still need to be confirmed in further studies, although a previous study comparing FDG-PET and US in thyroid incidentaloma agrees with our data (30). Our study may also present the bias that all of the thyroid nodules analyzed were initially diagnosed using US, and may have been particularly selected for FNAC based on more suspicious US finding; it is well known that nodules with criteria suspicious for malignancy such as a hypoechoic solid structure and hypervascularization undergo FNAC more often (31,32). On the basis of our results, however, the use of FDG-PET is not useful in the presurgical characterization of thyroid nodules with indeterminate FNAC. In our study, one-third of the cancers and TUMP would have been missed if the surgical decision had been based on FDG-PET results.
In conclusion, the sensitivity and specificity of FDG-PET in the presurgical evaluation of thyroid nodules with indeterminate FNAC are low and FDG-PET should not be used in routine practice to select patients for surgery, confirming current ATA recommendations. The role of FDG-PET in detecting aggressive tumor variants still remains to be evaluated.
Footnotes
Disclosure Statement
The authors declare that no competing financial interests exist.
