Abstract
Background:
The CUT score is a thyroid nodule scoring system that has become recently available as a smartphone application. It has been created on the basis of a clinical (C) and ultrasonographic (U) meta-analysis of suspicious thyroid nodule features to help clinicians with the preoperative malignancy risk assessment of thyroid nodules. The aim of the present study was to analyze the C + U sum of the CUT score for cytologically indeterminate TIR3A and TIR3B thyroid nodules, comparing the results obtained from the two groups.
Methods:
The CUT score was applied to 201 cytologically indeterminate thyroid nodules, 78 categorized as TIR3A and 123 as TIR3B. The Mann–Whitney test was applied to compare the C + U score values of the two groups, and a receiver operating characteristic (ROC) curve analysis was performed to validate the C + U score as a diagnostic test.
Results:
In both groups, the median C + U value of all nodules was significantly higher in case of malignant (4.37 TIR3A, 4.50 TIR3B) versus benign nodules (2.75 TIR3A, 3.00 TIR3B). Through ROC curve analysis within the TIR3A group, a C + U value ≥4.00 was determined as diagnostic cutoff for the detection of malignant nodules (56% sensitivity, 77% specificity, area under the curve [AUC] = 0.714); and for the TIR3B group, a cutoff of C + U value of ≥3.75 was identified (65% sensitivity, 78% specificity, AUC = 0.744).
Conclusion:
The CUT score could represent a valid aid for the clinician in the management of indeterminate nodules with follicular proliferation.
Introduction
The management of thyroid nodules with a cytologically follicular pattern is challenging because of the impossibility of performing a histo-architectural analysis in the preoperative phase, which hampers a definitive diagnostic evaluation regarding the invasiveness of the lesion (1,2).
At the histological level, a cytologically microfollicular pattern can reflect a broad spectrum of lesions, including hyperplastic adenomatous nodules, follicular adenomas or carcinomas, and the follicular variant of papillary thyroid carcinoma (3).
The latest Italian consensus for thyroid cytology from 2014, redacted by the Italian Society of Anatomic Pathology and Diagnostic Cytology (SIAPEC-IAP), together with experts representing the Italian Endocrinology societies (Italian Thyroid Association [AIT], Italian Society of Endocrinology [SIE], and Association of Medical Endocrinologists [AME]), divides the indeterminate follicular category, TIR3, in two main subclasses with different risks of malignancy (4).
A previous classification from 2007 (5) recommended surgical intervention in case of TIR3, which was associated with a rate of malignancy of about 20%. In the new classification (4), the TIR3 category is divided into two subclasses: TIR3A corresponds approximately to the atypia of undetermined significance/follicular lesion of undetermined significance and Thy3a categories, and TIR3B corresponds to the follicular neoplasm/suspicious for follicular neoplasm and Thy3f categories of the Bethesda Reporting System for Thyroid Cytology (6) and the British Royal College of Pathologists Classification (7).
This discrimination aims to reduce the number of patients included in the TIR3 category undergoing surgery for benign pathologies (8).
Nodules that fall within the TIR3A cytology category (low-risk indeterminate lesion) are characterized by increased cellularity, poor colloid content, and numerous but insufficient microfollicular structures to allow establishing the diagnosis of a “follicular neoplasia.” In these cases, the expected risk of malignancy is lower than 10% suggesting that follow-up with a possible repeat fine-needle aspiration (FNA) within six months is adequate.
Thyroid nodules showing a TIR3B cytology (high-risk indeterminate lesion) are characterized by high cellularity arranged in monotonous microfollicular/trabecular structures, with poor/absent colloid (a picture suggestive for “follicular neoplasia”). Although architectural atypia is the most important morphological feature that allows the distinction between low- and high-risk lesions (i.e., TIR3A from TIR3B), a significant degree of nuclear atypia entitles the inclusion of the lesion within the TIR3B subcategory. For TIR3B nodules, the risk of malignancy is between 15% and 30%, and surgical excision is considered as first-line interventional strategy.
Given the low percentage of malignancy among nodules with an indeterminate follicular pattern, any additional effort to identify further factors allowing a more accurate selection of nodules that should be surgically removed remains of interest. Immunocytochemical and molecular markers allowing to improve this decision have been identified, but further studies are needed to obtain tools leading to a rapid and inexpensive application in the clinical context.
The clinical–ultrasound evaluation is a key part for the characterization of thyroid nodules. For an overall and prompt patient evaluation, we recently introduced a free smartphone app of our meta-analysis-derived scoring system, the CUT score, which is available on iPhone and Android. The CUT score is based on a comprehensive meta-analysis of published studies on the clinical and ultrasonographic (US) features of thyroid nodules associated with higher risk of malignancy. To build a 10-point score represented by the C + U sum (ranging from 0 to 10), each clinical and US feature received a matching value directly proportional to the effect size computed in the meta-analysis. The application allows to assign a weight to each suspicious clinical (C) and/or US (U) thyroid nodule feature and links the C + U sum to the cytological category (represented by T) of the biopsied nodule (Table 1) (9,10).
The CUT Score Consisting of the Amount of the Suspicious Clinical and Ultrasonographic Features (C + U) Occurring in the Thyroid Nodule and the Result of the Thyroid Nodule Cytology (T) Corresponding to the Previous Italian Cytological Classification *
Fadda et al. (5).
In a previous study of our group, the CUT score was applied to 705 thyroid nodules, showing a reliable preoperative malignancy risk stratification (10).
The aim of the present study was to analyze the C + U sum of the CUT score for TIR3A and TIR3B thyroid nodules, comparing the results obtained from the two groups. In each group of nodules, we then attempted to identify a cutoff value that would guarantee better sensitivity and specificity in predicting the malignancy risk.
Materials and Methods
A retrospective analysis was performed for 201 cytologically indeterminate thyroid nodules, 78 of them are TIR3A nodules and 123 TIR3B nodules. All nodules were surgically resected and evaluated by histopathology. The nodules were selected from a series of 4778 consecutive thyroid nodules in 5059 outpatients, referred for FNA according to the AME/AACE/ETA guidelines (11), in a timeframe between May 1, 2014, and July 31, 2017, at the Center for Thyroid Disease of the Catholic University of Sacred Heart, Foundation “A. Gemelli” University Hospital-Rome.
All patients approved and signed an informed consent, and subsequently, a clinical history assessment, neck US examination, and FNA of the selected nodules were performed. The surgical intervention was recommended based on the cytological result and the clinical and US context.
The color-Doppler US examinations and the US-guided FNA procedures were performed by endocrinologists and surgeons of our Center, by using a Toshiba Aplio 400 (Japan) equipped with a VF13–5 linear transducer (12 MHz). All FNA were performed with 23–25–27 G needles under US guidance with two to three consecutive passes for each lesion (12). After the aspiration, the needle was rinsed in CytoLyt® solution, and the sample was processed using the ThinPrep 2000 method (HologicCytyc Co., Marlborough, MA), fixed with 95% ethyl alcohol, and stained with Papanicolaou. The cytological diagnoses were performed based on the 2014 Italian Reporting System for Thyroid Cytology (4). For surgically removed thyroid nodules, all surgical specimens were fixed in 10% buffered formaldehyde, and paraffin-embedded 5-μm-thick microtomic sections were stained with hematoxylin and eosin. All the cytological and histological sections were analyzed and reported by two different thyroid cytopathology specialists.
Two skilled sonographers, both endocrinologists with 6–12 years of experience in performing FNA and >10 years in US, retrospectively applied the C + U sum of the CUT score to the thyroid nodules included in the study, unaware of the respective histology. For the C + U score clinical component, the following elements were included: sex, history of head and/or neck irradiation, family history of thyroid carcinoma, and/or familial syndromes associated with a higher incidence of thyroid carcinomas. For the C + U score US component, each thyroid nodule was carefully assessed for the following features: single or multiple nodule (single if no other nodule was detected on US examination, accurately excluding pseudo-nodules), size (the maximum value recorded in the three orthogonal dimensions), shape (considering a taller than wide shape if the anteroposterior diameter was higher than the transverse diameter), margins (irregular or lobulated vs. well circumscribed), the presence and characteristics of the peripheral halo (incomplete or absent vs. regular halo sign), echostructure (entirely solid vs. mixed or purely cystic), echogenicity (hypoechoic vs. hyperechoic or isoechoic with respect to normal thyroid parenchyma; in mixed thyroid nodules, the internal solid components were analyzed), the presence and morphology of microcalcifications (tiny, punctate hyperechoic foci less than 2 mm, excluding those presenting a comet-tail or ring down artifact) with or without macrocalcifications, and intranodular vascularization found on the color-flow Doppler pattern (defined as type 1, the absence of flow signals; type 2, prevalent peripheral flow; type 3, prevalent intranodular flow) (13). The thyroid nodules US examination was performed by the two sonographers, reaching a consensus regarding each US characteristic.
Statistical analysis
The C + U sum of the CUT score was calculated for each thyroid nodule included in the study, by using the specific application software created for Apple and Android (Cut score, Arianna srl).
The normality of data distribution for each group was determined by the Shapiro–Wilk test. The Mann–Whitney test was applied to compare the C + U sum values of the TIR3A and the TIR3B groups and, within each group, to compare the values belonging to the benign and malignant nodules.
The sensitivity, specificity, positive predictive value, negative predictive value, and accuracy were calculated independently for the TIR3A and the TIR3B nodule groups for each C + U sum value. For both the TIR3A and the TIR3B groups, a receiver operating characteristic (ROC) curve was obtained to validate the C + U sum as a diagnostic test and to identify a cutoff value for each cytological subcategory. The area under the curve (AUC) and the p-value for both curves were calculated. A value of p < 0.05 was considered statistically significant. The data analysis was performed using MedCalc Statistical Software version 18.2.1 (MedCalc Software bvba, Ostend, Belgium).
Results
Of 4778 biopsied thyroid nodules, 325 (6.8%) yielded TIR3A and 215 (4.5%) TIR3B cytologies.
Following a diagnosis of TIR3A on an initial thyroid FNA, a repeat aspiration was performed in 95/325 (29%) to provide a more definitive risk stratification. All indeterminate thyroid nodules not surgically resected and/or managed elsewhere were excluded (Fig. 1). Therefore, 78 TIR3A nodules (of which, 38 had an identical reading on the second FNA) and 123 TIR3B nodules, all histologically verified, were included in the study. Of 201 indeterminate nodules, 155 (77%) were benign and diagnosed as nodular hyperplasia, follicular adenomas, and oxyphilic adenomas. Among 78 TIR3A nodules, 9 (11.5%) showed a malignant histology. Among 123 TIR3B nodules, 40 (32.5%) showed a malignant histology. On histopathological diagnosis, 1 case of noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) (2%) (which we included in the malignant category), 43 papillary carcinomas (87%), and 5 follicular carcinomas (11%) were found (Table 2).

Flow chart of the all biopsied thyroid nodules.
Histopathological Diagnosis of Malignant TIR3A and TIR3B Nodules
FVPTC, follicular variant of papillary thyroid carcinoma; NIFTP, noninvasive follicular thyroid neoplasm with papillary-like nuclear features; PTC, papillary thyroid carcinoma.
The M:F ratio and median age were similar between the TIR3A and the TIR3B groups (respectively, M:F 21/57 vs. 34/89 and median age ± SD: 53 ± 14 vs. 56 ± 14). A higher median size of the nodules was found within the TIR3A group versus the TIR3B group, with a statistically significant difference (median size nodules [range]: 30 [6–80] mm vs. 20 [4–79] mm).
In the TIR3A group, the median C + U value of all nodules was 2.75 (95% confidence interval [CI = 2.50–3.25]; interquartile range [IQR] = 1.75–4.43). In the TIR3B group, the median C + U value of all nodules was 3.50 ([CI = 3.00–3.75]; IQR = 2.50–4.68) (Fig. 2a).

The box plots indicate the interquartile range and the median value of the C + U score (
In the TIR3A group, the median C + U value was significantly higher in case of malignant (4.37 [CI = 2.75–5.40]; IQR = 2.75–5.50) versus benign (2.75 [CI = 2.50–3.00]; IQR = 1.75–3.81) (p = 0.0035) nodules (Fig. 2b).
The difference of the median C + U values found in the malignant (4.50 [CI = 3.83–5.25]; IQR = 3.12–5.87) versus benign (3.00 [CI = 2.75–3.50]; IQR = 2.06–3.75) (p < 0.0001) nodules within the TIR3B group was also significant (Fig. 2c).
Through the ROC curve analysis within the TIR3A group, a C + U value ≥4.00 was identified as cutoff, showing a 56% sensitivity and a 77% specificity in detecting malignant nodules (AUC = 0.714 [CI = 0.63–0.78]; p < 0.001) (Fig. 3a and Table 3). Within the TIR3B group, a ROC cutoff of ≥3.75 was found for the C + U value, with a 65% sensitivity and a 78% specificity in detecting malignant nodules (AUC = 0.744 [CI = 0.65–0.80]; p < 0.001) (Fig. 3b and Table 3).

Receiving operator characteristic curves applied to the C + U sum of the CUT score for (
Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value, and Accuracy in the TIR3A and TIR3B Groups at Established Cutoffs (see Fig. 3)
CI, 95% confidence interval; NPV, negative predictive value; PPV, positive predictive value.
Discussion
To date, the management of follicular thyroid nodules with indeterminate FNA cytology is representing a challenging task for clinicians. The introduction of the two subgroups A and B in the TIR3 category into the new Italian cytological classification of 2014 (4) aimed at attributing a more refined risk of malignancy prediction with the goal to reduce unnecessary thyroid surgery.
At the “Fondazione Policlinico A. Gemelli IRCCS” University Hospital of Rome, the overall incidence of diagnosed thyroid cancers in the first three years following the introduction of the two subclasses was 11.5% for the TIR3A and 32.5% for the TIR3B category. This percentage is slightly higher than expected for both categories, since the respective incidence is estimated to be approximately less than 10% for TIR3A and between 20% and 30% for TIR3B. This result may stem from the biased sample selection for the present study, as only the surgically treated nodules were included. The selection bias could also explain the increased number of TIR3A nodules compared with TIR3B lesions, although TIR3A lesions can usually be managed with more conservatively. Of note, the TIR3A nodules addressed surgical excision were on average larger than the TIR3B nodules, and in most cases, the surgical indication was due to the presence of large multinodular goiters. Therefore, it is reasonable to speculate that the percentage of malignancy in the two subclasses would be lower in a prospective randomized study, thus better aligning with the expected percentage.
Regarding the clinical and US context, the results of the present study suggest that the CUT score may be useful in daily practice for the evaluation of cytologically indeterminate thyroid nodules. The sum of clinical and US features included and calculated in the C + U score, resulted to be slightly higher in the TIR3B versus TIR3A thyroid nodules (p = 0.02). This CUT score grading reflects therefore the TIR3 subclassification in A and B, and it can be considered as a preliminary result that encourages more efforts to identify the indeterminate nodules with the lowest risk of malignancy to spare them from surgery. Most importantly, the indeterminate thyroid nodules in both cytological subcategories that were found to be malignant on histology in this study showed C + U sums that were significantly higher than the ones found in nodules that displayed a benign histology. This result is in line with previous findings (10), where the CUT score was applied to 705 nodules classified with the previous Italian cytological classification system of 2007 (5). The category of follicular/indeterminate lesions (TIR3) included 32 nodules subjected to surgical resection. Of these, the malignant nodules showed an average C + U value that was significantly higher than the value in benign nodules (2.4 vs. 4.7). In our previous study (10), a C + U sum >2.5 was identified as a cutoff for malignancy in indeterminate thyroid nodules. In the present study, the higher number of samples and the subdivision into the two subclasses allowed the identification of a more precise cutoff for each subclass, A and B, of the new TIR3 category. Specifically, for TIR3A nodules, a C + U sum <4 allowed to identify 51 benign nodules out of 69 cases (74%), with a negative predictive value of 92.7% and an overall risk of malignancy of 7.2% (Table 4). For the TIR3B nodules, a C + U sum ≥3.75 allowed the identification of 28 malignant tumors out of a total of 40, with a risk of malignancy equal to 47.5% (Table 5). Thus, we reached a diagnostic accuracy of 71.8% [CI = 61.8–81.8] in the TIR3A group and of 65% [CI = 56.6–73.4] in the TIR3B group.
Ianni et al. (10).
Ianni et al. (10).
Therefore, this study demonstrates a high negative predictive value for TIR3A nodules with a C + U sum <4, which suggests that a conservative approach (follow-up and repeat FNA) is appropriate for these lesions. In contrast, for TIR3B nodules, a C + U sum ≥3.75 appears to be the appropriate cutoff for recommending surgery. However, the results of this study need to be tested in prospective multicenter studies to confirm their clinical utility.
In the last years, several studies investigated the role of the most widespread US thyroid nodule classification systems such as the various Thyroid Imaging Reporting and Data Systems (TIRADS) (14 –17) and the American Thyroid Association (ATA) classification (18), in the risk stratification of nodules with indeterminate cytology. These systems include five suspicious US features (solid echostructure, hypoechogenicity, irregular margins/halo, microcalcifications, and “taller than wide” shape), all of which are represented in the CUT score. A recent study by Grani et al. (19) applied the TIRADS proposed by Kwak et al. (16) and the 2015 ATA risk stratification (18) to a group of indeterminate thyroid nodules classified as TIR3 (5). Using category 4c of the TIRADS by Kwak et al. (accuracy 73% [CI = 60–84]) and the intermediate suspicion pattern of the ATA guidelines (accuracy 76% [CI = 62–85]) as cutoff, the authors reached diagnostic accuracies comparable with those obtained in our study. Ulisse et al. applied the TIRADS by Kwak et al. to 50 cytologically indeterminate nodules (23 TIR3A and 27 TIR3B) and also determined category 4c as cutoff, suggesting an intermediate risk of malignancy for TIR3A combined with the TIRADS score 4c and 5, and a high risk of malignancy for TIR3B combined with the same TIRADS scores (20). Similar results were reported by Chng et al. using the BTA-RCPath system (21) and including 144 Thy3F nodules. They found that nodules with Thy3F cytology results and a TIRADS score of 4c and 5 had a very high risk of malignancy. In that study, the presence of at least three suspicious US features including solid component, marked hypoechogenicity, microlobulated or irregular margins, microcalcifications, and taller than wide shape was linked to a high malignancy rate in lesions classified as follicular neoplasms on cytology. The main limitation of the TIRADS by Kwak et al. is, in our opinion, that each US feature is considered with the same weight. In contrast, we introduced a different weights for each suspicious US feature in the CUT score, which can be easily calculated on a smartphone.
Over the past 5–10 years, great efforts have been made in clinical translational research for thyroid nodules with indeterminate cytology. Molecular biomarkers, such as immunocytochemistry, gene mutation panels, gene or microRNA expression profiles, and sequencing techniques, have improved the characterization of thyroid nodules, overall improving the accuracy. A recent review by de Koster et al. (22) suggested that the best rule-out tests for malignancy are the Afirma® gene expression classifier and fluorodeoxyglucose positron emission tomography; the most accurate rule-in test was BRAF mutation analysis. However, their limited global availability, high costs and low probability of cost-effectiveness, currently limits their widespread clinical implementation. The value of the scoring system presented here consists in its lower cost and wide availability, which may justify its lower accuracy compared with the tests listed above.
In conclusion, the CUT score is a potentially valid aid for the clinician in the management of indeterminate nodules with a follicular cytology. Since the application software is available on smartphones, it is easily accessible. Nevertheless, further studies are needed to validate the “CUT” score as a clinical–ultrasound evaluation system.
Footnotes
Acknowledgments
We wish to pay special recognition and tribute to Dr. Paolo Campanella, whose work and research was fundamental for the success of this project. Dr. Campanella's passion and dedication to the field of medicine was inspirational for those who had the fortune to work with him. His contributions as a doctor of medicine and as a human being will be deeply missed.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
This research did not received any specific grant from sponsors or funders.
