Abstract
Background:
Two types of risk-stratification system—a qualitative grading system or a quantitative scoring system—have been used for the effective management of thyroid nodules on ultrasonography.
Summary:
The concept of the Thyroid Imaging Reporting and Data System (TI-RADS), based on the quantitative scoring system, was introduced in the late 2000s, and its format has been constantly evolving and developing. Understanding the role and appropriate utilization of risk-stratification systems including TI-RADS could facilitate the effective interpretation and communication of thyroid ultrasonography findings among referring physicians and cytopathologists.
Conclusion:
This comprehensive review provides a developmental overview of the use of risk-stratification systems in thyroid nodules, including TI-RADS proposals, and describes the future developmental direction of TI-RADS for the personalized and optimized management of thyroid nodules.
Introduction
T
Risk-stratification systems for thyroid nodules on US have been developed by many different authors and societies for the effective management of thyroid nodules (7 –14). Since these systems are based on US features that have changed little over the past decade, the challenge lies in how to incorporate them into a system that is both practical and accurate. Accurately estimating the risk of malignancy on US could help to select those nodules with a high risk of cancer. Conversely, numerous FNAs may also be avoided by identifying those nodules with an acceptably low incidence of malignancy.
Risk-stratification systems for thyroid nodules were initially introduced by classifying thyroid nodules that showed any suspicious US features as malignant (7 –11). The qualities of a nodule are purely descriptive in such systems, known as qualitative grading systems. However, since the malignancy risk is not determined by a single US predictor, assessment using a combination of US features has been suggested as a better method of risk stratification. Therefore, the qualities of a nodule are assigned a number, and the malignant risk of the thyroid nodules is scored by calculating the number of suspicious US features, categorizing the US patterns, or using web-based malignancy risk estimation (15 –22). Thus, the system can now be considered a quantitative scoring system. Thyroid Imaging Reporting and Data Systems (TI-RADS), based on quantitative scoring, have recently been proposed to stratify the risk of malignancy in thyroid nodules and to standardize the reporting system in thyroid US. Since its introduction by Horvath et al. in 2009, TI-RADS has comprised a series of separate systems proposed by various authors and societies (15 –21).
This comprehensive review provides a developmental overview of the various risk-stratification systems for thyroid nodules, including TI-RADS proposals, along with their advantages and limitations in terms of their application in clinical practice. Based on these historical backgrounds, the future developmental direction of TI-RADS for the personalized and optimized management of thyroid nodules is described.
History of Risk-Stratification Systems and the Development of TI-RADS
Qualitative grading system
Risk-stratification systems were initially developed as qualitative methods. In 2002, Kim et al. proposed sonographic criteria for the recommendation of FNA in a study of 155 non-palpable solid nodules (8). In this report, nodules with one of the suspicious features on US evaluation, including microcalcifications, a taller-than-wide shape, irregular borders, and marked hypoechogenicity, were considered as potentially malignant. The overall sensitivity and specificity using this system were 93.8% and 66.0%, respectively, for detecting thyroid cancers. In 2005, the Society of Radiologists in Ultrasound suggested US features with a threshold for FNA (11). US features such as microcalcifications, solidity, and coarse calcifications were strongly considered as criteria for FNA, depending on the nodule size. However, the diagnostic accuracy was relatively low in the prediction of malignancy (23,24). The overall sensitivity and specificity using this system were 35.5% and 54.3%, respectively. In 2008, Moon et al. proposed five US features using a multicenter retrospective study of 849 nodules as predictors of malignancy. US features including microcalcifications, macrocalcifications, a taller-than-wide shape, a spiculated margin, and marked hypoechogenicity were accepted, with a diagnostic sensitivity of 83.3%, and the Korean Society of Thyroid Radiology (KSThR) guideline was published in 2011 based on these study results (7,13). Two major guidelines for the management of thyroid nodules, the American Thyroid Association (ATA) in 2009 and the American Association of Clinical Endocrinologists, Associazione Medici Endocrinologi, and the European Thyroid Association (AACE/AME/ETA) in 2010, added hypoechogenicity and increased nodular vascularity to the list of suspicious US features in consideration of the clinical risk factors and nodule size for FNA (9,10). However, since the malignancy risk of a nodule with any suspicious US features can differ according to the combined US features, that is, the malignancy risk associated with microcalcification in hypoechoic or solid nodules is significantly higher than that in isoechoic or partially cystic nodules (19,25), the qualitative grading system using single features is unable to provide accurate information.
The analysis of US findings from several combined-feature categories was suggested by a number of authors. In 2005, Reading et al. described a different approach based on the “classic pattern” of appearances of benign and malignant thyroid nodules (26). They presented eight classic patterns commonly encountered in benign and malignant nodules for use in deciding whether to perform FNA. In 2007, Ito et al. similarly reported a pattern-based classification system, which consisted of five classes from 1 to 5, with intermediate steps of 0.5 for classes 2–5 (27). Nodules classified with a score of ≥3.5 were considered as having a high risk of malignancy, with a high positive predictive value of 95.8%. Both studies by Reading et al. and Ito et al. showed a conceptual change in the risk-stratification system from the analysis of individual US features to a categorization of US patterns. However, these systems were descriptive, encompassed a primitive form of analysis of combined US features, and remained in essence qualitative in terms of evaluating the risk of malignancy.
Quantitative scoring system
Most recently, a quantitative scoring system was proposed, which estimates the risk of malignancy by scoring the combined US features using one of the following methods: categorizing the US patterns, calculating the number of suspicious US features, or summing the US risk scores (Table 1). The concept of TI-RADS, based on the quantitative scoring system, was initially proposed by Horvath et al. in 2009, and six categories based on 10 sonographic patterns were suggested, with an estimated malignancy risk in each category (15,28). In this system, the observer takes the constellation of features present in a nodule and matches them to a particular pattern to determine the malignancy risk. The advantage of pattern-based classification systems is that they can take differential weighting into account. However, this system was thought to be less practical, as there were too many US features to be converted into the 10 stereotypic US patterns, and it did not cover all of the patterns associated with the nodules. In 2009, Park et al. proposed a different approach for TI-RADS: five categories based on 12 US features (16). They generated a mathematical equation to predict the probability of malignancy and developed categories ranging from the lowest to highest probability of malignancy. However, as a limitation, it was difficult to assign every thyroid nodule into the proposed equation in clinical practice. Therefore, to overcome the complexity of the two proposed forms of TI-RADS, in 2011, Kwak et al. proposed five TI-RADS categories simply by using the number of suspicious US features (17). However, this system was criticized because each suspicious US feature was assigned the same weight, despite having a different probability of malignancy, and the system did not allow easy perception of a visualized US pattern in a nodule. Similar to Kwak et al., in 2013, Russ et al. proposed five TI-RADS categories by defining the number of suspicious US features (18). A high degree of stiffness with elastography on US was included as one of the suspicious US features. In contrast, in 2016, Na et al. 2016 and the KSThR proposed a four-tier risk-stratification system, again using the categorization of US patterns (5,19). They demonstrated that the malignancy risk or US features predicting malignancy may differ, according to the solidity and echogenicity of the thyroid nodules. Therefore, TI-RADS categories were suggested according to the US patterns by combining solidity, echogenicity, and suspicious US features. This new version of TI-RADS, based on solidity and echogenicity, was validated in a prospective multicenter study by Ha et al. (25).
ACR, American College of Radiology; TI-RADS, Thyroid Imaging, Reporting and Data System; US, ultrasound.
However, although the TI-RADS, based on the pattern-based approach, can be easily applied in practice, the need for a more segmented risk-stratification system was suggested by some investigators for the personalized and optimal management of thyroid nodules. In 2013, the KSThR suggested a risk-scoring system composed of 15 categories, ranging from 7.3% to 95.2% (20). Each suspicious US feature was assigned a different risk score according to their odds ratio for detecting thyroid malignancy, and the risk of malignancy was simply calculated by the sum of the total scores. However, risk estimation using the 15 categories was too complex to be applied in practice, and nodules with the lowest score still had a 7.3% risk of malignancy, while those with a higher score needed to be followed. In 2015, Choi et al. proposed an advanced risk-scoring system, the so-called web-based malignancy risk-estimation system, by using a developmental and validation data set from a multicenter retrospective study (21). A 14-point risk-scoring system was developed using the sum of the individual scores, similar to the proposals by the KSThR in 2013, and the malignancy risk was more segmented, ranging from 3.8% to 97.4%. The authors proposed a freely available Internet Web-based program to solve the complexity of the segmented risk-scoring system. Thus, it would be simple and easy to use by clicking on each US feature on the Web site. However, clicking on each US feature on the Web site may be inconvenient for US practitioners. In addition, this Web-based, automatically calculated malignancy risk depends on the data used for the program. Thus, further studies based on different data sets are required to validate this program.
Risk-Stratification Systems in Current Guidelines
Risk-stratification systems based on US patterns have been incorporated into major guidelines. In 2014, the British Thyroid Association guideline proposed a five-tier risk-stratification system called the U score based on several combined-feature categories (12). However, this system was qualitative, and the risk of malignancy was not estimated in all of the categories. The 2014 National Comprehensive Cancer Network guideline also presented a conventional qualitative risk-stratification system, emphasizing the solidity of the nodule (14). On the other hand, the 2015 ATA and 2016 AACE/ACE/AME guidelines proposed five- and three-tier risk-stratification systems, respectively, with an estimated risk of malignancy (3,4). However, the US pattern of an isoechoic or hyperechoic solid nodule with any suspicious US features is not categorized in the ATA risk-stratification system (29). In 2017, the American College of Radiology (ACR) proposed a different approach for TI-RADS: the five-tier ACR TI-RADS based on a point-scoring system (30). The ACR was not in favor of the pattern-based approach used by the ATA because of the results of a study by Yoon et al., who showed that the ATA guidelines were unable to classify 3.4% of 1293 nodules, of which 18.2% were malignant. Instead, 0–3 points were assigned based on one feature in each of five basic US finding categories, with more points assigned for a higher malignancy risk associated with that feature (29,30). The ACR TI-RADS was recently validated by a multi-institutional analysis using 3422 nodules (31).
Future Developmental Direction of TI-RADS
TI-RADS is currently evolving from the initial proposal by Horvath et al. in 2009 to the recent Web-based risk-scoring system by Choi et al. in 2015. However, despite these efforts, none of the TI-RADS classifications have been widely adopted worldwide, and there are still conflicting recommendations from several societies. Therefore, there have been many attempts to develop a practical, standardized classification for the risk stratification of thyroid nodules and to provide consistent management in practice. Standardization of the US lexicon, along with efforts to provide an evidence-based risk-stratification system, is necessary for the proper management of thyroid nodules (22,32). It is important to validate the applicability of each risk-stratification system in different clinical settings and to compare the diagnostic performances of various risk-stratification systems.
Recent trends in the management paradigm of thyroid nodules tend to have moved toward a conservative approach as an alternative to FNA (3,4). TI-RADS tend also to be geared toward more personalized management, along with minimizing unnecessary FNAs. In this respect, it may be that the current risk-stratification systems using three to five categories need more segmentation according to the combination of US characteristics. However, as more of the risk-scoring system is segmented, it may not be clinically feasible due to its complexity. Thyroid computer-aided diagnosis (CAD) using artificial intelligence might be one option to solve this problem. In 2016, Chang et al. reported that the use of thyroid CAD to differentiate malignant lesions from benign ones showed accuracy similar to that obtained via visual inspection performed by radiologists (33). Also in 2016, Choi et al. integrated the thyroid CAD system into a US machine (S-Detect for Thyroid) (34). S-Detect demonstrated a sensitivity for thyroid cancer detection similar to that obtained by an experienced radiologist. However, the radiologist achieved a greater specificity. The combination of thyroid CAD with a risk-scoring system, implemented on a commercial US machine, could decrease operator dependency in image interpretation and assist with real-time interpretation for assessing the risk of malignancy and FNA decisions in patients with thyroid nodules.
Combined categorical reporting systems between clinical and US features or between cytology and US features should be evaluated to gain a proper indication for FNA. Ianni et al. recently reported a new cancer risk score, the so-called CUT score, derived from clinical and US features, along with the five-tiered TI-RADS score for the preoperative assessment of thyroid nodules (35). In this study, the combined score represented the cytologic result of the FNA. Lee et al. demonstrated that the relative risk ratios of the initial FNA category of “atypia of undetermined significance” and thyroid US categories of US 3, US 4, and US 5 revealed a high malignancy rate (36). Although there are still only a limited number of studies in the literature, these findings might be useful for suggesting indications for initial and repeat US-guided FNA, which may ultimately lead to the effective management of thyroid nodules. The various diagnostic and prognostic molecular markers of thyroid cancer can also be applied to categorical systems to achieve more personalized management of patients (37). In the future, development of the CAD system and the combination of US features with clinical and cytopathologic/genetic information could enhance the diagnostic performance and minimize unnecessary biopsies and/or surgeries.
Conclusion
The proper use of risk-stratification systems, including TI-RADS, provides numerous advantages, including facilitating the standardized assessment of thyroid nodules that clearly indicate a need for consistent management recommendations, as well as effective communication between referring physicians and cytopathologists. An understanding of the current status and future developmental direction of TI-RADS will be of great help to physicians in the management of thyroid nodules.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
