Abstract
OBJECTIVE:
This study was conducted to investigate the diagnostic performance of Virtual Touch Tissue Imaging and Quantification (VTIQ), combined with the American College of Radiology Thyroid Imaging Reporting and Data System (ACR TI-RADS) in differentiating malignant and benign thyroid nodules.
METHODS:
A total of 130 thyroid nodules in 128 patients were included. The diagnostic performance of conventional ultrasound (US), VTIQ, and the combination of these two techniques was calculated and compared according to the area under the receiver operating characteristic curve (AUROC), for sensitivity, specificity, and accuracy.
RESULTS:
The sensitivity and specificity for the ACR TI-RADS were 98.6% (72/73) and 24.6% (14/57), respectively. There was a strong agreement with ACR TI-RADS categories of thyroid nodules (all ICCs > 0.60). With an optimal cutoff value of 2.46 m/s, the sensitivity and specificity of the minimal shear wave velocity (SWVmin) were 87.7% (64/73) and 70.2% (40/57). By applying this value to downgrade or upgrade ACR TI-RADS, the specificity significantly increased from 24.6% (14/57) to 47.4% (27/57; P < 0.05) and the sensitivity remained at 98.6% (72/73).
CONCLUSIONS:
VTIQ combined with ACR TI-RADS could improve the specificity of the differential diagnosis of thyroid nodules without a loss of sensitivity.
Keywords
Introduction
A thyroid nodule is a common finding in clinical practice. In the adult population, the incidence of thyroid nodules is approximately 33–68%, of which about 5–15% are malignant [1]. Epidemiological studies in the United States have indicated that the incidence of thyroid carcinoma was increasing at an average annual rate of 3.6% and the most common histological type was papillary thyroid carcinoma [2]. The detection rate of thyroid nodules can reach up to 19–67% with high-resolution ultrasound (US) [3], which is the preferred non-invasive imaging method with which to differentiate benign from malignant thyroid nodules. However, the accuracy of distinguishing benign and malignant lesions based on a single US feature is relatively low [4]. In 1992, the Breast Imaging Reporting and Data System (BI-RADS) was established by the American College of Radiology (ACR) to describe mammography, US findings, and their correlation with malignancy in a standard form [5]. Based on BI-RADS, the Thyroid Imaging Reporting and Data System (TI-RADS) was first proposed by Horvath et al. in 2009 [6]. Subsequently, Park et al. [7] proposed an equation with which to classify thyroid nodules into five types based on 12 features, such as shape, halo, and others. In 2011, a comparatively simple TI-RADS was devised by Kwak et al. [8] on the basis of five malignant characteristics. Recently, the ACR developed a new TI-RADS based on those five US features, which were composition, echogenicity, shape, margin, and echogenic foci. Different from the previous classifications, the ACR TI-RADS assigned corresponding scores according to the malignant risk of the different US characteristics. The TI-RADS level was then determined by adding the point values from all the categories [9].
There is some overlap in US findings between benign and malignant thyroid nodules and the inter-observer consistency is poor [10, 11]. To further improve the diagnostic performance with regard to thyroid nodules, elastography has been introduced into clinical practice. As a noninvasive imaging modality, elastography was originally proposed by Ophir et al. [12] in 1991 to evaluate tissue stiffness and has been widely used in recent years. US elastography mainly incorporates strain elastography and shear wave elastography (SWE), which can qualitatively and quantitatively assess the tissue stiffness on the basis of conventional US [13]. Strain elastography, also known as real-time elastography (RTE), is a method by which tissue stiffness can be measured qualitatively. It requires external manual compression and displays the elastic deformation based on the displacement of tissue. Nevertheless, strain elastography has proven to be highly operator-dependent, susceptible to external pressure, and problematic in reproducibility [14]. In the last few years, SWE has been widely adopted to provide a quantitative measurement of tissue stiffness. SWE can be divided into two types: point SWE, and two-dimensional SWE. Compared with the previous point SWE, Virtual Touch Tissue Imaging and Quantification (VTIQ; Siemens Medical Solutions, Mountain View, CA), a kind of two-dimensional SWE, generates a 2D visualization of shear wave velocity (SWV) distribution with various colors and has smaller region of interest (ROI). Thus, VTIQ might provide more precise stiffness information.
To our knowledge, many previous studies have obtained promising results in discriminating thyroid nodules based on the combination of TI-RADS and various elastography techniques. However, to date, no study has explored the diagnostic performance of VTIQ combined with the newest 2017 ACR TI-RADS in differentiating benign from malignant thyroid nodules. Therefore, the main objective of this study was to investigate the diagnostic performance of VTIQ combined with the ACR TI-RADS for thyroid nodules.
Materials and methods
This retrospective study was approved by the local Ethics Committee and the approval number was KY2018-296. The need to obtain written, informed consent was waived due to the retrospective character of the study.
Patients
From December 2016 to March 2018, a total of 402 consecutive patients with 416 thyroid nodules underwent conventional US and VTIQ examinations before fine needle aspiration (FNA) or surgery. The inclusion criteria were as follows: (1) sufficient thyroid tissue around the nodule at the same depth; and (2) cytological or pathological results were obtained by FNA or surgery. For patients with multiple nodules, we selected the nodule with the most suspicious US features or the largest nodule when there were no suspicious US features. The exclusion criteria were as follows: (1) purely cystic nodules or mixed nodules (cystic portion > 75%) (n = 23); (2) incompleteness of clinical data and images (n = 216); (3) a history of invasive therapy, such as ablation (n = 20); and (4) non-diagnostic cytological findings (n = 27). Finally, 128 patients with 130 thyroid nodules were enrolled in this study (Fig. 1). A single nodule in 126 patients, and two nodules in two patients, were selected for evaluation.

Flowchart of thyroid nodules selection.
Conventional US and VTIQ examinations were performed on the same Siemens S3000 US scanner (Siemens Medical Solutions, Mountain View, CA, USA) equipped with an 18L6 linear array transducer (frequency range, 7–17 MHz) and a 9L4 linear array transducer (frequency range, 4–9 MHz). Both examinations were performed by one of four radiologists who had more than five years of experience in thyroid US and three years in thyroid elastography. The patient was asked to lie on the bed in a supine position with slight dorsal flexion of the head. After carefully scanning the thyroid and adjacent tissue transversely and longitudinally, the conventional US images were manually saved on the machine hard disk for further analyses.
After conventional US, VTIQ was performed by the same radiologist in the longitudinal section of the nodule. The transducer was gently placed perpendicularly to the skin with sufficient couplant. While the images were obtained, patients were asked to hold their breath and not to swallow for a few seconds. VTIQ quality mode was first acquired, in which green represented high quality, yellow intermediate quality, and red low quality. Afterward, the image mode was switched to VTIQ velocity mode, in which red, yellow, and blue represented SWV from high to low, respectively. The SWV values were calculated in meters per second (m/s) and ranged from 0.5 to 10 m/s. SWV was adjusted slowly from high to low to obtain the final VTIQ velocity mode image with the proviso that the surrounding tissue of the nodule displayed a uniform light blue or green color and the nodule displayed a red or yellow color. When placing SWV-ROIs (fixed size: 1×1 mm), cystic and calcified areas on conventional US, and areas corresponding to low quality on the quality mode in the target nodule, were avoided. Seven SWV-ROIs were placed in the target nodule based on the previous literature [14, 15]. When the velocity image presented a homogeneous distribution of SWV, all seven SWV-ROIs were placed randomly; otherwise, two SWV-ROIs were placed in the hardest and softest parts, respectively, and the rest were placed at random. The maximal SWV value (SWVmax) and minimal SWV value (SWVmin) were chosen and recorded, and the mean SWV value (SWVmean) and median SWV value (SWVmedian) were computed. The VTIQ images were also saved for subsequent analyses.
Image interpretation and analysis
The conventional US features of thyroid nodules include composition, echogenicity, shape, margin, and echogenic foci. These features are divided into categories in the ACR lexicon. The nodule composition is categorized as cystic or almost completely cystic, spongiform, mixed cystic and solid, and solid or almost completely solid. The echogenicity of a nodule is categorized as anechoic, hyperechoic or isoechoic, hypoechoic, and very hypoechoic. The shape is categorized as wider-than-tall or taller-than-wide. The margin is categorized as smooth, ill-defined, lobulated or irregular, and extra-thyroidal extension. The echogenic foci is categorized as none or large comet-tail artifacts, macrocalcifications, peripheral calcifications, and punctate echogenic foci. In the ACR TI-RADS, every US feature is awarded 0–3 points, representing their association with malignancy. When evaluating a nodule, a reviewer selected one feature from the first four categories, including composition, echogenicity, margins and shape, but all the features from echogenic foci [9]. Points from all categories were added to determine the TI-RADS level, in which TR 1 indicated 0 points; TR2, 2 points; TR3, 3 points; TR4, 4–6 points; TR5, 7 or more points. FNA recommendation or follow-up were based on the ACR TI-RADS level and the size of the nodules.
Inter-observer agreement of ACR TI-RADS
In order to test whether operator experience affected the results of TI-RADS classifications, we conducted an inter-observer agreement test. Retrospective assessment of US images was conducted by two experienced radiologists (six years of experience in thyroid US) and two inexperienced radiologists (each with two years of experience in thyroid US). All radiologists were blinded to the clinical data, pathologic findings, and each other’s interpretation.
Statistical analysis
The reference standards were cytologic results from FNA or histopathologic results from surgery. Categorical data were expressed as frequency tables while continuous data were expressed as means±standard deviations. Categorical variables were compared using the χ2 test or Fisher’s exact test and continuous variables were compared by independent t test. A receiver operating characteristic curve was constructed to acquire the area under the curve (AUC), cut-off values, sensitivity, specificity, accuracy, positive predictive value (PPV), and negative predictive value (NPV). For statistical analysis, nodules with TR2 or TR3 were considered benign and nodules with TR4 or TR5 were considered malignant.
The intraclass correlation coefficient (ICC) was used to evaluate the inter-observer agreement with regard to ACR TI-RADS. The ICC values were as follows: 0–0.20, poor agreement; 0.20–0.40, fair agreement; 0.40–0.60, moderate agreement; 0.60–0.80, strong agreement; >0.80, excellent agreement.
All statistical analyses were performed with the SPSS 13.0 software (IBM Corporation, Armonk, NY). A two-tailed p value less than 0.05 was considered statistically significant.
Results
Demographic and pathologic characteristics
The 128 patients included 27 men and 101 women. Their mean age was 47.8±10.5 years (range, 17–68 years). The average size of thyroid nodules was 14.6±10.0 mm (range, 5.0–49.0 mm). Among the 130 nodules, 57 were benign and 73 were malignant. Benign nodules included nodular goiter (n = 52), adenoma (n = 3), subacute thyroiditis (n = 1), and Ig4-related sclerosing adenosis (n = 1). All were confirmed by pathological results from surgery. Malignant nodules included papillary thyroid carcinoma (PTC) (n = 71), medullary thyroid carcinoma (MTC) (n = 1), and follicular thyroid carcinoma (FTC) (n = 1). Four of 73 thyroid nodules were confirmed by cytological results from FNA, and the remaining 69 thyroid nodules were confirmed by pathological results from surgery. The basic demographic data is summarized in Table 1. There were no significant differences in gender between benign and malignant nodules (P = 0.094). The average size of the benign nodules was 17.8±10.2 mm, which was significantly larger than that of the malignant nodules (12.2±9.0 mm; P = 0.001). The mean age of the patients with benign nodules was significantly older than that of the patients with malignant nodules (P = 0.001).
Basic demographic characteristics of patients and nodules
Basic demographic characteristics of patients and nodules
Data are numbers of nodules unless otherwise indicated. †Data are mean±standard deviation with ranges in parentheses. *Indicates significant difference.
Table 2 lists a summary of the US features of benign and malignant thyroid nodules. US features, such as solid or almost completely solid, hypoechoic, taller-than-wide, lobulated or irregular, and extra-thyroidal extension, were more commonly found in malignant nodules (all P < 0.05). According to the TI-RADS guidelines [9], echogenic foci were subdivided into four categories and all these categories were applied to evaluate a nodule. Therefore, the differences between benign and malignant nodules were calculated separately in the four categories. While a statistically significant difference was seen in punctate echogenic foci (P = 0.001), there were no significant differences in the other three categories (all P > 0.05).
Summary of conventional US features of thyroid nodules
Summary of conventional US features of thyroid nodules
Data are given as form of numbers. *Indicates significant difference.
Table 3 shows the malignant rates of TI-RADS categories 2, 3, 4, and 5, which were: 0% (0 of 3); 8.33% (1 of 12); 17.24% (5 of 29); and 77.91% (67 of 86), respectively. A trend toward an increasing risk of malignancy from TR2 to TR5 was noted.
Malignant rates of TI-RADS category
Data are given as form of numbers.
Among all four radiologists, a strong agreement was obtained for composition, margin, none or large comet-tail artifacts, macrocalcifications, and punctate echogenic foci, whereas there was a moderate agreement for echogenicity, shape, and peripheral calcifications (Table 4). For TI-RADS categories of thyroid nodules, all ICCs were strong (Table 5).
Inter-observer agreement for US features among all four radiologists
Inter-observer agreement for US features among all four radiologists
Inter-observer agreement for ACR TI-RADS categories
A and B: two experienced radiologists. C and D: two inexperienced radiologists.
The diagnostic performance of VTIQ in the discrimination of benign and malignant thyroid nodules is shown in Table 6. The optimal cutoff values of SWVmax, SWVmin, SWVmean, and SWVmedian were 3.17 m/s, 2.46 m/s, 2.80 m/s, and 2.76 m/s, respectively, and the AUCs were 0.766, 0.855, 0.825, and 0.826 (Fig. 2). Among them, the highest sensitivity was demonstrated by an SWVmax and SWVmean of 93.2%, the highest specificity by an SWVmin of 70.2%, the highest accuracy by an SWVmin and SWVmean of 80.0%, and the highest AUC by an SWVmin of 0.855.

Receiver operating characteristic curves for VTIQ measurements.
Diagnostic performance of VTIQ for predicting thyroid malignancy
Data are given as form of percentages with numerators and denominators in parentheses; AUC numbers in parentheses are 95% CIs. SEN = sensitivity; SPE = specificity; PPV = positive predictive value; NPV = negative predictive value; ACC = accuracy; AUC = area under the curve.
For TR4 nodules, when the SWVmin was less than 2.46 m/s, they were downgraded to TR3, with 54.2% (13/24) of benign TR4 lesions successfully downgraded to TR3 (Table 7 and Fig. 3). The specificity significantly increased from 24.6% (14/57) to 47.4% (27/57; P < 0.05), and the accuracy significantly rose from 66.2% (86/130) to 76.2% (99/130; Table 8; P < 0.05). For TR3 nodules, when SWVmin was more than 2.46 m/s, they were upgraded to TR4. There was no change in sensitivity, which remained at 98.6% (72/73).

US images from a 54-year-old woman with a nodular goiter with calcification. a. On conventional US imaging, a 26.6-mm nodule in the right lobe of the thyroid shows an almost completely solid, hypoechoic, wider-than-tall, smooth nodule with marcrocalcifications. b. The VTIQ quality mode shows high-quality shear wave propagation. c. The VTIQ velocity mode shows the nodule with a minimum SWV value of 2.35 m/s, ranging from 2.35 m/s to 4.22 m/s, which was less than the cut-off value of an SWVmin of 2.46 m/s. The total score of the nodule was 5. The nodule was classified as TR 4 with conventional US, whereas it was successfully downgraded to TR3 with the combination method of VTIQ and TI-RADS.
TI-RADS categories of TI-RADS alone and combination with VTIQ
Data are given as form of numbers and percentages in parentheses.
Comparison of the diagnostic performance of TI-RADS and combination with VTIQ
Data are given as form of percentages with numerators and denominators in parentheses. *Indicates significant difference. SEN = sensitivity; SPE = specificity; ACC = accuracy.
The present study demonstrated that VTIQ combined with ACR TI-RADS could improve the specificity for the differential diagnosis of thyroid nodules without a loss of sensitivity, compared with ACR TI-RADS applied alone. With the application of high-resolution US, the detection rate for thyroid nodules has increased substantially [16]. We need to recognize that the most important issue is not just diagnosis, but to determine which types of nodules are suitable for FNA or surgery to avoid unnecessary FNA for benign nodules. Although conventional US proved to have high diagnostic accuracy [3], there is some overlap in US findings between benign and malignant thyroid nodules [17]. To obtain more diagnostic information, the development of additional noninvasive methods, such as elastography, has been promoted. The World Federation of Ultrasound in Medicine and Biology (WFUMB) guidelines point out that elastography is a recommended supplementary tool for conventional US and may be useful in selecting thyroid nodules for surgery [18]. It is well known that both VTIQ and Virtual Touch Tissue Quantification (VTQ) are quantitative elastography techniques induced by Acoustic Radiation Force Impulse (ARFI). Yang et al. [19] compared the diagnostic efficiency of VTIQ and VTQ in identifying benign and malignant thyroid nodules by using the median and mean of SWV. They concluded that VTIQ presented better diagnostic performance when using the SWVmedian, compared with VTQ. The difference might be related to the following factors. First, the size of the ROI in VTIQ is altered and the smallest size is 1 mm×1 mm, which is smaller than that in VTQ. Second, in VTIQ, the extent of SWV values is from 0.5 to 10 m/s. However, in VTQ, SWV values range from 0 to 8.4 m/s. Third, VTIQ has four modes: quality mode; velocity mode; displacement mode; and time mode. The quality mode can display shear wave quality with different colors and the velocity mode generates a 2D visualization of SWV distribution with various colors, whereas these are absent in VTQ. However, a previous study reported that adding elastography did not improve the diagnostic performance of conventional US in discriminating thyroid nodules [20], which was different from our results. This difference may be caused by the facts that (1) Yoon et al. used a Toshiba Apolio 500 US scanner, strain elastography, and SWE, while our study used a Siemens S3000 US scanner and VTIQ; and (2) there were more benign nodules than malignant in the study by Yoon et al., which could represent the normal distribution of thyroid disease in the population, but the proportion of malignant nodules was higher in our study.
For the clinical use of SWE, Xu et al. [21] proposed more specific guidelines and recommendations, which can help clinicians with the management of thyroid nodules by the assistance of SWE. With respect to US machines, the most frequently used 2D-SWE imaging techniques in clinical practice incorporate VTIQ (Siemens Medical Solutions, Mountain View, CA, USA), SuperSonic Imagine (SSI; Aix en Provence, France), and Toshiba SWE (T-SWE; Toshiba Medical System, Tochigi, Japan). Several studies have compared the diagnostic performance between different 2D-SWE imaging techniques. He et al. [22] compared diagnostic performance between VTIQ and T-SWE in 75 thyroid nodules. Their results showed that T-SWE with SWVmax performed better than VTIQ with SWVmax, in terms of AUC, sensitivity, and NPV. Thus, the selection of SWVmax should be recommended in T-SWE, whereas it should be avoided in VTIQ. Zhou et al. [23] concluded that VTIQ was a capable and reproducible tool with which to differentiate thyroid nodules and SWVmean showed the highest AUC of 0.820 with a cutoff value of 2.60 m/s. Meanwhile, when using SSI, AUC values ranged from 0.795 to 0.840, which confirmed that SSI and VTIQ had similar diagnostic utility [24–26]. Therefore, appropriate US machines should be selected based on clinical requirements.
In the present study, compared to TI-RADS applied alone, VTIQ combined with TI-RADS increased specificity from 24.6% to 47.4% without a loss of sensitivity. Mao et al. [27] obtained similar results. For thyroid nodules referred for biopsy, the specificity increased from 20.1% to 47.0% and the sensitivity remained at 98.4% for the combination of VTIQ and TI-RADS. Liu et al. [24] reported that SWE combined with conventional US achieved a higher specificity of 73.9%, but a lower sensitivity of 87.1% than our study. For thyroid nodules, the importance of US is to screen out malignant lesions, and higher sensitivity means missing less thyroid cancers. There are some differences between our study and previous studies. As we know, many previous studies have obtained promising results in distinguishing thyroid nodules based on a VTIQ imaging technique or the combination of TI-RADS and various elastography techniques. However, to date, our study is the first to use VTIQ combined with the newest 2017 ACR TI-RADS. In 2009, Horvath et al. [6] first proposed TI-RADS on the basis of BI-RADS. Since then, a large number of guidelines and TI-RADS have emerged [3, 28]. Nevertheless, none has been widely adopted. Different from the previous classifications, ACR TI-RADS assigns the corresponding scores according to the malignant risk of the different US characteristics, then the points from all categories are added to determine the TI-RADS level. It is relatively simple and convenient, thus, is more easily accepted by radiologists with different levels of experience.
According to the size of the nodules and the TI-RADS categories, a specific recommendation for biopsy or follow-up is given, and the TI-RADS categories are based on the US features. Therefore, variability in interpreting US features can affect management recommendations. Hoang et al. [29] assessed inter-observer variability in assigning features in the ACR TI-RADS. They reported that agreement was fair to moderate for all features except shape (κ= 0.61) and macrocalcifications (κ= 0.73), which had substantial agreement. In contrast, the present study showed that a strong agreement was obtained for composition, margin, none or large comet-tail artifacts, macrocalcifications, and punctate echogenic foci, whereas there was a moderate agreement for echogenicity, shape, and peripheral calcifications. The differences can be explained by the number and experience of the radiologists who assessed the thyroid nodules. There were eight radiologists in the study by Hoang et al., of whom six were from private facilities, with a median of 13 years in practice, and two were from different academic centers, with 20 and 32 years in practice. Our study included four radiologists, of whom two had six years of experience in thyroid US and two had two years of experience in thyroid US. In the present study, echogenicity had the lowest agreement among all the features. There are several potential sources of the greatest variability regarding echogenicity. First, according to the ACR TI-RADS, echogenicity is subdivided into four categories: cystic or almost completely cystic, spongiform, mixed cystic and solid, and solid or almost completely solid. However, there is no specific definition of the proportion of cystic and solid with regard to cystic or almost completely cystic and solid or almost completely solid in the ACR TI-RADS lexicon [30]. Second, other than on-site interpretation, the off-site interpretation of the US images might confuse the reviewers’ judgment of the US characteristics of each thyroid nodule. Our results revealed that all ICCs were strong for the ACR TI-RADS categories of thyroid nodules, despite whether radiologists had more or less experience. In other words, the ACR TI-RADS categories of thyroid nodules were not affected by the experience of radiologists.
Several limitations should be pointed out in our study. First, the nature of retrospective research would reduce the reliability of the results, and thus, prospective studies are needed to further confirm the validity of the results of this study in the future. Second, our hospital is a tertiary referral hospital and all patients enrolled in this study were scheduled for FNA or surgery for thyroid nodules with suspicious US features or compression symptoms. Therefore, most patients refused FNA or follow-up and received surgery as the preferred treatment. The proportion of malignant nodules was higher, which does not represent the normal distribution of thyroid disease in the population. Third, this study was a single-center and small-sample study that included only 130 thyroid nodules and almost all malignant nodules were PTCs, except for one medullary carcinoma and one follicular carcinoma. Most benign nodules were nodular goiters, and other pathological types were not included. Therefore, a multicenter, large-sample study is needed in the near future. Fourth, our study only conducted an inter-observer agreement test, while intra-observer variability was not evaluated. Fifth, in our study, four of 73 malignant nodules were confirmed by cytological results from FNA, which may be not precise because a malignant FNA interpretation can yield false-positive results of 1–3% [31].
Conclusion
Both VTIQ and ACR TI-RADS can identify benign and malignant thyroid nodules effectively. Our results revealed that ACR TI-RADS had strong inter-observer agreement and was not affected by the experience of radiologists. VTIQ combined with ACR TI-RADS could improve the specificity for the differential diagnosis of thyroid nodules without a loss of sensitivity. We believe this method could be applied as a simple and convenient tool with which to identify thyroid nodules accurately, ensure effective selection of malignant nodules, and avoid unnecessary biopsy for benign nodules. This method could also improve the diagnostic confidence of radiologists, and help guide clinicians in decision-making with regard to surgery, FNA, or follow-up.
