Abstract
Objective
Cervical tuberculous lymphadenitis (CTBL) and cervical lymph node metastasis (CLNM) share similar imaging characteristics, making differentiation challenging. This study aims to evaluate the clinical utility of a multimodal radiomics model combining grayscale ultrasound (GUS), elastography ultrasound (EUS), and contrast-enhanced ultrasound (CEUS) for distinguishing CTBL from CLNM.
Methods
A high-quality dataset comprising 203 cases of CTBL was used to train and test the radiomics models. The performance of single-modal (GUS, EUS, CEUS) and combined models was compared using AUC, sensitivity, specificity, and accuracy metrics. An independent test set of 45 cases was included for validation.
Results
The combined GUS + EUS + CEUS model outperformed single-modal models, achieving AUCs of 0.894, 0.832, and 0.919 in the training, validation, and test sets, respectively. Its diagnostic performance was comparable to a clinical model in validation and test sets, demonstrating superior generalizability and robustness. Wavelet features accounted for all selected features, enhancing the model's discrimination ability.
Conclusions
The integration of three ultrasound modalities captures multidimensional imaging features, reducing reliance on subjective interpretation. This multimodal radiomics approach provides a standardized diagnostic tool with significant clinical potential, particularly for less experienced physicians. Further validation with diverse datasets is needed to confirm its utility.
Keywords
Introduction
Cervical lymphadenopathy (CLA) is a condition that occurs across all age groups, with an annual incidence rate of 0.6%–0.7% in the general population.1,2 Common causes include reactive hyperplasia, tuberculous lymphadenitis, metastatic carcinoma lymph nodes, and lymphoma.3,4 The imaging findings of CTBL and CLNM share many similarities but with entirely different referral patterns and treatment strategies. Therefore, accurately identifying the specific etiology is crucial for subsequent medical management.1,5
In our clinical practice, we observed that many patients with CTBL seek medical attention solely because of the presence of a neck mass. CTBL diagnosis is frequently challenging owing to the lack of specific systemic symptoms and the low bacterial load within the lesion. 6 This difficulty in diagnosis delays recognition and leads to unfavorable prognoses. 7 Fistulas may form, causing considerable physical and psychological harm to the patients, particularly in the advanced disease stages.
Cervical lymph nodes serve as a metastatic pathway for many tumors in the human body. Accurate diagnosis of lymph node metastasis is of crucial significance for tumor staging, treatment planning, and patient prognosis. 8 Recent studies have developed predictive models for lymph node metastasis in papillary thyroid carcinoma 9 and cervical lymph nodes using conventional ultrasound and shear wave velocity, 10 demonstrating the importance of imaging-based diagnostics in clinical decision-making. Imaging methods are the primary tools for detecting, diagnosing, and monitoring lymph node abnormalities. Ultrasound examination, which has advantages, such as real-time imaging, safety, and high resolution, has been widely recognized for its importance in patients with CLA. 11 This method is recommended as a frontline diagnostic tool for CLA of unknown origin. 12 However, the diagnostic efficacy of ultrasound is highly dependent on the clinical and professional knowledge of radiologists.13,14 Subjective image interpretation, lack of effective quantification, and ongoing intra- and interobserver variability remain major challenges for ultrasound examinations. Advanced imaging techniques, such as microvascular flow imaging and contrast-enhanced ultrasound (CEUS), have been explored to improve the accuracy of blood flow analysis in cervical lymph node lesions. 15 Additionally, recent studies have shown that ultrasound-guided core needle biopsies (US-CNBs) are both effective and safe for cervical lymphadenopathy, particularly in settings with limited resources like the COVID-19 pandemic. This further emphasizes the need for more objective and reliable imaging methods. 16 Subjective image interpretation, lack of effective quantification, and ongoing intra- and interobserver variability remain major challenges for ultrasound examinations. Additionally, the imaging manifestations of CTBL and CLNM frequently overlap, causing misdiagnosis or an inability to provide a definitive diagnosis. 17
Radiomics is a technique that combines artificial intelligence (AI) with large-scale medical imaging data.18–20 It involves extracting image features from medical images in a high-throughput manner, transforming them into high-resolution and exploitable data, and conducting quantitative analysis. The advantages of radiomics include relatively intuitive images generated by traditional imaging techniques, allowing doctors and researchers to make judgments by observing the images. Additionally, radiomic data have relatively low requirements and often do not require large amounts of annotated data for training and analysis. The professional knowledge and experience of doctors can be directly applied to image interpretation and analysis, thereby reducing the dependence on extensive annotated data.
Radiomics has demonstrated considerable value in diagnosing and prognosticating various diseases, including thyroid and breast lesions, gastrointestinal disorders, and salivary gland tumors20–24 Notably, studies have shown that radiomics can effectively differentiate salivary gland tumors 25 and detect coronavirus disease 2019 (COVID-19) pneumonia. 26 However, the application of radiomics and other AI-based methods to lymph node diagnosis remains an area with limited research. Recent studies have explored the use of ultrasound-based radiomics for classifying lymph node metastases in breast and thyroid cancer patients,27,28 demonstrating advantages in non-invasive diagnostics and predictive modeling. The combination of ultrasound and CEUS has been shown to improve diagnostic accuracy for differentiating cervical tuberculous lymphadenitis from primary lymphoma, 29 suggesting that multimodal imaging techniques could enhance radiomics-based classification models.Multimodal ultrasound-based radiomics, combining GUS, EUS, and CEUS, offers a promising solution to the limitations of single-modality imaging. This approach combines complementary information from different ultrasound techniques, improving diagnostic accuracy for lymph node lesions.
The main technical challenges this study addresses include limited diagnostic information, data variability, and feature extraction. Single-modality ultrasound often lacks sufficient detail for accurate diagnosis, whereas using multimodal data can provide a more comprehensive and informative picture. Additionally, differences in imaging protocols and quality can lead to variability in results, but combining multiple modalities helps to mitigate this issue. Another challenge is the extraction of meaningful features from complex medical images, which can be difficult for human observers to identify. Radiomics, however, can automate this process and reveal subtle patterns that may otherwise be overlooked.
Radiomics was chosen because it can extract quantitative features from multimodal ultrasound images, which helps improve diagnostic precision and reduce reliance on subjective interpretation. This approach is different from previous studies, which typically focus on single-modality ultrasound or lack advanced computational methods for analyzing multimodal data.
Future research should focus on expanding datasets, standardizing imaging protocols, and refining feature extraction methods to fully realize the potential of AI and radiomics in lymph node diagnosis.
Materials and methods
Patients
A database search was conducted for patients undergoing ultrasound-guided lymph node puncture or surgical excision at the Hangzhou Red Cross Hospital from September 2020 to August 2023. The Ethics Committee of Hangzhou Red Cross Hospital approved this study (approval number: [2023] Review No. (046)), and the need for individual consent for this retrospective analysis was waived. This study included 203 patients with lymph node ultrasound images. Subsequently, the cases were retrieved from the Picture Archiving and Communication System (Carestream) and were classified and organized after reviewing the medical records. The ultrasound images included 142 and 61 cases of CTBL and CLNM, respectively, with videos of GUS, EUS, and CEUS (Figure 1).

Patient inclusion flow chart.
To further validate the model's performance and assess its generalizability, 45 additional cases were included, forming a separate test set group. These new cases were added for independent validation of the model's ability to predict outcomes in novel cases.
The inclusion criteria included the following:
Patients with GUS, EUS, and CEUS data, Patients who underwent biopsy or surgical excision to obtain pathological results, Patients with clear results of microbiological and molecular biology diagnostics, Patients with well-defined follow-up results.
The exclusion criteria included the following:
Cases with incomplete pathological and clinical information, Poor image quality or incomplete visualization of the lymph nodes, Cases that underwent relevant treatment.
Image acquisition
In this study, ultrasound physicians with more than 10 years of diagnostic ultrasound experience conducted examinations on patients. Lymph nodes were subjected to GUS, EUS, and CEUS using either the Philips IU Elite or Mindray Resona 7S ultrasound machines, both equipped with a linear array transducer.The ultrasound settings, such as frequency and gain, were standardized across all images to ensure consistency. The GUS images were acquired at a frequency of 5–12 MHz, the EUS images at 7.5 MHz, and the CEUS imaging was performed using contrast-enhanced ultrasound (Sonovue, Bracco) at a frequency of 2–4 MHz.The CEUS procedure was conducted in accordance with the EFSUMB guidelines, which provide recommendations on contrast agent administration, imaging acquisition, and interpretation for lymph node assessment. 30
Image segmentation and feature extraction
Two ultrasound physicians, each with 8 and 10 years of experience in lymph node ultrasound diagnosis, participated in a double-blind study to delineate segmented nodules. Both physicians were unaware of the pathological results, clinical data of patients, and each other's delineation outcomes. They used a double-blind approach to segment the regions of interest (ROIs) for GUS, EUS, and CEUS ultrasound raw images of different modalities for each lymph node lesion (Figure 2). The delineation was performed using manual drawing in the (3D Slicer, v5.2.1) software.

Examples of lesion annotation. The top row (A, B, C) shows the original images of the lymph nodes: (A) a GUS image, (B) an EUS image, and (C) a CEUS image. The bottom row (a, b, c) displays the same images with the ROI annotated: (a) ROI on a GUS image, (b) ROI on an EUS image, and (c) ROI on a CEUS image.
Subsequently, multiple types of radiomic features were extracted from each image, including: (1) First-order statistical features: such as mean, variance, skewness, kurtosis, etc; (2) Shape features: such as the long-to-short diameter ratio, surface area, and volume, which reflect the geometric morphology of the lymph nodes; (3) Texture features: The gray-level co-occurrence matrix (GLCM), gray-level difference matrix (GLDM), and gray-level run-length matrix (GLRLM) were used to analyze the texture distribution of the images. These features can reveal structural differences in tissues between different types of lesions; (4) Wavelet features: We applied wavelet transform-based features to extract multi-scale, multi-directional texture information, which helps reveal subtle differences between CTBL and CLNM.
Intra- and interobserver consistency
To assess the repeatability of intra- and interobserver consistency for radiomic features, two physicians (with 8 and 10 years of experience in lymph node ultrasound diagnosis, respectively drew a randomly selected set of 30 samples from the database. Reader 1 performed ROI delineation twice within 1 week following the same procedure to evaluate intraobserver reproducibility. Simultaneously, reader 2 independently delineated ROIs once. Intra- and interobserver consistencies were assessed using the intraclass correlation coefficient (ICC) to evaluate the reproducibility of radiomic parameters within and between observers. An ICC score of >0.75 was considered satisfactory. Therefore, strong consistency was considered to have been achieved if 90% of the radiomic features demonstrated satisfactory consistency. Reader 1 was responsible for the remaining samples if strong consistency was attained.
Feature selection and radiomics model construction
The sample data included in the study underwent random cropping and standardization processes. They were then randomly allocated into training and testing groups at a ratio of 7:3. The training set was employed for model construction and training, while the testing set served as an independent validation for assessing model performance. To address potential imbalances and enhance comparability in the training data, Z-score normalization was applied to ensure a consistent baseline for the dataset. In addition, to further validate the stability and generalizability of the model, 45 new cases were added to form the test set, which was used to evaluate the model's performance on new cases. Figure 3 presents the research flowchart. A preliminary evaluation of statistically significant features was conducted through analysis of variance based on the normalized training set data.

Imaging radiomics diagnostic model flow chart.
After fine-tuning the regularization parameter λ to regulate the strength of regularization, a process of feature selection and dimensionality reduction was implemented to reduce the number of features, retaining only those most relevant to the diagnosis. The selection of representative features was carried out using the Least Absolute Shrinkage and Selection Operator (LASSO). Subsequently, a training model was constructed through multivariate logistic regression analysis based on the retained effective features. The effective features, along with their corresponding non-zero regression coefficients, obtained from the training model were then applied to the testing set data. This iterative procedure ultimately led to the development of radiomics models, including GUS, EUS, CEUS, and GUS + EUS + CEUS.
Clinical diagnostic model
Conventional ultrasonic images were analyzed and a traditional logistic regression model was developed. Two sonographers (T.W. and Z.Y., with 12 and 16 years of clinical experience, respectively) empirically analyzed and recorded 14 important lymph node features according to the stored ultrasound images, including (a) Age, (b) Gender, (c) Long diameter, (d) short diameter, (e) Long diameter/Short diameter (L/S) ratio, (f) Boundary, (g) Echo, (h) Hilum, (i) Annular hypoechoic, (j) Fusion, (k) Surrounding soft tissue, (l) Enhancement mode, (m) Enhancement area and (n) Annular enhancement. The clinical prediction model constructed according to ultrasonic characteristics is referred to as the clinical model.
Statistical analyses
Statistical analyses were conducted utilizing the Statistical Package for the Social Sciences (SPSS) (version 19; IBM),MedCalc (version 18.2.1; MedCalc Software) and Python (version 3.8.1). Categorical data were presented in terms of frequencies and percentages, with intergroup comparisons performed using the chi-square test. Continuous variables, assessed for normality, were expressed as means ± standard deviations.Calibration curve analysis and decision curve analysis (DCA) were carried out using the “rmda” package in Rstudio. ROC curve analysis, including metrics such as area under the ROC curve (AUC), sensitivity, specificity, accuracy, PPV, and NPV, was utilized to assess diagnostic performance. The Hosmer-Lemeshow goodness-of-fit test (HL test) was executed using SPSS software, where a P-value of ≥0.05 indicated a good fit with the theory.Differences in AUC between models were compared using the DeLong test, with P-values <0.05 considered statistically significant. To address the potential increase of Type I errors due to multiple comparisons, Bonferroni correction was applied, adjusting the significance level for multiple tests.
Results
Patient population
Following our predefined inclusion and exclusion criteria, a total of 203 patients (89 men and 114 women) were enrolled, with an average age of 48.5 ± 39.5 years. The dataset included 142 cases of CTBL and 61 cases of (CLNM). Additionally, 45 more cases were added later (19 cases of CTBL and 26 cases of CLNM).
Patients were randomly assigned to the training (n = 143), validation (n = 60), and test (n = 45) groups, with the training and testing groups allocated at a 7:3 ratio. Statistical analysis was performed on key variables, including gender, age, and the final clinical diagnosis for the 248 patients. The outcomes indicated no significant differences in gender and age between the training, valadation, and test groups (P > 0.05).
Intra- and interobserver agreement
The intraclass correlation coefficients (ICC) for GUS, EUS, and CEUS were computed as 0.89 and 0.94, respectively, based on the features extracted through region of interest (ROI) delineation in ultrasound images by two senior ultrasound physicians. These results signify excellent reproducibility in the extraction of features. It is noteworthy that Reader 1 successfully completed the segmentation of all samples.
Radiomics models
GUS radiomics model
GUS images were annotated, and feature extraction was performed using 3D Slicer software (as shown in Figure 4A and a). Initially, 817 features were extracted, and 234 were retained after preliminary selection. Ultimately, one stable feature was isolated. The GUS model (Model 1) was built using logistic regression, with performance metrics summarized in Table 2. In the training set, the AUC was 0.797, sensitivity was 70.30%, specificity was 80.95%, and accuracy was 73.43%. In the validation and test sets, AUC values were 0.759 and 0.872, respectively.

LASSO dimensionality reduction and feature screening diagram. Feature selection using LASSO. Plotted relationship between the area under the receiver operating characteristic (AUC) curve and log (λ). A vertical dotted line was drawn at the best value using the minimum standard and the 1-SE criterion. The most valuable subset of features was selected. (A,a) GUS training set. (B,b) EUS training set. (C,c) CEUS training set. (D,d) GUS + EUS + CEUS training set.
EUS radiomics model
Following the same method as for GUS, 407 features were initially included, and three stable features were selected (Figure 4B and b) . Logistic regression was used to build the EUS model (Model 2). The AUC in the training set was 0.771, with sensitivity of 51.49%, specificity of 90.48%, and accuracy of 62.94%. The AUC values in the validation and test sets were 0.774 and 0.783, respectively (Table 2).
CEUS radiomics model
Similar to the GUS method, 407 features were initially selected, and two stable features were filtered out (Figure 4C and c). The CEUS model (Model 3) was constructed using logistic regression. Performance metrics are summarized in Table 1, with AUC values of 0.784 in the training set, sensitivity of 87.13%, and specificity of 66.67%. In the validation and test sets, the AUC values were 0.832 and 0.901, respectively (Table 2).
Univariate and multifactorial analysis of conventional ultrasonic characteristics.
Coef: coefficient; S.E.: standard error; Wald Z: wald Z-values ; p: p-values. L:Long diameter; and S: short diameter
Comparison of diagnostic effectiveness of different models.
The P-values represent the statistical significance of the comparison between the Combined Model and each individual model (GUS, EUS, CEUS, Clinical). The P-values have been adjusted using the Bonferroni correction for multiple comparisons. Asterisks (*) indicate statistical significance after correction: *p < 0.0125 (Bonferroni corrected).
GUS + EUS + CEUS radiomics model
Early fusion of features from GUS, EUS, and CEUS resulted in a new feature dataset. After preliminary analysis, 920 features were initially included, and 8 optimal features were selected (Figure 4D and d). The combined model (Model 4) was built using logistic regression. Performance metrics are summarized in Table 2. In the training set, the AUC was 0.894, with sensitivity of 85.15% and specificity of 83.33%. In the test set, the AUC was 0.919.
Clinical characteristics
Univariate and multivariate analyses were performed to identify independent clinical predictors, as summarized in Table 1.
Performance comparison
The diagnostic performance of different radiomics and clinical models is compared in Figure 5, Table 2. The results indicated that Model 4 (GUS + EUS + CEUS) performed significantly better than the other models in the training set. However, in the test set, its performance was comparable to the clinical model. Statistically significant differences in AUC were observed between the models in the training set, particularly for Model 4 versus Model 1, Model 2, and Model 3 (P < 0.05). However, no statistically significant differences were observed in the validation and test sets (P > 0.05).

The efficiency diagrams of the ROC curve for five models in (a) the training cohort, (b) the validation cohort, and (c) the test cohort.
To further evaluate the clinical utility of the models, DCA was performed for the training and test sets (Figure 6). The results indicated that the combined radiomics model (Model 4) provided the greatest net benefit compared to the other models.

DACs with five diagnostic modes in the (a) training set, (b) validation set, and (c) test set. (X-axis represents onset risk threshold and Y-axis represents net benefit).
Discussion
A major strength of this study lies in its dataset composition, which was collected in a comprehensive hospital specializing in tuberculosis and includes a large number of CTBL cases, a condition that is relatively underrepresented in the existing literature.This unique dataset not only ensures a robust evaluation of the model but also provides valuable insights into a challenging clinical problem that has not been extensively studied.
In this study, a multimodal radiomics model was developed to differentiate between CTBL and CLNM by integrating GUS, EUS, and contrast-enhanced CEUS images. This represents a considerable advancement compared to previous studies,29,31,32 which have predominantly focused on single-modality or dual-modality approaches. By leveraging complementary imaging features from these three modalities, the model achieves enhanced diagnostic performance and greater stability.
Although ultrasound is a crucial tool for diagnosing cervical lymphadenopathy, its accuracy is often limited when used alone. To overcome this limitation, our study integrates multiple ultrasound techniques-B-mode ultrasound (GUS), elastography, and contrast-enhanced CEUS-each offering distinct advantages. B-mode ultrasound provides high-resolution structural imaging, elastography assesses tissue stiffness to aid in distinguishing benign from malignant lesions, and CEUS enhances vascular visualization. The combination of these modalities not only captures multidimensional imaging features but also reduces reliance on subjective interpretation, thereby improving the robustness and reliability of the model, particularly in differentiating CTBL from CLNM, which often present with overlapping imaging characteristics. As demonstrated by Jung EM et al., 20 combining artificial intelligence-optimized B-mode, elastography, and CEUS can significantly improve diagnostic accuracy, further supporting the use of multimodal ultrasound approaches in clinical practice.
The CEUS procedure in this study was performed in accordance with the EFSUMB guidelines, which provide recommendations on contrast agent administration, imaging acquisition, and interpretation for lymph node assessment. These guidelines ensure standardized, accurate use of CEUS in clinical practice, particularly for distinguishing between benign and malignant lymph node lesions. By following these guidelines, we aligned our imaging protocols with international best practices, enhancing the clinical relevance and applicability of our findings.
Although viscosity measurement was not included in this study due to its limited clinical adoption, as noted by Jung EM et al., 33 it could serve as a valuable supplement to lymph node diagnostics. Future studies may explore the potential of viscosity measurement as an adjunct to ultrasound radiomics, especially in distinguishing complex lymph node pathologies, where viscosity could provide additional diagnostic insights.
The multimodal radiomics model demonstrated statistically significant superiority over single-modal models in the training set (Bonferroni-adjusted p-values: Combined vs. GUS, 0.0121; Combined vs. EUS, 0.0022; Combined vs. CEUS, 0.0060). For instance, as shown in Table 2,these results highlight the advantages of integrating GUS, EUS, and CEUS in capturing multidimensional imaging features. In the validation and test sets, the Combined model demonstrated comparable performance to the Clinical model (AUC = 0.823 vs. 0.821 in validation set, p = 0.9752; AUC = 0.919 vs. 0.776 in test set, p = 0.0859) . Although no statistically significant differences were observed after correction, this comparable performance underscores the clinical relevance of the radiomics approach and highlights its potential utility as an adjunct diagnostic tool in clinical practice, particularly for less experienced physicians.
The improved AUC observed in the test set (AUC = 0.919, Table 2) demonstrates that the Combined model is better calibrated and capable of accommodating broader variations in unseen data. This supports its robustness and highlights its potential for clinical generalization.
Furthermore, while the Clinical model achieved relatively satisfactory diagnostic performance (AUC = 0.808, 0.821, 0.776 in the training set, validation set, and test set, respectively), it relied on traditional visual features, such as age, gender, and hypoechoic halo, for differentiation. By contrast, the radiomics approach extracts subtle imaging patterns, such as wavelet features, that are not easily discernible by human observers. We hypothesize that wavelet features encompass hidden differences between CTBL and CLNM, thereby improving the model's discriminative ability. This is supported by the inclusion of wavelet features in the final model, which accounted for 100% of the selected features after LASSO-based feature selection. This highlights the potential of the Combined model to reduce reliance on subjective interpretation and provide a more standardized diagnostic framework.
Importantly, this study highlights the potential of multimodal radiomics in addressing a challenging clinical problem. The integration of three ultrasound modalities is a novel approach that provides a more comprehensive assessment of lymph node lesions compared to single-modal or dual-modal studies. While current findings require further validation with larger and more diverse datasets, the unique strengths of this study, including the abundant CTBL cases and innovative multimodal methodology,provides meaningful insights into the field of ultrasound-based diagnostics, particularly in the context of multimodal applications for lymph node evaluation.
Conclusion
In conclusion, ultrasound radiomics shows considerable potential value in selecting meaningful ultrasound features and enhancing the efficiency of differential diagnosis between CTBL and CLNM. More datasets will be needed in the future to further improve the model's performance.
Footnotes
Ethical approval and informed consent
Ethics committee approval was granted by the local institutional ethics review board, and the requirement for written informed consent was waived. This study was retrospective in nature, and the data were collected from medical records.
Author contributions
The concept and design of the paper were proposed by Gaoyi Yang and Xiangyu Meng. Data collection and analysis were completed by Xiangyu Meng, Hongxiang Fu, Ying Wang, Ying Zhang, Peijun Chen, and Litao Sun. Xiangyu Meng, Hongxiang Fu, and Ying Wang prepared the manuscript. Gaoyi Yang, Xiangyu Meng, Hongxiang Fu, Ying Wang, Ying Zhang, Peijun Chen, and Litao Sun reviewed and edited the manuscript. All authors have read and approved the final version of the manuscript.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Zhejiang Provincial Medicine and Health Technology Project (2024KY1231) and the Hangzhou Science and Technology Plan Guidance Project (Z20230098).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
