Abstract
Background
Tumor budding (TB) is a critical histopathological feature of rectal cancer that is strongly associated with metastasis, recurrence, and poor prognosis.
Purpose
To evaluate the diagnostic performance of radiomics models based on magnetic resonance imaging (MRI) for preoperative prediction of TB grade in rectal cancer via systematic review and meta-analysis.
Material and Methods
A systematic search from PubMed, Cochrane Library, Web of Science, and Embase was conducted for original diagnostic studies up to 10 April 2026. Summary estimates of diagnostic accuracy were pooled using a random effects model. Threshold effect, subgroup, and meta-regression analyses were performed to explore the source of heterogeneity.
Results
Seven studies with a total of 1846 patients were included. The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and diagnostic odds ratio of MRI-based radiomics model were 0.82 (95% confidence interval [CI]=0.68–0.91), 0.80 (95% CI=0.65–0.89), 4.0 (95% CI=2.5–6.5), 0.22 (95% CI=0.13–0.39), and 18 (95% CI=10–32), respectively. The area under the summary receiver operating characteristic curve was 0.88 (95% CI=0.85–0.90). In subgroup analysis, multi-sequence MRI-based radiomics using T2-weighted (T2W) imaging, diffusion-weighted imaging (DWI) and contrast-enhanced T1-weighted (CE-T1W) imaging showed higher sensitivity compared with T2W imaging (87% vs. 67%; P < 0.001), DWI (87% vs. 77%; P = 0.02), and CE-T1W imaging (87% vs. 61%; P < 0.001).
Conclusion
MRI-based radiomics models show promising performance for predicting TB grade in rectal cancer, with multi-sequence models outperforming single-sequence approaches. However, due to the limited number of included studies and heterogeneity, further large-scale prospective studies are warranted to confirm these results.
Introduction
Colorectal cancer (CRC) is the third most common cancer and the second leading cause of cancer-related deaths worldwide; its incidence is predicted to increase by 77% by 2030 (1,2). Accounting for approximately one-third of CRC cases, rectal cancer contributes substantially to cancer-related morbidity and mortality (3). Although current treatment strategies primarily rely on the tumor node metastasis (TNM) staging system, its clinical utility is limited by significant outcome variations among patients with identical stages, underscoring the need for additional prognostic indicators (4–6).
Tumor budding (TB) is a well-known and critical pathological feature in rectal cancer that refers to a single cell or a cluster of up to four tumor cells at the invasive front of the tumor. According to the International Tumor Budding Consensus Conference (ITBCC) guidelines, TB is graded into three categories (Bd1, Bd2, and Bd3) based on the number of buds within a standardized area of 0.785 mm2 (7). TB occurs in 20%–40% of CRC cases, and higher TB grades are strongly associated with an increased risk of distant metastasis, recurrence, and poorer survival (8,9). Although histopathology remains the gold standard for TB assessment, its reliability is compromised by sampling limitations and tumor heterogeneity. Therefore, developing non-invasive methods for the preoperative assessment of TB grade is essential to support personalized treatment strategies.
As a non-invasive method, medical imaging has become indispensable for preoperative evaluation, treatment monitoring, and prognostic prediction in rectal cancer. Radiomics, as a quickly growing field, extracts high-dimensional imaging features and could provide valuable quantitative information that reflects the intratumoral heterogeneity (10). Preliminary studies have successfully employed magnetic resonance imaging (MRI)-based radiomics to predict TB grade in CRC, demonstrating its potential to improve risk stratification and clinical decision-making. Nevertheless, despite these advancements, no meta-analysis evaluating the performance of radiomics models in predicting TB grade in rectal cancer has been published to date.
The aim of this systematic review and meta-analysis was to assess the diagnostic performance of MRI-based radiomics models for preoperative prediction of TB grade in patients with rectal cancer.
Material and Methods
This study was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (11). Approval from the ethics committee was not required.
Search strategy
A systematic search was conducted in PubMed, Cochrane Library, Web of Science, and Embase to identify relevant studies. The initial search was conducted up to 14 February 2025; it was subsequently updated on 10 April 2026 to include newly published studies. The combination of search terms used was as follows: “Radiomics” (OR “machine learning” OR “deep learning” OR “neural network” OR “artificial intelligence” OR “texture analysis”) AND “magnetic resonance imaging” (OR “MRI”) AND “colorectal cancer” (OR “rectal cancer” OR “colon cancer”) AND “tumor budding” (OR “tumour budding” OR “budding of tumor” OR “budding of tumour” OR “budding”). In addition, a manual search of the bibliographies from the included studies was performed to identify other relevant articles.
Inclusion criteria
Articles that satisfied the following criteria were included: (i) studies using MRI-based radiomics models to preoperatively predict TB grade in patients with rectal cancer; (ii) studies with sufficient data to construct a 2 × 2 table; and (iii) human study participants.
Exclusion criteria
Articles that satisfied the following criteria were excluded: (i) studies not published in English; (ii) non-original studies including case reports, conference abstracts, review articles, letters, guidelines, and editorials; (iii) studies not focusing on our aim; and (iv) studies with insufficient data to construct a 2 × 2 table.
The titles and abstracts of the articles were screened independently by two reviewers (Xu and Liang, with 11 years and 14 years of experience in radiology, respectively). The full text of any articles that seemed relevant was reviewed. Any disagreement was resolved by consensus.
Data extraction
The following metrics were extracted:
patient characteristics: patient number, sex, age, number of lesions (low grade [Bd1&2] and high grade [Bd3]); study characteristics: first author, publication year, institution, study period, study design, number of centers, patient recruitment method, reference standard, blinding to the reference standard, time interval between MRI and the reference standard, reader numbers, and reader experience; radiomic characteristics: MRI field strength, imaging sequence, segmentation (region of interest [ROI] vs. volume of interest [VOI]; method), feature extraction and selection (software; feature type; feature selection methods; number of extracted features), and model construction and validation (number of selected features; validation model; classification method). True positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) were extracted. Data extraction was conducted independently by the same two reviewers, with disagreements resolved by consensus. If there were more than two models in one study, the radiomics model with the highest area under the curve (AUC) in the validation cohort was selected for inclusion in the meta-analysis. When there was no validation test, the results from the training cohort were included.
Quality assessment
The methodological quality of the included studies was independently assessed by the same two reviewers using the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) and the Radiomics Quality Score (RQS) tools. Any disagreements were resolved by consensus (12).
Statistical analysis
The 2 × 2 tables were used to build forest plots of sensitivity and specificity for each study individually. The pooled sensitivity, specificity, diagnostic odds ratios (DORs), positive likelihood ratio (PLR), negative likelihood ratio (NLR), and 95% confidence intervals (CIs) were calculated. The AUC was calculated to evaluate the overall diagnostic performance by constructing a summary receiver operating characteristic curve (SROC). Heterogeneity among study sensitivity and specificity was assessed by Cochran's Q-test (P <0.05) and the Higgins inconsistency index (I2) test (I2 >50% indicating substantial heterogeneity). Spearman’s correlation coefficient was calculated and P <0.05 indicated a significant threshold effect.
Subgroup analyses were conducted based on MRI sequences, including T2-weighted (T2W) imaging, diffusion-weighted imaging (DWI), contrast-enhanced T1-weighted (CE-T1W) imaging, and a multi-sequence approach combining T2W imaging, DWI, and CE-T1W imaging. The pooled sensitivities and specificities were compared across these different imaging sequences. Meta-regression was performed to explore heterogeneity from the following factors: study center (single center vs. multicenter), sample size (≤200 vs. >200), interval between MRI and surgery (within 14 days vs. others), MRI field strength (only 3.0 T vs. 1.5 T and 3.0 T), segmentation method (3D vs. others), segmentation software (ITK-SNAP vs. others) and RQS percentage score (<50% vs. ≥50%). Publication bias was evaluated using Deeks’ funnel plot, with P <0.05 indicating significant publication bias (13).
All analyses were performed using STATA software (version 14.0). P <0.05 was considered statistically significant. Study quality was evaluated using RevMan 5.3.
Results
Literature search and article selection
The search flow diagram is presented in Fig. 1. In the initial search (up to 14 February 2025), 42 records were identified. After removing 18 duplicates, 24 articles remained for title and abstract screening. Of these, 13 were excluded (case reports, n = 2; conference abstracts, n = 10; guideline, n = 1). The remaining 11 articles underwent full-text review, of which six were excluded due to not being relevant to the study aim (n = 4), insufficient data to construct 2 × 2 tables (n = 1), and studies not published in English (n = 1). Ultimately, five studies were included from the initial search.

Flow chart of study selection process for meta-analysis.
In the updated search (from 14 February 2025 to 10 April 2026), 12 additional records were identified. After removing nine duplicates, three articles remained for screening. Of these, one study was excluded as it was not relevant to the study aim, and two additional studies met the inclusion criteria. In total, seven studies were included in the final meta-analysis.
Characteristics of the included studies
Table 1 shows the study characteristics. There were 1846 patients in total (648 women, 1198 men; mean age = 56–78 years). The number of lesions per study was in the range of 74–458; the number of Bd1&2 and Bd3 cases was in the range of 54–325 and 20–185, respectively. The distribution of training (600 Bd1&2 cases, 358 Bd3 cases) and test (500 Bd1&2 cases, 314 Bd3 cases) datasets varied widely at the patient level. One study did not describe the size of each dataset separately (14). Only one study had a prospective design (14); the other six studies were retrospective. Two studies were set in a single institution (14,15) and five had a multi-institutional design (16–20). All studies were not explicit about their enrollment methods. Histopathology after surgery was used as the reference standard in all included papers.
Patient and study characteristics of the included studies.
*Values are given as mean ± SD.
EV, external validation; F, female; IV, internal validation; M, male; NA, not available; P, prospective; R, retrospective; TC, training cohort; VC, validation cohort.
Table 2 shows the detailed radiomic characteristics. Three studies used radiomics models based on 3-T MRI (14–16) and four studies focused on radiomics models based on 3-T and 1.5-T MRI (17–20). VOIs segmentation was used in six studies (15–20). Various radiomics features were extracted from the included studies. These were subdivided into the following classes: histogram features, first-order features, shape features, texture features, higher-order features, topology features, fractal features, and deep-learning features. Of these methods, the feature selection methods comprised different types of variance threshold, select-K-best, stepwise regression, least absolute shrinkage and selection operator, and Pearson correlation analysis. The number of extracted features was in the range of 1197–5760. The number of features selected for the final radiomic model was in the range of 7–26 in most of the included studies. The validation models were conducted using internal and external validation in five studies (15,17–20). Logistic regression, K-nearest neighbor, extra trees, random forest, eXtreme gradient boosting, light gradient boosting machine, support vector machine, multilayer perceptron, radiomics signature, deep-learning, and transfer-learning models (including DenseNet 121, ResNet 18, ResNet 34, ResNet 50, ResNet 101, and Vgg11) were applied as the classification methods.
Radiomic characteristics of the included studies.
APT, amide proton transfer; CE-T1W, contrast-enhanced T1-weighted; ET, extra trees; EV, external validation; IV, internal validation; KNN, K-nearest neighbor; LASSO, least absolute shrinkage and selection; LightGBM, light gradient boosting machine; LR, logistic regression; MLP, multilayer perceptron; MRI, magnetic resonance imaging; NA, not available; PCA, Pearson correlation analysis; RF, random forest; ROI, region of interest; SVM, support vector machine; VOI, volume of interest; XGBoost, eXtreme gradient boosting.
Methodologic quality assessment
Fig. 2 shows the risk of bias and applicability concerns for the seven included studies. For the patient selection domain, one study was considered to have a high risk of bias because only patients with locally advanced rectal cancer were included in the analysis (18) and the other six studies were judged to have an unclear risk of bias because they were unclear whether patient enrollment was consecutive. For the index test domain, one study was considered to have an unclear risk of bias because it did not report whether MRI results were interpreted blinded to the reference standard. For the reference standard domain, only one study was rated as having a low risk of bias (20), whereas the remaining six studies were judged as having an unclear risk of bias because they did not report whether the reference standard results were assessed independently of the index test. For the flow and timing domain, one study had an unclear risk of bias because the time between MRI and surgery was not clearly reported (18).

Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria for the seven included studies.
The RQS for the seven included studies is shown in Table 3. This ranged from 16 to 20 points in the included studies. The mean RQS total and percentage scores were 18.00 ± 1.53 and 50.00% ± 4.25%, respectively. The highest possible score was 20 points. The highest-rated study received a percentage score of 55.56%.
Radiomics quality score (RQS) of included studies.
Diagnostic performance for prediction of tumor budding grade in rectal cancer
Fig. 3 demonstrates the individual sensitivity (50%–94%) and specificity (63%–100%) of the included studies. Cochran's Q test revealed that there was significant heterogeneity among the studies (Q = 38.08; P < 0.001). Substantial heterogeneity was observed in the sensitivity (I2 = 84.32%) and specificity (I2 = 79.94%). There was no significant threshold effect (P = 0.071).

Coupled forest plots of pooled sensitivity and specificity.
The pooled sensitivity, specificity, PLR, and NLR were 0.82 (95% CI = 0.68–0.91), 0.80 (95% CI = 0.65–0.89), 4.0 (95% CI = 2.5–6.5), and 0.22 (95% CI = 0.13–0.39), respectively. The pooled DOR was 18 (95% CI = 10–32). The SROC curve was shown in Fig. 4. The AUC was 0.88 (95% CI = 0.85–0.90). Deeks’ funnel plot revealed that publication bias was absent (P = 0.15) (Fig. 5).

SROC of the diagnostic performance of radiomics models for preoperative prediction of the TB grade in rectal cancer. SROC, summary receiver operating characteristic; TB, tumor budding.

Deeks’ funnel plot for testing publication bias.
Subgroup analysis
The subgroup analysis based on imaging sequences are shown in Table 4. The pooled sensitivities and specificities of radiomics based on different MRI sequences for TB grade prediction were 67% (95% CI = 0.57–0.76) and 74% (95% CI = 0.57–0.87) for T2W imaging, 77% (95% CI = 0.63–0.91) and 63% (95% CI = 0.39–0.87) for DWI, 61% (95% CI = 0.41–0.76) and 86% (95% CI = 0.68–1.00) for CE-T1W imaging, and 87% (95% CI = 0.79–0.92) and 80% (95% CI = 0.63–0.91) for multi-sequence using T2W, DWI, and CE-T1W imaging, respectively. The pooled sensitivity of multi-sequence was significantly higher than that of T2W imaging (87% vs. 67%; P < 0.001), DWI (87% vs. 77%; P = 0.02) and CE-T1W imaging (87% vs. 61%; P < 0.001).
Pooled sensitivities and specificities of radiomics based on different MRI sequences for the prediction of tumor budding grade in rectal cancer.
Values in parentheses are 95% CIs.
CE-T1W, contrast-enhanced T1-weighted; MRI, magnetic resonance imaging.
Meta-regression
Meta-regression analysis (Table 5) showed that study center, sample size, interval between MRI and surgery, MRI field strength, segmentation method, segmentation software, and RQS percentage score were not significantly associated with diagnostic performance (P > 0.05), suggesting that other unmeasured factors may have contributed to the observed heterogeneity.
Results of the meta-regression analysis of radiomics for the prediction of TB grade in rectal cancer.
Values in parentheses are 95% CIs.
MRI, magnetic resonance imaging; RQS, radiomics quality score; TB, tumor budding.
Clinical utility
Using a radiomics model, the post-test probability increased to 50% from 20% with a PLR of 4 when the pre-test was positive. Furthermore, the post-test probability reduced to 5% with an NLR of 0.22 when the pre-test was negative (Fig. 6).

Fagan plots for assessing clinical utility.
Discussion
To the best of our knowledge, this is the first meta-analysis to evaluate the diagnostic performance of MRI-based radiomics models for the preoperative prediction of TB grade in rectal cancer. Our findings demonstrate that MRI-based radiomics models exhibit good diagnostic accuracy, with a pooled sensitivity of 0.82 (95% CI = 0.68–0.91), specificity of 0.80 (95% CI = 0.65–0.89), and an AUC of 0.88 (95% CI = 0.85–0.90). The subgroup analysis demonstrated that multi-sequence MRI-based radiomics models using T2W, DWI and CE-T1W imaging achieved higher sensitivity compared with single-sequence models, including T2W imaging (87% vs. 67%; P < 0.001), DWI (87% vs. 77%; P = 0.02), and CE-T1W imaging (87% vs. 61%; P < 0.001). The clinical utility of radiomics models is further supported by their ability to improve post-test probabilities. In our analysis, a positive radiomics result increased the probability of high-grade TB from 20% to 50%, while a negative result reduced it to 5%. These results suggest that MRI-based radiomics models may serve as a noninvasive tool for preoperative prediction of TB grade, which could aid in personalized treatment planning and improve patient outcomes.
The detection of TB in rectal cancer is of considerable clinical importance, as it has been consistently associated with lymph node metastasis, local recurrence, distant metastasis, and reduced overall survival in postoperative patients (21,22). Although standardized criteria for TB in T2 CRC have not yet been fully established, TB assessment has already been incorporated into pathological reporting guidelines for T1 disease (21,23). Furthermore, high-grade TB has been associated with poorer prognosis in CRC patients undergoing neoadjuvant chemotherapy. Based on its well-documented prognostic value and clinical relevance, TB may represent a useful complementary factor in risk stratification and could be considered in future refinements of CRC reporting systems (6,24–26).
Radiomics is an emerging non-invasive imaging analysis method that enables comprehensive tumor characterization through quantitative feature extraction from medical images (27). This advanced approach serves as a valuable digital biopsy tool, providing in-depth analysis of tumor heterogeneity and facilitating the prediction of tumor biological characteristics (28). Radiomics-based predictive models have been successfully applied in TB prediction across multiple malignancies, including bladder cancer, cervical cancer, as well as laryngeal and pharyngeal carcinomas (29,30). For instance, Chong et al. developed four multiparametric MRI-based radiomics models, achieving AUCs in the range of 0.742–0.891 in test cohorts (29). Similarly, Granata et al. reported high performance for an MRI-based model in colorectal liver metastases, with an accuracy of 94%, a sensitivity of 86%, and a specificity of 95% (31). In addition, Li et al. demonstrated the utility of CT-based deep-learning models for TB prediction in bladder cancer, achieving AUCs of 0.882 and 0.944 in external validation cohorts (30). Chong et al. also reported the potential of 18F-FDG PET/CT radiomics, with an AUC of 0.762 for cervical cancer TB prediction (32). These findings further highlight the ability of radiomics to quantify tumor heterogeneity and reliably predict TB grade, thereby facilitating improved risk stratification and treatment decision-making.
This meta-analysis provides critical insights into the diagnostic performance of multiparametric MRI-based radiomics models. The integrated model incorporating T2W, DWI, and CE-T1W imaging sequences demonstrated improved diagnostic performance, with higher sensitivity (0.87, 95% CI = 0.79–0.92) compared with single-sequence models. These findings were consistent with a recent meta-analysis evaluating MRI-based radiomics for predicting lymphovascular space invasion (LVSI), which showed that multi-sequence models exhibited significantly superior diagnostic performance compared to single-sequence approaches in validation datasets (33). Furthermore, a meta-analysis by Yang et al., encompassing 13 MRI radiomics studies, reported that multi-sequence models (AUC = 0.84, sensitivity = 81%, specificity = 77%) achieved higher performance than T2W imaging (AUC = 0.77, sensitivity = 74%, specificity = 66%) and CE-T1W imaging models (AUC = 0.74, sensitivity = 75%, specificity = 59%) in predicting LVSI among cervical cancer patients, although these differences were not statistically significant (34). The improved performance of multiparametric models may be attributed to the complementary information provided by different MRI sequences, with each sequence capturing distinct but synergistic aspects of tumor biology. These findings strongly support incorporating multiparametric MRI protocols into routine preoperative evaluation of rectal cancer, particularly for assessing clinically significant histopathological features such as TB that critically inform treatment selection and prognostic stratification.
The present study has some limitations. First, the number of included studies was relatively small, and only one was prospective, which may introduce selection bias. Second, all included studies were conducted in Asian populations, limiting the generalizability of the findings to other ethnic groups. Third, substantial heterogeneity was observed across the included studies. Although meta-regression analysis was performed to explore potential sources of heterogeneity, no significant factors were identified, which may reflect the influence of unmeasured variables. Finally, the review protocol was not prospectively registered in a public database such as PROSPERO, which may have reduced transparency and increased the risk of potential bias. Nevertheless, all study procedures were predefined and conducted in accordance with the PRISMA guidelines to ensure methodological rigor.
In conclusion, radiomics models based on MRI demonstrate promising diagnostic performance for the preoperative prediction of TB grade in rectal cancer. Multiparametric approaches combining multiple MRI sequences may offer improved diagnostic performance compared with single-sequence models. However, these findings should be interpreted with caution given the limited number of included studies and the presence of heterogeneity. Further large-scale prospective studies are needed to validate these results and to better define the clinical utility of radiomics in this setting.
Supplemental Material
sj-tif-1-acr-10.1177_02841851261459119 - Supplemental material for Radiomics based on MRI for preoperative prediction of the tumor budding grade in rectal cancer: a systematic review and meta-analysis
Supplemental material, sj-tif-1-acr-10.1177_02841851261459119 for Radiomics based on MRI for preoperative prediction of the tumor budding grade in rectal cancer: a systematic review and meta-analysis by Fan Xu, Zhiguo Deng, Yongjian Li, Miaoxia Chen, Qing Huang, Hongzhen Wu and Yingying Liang in Acta Radiologica
Footnotes
Data availability
The data extracted during and/or analyzed during the current study are available in the main manuscript.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China (82402377 [YYL]), Guangzhou Planned Project of Science and Technology (2024A03J1025 [HZW]), Medical Science and Technology Research Fund Project of Guangdong (B2024073 [QH]) and Guangzhou Science and Technology Project of Health (2025A03J3320 [FX]).
Supplementary material
Supplementary material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
