Abstract
Background
Urothelial bladder cancer exhibits marked molecular and clinical heterogeneity. While genomic and transcriptomic profiling of muscle-invasive bladder cancer (MIBC) has revealed recurrent alterations with therapeutic and prognostic relevance, limited access to molecular testing constrains clinical use. Computed tomography (CT), routinely performed for staging and surveillance, may serve as a noninvasive adjunct for tumor biology. Radiomics, the quantitative extraction of imaging features, offers a means to associate imaging phenotypes with molecular characteristics.
Methods
Genomic data for The Cancer Genome Atlas were integrated with CT images from the Cancer Imaging Archive for 89 patients with biopsy-proven MIBC. An in-house radiomics pipeline extracted 488 texture metrics characterizing the brightness distribution, pixel relationships, and spatial patterns of segmented tumors. Three classifiers - Random Forest, Extreme Gradient Boosting, and Elastic Net - were trained to predict DNA mutations, tumor mutational burden (TMB), and mRNA expression. Model performance was evaluated using 10-fold cross-validation.
Results
Among 15 recurrent mutations, EP300, FGFR3, and ARID1A were predicted most reliably (AUCs = 0.77, 0.76, 0.75). Models identified high-TMB tumors (AUC = 0.61), poor-prognosis transcriptomic signatures (AUC = 0.73, 0.65), expression of key cell cycle (CKDKN1A, AUC = 0.78) and apoptotic (CASP3, AUC = 0.71) genes, and discriminated the luminal infiltrated molecular subtype from other variants (AUC = 0.69).
Conclusion
Our study demonstrates that CT-derived radiomics features can capture biologically and clinically relevant information in muscle-invasive bladder cancer. These findings support the potential utility of radiomics as a noninvasive, scalable adjunct to genomic profiling in MIBC.
1. Introduction
Cancers of the bladder are the most common urinary tract malignancy in both men and women, with an estimated 84,870 new cases and 17,420 deaths in 2025. 1 Bladder cancer represents a heterogeneous group of malignancies ranging from low-grade and high-grade non-muscle invasive bladder cancer (NMIBC) to muscle invasive bladder cancer (MIBC) with or without distant metastases. Bladder cancer also demonstrates notable variance at a genomic level, and the presence or absence of several gene mutations has demonstrated significant implications for treatment options and clinical outcomes. 2
The Cancer Genome Atlas (TCGA) provides a unified database of gene mutations in bladder tumors, compiling this data alongside other clinically relevant biomarkers including tumor cell mRNA expression levels, mRNA expression molecular subtypes, and tumor mutational burden (TMB). In this study, we focus on a panel of genomic alterations implicated in MIBC treatment and progression. We also investigate mRNA expression level, both for individual genes and in 2 different four-gene expression panels with demonstrated prognostic value. The first panel, consisting of IL8, S100A8, S100A9 and EGFR (referenced here as four-gene panel A) has been previously shown to predict disease progression in MIBC. 3 A second set of mutations consisting of JUN, MAP2K6, STAT3 and ICAM1 (referenced here as four-gene panel B) shown to predict disease recurrent and 5-year survival was also analyzed. 4
Biomarkers investigated in the study and their relevance to bladder cancer.
Given the broad prognostic and therapeutic implications of omics-level data in bladder cancer, it would be clinically useful to identify imaging-based surrogate biomarkers for hallmark somatic mutations and mRNA expression patterns to assist with appropriate treatment selection. CT-based texture analysis (CTTA), a technique for the extraction of quantitative features (radiomics) from available medical images which are not readily assessed visually, is a promising tool towards achieving this aim.18,19 Features extracted by CTTA include tumor shape, repeating patterns, and heterogeneity, which are created by the underlying makeup of the tumor microenvironment and may have an impact on clinical decision making from a treatment and prognosis standpoint. 20 These techniques have been applied to various cancers including head and neck, colorectal, and lung malignancies.21–24
Although not as widely studied as other malignancies, CTTA of urothelial carcinoma is an actively growing field of research.25,26 Our group showed that CTTA of bladder cancer may differentiate urothelial cancer from micropapillary (a particularly aggressive form of high-grade cancer) types, with the latter having a more heterogeneous texture on CT. 27 Similarly, studies have demonstrated the ability of CTTA to predict grade, muscle invasive status, clinical stage, and progression free survival in MIBC.28–31 MRI and CT-based radiomic analysis has also successfully been used in conjunction with genomic data to better estimate patient prognosis32,33
With this in mind, we hypothesize that CTTA derived from clinical imaging can be used for staging of MIBC, identification of pathological and molecular subtypes, prognostication, and to assist with the determination of an appropriate therapy. 34 The addition of quantitative imaging may also supplement qualitative assessment of clinical imaging to provide a more comprehensive representation of the patient’s disease status. Considering the negative effect of data heterogeneity on multicenter CTTA analysis, we utilize a previously published CT RRR filter to construct predictive models.35,36 The study aimed to identify and quantify associations between imaging-based features (radiomics) on CT and various hallmark genetic and molecular markers in biopsy-proven MIBC. While there is also interest in the use of MRI in this context, we focused only on CT in the current project due to the availability of data. 32
2. Materials and methods
2.1. Data collection
We integrated genomics data from The Cancer Genome Atlas (TCGA) with matched CT data from The Cancer Imaging Archive (TCIA) in 89 biopsy-proven MIBC cancer patients to find associations between hallmark MIBC mutations and radiomic metrics. 37 Patients with T2-T4 MIBC, any N and M staging who had not undergone prior chemotherapy (including intravesical chemotherapy) and/or radiation therapy were included for evaluation. Tumor DNA mutations present in at least 10% of the selected patient cohort (ARID1A, ASXL2, ATM, CDKN1A, CREBBP, ELF3, EP300, ERBB2, FGFR3, KDN6A, KMT2D, PIK3CA, RB1, RHOB, and TP53) were included in the analysis. Staging, TMB, mRNA expression patterns for panels A and B, mRNA expression of key tumorigenic genes (TP53, CDKN1A, CDKN1B, BCL2L1, BIRC5, and MRE11), and molecular subtype were also evaluated for radiomic significance. Specimens for pathologic evaluation were obtained via transurethral biopsy and/or cystectomy.
2.2. Data selection
We included cases from the TCGA MIBC database with a corresponding post-contrast enhanced CT from the TCIA. Cases were excluded if the tumor was not visible on imaging and/or if the bladder was decompressed. Cases in which only non-contrast enhanced CT, MRI or PET imaging were available were not used. The included subset was examined for significant differences in demographic, genomic, and transcriptomic variables using Mann-Whitney U and Chi-Square tests for binary/categorical and continuous variables respectively.
2.3. Radiomic evaluation
An example protocol for radiomic evaluation in MIBC has previously been described by our group and is shown in Figure 1.
38
First, the lesion was identified on CT imaging and segmented manually by an experienced, fellowship trained genitourinary radiologist. Next, a region of interest (ROI) was created from the segmented data. Using in-house developed MATLAB code, a CT-based radiomics panel comprised of reproducible, robust, and repeatable (RRR) texture metrics was then extracted from the ROI using six different methods: histogram analysis (HIST), 2D- Gray-level co-occurrence matrix (GLCM), Gray-level difference matrix (GLDM) and Gray-level run-length matrix (GLRLM), Gray-level size zone method (GLSZM) and 2D- Fast Fourier Transform (FFT) analyses. Example protocol for radiomic evaluation of muscle invasive bladder cancer. The region of interest is extracted from the CT image and analyzed as an array of values corresponding to pixel brightness at the corresponding location in the image. Mathematical definitions of representative features for array-level groups are listed to the right of the figure. These features account for local relationships between pixels in the image. Histogram and Fourier analyses extract global pixel intensity distributions and patterns. (GLCM = gray level co-occurrence matrix, GLDM = gray level difference matrix, GLRLM = gray level run-length matrix, GLSZM = size zone method)
2.4. Outcome variables
To standardize model training and evaluation, all non-binary outcomes were converted to binary outcomes. Continuous mRNA expression data were classified as elevated (>1 standard deviation above the mean of the TCGA-BLCA dataset) or not elevated, while continuous TMB data were classified as elevated (≥7.9/MB) or not elevated based on ideal prognostic cutoff values identified by Yang et al. 39 Categorical tumor molecular subtype was handled using one-hot encoding, while T-stage and N-stage were grouped into T2 vs. T3/4 and N0 vs. N1/2/3. All tumor DNA mutations found in at least 10% of tumors in the cohort were included in the analysis. Tumor DNA mutations were classified as present or absent; no distinction was made between the type of mutation. Finally, the mRNA expression criteria described for four gene panels A and B were used to classify each tumor as good or poor prognosis.3,4
2.5. Statistical method
Three machine learning (ML) algorithms were used to create binary predictive models: Random Forest (RF), XGBoost (XGB) and ElasticNet (EN).40–42 Hyperparameter tuning for all models was conducted using manual grid search. Parameters searched for RF included: number of estimators, max depth, minimum samples before splitting, minimum leaf samples, and maximum features per split; XGB included: number of estimators, max depth, learning rate, subsampling ratio for rows and features, and minimum instance weight per leaf; finally EN included: C (regularization strength), and L1 to L2 ratio For each of the 3 classifiers, a 10-fold cross validation was used to evaluate model performance. The full dataset was equally divided into 10 folds and the data for each fold was divided into training and testing sets which contained 90% and 10% of the data respectively. We iterated the model training process for each fold of the data, setting aside a different 10% of the data as the testing sample for each fold such that each study sample served as an independent testing case to generate a robust picture of model performance. The mean area under the receiver operating characteristic curve (AUC) with 95% confidence interval (CI) was used to assess prediction capabilities.
For imbalanced outcomes, we employed Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance during model training. 43 SMOTE generates synthetic samples for the minority class by interpolating between existing minority instances, thereby enhancing the classifier’s exposure to underrepresented patterns without duplicating existing observations. This strategy is commonly used to mitigate bias toward the majority class in supervised learning settings. To prevent data leakage and more accurately assess model performance, oversampling was confined to the training portion of each fold. We applied a generally accepted interpretation of AUC summarized by Alba et al. to categorize the utility of model outputs: AUC <0.60 reflects poor discrimination; 0.60 to 0.75, possibly helpful discrimination; and more than 0.75, clearly useful discrimination. 44
To improve model interpretability, the influence of individual features and feature groups on XGBoost model predictions for selected outcomes was quantified using SHAP (SHapley Additive exPlanations) analysis. 45 Point-biserial correlation coefficients between the radiomics features and outcomes were also computed. We expect the features with strong Point-biserial correlation coefficients to concordant with the ranking position from SHAP. The underlying radiomics analysis was conducted using the MATLAB pipeline described by Fan et al. 27 All statistical analysis and ML model training and evaluation were conducted using Python version 3.12.1 with the Scikit-learn and XGBoost libraries.
3. Results
3.1. Patient selection
A total of 89 patients with pathologically proven MIBC and corresponding CT imaging acquired under an identical scanning protocol were identified within the TCGA dataset. The 89-patient cohort did not significantly differ from the original 412-patient TCGA MIBC cohort in any demographic or outcome variable (p>0.05). The cohort included 67 (75.3%) were male and 22 (24.7%) females. The mean age was 69.3 +/- 9.7 years (43-87). 79 (88.8%) of patients were White, 9 (10.1%) Black and 1 (1.1%) Asian. The sex, tumor stage, TMB, mRNA subtype, tumor DNA mutations, and level of mRNA expression in selected genes for each patient are shown in Figure 2(a)–(e). A table of the demographic data is shown in Figure 2(f). A) Patient demographic data, mRNA cluster, and TMB. B) DNA alteration landscape. C) mRNA expression levels of genes included in a panel predictive of MIBC progression (panel A and D) disease recurrence/overall survival (panel B). E) mRNA expression levels for tumorigenic genes included in radiogenomic analysis. F) Summary of demographic and staging information for included TCGA-BLCA cohort patients. (T/N = Tumor/Nodal, TMB = Tumor Mutational Burden).
3.2. Tumor DNA mutations
Sixteen DNA mutations were included. Clinically useful model prediction was achieved for EP300 (AUC 0.77, 95% CI 0.64-0.91), FGFR3 (AUC = 0.76, 95% CI 0.56 – 0.96), and ARID1A (AUC 0.75, 95% CI 0.68-0.83) using RF, EN, and EN models respectively. Models were also able to identify mutations in ASXL2, CREBBP, RHOB, and a high TMB (TMB > 7.9), though with less consistency. No useful association between radiomics metrics and mutations of KMT2D, ATM, KDM6A, PIK3CA, ERBB2, RB1, ELF3, or TP53 was identified (Figure 3). A) Distribution of each binary outcome investigated in the radiogenomics analysis: favorable vs. unfavorable predicted outcome from mRNA panels3,4, high vs. baseline/low mRNA expression of selected tumorigenic genes, presence or absence of DNA mutation, favorable (≥7.9 Mut/MB) or unfavorable (0.6) and likely useful (AUC > 0.7) results. The class of the highest performing model is indicated by point and bar color.
3.3. MIBC staging
We also evaluated the role of CTTA in evaluating the T-stage (T2 vs. T3/4) and N-stage (N0 vs. N+). The models did not predict any association between radiomics and T-stage (AUC < 0.5) or N-stage (AUC = 0.57, 95% CI = 0.39-0.75) (Figure 3).
3.4. mRNA expression
Of the 89 patients included, 12 fit the “poor-prognostic signature” mRNA expression cluster defined by panel A, while 11 fit the “unfavorable” mRNA expression criteria of panel B. The EN model achieved potentially useful classification for panel A (AUC = 0.73, 95% CI 0.50 – 0.96) and the XGB model achieved similar results for panel B (AUC = 0.65, 95% CI 0.52 – 0.88). Models were also able to predict high transcription levels of several key regulatory genes. XGB identified elevated CDKN1 mRNA (AUC 0.78, 95% CI 0.57 – 0.98) and EN reached potentially useful classification for CASP3 mRNA (AUC 0.712, 95% CI 0.46-0.97). Classification results for remaining transcription levels were poor (Figure 3).
3.5. Tumor molecular subtypes
Of the 89 tumors included, 29 (32.6%) were classified as the luminal-papillary subtype, 20 (22.5%) as luminal-infiltrated, 5 (5.6%) as luminal, 33 (37.1%) as basal/squamous, 1 (1.1%) as neuronal and 1 remained unclassified. Luminal and neuronal subtypes were excluded from our analysis due to low prevalence in the dataset. The RF model demonstrated potentially useful identification of the luminal infiltrated subtype (AUC = 0.69, 95% CI = 0.51-0.87). Identification of luminal papillary (AUC = 0.57, 95% CI = 0.49 – 0.66) and basal/squamous (AUC = 0.54, 95% CI = 0.38-0.70) subtypes was poor (Figure 3).
3.6. Feature importance analysis
SHAP analysis outcomes classified with AUC identified 185 features with nonzero contribution to the XGBoost model. Of the top 10% most influential radiomics features identified by SHAP analysis, the majority (69%) were from the HIST, FFT, and GLDM feature groups (Figure 4(a)). The individual features with the largest contributions, both to individual outcomes and across all outcomes, were mean pixel intensity (HIST), gray level dependence matrix correlation coefficient (GLDM), and dominant eigenvector correlation (FFT) (Figure 4(b)). Higher absolute point biserial correlation between outcomes and all radiomic features was significantly associated with increased model prediction ability as assessed by AUC (r = 0.445, 95% confidence interval 0.107 - 0.691, p = 0.012). A) Contributions of radiomics features to XGBoost model predictions as described by SHAP analysis. Only outcomes predicted with mean AUC ≥ 0.7 and features with at least one Shapley value ≥ 0.1 are shown. Normalized mean feature contribution across all selected outcomes is provided in the bottom row of the heatmap. B) Baseline point-biserial correlations for all 488 features grouped by radiomics method with all outcomes. (SHAP = SHapley Additive exPlanations, GLCM = Gray Level Correlation Matrix, GLDM = Gray Level Difference Matrix, GLRLM = Gray Level Run Length Matrix, GLSZM = Gray Level Size Zone Matrix, FFT = Fast Fourier Transform, HIST = Histogram).
4. Discussion
MIBC is a heterogenous malignancy with highly variable clinical outcomes. Disease prognosis has long been informed by tumor-node-metastasis (TNM) staging and histological features on biopsy. In recent years, genomic profiling of tumors has allowed for more individualized prognosis and tailored treatments through molecular subtyping, gene alterations, and transcriptional expression factors. In this study, we demonstrated that radiomics can be used to predict certain clinically relevant genome-level alterations via machine learning models.
Several previous studies have demonstrated the ability of radiomics to predict muscle invasive status, 29 clinical staging, 30 and progression free survival 31 in bladder cancer. MRI and CT based radiomic analysis has also successfully been used in conjunction with genomic data to better estimate patient prognosis.32,33 The integration of imaging and molecular data is an evolving field. Significant radiogenomic correlations have been established outside of bladder cancer, especially in lung and head and neck cancers, where specific radiomic textures serve as proxies for underlying transcriptomic signatures.46,47 Furthermore, pan-cancer analyses have suggested that certain radiomic features may even reflect shared biological pathways across disparate tumor types. 48
Building on this foundation, our work is the first, to our knowledge, to demonstrate that CTTA can move beyond prognostic risk stratification to identify a range of specific, therapeutically relevant molecular targets in bladder cancer. The resultant imaging-based surrogate biomarkers paint a more detailed picture than good vs. poor prognosis alone and hold potential to assist with personalized treatment selection.
In particular, we found that the radiogenomic models were able to identify mutations in EP300, FGFR3, ARIDA1, and ASXL2, and elevated expression of CDKN1A and CASP3 with moderately high performance (AUC > 0.7). These genes represent current targets (Erdafitinib for treatment of FGFR2 susceptible tumors), 13 or potential targets (for example, use of EZH2 and PI3K inhibitors in ARID1A deficient tumors) 9 for personalized therapy. Moreover, the CTTA models could inform prognosis status as determined by the 4 gene mRNA panel A and molecular subtype.
Various studies have reported the inter-scanner, intra-scanner, and multicenter variability in CTTA results due to various imaging variables and processing factors that differ across multicenter studies. This lack of standardized CTTA protocols has limited the generalizability of past radiomics pipelines beyond the institution of origin.49,50 Post-processing data harmonization using techniques such as ComBat has been used to ameliorate scanner and protocol variabilities in multicenter studies while preserving biological variability studies to some success. 51 However, such batch adjustment methods have limitations when used on small datasets and require stringent data distributions to be met, limiting their applicability in practice. Therefore, to increase the reproducibility of our results, we used a feature set of 488 radiomics features previously reported as robust across different CT image acquisition setups. SHAP analysis conducted on the XGBoost model demonstrated an even smaller feature set of <200 features may be feasible without loss in performance.
The SHAP analysis also highlighted several features with exceptionally large contributions towards predictions, allowing for insight into how model output may be related to image characteristics. For instance, high mean pixel intensity may indicate highly vascular tumors or a dense cellular structure, while low magnitude of the dominant eigenvector derived from the Fourier transform may indicate chaotic, disorganized growth of cells within the tumor. The interpretability possible with the models used in the study represents a distinct advantage over more complex image classifiers such as neural networks or transformer-based architectures.
5. Study limitations
Our study is limited by its retrospective, correlative design, and by a small dataset of only 89 patients secondary to imaging selection criteria and the size of the TCIA and TCGA datasets. Furthermore, the lack of information regarding distant metastases in nearly 50% of the cohort limits our ability to fully assess the clinical utility of these features, particularly in their capacity to predict advanced or systemic disease status. Machine learning literature suggests optimal performance with datasets with tens or hundreds of times as many samples as features, implying that large improvements in performance should be possible with datasets containing hundreds or thousands of tumor samples and more complete staging information. 52 The multicenter nature of the data also introduces inherent variation in the CT data. Correcting for this variation with our robust feature set limits the number of applicable features and still may not completely ameliorate negative effects on model performance.
While this single-cohort pilot study proved the feasibility of CT based radiogenomic analysis of MIBC, our findings are not externally validated. We attempt to mitigate overfitting through cross-validation and data augmentation, but in face of the lack of external validation and small dataset, the risk of model overfitting must be considered. Future work should focus on external validation of these models on larger, independent cohorts and prospective studies to confirm their clinical utility.
6. Conclusions
CTTA is a promising tool for the evaluation of MIBC with the potential of deriving clinically useful DNA mutation and mRNA expression information from preexisting CT imaging data at scale. The output of CTTA based pipelines may serve as both a prognostic adjunct and provide insight into the genomic and transcriptomic profile of MIBC. Continued curation of multimodal datasets and further research to create robust, validated image-based biomarkers are essential to translate these findings into clinical practice.
Footnotes
Ethical considerations
This study used publicly available, de-identified data obtained from the Cancer Imaging Archive and The Cancer Genome Atlas. In accordance with prevailing regulations and institutional policies, analyses of these datasets do not constitute human subjects research and do not require institutional review board approval or informed consent.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article. The Cancer Imaging Archive is funded by the Cancer Imaging Program (CIP), a part of the United States National Cancer Institute.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: State: - V.D. has a consulting relationship with Radmetrix, Roche, and Deeptek. S.P.L. has funding for clinical trials from Aura Biosciences, FKD, JBL (SWOG), Genentech (SWOG), Merck (Alliance), QED Therapeutics, SURGE Therapeutics, Vaxiion, and Viventia; is a consultant/advisory board member for Aura Bioscience, BMS, Gilead, Incyte, Pfizer/EMD Serono, Protara, Surge Therapeutics, UroGen, Vaxiion, and Verity; has a patent for the TCGA classifier and received honoraria from Grand Rounds in Urology and UroToday; has stock options from Aura Biosciences; and received funds for stock options from C2I Genomics/Veracyte.
