Abstract
Background:
Minimally invasive pyeloplasty (MIP), encompassing both conventional laparoscopy and robot-assisted approaches, has become the primary treatment for pediatric ureteropelvic junction obstruction (UPJO). However, these procedures remain associated with considerable postoperative complications that affect surgical outcomes. This study employed the Clavien–Dindo classification system and interpretable machine learning (ML) algorithms to predict Clavien–Dindo grade ≥2 complications using strictly preoperative clinical and ultrasound parameters. Furthermore, an online calculator was developed to assist clinicians in real-time risk stratification.
Methods:
A retrospective analysis was conducted on 533 pediatric UPJO patients admitted between January 2020 and December 2024. Feature selection was performed using recursive feature elimination. Ten ML algorithms were developed and compared to identify the optimal predictive model.
Results:
A total of 533 children undergoing unilateral minimally invasive pyeloplasty were included, among whom 127 (23.8%) developed Clavien–Dindo grade ≥2 complications. Seven preoperative clinical and imaging features were prioritized for model construction: age, affected-to-unaffected kidney volume ratio, increased calyx diameters, blood neutrophil count, cystatin C, anteroposterior pelvic diameter, and white blood cell (WBC) count. Based on comparisons of the area under the receiver operating characteristic curve and decision curve analysis in training and testing sets, the light gradient boosting machine (LightGBM) model demonstrated superior performance. SHapley Additive exPlanations (SHAP) analysis enhanced interpretability, and an online calculator was developed to improve clinical applicability.
Conclusion:
The LightGBM model can effectively predict the occurrence of Clavien–Dindo grade ≥2 complications following unilateral MIP. It may assist clinicians in evaluating postoperative risks and developing personalized follow-up strategies for early intervention.
Keywords
Introduction
Ureteropelvic junction obstruction (UPJO) is the most common cause of neonatal hydronephrosis, affecting as much as 20% of infants with prenatal hydronephrosis. 1 It describes a functional or anatomical blockage of urine flow from the renal pelvis to the proximal ureter. This can result in dilation of the renal pelvis and calyces, potentially causing impaired kidney function and adversely affecting long-term pediatric outcomes. For patients with symptomatic UPJO, progressive worsening of dilation, or impaired renal function, surgical intervention remains the primary treatment option.1,2 Since its introduction by Anderson and Hynes, pyeloplasty has been established as the gold standard surgical procedure for the management of UPJO in children. 3 Its efficacy and safety have been widely recognized in the field.4,5
In recent years, advances in pediatric surgical techniques have led to the growing adoption of minimally invasive approaches. Laparoscopic pyeloplasty (LP) has gradually replaced open pyeloplasty as the first-line surgical intervention for UPJO.4–6 In addition, robot-assisted LP (RALP) has undergone further development and application, with previous studies reporting its excellent efficacy and safety profiles. 7 However, regardless of the minimally invasive modality chosen, the limited operating space and the smaller size of the renal pelvis and ureter in children pose additional challenges. Previous studies have reported major postoperative complication rates ranging from 6.7% to 37.5%, primarily including urinary tract infections (UTIs) and reobstruction,8–10 which can affect surgical outcomes. This study employs the Clavien–Dindo classification system, which is the most widely used system for objectively categorizing postoperative complications based on the level of intervention required. 11 Complications are classified into grade II (requiring pharmacological treatment) and grade III or higher (requiring surgical intervention).
Machine learning (ML) provides powerful analytical capabilities by identifying relationships between patient attributes and clinical outcomes through the analysis of large datasets, thereby supporting objective outcome prediction through data integration. These techniques have been extensively used in medical applications such as diagnosis, outcome forecasting, treatment planning, and image interpretation.12,13 To ensure that these predictions provide transparent and clinically understandable reasoning, our study combined ML algorithms with the SHapley Additive exPlanations (SHAP) framework to predict Clavien–Dindo grade ≥2 complications after unilateral LP. We also created an online calculator to enhance clinical usability. This approach helps clinicians assess postoperative risks, personalize follow-up strategies, and enable earlier interventions.
Methods
Study population
This study was conducted with the approval of the Ethics Review Committee of Children’s Hospital of Chongqing Medical University. The study population consisted of patients who underwent unilateral MIP for UPJO at our institution between January 2020 and December 2024. In recent years, our center has fully transitioned to minimally invasive techniques for this indication; thus, no open pyeloplasty cases were performed or included during the study period. The MIP procedures in our cohort encompassed both conventional LP and RALP. Eligible participants were under 18 years of age, were undergoing primary unilateral MIP (either LP or RALP) at our hospital, and had completed at least 6 months of postoperative follow-up in the outpatient clinic. Patients were excluded if they had concomitant congenital urinary tract anomalies such as duplex kidney, underwent bilateral procedures, had received initial surgical intervention at another hospital, had incomplete medical documentation, or were lost to follow-up. The overall clinical indications for surgical intervention primarily included the presence of UPJO-related symptoms (e.g., recurrent flank or abdominal pain, recurrent UTIs, and hematuria) or progressive worsening of hydronephrosis on serial ultrasound imaging (e.g., increased anteroposterior pelvic diameter [APD] and Society for Fetal Urology [SFU] grade III or IV). A total of 533 pediatric patients met the inclusion criteria. The cohort included 430 male patients (80.7%) and 103 female patients (19.3%), with a median age of 2.8 years (IQR: 0.8–6.9 years). Regarding the surgical approach, 512 patients underwent conventional laparoscopic pyeloplasty, and 21 patients underwent robotic-assisted laparoscopic pyeloplasty.
Data collection and outcomes
Data were obtained from electronic medical records, including demographic and clinical characteristics, preoperative laboratory values, and ultrasound findings. Collected variables comprised sex, age, weight, history of prenatal hydronephrosis, laterality of obstruction, clinical symptoms, and recurrent UTIs. Preoperative labs covered white blood cell (WBC) count, absolute neutrophil count and percentage, serum creatinine, cystatin C (Cys C), urinalysis parameters (WBCs, red blood cells), and urine culture results. Renal ultrasound evaluation involved affected kidney volume, affected-to-healthy kidney volume ratio, APD, cortical thickness, and perfusion status of the affected kidney. Standardized ultrasound measurements, including APD and increased calyx diameters, were recorded during the most recent preoperative evaluation to ensure clinical reproducibility. Ultrasound-derived renal volume was determined according to the simplified ellipsoid formula (V = 0.5 × L × W × H). 14 Renal perfusion status was evaluated using color Doppler flow imaging optimized for slow cortical flow and was classified as a binary variable: adequate perfusion or inadequate perfusion. 15
Primary outcomes included UTI, stent-related complications requiring reintervention, and reobstruction needing reoperation. Complications were graded using the Clavien–Dindo classification.
ML model development
Selection of the variables
The feature selection process was conducted in three sequential steps to identify the most robust predictors. First, a panel of experienced pediatric urologists reviewed an original pool of 32 collected variables. Variables lacking direct clinical relevance to pyeloplasty outcomes or exhibiting limited biological plausibility were excluded, yielding a refined candidate pool of 22 strictly preoperative variables. Second, to mitigate the potential impact of multicollinearity, a Spearman correlation analysis was conducted among these preselected variables. For any pair of features exhibiting a strong correlation (correlation coefficient ≥0.7), one was carefully excluded to ensure feature independence. Finally, the remaining variables were incorporated into a recursive feature elimination (RFE) process based on the random forest (RF) algorithm. RFE is a robust wrapper method that iteratively removes the least important features and evaluates the model’s performance via 10-fold cross-validation to identify the optimal feature subset. The features ultimately retained by the RFE process were selected as the final candidate variables for model construction. All analyses were implemented in R using the “caret,” “randomForest,” and “corrplot” packages.
Processing of data
Missing values (<20%) were handled using standard imputation methods, and the dataset was partitioned into training and testing sets (7:3 ratio).
Model selection
Ten ML algorithms were evaluated to predict Clavien–Dindo grade ≥2 complications. Hyperparameters were tuned via grid search, and models were rigorously validated using 10-fold cross-validation. Performance was comprehensively assessed using area under the curve (AUC), calibration curves, decision curve analysis (DCA), and standard classification metrics. Detailed descriptions of the algorithms and corresponding R packages are provided in the Supplementary Table S1.
SHAP-based interpretability analysis
Model interpretability was achieved using SHAP via the “shapviz” R package. SHAP summary, swarm, and waterfall plots were generated to quantify and visualize individual feature contributions to the light gradient boosting machine (LightGBM) predictions. 16
Shiny online calculator
To facilitate clinical use, we built an online calculator with the R package “Shiny,” allowing clinicians to calculate real-time, individualized risk estimates for Clavien–Dindo grade ≥2 complications. The tool is hosted on Shinyapps.io for free public access.
Statistical analysis
All statistical analyses were performed using R (version 4.5.1). Normality of continuous variables was assessed using the Shapiro–Wilk test. Normally distributed variables are expressed as mean ± standard deviation and compared using the t-test; non-normally distributed variables are presented as median with interquartile range and compared with the Mann–Whitney U test. Categorical variables are summarized as frequencies (percentages) and compared using chi-square or Fisher’s exact test. Feature selection was conducted using RFE, with ML algorithms employed to identify the optimal predictive model. Model calibration was evaluated using calibration curves, and a two-sided p-value of <0.05 was considered statistically significant.
Results
Characteristics of the study population
A total of 533 patients were included in the final analysis, with 127 (23.8%) developing Clavien–Dindo grade ≥2 complications. These comprised 89 grade 2 (16.7%) and 38 grade 3 complications (7.1%). Specific complications included UTI (89 cases, 16.7%), Double-J stent-related complications (26 cases, 4.9%), and reobstruction requiring reoperation (12 cases, 2.3% Table 1). Patients were randomly divided into training (n = 374) and testing (n = 159) sets at a 7:3 ratio, with no significant differences in baseline characteristics observed between the two sets. The screening and analysis process is summarized in Figure 1.
Baseline Characteristics of Included Patients Stratified by Clavien–Dindo Grade ≥2
APD = anteroposterior pelvic diameter; Cys C = cystatin C; LP = laparoscopic pyeloplasty; N = neutrophil count; Pt-thin = parenchymal thinning; RALP = robot-assisted laparoscopic pyeloplasty; Scr = serum creatinine; Urbc = urine red blood cell; UTI = urinary tract infection; Uwbc = urine white blood cell; WBC = white blood cell.

Flowchart of data screening and analysis.
Variable selection and correlation analysis
To identify the optimal predictors, we initially evaluated 22 candidate preoperative variables. First, Spearman correlation analysis identified strong multicollinearity among several baseline variables (Supplementary Fig. S1a). To ensure model stability, redundant variables, including weight, serum creatinine, and neutrophil percentage, were excluded. Subsequently, the remaining variables were subjected to an RF-based RFE process with 10-fold cross-validation. To balance model parsimony with high predictive accuracy, seven key preoperative variables were ultimately prioritized for model construction: age, affected-to-unaffected kidney volume ratio, increased calyx diameters, blood neutrophil count, Cys C, APD, and blood WBC count (Supplementary Fig. S1b, c).
Model development and comparison
Ten ML models were developed to predict the risk of Clavien–Dindo grade ≥2 complications after unilateral MIP. After excluding overfitted models, the remaining models were evaluated using receiver operating characteristic analysis. The LightGBM model demonstrated superior performance, with training and testing AUCs of 0.810 and 0.747, respectively (Fig. 2a, b).

Receiver operating characteristic (ROC) curves of the optimal light gradient boosting machine (LightGBM) prediction model.
Comprehensive evaluation confirmed that LightGBM outperformed other candidates across multiple metrics (Tables 2 and 3). Specifically, it demonstrated balanced sensitivity and specificity in the testing set, achieving an optimal Brier score of 0.200.
Performance Metrics of Machine Learning Models on the Training Set
This table presents the performance metrics of various machine learning models on the training set, including AUC, accuracy, sensitivity, specificity, precision, F1-score, and Brier score.
AdaBoost = adaptive boosting; CatBoost = category boosting; GBM = gradient boosting machine; KNN = k-nearest neighbor; LightGBM = light gradient boosting machine; NN = neural network; SVM = support vector machine; XGBoost = extreme gradient boosting.
Performance Metrics of Machine Learning Models on the Testing Set
This table presents the performance metrics of various machine learning models on the testing set, including AUC, accuracy, sensitivity, specificity, precision, F1-score, and Brier score. The bold text highlights the LightGBM model, which demonstrated the optimal overall predictive performance.
Calibration curves showed good prediction reliability (Supplementary Fig. S2a, b), and DCA demonstrated clinical utility across risk thresholds (Supplementary Fig. S2c, d), supporting LightGBM’s selection as the optimal predictive model.
SHAP model interpretability and web application development
SHAP summary and bee swarm plots were used to visualize feature contributions in the LightGBM model for predicting Clavien–Dindo grade ≥2 complications. Age showed the highest SHAP values, indicating a strong predictive influence, with younger age associated with an increased risk of Clavien–Dindo grade ≥2 complications. Preoperative affected-to-unaffected kidney volume ratio and increased calyx diameters also had high SHAP values, confirming their importance. In contrast, APD and WBC showed relatively lower but complementary predictive weights in the global model output (Supplementary Fig. S3a, b).
Partial dependence plots (PDPs) illustrated nonlinear relationships for age and other selected variables (Supplementary Fig. S3c, d). SHAP waterfall plots were used to interpret individual predictions in representative cases (Supplementary Fig. S3e, f), where yellow bars indicate risk-increasing features and red bars denote protective factors. The cumulative effect of features is shown as the f(x) value.
An online calculator (Supplementary Fig. S4) was developed based on the LightGBM model to predict complication risk, available at https://complicationsafterpyeloplasty.shinyapps.io/lgbweb/.
This tool assists clinicians in individualized risk assessment, postoperative planning, and early intervention for high-risk patients.
Discussion
This study developed and internally validated a strictly preoperative LightGBM model to predict Clavien–Dindo ≥2 complications after pediatric MIP. By integrating seven accessible preoperative features (age, kidney volume ratio, calyx dilation, APD, Cys C, WBC, and neutrophils) with SHAP interpretability, we created an online calculator to facilitate early risk stratification and personalized surgical planning.
Pyeloplasty remains the gold standard treatment for UPJO. With advances in minimally invasive techniques, LP and RALP have been widely adopted in pediatric patients with UPJO, achieving success rates comparable to those of open pyeloplasty.7,17 However, MIP is associated with a considerable postoperative complication rate, reported to be 20% or higher. 18 Major complications classified as Clavien–Dindo grade ≥2 primarily include UTIs, Double-J stent-related adverse events, and reobstruction requiring reoperation. Previous studies on risk factors for complications following LP have suggested that preoperative APD, Cys C levels, and intraoperative blood loss may influence the occurrence of reobstruction. 19 In addition, weight <10 kg and complications related to intraoperative nephrostomy tube placement have been identified as risk factors for higher-grade complications according to the Clavien–Dindo classification. 20
Age emerged as a prominent predictor in our model, reflecting the interplay between anatomical constraints and physiological immaturity. In infants, limited retroperitoneal space restricts the maneuverability of minimally invasive instruments, which elevates the technical difficulty of anastomosis and potentially increases tissue trauma. 21 Furthermore, younger age is closely associated with postoperative UTIs, likely due to an immature immune defense system. 22 Lower age and somatic metrics may indicate inadequate mucosal immunity, while smaller bladder capacities and immature voiding patterns further predispose infants to infections. 23 Postoperative UTIs can trigger persistent localized inflammation and fibroplasia at the anastomotic site, theoretically elevating the risk of secondary stenosis—a phenomenon frequently observed in redo pyeloplasties associated with a history of infection. 24 Identifying patients at high risk for UTIs provides a strong clinical motivation to safeguard long-term anastomotic patency. Although body mass index (BMI) and growth curve percentiles are valuable developmental indicators that naturally mitigate collinearity concerns, our analysis was limited by missing height data in the retrospective records. Prospective studies are warranted to elucidate the impact of somatic metrics on the pathological repair process.
Beyond these age-related developmental factors, the local anatomical burden of the affected kidney critically dictates surgical outcomes. Our model integrates three ultrasound parameters—APD, increased calyx diameters, and the affected-to-unaffected kidney volume ratio—as predictors of major complications. APD is a well-established metric for hydronephrosis severity; larger preoperative values significantly correlate with increased risks of persistent obstruction, infection, and reoperation in pediatric UPJO20. While increased calyx diameters have not been previously established as an independent predictor, it is a core component of the SFU grading system.20,25 We postulate that pronounced increased calyx diameters reflect severe chronic obstruction and early parenchymal compromise, predisposing patients to adverse trajectories. 26 Furthermore, the affected-to-unaffected kidney volume ratio emerged as a novel predictor. We hypothesize this ratio captures three-dimensional pelvic tension and overall anatomical distortion. By utilizing the contralateral healthy kidney as a denominator, this metric inherently standardizes physical expansion against the patient’s baseline somatic growth. An elevated ratio likely indicates long-standing obstruction and heightened surgical complexity. As a hypothesis-generating finding, its prognostic value requires further multicenter validation.
Beyond local anatomical challenges, a patient’s baseline systemic environment significantly dictates recovery. Elevated WBC and neutrophil counts suggest a subclinical pro-inflammatory state, which surgical trauma may amplify, compromising anastomotic repair and increasing susceptibility to postoperative UTIs.23,27 Concurrently, baseline physiological reserve is crucial. Cys C, a sensitive marker of glomerular filtration, 28 reflects the extent of chronic parenchymal damage secondary to long-standing obstruction. Consistent with our findings, prior research has identified Cys C as a potential risk factor for postoperative complications. 19
This study possesses several strengths. First, it develops an ML model specifically to predict complications following unilateral MIP in pediatric UPJO. Second, employing the Clavien–Dindo classification to define endpoints enhances clinical relevance and comparability. Third, feature selection via RFE combined with multicollinearity testing improves model robustness. Furthermore, the comprehensive evaluation of 10 ML algorithms ensured optimal model selection, while the integration of SHAP values and DCA provided both interpretability and clinical utility. Finally, the development of an online calculator facilitates real-time risk assessment and personalized clinical decision-making, allowing pediatric urologists to proactively tailor surgical planning, optimize antibiotic prophylaxis, and individualize follow-up strategies for high-risk patients prior to surgery.
Several limitations of this study warrant consideration. First, the retrospective, single-center design may introduce inherent selection bias and limit the generalizability of our findings to other institutional settings. Second, despite standardized protocols, preoperative ultrasound metrics (e.g., APD) remain susceptible to transient physiological variability, such as hydration status and bladder fullness, which can introduce unavoidable measurement variations. Third, although the cohort size was substantial for this clinical population, it is relatively modest for complex ML algorithms. While we employed rigorous dimensionality reduction and 10-fold cross-validation to mitigate the risk of overfitting, this sample size inherently limits the absolute stability and generalizability of the model. Fourth, the current model was evaluated using internal validation only; independent external validation in multicenter cohorts is essential to confirm its broad clinical utility. Furthermore, the moderate AUC values suggest that a degree of variance remains unexplained. This could be attributed to the exclusion of certain potentially relevant variables, such as BMI—which was omitted due to missing height data—or radionuclide imaging results reflecting differential renal function. Finally, as a retrospective study, we could not account for all potential confounding factors. Prospective studies incorporating multidimensional data are needed to further refine and validate these predictive findings.
Conclusion
This study developed an ML model to predict Clavien–Dindo grade ≥2 complications after unilateral MIP. Combining SHAP improved interpretability of individual risk predictions, and an online calculator promotes clinical use. By integrating multidimensional anatomical metrics with baseline physiological indicators, this model translates clinical heuristics into a precise, quantifiable risk score. This approach aids clinicians in preoperative counseling, postoperative risk assessment, and informing personalized follow-up strategies in pediatric patients.
Ethical Approval
This work has been approved by the Ethics Committee of Children’s Hospital of Chongqing Medical University. Study conduct aligned with the protocol, the ethical principles in the Declaration of Helsinki, and International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use guidelines, as well as applicable regulations and guidelines.
Authors’ Contributions
H.P. and R.W. conceptualized the study, curated and analyzed the data, designed the methodology, and wrote the initial draft. They contributed equally to this work. M.J. and Q.H. conducted the field investigations, applied analytical software, and created visualizations of the results. J.L., X.L., T.L., and G.W. oversaw project coordination and provided critical reviews of the article. D.H. secured research resources, supervised the study execution, validated the findings, and contributed to article revision.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
Funding Information
The authors received funding from the
Supplemental Material
Abbreviations Used
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
