Abstract
Objective:
To assess the performance of recurrence of kidney stone (ROKS) nomogram in identifying first-time stone formers who will require future stone procedures.
Materials and Methods:
From January 2009 to February 2016, 2287 patients underwent surgical treatment for nephrolithiasis at our institution and 498 of them were eligible for this study. We defined recurrence as repeat surgery for symptomatic nephrolithiasis. We analyzed the performance of the nomogram with respect to discrimination, calibration, and the clinical net benefit. We also examined the performance of each individual variable from the nomogram.
Results:
Over a median follow-up of 4.8 years (mean 4.6, IQR 3.1–6.1), 88 patients (17.7%) had recurrent nephrolithiasis requiring surgical treatment. The ROKS nomogram demonstrated moderate discriminative ability (AUC 0.655 for 2 years and 0.605 for 5 years). Calibration of the ROKS nomogram-based predictions was poor and net clinical benefit was minimal. Three of 11 predictors from the nomogram were statistically significantly associated with the risk of repeat surgery, with two of them representing similar clinical scenarios, namely symptomatic and nonsymptomatic renal stones.
Conclusion:
ROKS nomogram demonstrated limited discrimination and calibration in predicting the risk of repeat surgery for symptomatic nephrolithiasis in our cohort of first-time stone formers. This may be caused by the differences between stone patients who do and do not require surgery and suggests the need for development of more precise prediction instruments.
Introduction
Nephrolithiasis is increasingly prevalent in the United States and other developed nations. 1 Morbidity of this disease is particularly high in patients with recurrent symptomatic stones requiring surgical interventions, who are also at higher risk of developing long-term complications. Nephrolithiasis is also associated with growing health care cost comparable with that of many oncologic diseases. 2 Identification of patients with recurrent renal stones at the time of the original presentation could help to reduce this morbidity and cost through more comprehensive monitoring such as precise imaging and biochemical evaluation, and more aggressive preventive measures, including dietary, behavioral, and pharmacologic interventions. 3 Furthermore, first-time stone formers who are aware of their increased risk of future recurrence may be more motivated to follow the recommendations aiming at stone prevention, which improve quality of life and life expectancy.
The factors associated with increased risk of disease recurrence among patients with nephrolithiasis have not been well understood 4 until a recurrence of kidney stone (ROKS) nomogram was published in 2014. 5 This prediction tool is based on 2239 first-time adult kidney stone formers, at least one-third of whom had surgical treatment. The nomogram has demonstrated some discriminative ability in the original cohort, however, to our knowledge it has not been externally validated. In this study, we tested the performance of this nomogram in identifying first-time stone formers who will require future interventions in surgical cohort at the University of Wisconsin.
Materials and Methods
From January 2009 to February 2016, 2287 patients underwent surgical treatment for symptomatic nephrolithiasis at our institution and were entered into an IRB-approved database. For the purpose of this analysis, we eliminated patients who had personal history of renal stone disease (n = 1205), less than 2 years of documented follow-up (n = 572), or were younger than 18 years at the time of the procedure (n = 12). This resulted in a study cohort of 498 patients.
We defined recurrence as repeat surgery for symptomatic nephrolithiasis. This did not include interventions planned during the first surgery (staged procedures), surgical treatment in a contralateral kidney for stones present at the time of the original diagnosis or procedures performed for asymptomatic stones found on follow-up imaging.
Estimates of recurrence at 2 and 5 years of follow-up were calculated for each individual patient using the ROKS nomogram. We then used these predictions to analyze the performance of the nomogram with respect to discrimination, calibration, and the clinical net benefit.
Discriminative performance was assessed using the area under the receiver operating curve (AUC). This parameter ranges from 0.5 (random predictions) to 1 (perfect concordance). Calibration was evaluated graphically by plotting the observed outcomes vs predicted probabilities (calibration curve). The LOWESS smoother function was used for this analysis. The clinical net benefit was evaluated with decision curve analysis (DCA). This method measures potential clinical benefit resulting from changes in management of patients with different probabilities of repeat surgery for symptomatic renal stones.
To identify the most valuable predictors among those included in the ROKS nomogram, we selected Cox proportional hazards regression analysis to examine the performance of each individual variable in our patient population. Harrell's C-index was used to quantify the predictive performance of each of the variables that were significantly associated with the recurrence. All analyses were conducted using STATA version 11.0 software (College Station, TX).
Results
Patients' characteristics are presented in Table 1. Over a median follow-up of 4.8 years (mean 4.6, IQR 3.1–6.1), 88 patients (17.7%) had recurrent symptomatic nephrolithiasis requiring surgical treatment. Overall, 269 patients (54.0%) had more than 5 years of documented follow-up or recurred within 5 years from their first surgery and were included in the analysis of the ROKS nomogram predictions for the risk of recurrence at 5 years. The estimated cumulative recurrence rate in the entire cohort is shown in Figure 1. About 25% of patients required a second procedure by 8 years of follow-up.

Cumulative risk of repeat surgery for symptomatic nephrolithiasis.
Patients' Characteristics
ROKS nomogram demonstrated moderate discriminative ability in identifying patients who have increased risk of repeat surgery with AUCs of 0.655 and 0.605 for 2- and 5-year predictions (Fig. 2A, B). The overall Harrell's index for the model was 0.622 (compared with 0.647 in the original population 5 ).

Receiver operating characteristic curves for the ROKS nomogram predictions at 2
Calibration of the ROKS nomogram-based predictions was poor especially when the estimated risk exceeded 20% for the recurrence at 2 years and 40% for the recurrence at 5 years (Fig. 3A, B). DCA demonstrated small net benefit for nomogram-predicted probability of up to 17% for the 2-year predictions, while no net benefit was seen for 5-year predictions (data not shown).

Calibration plots of the ROKS nomogram predictions at 2
Table 2 shows the performance of individual predictors from the ROKS nomogram. Three of 11 predictors, namely age, presence of symptomatic stone at the renal pelvis or lower renal pole, and concurrent asymptomatic (nonobstructing) stone, were statistically significantly associated with the risk of repeat surgery with the latter two representing similar clinical scenarios.
Association Between Individual Predictors from the Recurrence of Kidney Stone Nomogram and the Risk of Repeat Surgery
CI = confidence interval; C-index = Harrell's C-index; HR = hazard ratio.
Discussion
Prediction instruments are becoming an ever more important part of current medical practice in light of the focus on outcome reporting. Nomograms provide individual estimates of future clinical events by combining the effects of various parameters associated with these events, thus addressing the uncertainty faced by both patients and physicians. External validation is a critical step in evaluating the clinical value of any prediction tool. It is well established that performance of such tools tends to deteriorate when they are tested in populations other than the one in which they were developed. In this study, we analyzed the performance of nomogram predicting the risk of second symptomatic stone episode in first-time stone formers in our cohort of patients who underwent surgical treatment.
Our findings suggest that ROKS nomogram has limited value in this paradigm. Although the nomogram was moderately helpful in discriminating between patients with higher and lower risk of repeat surgery (AUC 0.655 and 0.605 for 2- and 5-year predictions), the accuracy of the predictions was poor especially when the estimated risk was relatively high. Moreover, of 11 predictors included in the nomogram, only age and presence of symptomatic and asymptomatic renal stone were statistically significantly associated with the risk of repeat surgery in our cohort.
Theoretically, the difference in performance of a predictive tool on the internal and external validation should result either from the dissimilarity between characteristics of patients from the cohort where the tool was originally developed and the one where it was tested or from the differences in the methodology of the analysis.
The sociodemographic composition of our cohort was likely similar to that in the original study. Indeed, our institution is located in the same region of the United States and less than 200 miles from the Olmsted County. Therefore, the differences in performance were more likely resulting from the differences in disease characteristics and treatment outcomes between the general population of first-time stone formers and those who undergo surgical interventions.
Some differences could be hypothesized as potentially responsible for the deterioration in the performance of the nomogram. Preoperatively, patients who require surgery tend to have more severe disease as shown by higher prevalence of multiple stones and also renal stones in our cohort. On the contrary, the significance of nonobstructing stones likely differs between patients who pass only the symptomatic calculus and those who undergo surgery, as in the latter group the nonobstructing stones in the same renal unit are likely to be removed, while in the former group, such stones remain and may produce recurrent symptoms in the future.
The presence of a stone at the ureterovesicular junction was a predictor of lower risk of future stone episodes in the original cohort, in which at least 48% of patients passed their stones. While in a general population of patients with nephrolithiasis, this stone location is associated with smaller stones and increased likelihood of spontaneous passage both for the current and future stones; in patients who do not pass the stone and require surgical procedure, such an association may be much weaker.
At least two decades divide the middle of recruitment periods of our and the ROKS nomogram cohorts. It is therefore possible that temporal changes in the epidemiology of nephrolithiasis such as closing the gap in the prevalence of disease among males and females 6 may be at least partially responsible for the observed differences in the effect of gender on the risk of stone recurrence. Indeed, males represented 62.5% of the original cohort compared with just 52.2% in our population. The changes in surgical and medical management of nephrolithiasis may also have played some role.
Despite the considerations presented above, it is important to note that our approach of testing the ROKS nomogram in a surgical population was valid as this tool is intended to be used in surgical patients. First, as previously mentioned, 33% of the patients in the original cohort underwent surgical treatment. It is possible that surgical therapy was also performed in some of the 20% of patients for whom stone resolution was not documented or who had resolution of the symptoms without stone passage as they may have had treatment outside of the Olmsted County. Second, the authors tested but chose not to include surgical treatment as a variable in their final model, suggesting that their nomogram's ability to predict the risk of stone recurrence was not significantly affected by the treatment modality.
Another important difference between our analysis and the original study is that we used surgery for recurrent symptomatic stone disease as opposed to any symptomatic stone (i.e., spontaneous passage, symptoms without passage, and surgery) as an evidence of recurrence. While this may have affected the performance of the nomogram and especially its calibration, we believe that our approach has a number of advantages. Surgical treatment for recurrent stone disease is a more precise outcome than passage of a symptomatic stone or just stone-related symptoms, as the latter are difficult to document accurately especially if the patient did not seek medical care. Furthermore, the costs and morbidity associated with surgical treatment for nephrolithiasis make it the most significant outcome of the alternatives. The authors of the original study did not specify the frequency of different types of recurrences in their cohort, making it impossible for us to estimate the possible effect of the differences in definition of recurrence on the nomogram performance.
Our analysis has a number of other limitations inherent to relatively small, single-center studies with limited follow-up. Of note, the latter did not allow us to test the nomogram's performance in predicting the risk of recurrence at 10 years. Our analysis was done in the settings of a referral center, which likely affected the duration of follow-up as many patients chose to continue their medical care locally. This may have introduced some selection bias into the study as patients with recurrent stones may have been more likely to return for follow-up at our institution.
In conclusion, the ROKS nomogram demonstrated limited discrimination and calibration in predicting the risk of repeat surgery for symptomatic nephrolithiasis in our cohort of first-time stone formers. This may be caused by the differences between stone patients who do and do not require surgery and suggests the need for development of more precise prediction tools specifically for the surgical population of patients with nephrolithiasis.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
