Abstract
Introduction:
Urolithiasis guidelines still rely on the maximum stone diameter to propose treatment strategy, although this measure is known to have many pitfalls. Stone volume (SV) could represent a more accurate measurement, helping to plan the treatment or follow-up. Various methods to measure SV have been proposed. We aimed to compare different methods to estimate SV.
Methods:
Fifteen stones (human and artificial) were assessed. Real SV was measured using the water displacement method. Volume estimation included three diameter-based formulas (Ackerman, 4/3 Pi r3 and r3/2) and two 3D segmentation methods (Horos and Kidney Stone Calculator [KSC]). All measurements were done by a single operator. Spearman correlation test and comparative analyses were conducted between the real and the estimated SV.
Results:
Compared with real SVs, Ackerman and r3/2 formulas estimated volume accurately in 2/15 (13%) of stones each. No accurate measurement was reported using the sphere formula. KSC did estimate volume accurately in 4/15 (27%) stones compared with the reference SV; Horos did it in 7/15 (47%) stones. Both segmentation methods presented strong correlation coefficients (r = 0.9642 and 0.9659, p < 0.0001), while formula correlation was moderate (r = 0.7531, p < 0.0001).
Conclusion:
Formulas and segmentation methods for SV estimation resulted in divergent outcomes. Segmentation methods (Horos and KSC) presented higher accuracies in SV estimation, compared with real SV. Formulas were the least accurate.
Introduction
Urinary stones differ in their shape, size, location, risk of recurrence, etiology, and composition. 1 The most renown urolithiasis guidelines 2,3 still consider the maximum stone diameter (MSD) as one of the main factors to decide one treatment over another. This could be inaccurate as a small variation in diameter could translate into large differences in stone burden., that is, a 19 × 19 × 19 mm (3591 mm3) spherical stone double in volume of a spherical stone of 15 × 15 × 15 mm (1767 mm3). Also, the shape of the stone may determine great variations in stone volume (SV), that is, the same 15 × 15 × 15 mm stone named before has nine times more volume than a 15 × 5 × 5 mm (196 mm3) stone.
Some authors 4 have proposed the use of SV to better report stone burden. Knowing the stone burden, estimating the lithotripsy duration during ureteroscopy is feasible, considering also pre- and intraoperative factors, such as the composition (hardness) of the stone, surgeon’s experience, tools, and technique. 5,6 Nonetheless, some publications 7 –9 advocate that SV is not a paramount variable to predict stone-free rate after treatment.
Even a whole new terminology (laser ablation efficiency, laser ablation speed, laser efficacy, laser energy consumption, lasering time consumption, and total operative time laser consumption) has been described to better report stone treatment results. 10 All parameters in this newly adopted terminology have a common factor: SV.
Thus, having an accurate method for estimating SV is paramount for better individualization of urinary stone treatment. One classical option is the use of different formulas to estimate SV based on a single or multiple measurements. Another relatively newer alternative is with the help of software that uses segmentation to estimate SV. This tool separates subregions of interest based on certain features (i.e., HU). 11
The objective of this study was to evaluate in vitro different methods to estimate SV.
Methods
Fifteen stone cases of various shapes were assessed. Of them, nine were ex vivo human urinary stones, three were soft BegoStones (BegoStone plus, Rhode Island, USA), and three were hard BegoStone (BegoStone plus, Rhode Island, USA). The hardness of the BegoStone was prepared according to a validated method. 12 BegoStones were used to allow having different degrees of complexity in shapes. All BegoStones had irregular fantasy shapes (stones numbered from 10 to 15).
All stones had a CT scan done with a Toshiba, Aquillion Prime (Tokyo, Japan) machine. Acquisition method was as follows: Spiral type with tube current at 60 mAmp and 100 kVp. Slice thickness was 2.0 mm.
Reference SV was measured three times with the water displacement method per stone. Stones were immersed in water inside a graduated cylinder glass to evaluate the displaced water volume, that is, reference SV. We use different glasses depending on the diameter of the stone to be able to insert it into the glass. This measurement was considered the “real SV.” Volume estimation was done with three formulas: Ackerman (0.6 × Pi × radius2), sphere (4/3 × Pi × radius3), and radius3/2. Also, two free open-source software were used to estimate SV: Horos (https://horosproject.org) and Kidney Stone Calculator (KSC) (https://www.slicer.org). 13 Six estimations were performed using Horos and eight using KSC.
Stone radius was calculated dividing the largest stone diameter by two. The diameter was measured using a Vernier caliper Truper, CALDI-6MP (Jilotepec, Mexico).
Direct SV estimation was conducted by density-based segmentation in bone window, using the KSC and Horos software. 14 Figure 1 illustrates the real stone photography of each sample compared with the Horos and KSC 3D reconstructions.

Photography of real stones compared with the Horos 3D reconstruction.
All raw data are in Supplementary Data.
Statistical analysis
For the descriptive analysis of the data, summary measures, mean, and standard deviation were used, in the case of quantitative variables. In the case of categorical variables, frequency measures (percentage) were used. To analyze the relationship between the real volumes with the different methods, the Spearman correlation test was used. Furthermore, to compare the different methods, the Kruskal–Wallis test was used, with a Bonferroni post hoc analysis to evaluate the comparison between two methods. In relation to the assumptions of normality and homoscedasticity that these tests consider, this was evaluated using the Shapiro–Wilk tests and Bartlett tests, respectively. To carry out these analyzes, STATA statistical software version 17 (StataCorp, College Station, TX, United States) was used, considering a significance level of 5% (p < 0.05).
Results
Median (interquartile range, IQR) maximum diameter of the 15 analyzed stones was 25.4 (14.2) mm. Human stones had a median diameter of 20.2 (19.1) mm and BegoStones 28.3 (7.6) mm.
Both formula and software volume estimates (mm3) are reported in Table 1.
Results of Stone Volume Estimation with Different Methods
In green are highlighted measurements with no statistical difference compared with real volume. Volume in mm3. All data are shown as median (interquartile range or IQR).
0: Real stone; 1: BegoStone soft; 2: BegoStone hard; Vol, Volume.
NA = not applicable.
Ackerman formula estimated accurately two stone samples (stones 4 and 7) compared with the reference volume (13%), hence had a correct SV estimation in 2 out of 15 stones. Sphere formula resulted in statistically different estimations in all stone samples compared with reference volume, hence 0% concordance. The formula r3/2 had two samples (stones 6 and 9) (13%) of volume estimation that were accurate compared with the reference volume.
Horos software had accurate estimates compared with reference stones in seven cases (7/15, 47%). KSC had four stones (4/15, 27%) with accurate volume estimation compared with reference SV.
Considering irregular shapes (stones 10–15), Horos resulted in better estimation of SV (4/6, 67%) than KSC (0/6, 0%).
The three formulas had a moderate correlation to reference volume (r: 0.75, p < 0.0001). Software correlate strongly to reference volume (r = 0.96, p < 0.0001) (Fig. 2).

Spearman correlation of real volume to different modalities to estimate stone volume. Interpretation of Spearman’s Rho. 15
Formulas tend to overestimate volume more when compared with Horos and KSC. Sphere formula overestimated in all (100%) cases, r3/2 in 13 (87%) cases, and Ackerman formula in 8 (53%) cases. Horos overestimated in 6 (40%) and underestimated in 2 (13.3%) cases, KSC in 10 (66.7%) and 1 (6.7%) cases, respectively.
Discussion
Stone burden estimation
There are three ways to estimate SV: manual, semiautomated, and automated methods. 16 Manual estimation of SV is difficult because it requires multiple measures and relatively spheroidal shapes of stones. The perfect method to estimate SV does not currently exist yet. However, software with 3D reconstruction (segmentation) seems highly accurate and is relatively user-friendly, even if not widely used. An automated artificial intelligence (AI)-based system could represent the near future of stone detection and quantification.
The ellipsoid formula has been proven to have a bad estimation of SV. 17 –19 These findings correlate with ours, showing Ackerman and Sphere formulas as inaccurate methods to measure the stone. In one of those studies, 18 it must be noted that 3D-printed models were used. In our study, both real urinary stones ex vivo and BegoStone models were included. CT-based algorithmic volumetric assessment of urinary stones has been reported as more accurate than ellipsoid formulas. 17
The reproducibility of KSC was previously evaluated and the authors found a strong inter- and intraobserver correlation. Nonetheless, stone complexity was a variable of significant changes during the measurements. 20 This could explain why we found a lower ability to estimate SV for irregular shapes of stones with KSC. Remarkably Horos software performed well in those types of models.
Although the correlation coefficient and p value for the three formulas are approximately the same, this does not mean that the volume values obtained are equal for each of them. Rather, it indicates how well the values from each method are associated with the actual values, without measuring the accuracy of this relationship. Therefore, both are positively correlated with a similar magnitude, but with different precision concerning the actual value.
It must be highlighted that we looked for no statistically significant differences between estimation methods and the real SV. Some differences are statistically different but whether those differences are clinically significant is a matter of debate (i.e., stone 7). The capability to make a clinical difference probably depends also on other variables such as hardness (composition) and surgical approach to pulverize the stone. The extremely strong correlation between software measurements and the real SV may be a hint to that. Horos showed to be more accurate, but KSC had nearly the same correlation as Horos to real SV.
AI and stone detection or quantification
AI can achieve automated stone detection. 21 Also, AI can quantify SV, one protocol that has been proved to have a great correlation with expert volume estimation is using 3D Slicer. 22 It must be noted that 3D Slicer is the base software of KSC. This study is encouraging, given that a simple method to estimate SV is probably what will expand the use of volume estimation instead of maximum diameter.
From in vitro to clinical practice
One question that arises when talking about SV is: what are the limits to indicate ureteroscopy, shockwave lithotripsy, or percutaneous nephrolithotripsy? That depends predominantly on the operator and type of laser used, rather than the stone itself. Having an accurate SV estimation is easy to calculate our own laser ablation efficiency and speed. 10 With those numbers, we will be able to estimate preoperatively how long it should take us (individually) to ablate a certain SV and thus propose to the patient different strategies of treatment.
SV measurement has gone beyond preoperative planning and recently it was even proposed to use it postoperatively to report results as a volumetric stone-free rate. 23 This considering that it is still not clear what size of residual fragments can be considered clinically insignificant based solely on MSD.
Strengths and limitations
Our study has limitations, mainly that it is an ex vivo/in vitro study, ideally, we should have compared in vivo and ex vivo, but nowadays, it is extremely rare to perform open surgery for stones in our institution, and therefore, it is unacceptable to perform that technique just to have the intact stones afterward. Another limitation is that we did not evaluate interobservability variations. Also, the number of stones evaluated is limited, which could underpower the study. We only included two different software to estimate SV and not all of those are available.
The main strength of our study is that it compares five methods to estimate SV. We used real stones to come near to real world and also include irregular shapes (BegoStone) to reproduce complex shapes and put to test the different methods.
Conclusion
Different modalities to estimate SV have nonconcordant results. All the formulas did correlate moderately to real SV. Conversely, both software had an extremely strong correlation. Horos software was the most accurate (47% cases), followed by KSC (27% cases).
Footnotes
Author Disclosure Statement
Prof. O.T. is a consultant for Coloplast, Rocamed, Olympus, EMS, Boston Scientific, and IPG. F. Panthier is a consultant for Dornier Medtech.
Funding Information
No funding was received for this article.
Supplementary Material
Supplementary Data
Abbreviations Used
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
