R.E.N.A.L. Nephrometry Scoring: How Well Correlated Are Urologist,Radiologist,and Collaborator Scores?

Abstract

Purpose:

R.E.N.A.L. Nephrometry Score (NS) is an imaging-based (CT/MRI) scoring system commonly used by urologists to standardize the reporting of renal masses by enabling quantification of anatomical characteristics. We sought to examine the inter-rater correlation of NS between urologists, radiologists, and tumor-board collaborators.

Methods:

We identified adult patients undergoing partial or radical nephrectomy over 10 years (n=2450). Patients with autosomal dominant polycystic kidney disease (ADPKD), metastatic disease, masses >10 cm, and studies in which the study urologists or radiologists partook in patient care were excluded. Preoperative imaging was evaluated and patients with multiphasic CT available were included. Scans were provided to the reviewers to evaluate with a R.E.N.A.L. nephrometry questionnaire. Results were analyzed using kappa correlation coefficients.

Results:

One hundred twenty patients met inclusion criteria with mean age of 59.5 years. The majority of cases were partial nephrectomies (72%). Eighty-five percent of the tumors were malignant, with 26% having high-grade histology. The mean (standard deviation) overall NS was 6.8 (1.9) with fair correlation among reviewers (κ=0.222). Collaborators had the highest inter-rater correlation, ranging from 0.41 to 0.84 for NS component scores, compared with 0.42–0.85 for radiologists and 0.36–0.86 for urologists. “R” scores were best correlated (κ>0.8). NS correlation ranged between 0.16 and 0.31 for the groups while the NS complexity category correlation ranged between 0.50 and 0.61.

Conclusions:

Despite being naive to NS, inter-radiologist scoring patterns were better correlated than inter-urologist. The urologist and radiologist collaborating in tumor board showed the highest agreement, suggesting that a multidisciplinary approach in the characterization of renal masses may provide benefit to patient management.

Introduction

Since 2009, the R.E.N.A.L. Nephrometry Score (NS) has been used to describe the complexity of renal masses in a standardized fashion between urologists. The components of the score are both objective and subjective: R, diameter of the mass; E, exophytic versus endophytic properties; N, nearness of the mass to the collecting system; A, anterior or posterior location; and L, location relative to polar lines.^1,2 Each component is associated with a numerical weight, totaling to a scoring system from 4 (low complexity) to 12 (high complexity). NSs were designed to guide management decisions and to improve both communication between care providers and comparisons of renal masses described in the literature.² Subsequent studies have suggested that NSs are associated with perioperative complications^3

–6 and may be an independent predictor of surgical approach for T1a tumors.⁷

Based on the significant increase in published literature, multidisciplinary collaborations in the management of urologic malignancy, including renal cell carcinoma, are growing in popularity as clinical benefits and improved patient satisfaction are demonstrated. Regular participants in a genito-urinary (GU) tumor board can vary but typically include urologists, medical oncologists, radiation oncologists, pathologists, and radiologists. Discussion of renal tumor characteristics, normal kidney anatomy, presence or absence of tumor thrombus, lymph node disease, and metastatic spread is commonplace in this setting. As such, tumor board participants may develop similar patterns and approaches in the interpretation of renal mass imaging, although this has not been previously evaluated. Currently only urologists have implemented nephrometry scoring into their armamentarium for the assessment of kidney tumors; however, expanding nephrometry scoring to the field of radiology has the potential to improve communication and decision making in multidisciplinary settings. The effectiveness of the nephrometry scoring system is reliant upon high inter-rater correlation and reliability. Therefore, we sought to assess NS correlation between urologists, radiologists, and collaborators to determine any patterns or score distributions unique to certain groups.

Materials and Methods

After approval of the study by the Indiana University Institutional Review Board, we conducted a retrospective review of all adult patients undergoing partial or radical nephrectomy at our institution between June 2003 and June 2013 (n=2450). Patients were selected for inclusion if they had a preoperative multiphasic CT scan (dual or triphasic) in our system. This included patients whose CT was uploaded as a temporary file at our institution. We excluded patients with metastatic or locally advanced disease, tumor diameter >10 cm, and previously diagnosed polycystic kidney disease. Two urologists and two radiologists were chosen to participate in the study as reviewers. Both urologists are fellowship trained in urologic oncology (Timothy A. Masterson) and robotics (Ronald S. Boris). The radiologists focus on abdominal and GU radiology (Aashish A. Patel and Mark Tann). Any patient who received care from the study reviewers was excluded (i.e., imaging initially read by selected radiologists or surgery performed by selected urologists). One urologist and one radiologist regularly collaborate in multidisciplinary tumor boards and educational conferences and were defined for subanalysis as “collaborators.” These collaborators assessed the cases individually but, we hypothesized, would benefit from previous collaborative efforts.

Multiphasic CT scans were de-identified and made available to reviewers on an external hard drive. A computerized R.E.N.A.L. nephrometry scoring questionnaire was generated.¹ R.E.N.A.L. nephrometry is used to predict case complexity on a scale of 4–12, with 4–6 defined as low complexity, 7–9 as moderate complexity, and 10–12 as high complexity.² For the purpose of our study, we did not include the “A” variable (anterior vs posterior location) as it is not linear. Of note, no overview or introduction to nephrometry scoring was provided to the reviewers prior to this project, and neither radiologist was previously familiar with the scoring system. Study data were collected and managed using Research Electronic Data Capture (REDCap) tools hosted at Indiana University.⁸ REDCap is a secure, Web-based application designed to support data capture for research studies, providing (1) an intuitive interface for validated data entry, (2) audit trails for tracking data manipulation and export procedures, (3) automated export procedures for seamless data downloads to common statistical packages, and (4) procedures for importing data from external sources.

Descriptive analysis was performed using Pearson's chi-squared test for categorical variables. Cohen's and Fleiss' kappa correlation coefficients were used to assess overall correlation between all reviewers and correlation between the urologists, radiologists, and collaborators. Significance of the kappa scores was defined as follows: <0.01, poor correlation; 0.01–0.2, slight correlation; 0.21–0.4, fair correlation; 0.41–0.6, moderate correlation; 0.61–0.8, substantial correlation; and 0.81–1, almost perfect correlation.⁹ Scoring for one patient was not completed by reviewer 4, so it was eliminated from relevant correlative analyses. A priori, p-values<0.05 were considered statistically significant. Stata version 12.1 (Stata Corp. LP, College Station, TX) was used for all statistical analyses.

Results

One hundred twenty cases were identified for inclusion in the study. The average (standard deviation [SD]) age for patients included was 59.5 (13.2) years. The majority of these patients underwent partial nephrectomy for management of their mass (n=86). Tumor size ranged from 0.6 to 10.0 cm, with mean (SD) diameter of 3.3 (1.9) cm on final pathology. Eighty-five percent of the patients had malignant pathology (n=102) with clear cell being the most common histology (n=74). High-grade disease, defined as Fuhrman Score III–IV, was identified in 31 (26%) patients.

NSs assigned by three of the reviewers had a range from 4 to 11, while the final reviewer ranged from 4 to 12. Mean (SD) NS for all reviewers was 6.8 (1.9) with fair correlation between the reviewers (κ=0.222) (Table 1). Correlations for the overall NS between the urologists, radiologists, and collaborators ranged from slight to moderate with the strongest correlation noted for the radiologists (κ=0.306) (Table 1). Moderate-to-substantial correlation was noted when comparing the assigned complexity categories with κ=0.530 between all reviewers and κ=0.610 for the collaborators, which corresponded to 77.5% agreement between the two reviewers. The distribution of cases categorized within each complexity category was similar between the reviewers (p=0.092) (Table 2).

Table 1.

Agreement and Correlation Coefficients for Predicting R.E.N.A.L. NSs

		Urologists		Radiologists		Collaborators
	Overall κ ^a	% Agreement	κ ^b	% Agreement	κ ^b	% Agreement	κ ^b
NS	0.222	27.5	0.160	41.2	0.306	34.2	0.233
Complexity categories	0.530	70.0	0.497	75.6	0.561	77.5	0.610
“R” score	0.844	93.3	0.860	93.2	0.852	92.5	0.841
“E” score	0.443	63.3	0.357	71.8	0.422	70.0	0.442
“N” score	0.442	64.2	0.419	67.2	0.482	68.3	0.486
“L” score	0.388	55.8	0.356	62.2	0.440	60.8	0.414

κ represents Fleiss' kappa.

κ represents Cohen's kappa.

NS=nephrometry score.

Table 2.

Reviewer R.E.N.A.L. Nephrometry Complexity Category Scores

	Low (NS 4–6)	Moderate (NS 7–9)	High (NS 10–12)	p-Value
Urologist 1^a	58 (48)	49 (41)	13 (11)	0.092
Urologist 2	56 (47)	48 (40)	16 (13)
Radiology 1^a	52 (43)	61 (51)	7 (6)
Radiologist 2	66 (55)	47 (40)	6(5)
Sum of cases	232 (48)	205 (43)	42 (9)

Collaborators.

Within the individual components of the NS (i.e., “R,” “E,” etc.), there was variable correlation. Tumor diameter score (“R”) was the most highly correlated with almost perfect correlation of κ=0.844. The percent agreement between each of the three groups was 93% (Table 1). Correlation for the “E,” “N,” and “L” scores was slight to moderate with 56%–72% agreement within each group. Aside from the overall NS and the “R” score, the highest percentage agreement was found among the collaborators.

Discussion

The interpretation of uroradiology falls primarily within the scope of training for radiology residency but it also exists within urology residency, as evidenced by the American Board of Urologists continuing to incorporate radiographic interpretation in the certification process for urologists. The specific role for each is not well delineated; however, it has been advocated that radiologists and urologists bring separate interpretive and prognostic skills to a collaboration.¹⁰ Collaboration in multidisciplinary conferences and tumor boards likely provides the greatest value to patients. Within the field of endourology and renal calculus disease, previous studies have determined that both radiologists and urologists identify ureteric stones with generally equivalent results for both disciplines.^11,12 A recent study by Cho and colleagues examined imaging follow-up when guided by the ordering urologist and interpreting radiologist.¹³ Their results showed that rates of follow-up imaging were reduced when clinical correlation and radiology recommendations were collectively utilized by urologists to decide the need for a particular study. These findings underscore the collaborative role of both disciplines in the optimum management of urology patients who depend on imaging for their care.

Since its inception nephrometry scoring has improved the descriptive abilities among urologists for various renal masses.² Although not initially intended to be a system for prediction of postoperative complications, multiple studies have investigated this potential, with generally significant results.^{3

–6,14

–18} Both Hew and colleagues and Simhan and colleagues reported that NSs could be useful in predicting perioperative complications following partial nephrectomies.^4,6 Broughton and colleagues evaluated clinical T1a renal cortical tumors and determined that renal mass complexity was independently associated with urologists' decision to perform partial or radical nephrectomy.⁷ Implementing a system such as nephrometry scoring among radiologists has the potential to greatly impact the urologists' surgical planning and perioperative management strategy. Familiarity with NS may improve communication and standardization between radiologists and urologists as they independently assess various renal masses. Additionally, using NS may assist with patient education regarding perioperative expectations and complication risks. While the historical cohort of patients used for evaluating NS included both open and minimally invasive radical and partial nephrectomies, recent studies have been equivocal in determining whether NS is predictive of postoperative outcomes among minimally invasive patients alone.^5,15

–19 Regardless of findings, the usefulness of nephrometry scoring is critically dependent upon the assumption that the scores are reliable and reproducible, justifying our investigation of correlation across disciplines.

Calculating inter-rater correlation and variability relies on both the observed agreement between raters and the probability of agreement between the rates by chance alone.⁹ There are limitations to using the kappa correlation coefficient, including the fact that in the setting of higher inter-rater agreement, it can be relatively more difficult to achieve a higher kappa coefficient.²⁰ Because of these factors, alternative correlation measures have been suggested, such as the Gwet AC1, which calculates a potentially more robust probability of chance agreement than the kappa correlation coefficient.²⁰ Regardless, the kappa correlation coefficient remains the most widely accepted method of describing inter-rater variability. Understanding the limitations of the kappa is critical to interpreting our, and others', findings. We found it interesting that although the individual subjective scores had slight-to-moderate correlation, there was near-perfect correlation for the objective scores of diameter (“R”), which has been similarly identified in previous studies.^{4,19,21
–23} Weight and colleagues examined concordance for NSs finding substantial correlation; however, among tumors >7 cm, this dropped to fair correlation.²³ Similarly, Kolla and colleagues reported substantial correlation for the individual NSs.²¹ Our study, in contrast, found only slight correlation in the individual NSs. We hypothesize that this might be reflective of our choosing not to provide a group overview of NS prior to beginning the study; yet, as we were attempting to replicate everyday practice in which radiologists and urologists are reliant upon their interpretation of the scoring system, we believe that our results offer interesting findings. We do report moderate-to-substantial correlation for the overall NS complexity category, which implies that regardless of variation in the reported component NSs, the overall complexity of the tumor, as defined by NS, is fairly well correlated between urologists, radiologists, and collaborators.

To our knowledge, no other study has attempted to examine correlation in NS between urologists and radiologists in the evaluation of renal masses. The radiologists, with no exposure to NS aside from the nephrometry.com link, were in fact better correlated than were the urologists. We believe that this may be in part due to their formal training and greater familiarity with standard imaging practices. Counter-intuitively, the urologists were the least correlated in the study. Additionally, the urologists reported nearly twice the number of high complexity cases compared to the radiologists. Although we are unsure of the reason for either of these findings, we speculate that personal biases and interpretations of the expected operative complexity may play a significant role in explaining these differences.

In interpreting our results the consistently highest percentage agreement was found between the collaborating urologist and radiologist over all other compared groups. Correlation in scores was similar between the collaborators and the radiologists. Although this in and of itself does not advocate the benefit of a multidisciplinary approach to GU oncologic disease, it does suggest that collaboration may harbor similar patterns of analysis and interpretation among participants that could strengthen the reproducibility of interpretive systems, such as nephrometry scoring. In other GU malignancies such as prostate cancer, the potential benefit of multidisciplinary care has been frequently evaluated.^24

–27 In a recent study that examined 15 years of experience with a multidisciplinary clinical for management of prostate cancer, Gomella and colleagues showed improved survival particularly for patients with high-risk, locally advanced prostate cancer.²⁶ Although not examining exclusively prostate cancer patients, Acher and colleagues found that there was a benefit to multidisciplinary evaluation of selected cases who were identified as “potential change cases” prior to the meeting but that there was little benefit to cases who were not identified as such.²⁸ The need for careful examination is further echoed in a SEER database study in which Bekelman and colleagues report that there was a significant increase in use of intensity-modulated radiation therapy among localized prostate cancer patients managed by integrated care groups with a decrease in the less-expensive androgen deprivation therapy.²⁵ Particularly in the setting of increasingly widespread Accountable Care Organizations, the importance of multidisciplinary collaborations to provide appropriate, guideline-driven care for complex patients is paramount.²⁹ Although the role of tumor board collaboration for renal cell cancer has not specifically been examined, these prior studies suggest that the potential benefits may extend outward to other GU malignancies. Whether our study results specifically support this theory is unclear, but it does suggest that collaboration may eliminate some subjectivity in the interpretation of renal masses allowing for a more consistent, standard approach to tumor management. Because the value of NSs is dependent on its reproducibility, assessing the relevance of its correlative strength among physicians who would be potentially using it is critically important.

There are several limitations of our study. First, we utilized only four reviewers, two urologists and two radiologists. Additionally, having a reviewer-wide conversation regarding nephrometry scoring at the onset of the study may have increased our correlation of NSs, particularly the more subjective components; however, we elected to evaluate correlation only between raters who were self-interpreting the scoring system. We recognize that increasing the number of reviewers would strengthen our conclusions and will be a future direction moving forward. Further, because of exclusion criteria only a limited number of scans (120) were included in our analysis, which could have impacted our results. Despite these limitations, this study both demonstrates the potential benefit of multidisciplinary collaborations in improving inter-rater description of renal masses and provides unique insight into the inter-reviewer variability for nephrometry scoring using urologists familiar with NS and radiologists with minimal prior exposure to NS.

Conclusions

Although nephrometry scoring has been previously reported to have high inter-rater correlation, we found substantial correlation only for the complexity categories of the NS. The highest agreement in scoring was seen between multidisciplinary collaborators. The role of multidisciplinary collaboration in improving description and characterization of renal masses should be further investigated and potentially incorporated into radiology and urology residency training in order to foster greater collaboration and understanding of the contributions each field can play in the management of patients with a renal mass.

Footnotes

Disclosure Statement

No competing financial interests exist for any of the authors.

Abbreviations Used

References

R.E.N.A.L. Nephrometry Scoring System. Fox Chase Cancer Center. Available at: http://nephrometry.com. (Accessed December 2013 ).

Kutikov

, Uzzo

. The R.E.N.A.L. nephrometry score: A comprehensive standardized system for quantitating renal tumor size, location and depth. J Urol, 2009; 182:844–853.

Bruner

, Breau

, Lohse

, Leibovich

, Blute

. Renal nephrometry score is associated with urine leak after partial nephrectomy. BJU Int, 2011; 108:67–72.

Hew

, Baseskioglu

, Barwari

, et al. Critical appraisal of the PADUA classification and assessment of the R.E.N.A.L. nephrometry score in patients undergoing partial nephrectomy. J Urol, 2011; 186:42–46.

Mathieu

, Verhoest

, Droupy

, et al. Predictive factors of complications after robot-assisted laparoscopic partial nephrectomy: A retrospective multicentre study. BJU Int, 2013; 112:E283–E289.

Simhan

, Smaldone

, Tsai

, et al. Objective measures of renal mass anatomic complexity predict rates of major complications following partial nephrectomy. Eur Urol, 2011; 60:724–730.

Broughton

, Clark

, Barocas

, Cookson

, Smith

Jr ., Herrell

, Chang

. Tumour size, tumour complexity, and surgical approach are associated with nephrectomy type in small renal cortical tumours treated electively. BJU Int, 2012; 109:1607–1613.

Harris

, Taylor

, Thielke

, Payne

, Gonzalez

, Conde

. Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform, 2009; 42:377–381.

Landis

, Koch

. The measurement of observer agreement for categorical data. Biometrics, 1977; 33:159–174.

10.

Shabsigh

, Scardino

. The urologist and medical imaging. Nat Clin Pract Urol, 2006; 3:175.

11.

Connolly

, Younis

, Meade

, et al. Can computed tomography in the protocol for renal colic be interpreted by urologists?. BJU Int, 2004; 94:1332–1335.

12.

Freed

, Paulson

, Frederick

, et al. Interobserver variability in the interpretation of unenhanced helical CT for the diagnosis of ureteral stone disease. J Comput Assist Tomogr, 1998; 22:732–737.

13.

Cho

, Fulgham

, Clark

, Kavoussi

. Followup imaging after urological imaging studies: Comparison of radiologist recommendation and urologist practice. J Urol, 2010; 184:254–257.

14.

Long

, Arnoux

, Fiard

, et al. External validation of the RENAL nephrometry score in renal tumours treated by partial nephrectomy. BJU Int, 2013; 111:233–239.

15.

Mayer

, Godoy

, Choi

, Goh

, Bian

, Link

. Higher RENAL Nephrometry Score is predictive of longer warm ischemia time and collecting system entry during laparoscopic and robotic-assisted partial nephrectomy. Urology, 2012; 79:1052–1056.

16.

Mufarrij

, Krane

, Rajamahanty

, Hemal

. Does nephrometry scoring of renal tumors predict outcomes in patients selected for robot-assisted partial nephrectomy?. J Endourol, 2011; 25:1649–1653.

17.

Sea

, Bahler

, Lucas

, Mendonsa

, Sundaram

. Comparison of measured renal tumor size versus R.E.N.A.L. Nephrometry Score in prediciting patient outcomes following robot assisted laparoscopic partial nephrectomy. J Endourol, 2013. DOI: 10.1089/end.2013-0202.ECC13.

18.

Zhang

, Tang

, Li

, et al. Clinical analysis of the PADUA and the RENAL scoring systems for renal neoplasms: A retrospective study of 245 patients undergoing laparoscopic partial nephrectomy. Int J Urol, 2013; 21:40–44.

19.

Png

, Bahler

, Milgrom

, Lucas

, Sundaram

. The role of R.E.N.A.L. nephrometry score in the era of robot-assisted partial nephrectomy. J Endourol, 2013; 27:304–308.

20.

Gwet

. Computing inter-rater reliability and its variance in the presence of high agreement. Br J Math Stat Psychol, 2008; 61(Pt 1):29–48.

21.

Kolla

, Spiess

, Sexton

. Interobserver reliability of the RENAL nephrometry scoring system. Urology, 2011; 78:592–594.

22.

Montag

, Waingankar

, Sadek

, Rais-Bahrami

, Kavoussi

, Vira

. Reproducibility and fidelity of the R.E.N.A.L. nephrometry score. J Endourol, 2011; 25:1925–1928.

23.

Weight

, Atwell

, Fazzio

, et al. A multidisciplinary evaluation of inter-reviewer agreement of the nephrometry score and the prediction of long-term outcomes. J Urol, 2011; 186:1223–1228.

24.

Basler

, Jenkins

, Swanson

. Multidisciplinary management of prostate malignancy. Curr Urol Rep, 2005; 6:228–234.

25.

Bekelman

, Suneja

, Guzzo

, Pollack

, Armstrong

, Epstein

. Effect of practice integration between urologists and radiation oncologists on prostate cancer treatment patterns. J Urol, 2013; 190:97–101.

26.

Gomella

, Lin

, Hoffman-Censits

, et al. Enhancing prostate cancer care through the multidisciplinary clinic approach: A 15-year experience. J Oncol Pract, 2010; 6:e5–e10.

27.

Valicenti

, Gomella

, El-Gabry

, et al. The multidisciplinary clinic approach to prostate cancer counseling and treatment. Semin Urol Oncol, 2000; 18:188–191.

28.

Acher

, Young

, Etherington-Foy

, McCahy

, Deane

. Improving outcomes in urological cancers: The impact of “multidisciplinary team meetings.”. Int J Surg, 2005; 3:121–123.

29.

Mehta

, Macklis

. Overview of accountable care organizations for oncology specialists. J Oncol Pract, 2013; 9:216–221.