Abstract
Purpose
Computed tomography angiography is used for quantifying the significance of pulmonary embolism, but its reliability has not been well defined.
Methods
The study cohort comprised 10 patients randomly selected from a 150-patient prospective trial of ultrasound-facilitated fibrinolysis for acute pulmonary embolism. Four reviewers independently evaluated the right-to-left ventricular diameter ratios using the standard multiplanar reformatted technique and a simplified (axial) method, and thrombus burden with the standard modified Miller score and a new, refined Miller scoring system.
Results
The intraclass correlation coefficient for intra-observer variability was .949 and .970 for the multiplanar reformatted and axial methods for estimating right-to-left ventricular ratios, respectively. Inter-observer agreement was high and similar for the two methods, with intraclass correlation coefficient of .969 and .976. The modified Miller score had good intra-observer agreement (intraclass correlation coefficient .820) and was similar to the refined Miller method (intraclass correlation coefficient .883) for estimating thrombus burden. Inter-observer agreement was also comparable between the techniques, with intraclass correlation coefficient of .829 and .914 for the modified Miller and refined Miller methods.
Conclusions
The reliability of computed tomography angiography for pulmonary embolism was excellent for the axial and multiplanar reformatted methods for quantifying the right-to-left ventricular ratio and for the modified Miller and refined Miller scores for quantifying of pulmonary artery thrombus burden.
Keywords
Introduction
Venous thromboembolic events in the United States occur annually in 1 to 2 individuals per 1000 population or in approximately 300,000–600,000 cases.1,2 Pulmonary embolism (PE) is the presenting event in about one-third of these patients and has a high case fatality rate. 3 Right ventricular (RV) failure due to PE may result in hemodynamic collapse, cardiogenic shock, and early mortality. 4 Ultrasound-facilitated, catheter-directed thrombolysis, which alleviates RV dysfunction, was recently approved by the US Food and Drug administration for the treatment of PE.5–8
Effective therapy for PE requires rapid, accurate diagnosis and risk stratification, which is facilitated by computed tomography angiography (CTA).9,10 Concomitant RV dysfunction can be estimated by CTA with measurement of the right-to-left ventricular (RV/LV) diameter ratio. 11 Imaging findings help clinicians to stratify patients to anticoagulation alone plus advanced therapy, such as catheter-directed thrombolysis.12,13 Further, CTA is useful to quantitate thrombus dissolution and RV normalization during and after treatment. 14
While CTA is valuable in the diagnosis and management of PE, the methodology for measuring the RV/LV ratio is not well standardized. Measurement techniques vary, rendering problematic comparisons among clinical trials, individual patients, and longitudinal examinations of a single patient. Our project assesses the reliability of different CTA measures for quantifying the functional and anatomic significance of PE, assessing the intra-observer agreement and inter-observer agreement of CTA endpoints in patients with PE.
Materials and methods
The study cohort was a randomly selected 10-patient subset of the SEATTLE II trial, a 150-patient prospective multicenter single-arm study of ultrasound-facilitated, catheter-directed thrombolysis for acute massive and submassive PE (NCT01513759). 15 Patients aged ≥18 years of age with symptoms ≤14 days in duration were eligible for inclusion after a contrast-enhanced computed tomography imaging study demonstrated evidence of proximal PE. Thrombus was required to be present in at least one main or segmental pulmonary artery with an RV/LV diameter ratio of ≥0.9. The study was approved by the institutional review board of each investigational site, and each written informed consent was from obtained from each patient.
Each patient was required to have technically adequate CTA studies that passed internal quality assurance procedures. Each CTA was a high-quality non-gated contrast-enhanced chest computed tomographic study with adequate visualization of the ventricles and both lung fields. Images were reconstructed at ≤3.0 mm slices at ≤1.5-mm increments. CTA studies were reviewed using Aquarius iNtuition software (TeraRecon, Foster City, CA, USA). The cohort evaluated in the current study was randomly selected from the group of patients with imaging studies meeting these criteria. A random number between 0 and 1 was assigned to each of the 139 patients with technically adequate pre- and post-procedure CTA imaging studies (Microsoft Excel 2013, Redmond, WA). These patients were sequenced from low to high by the random numbers, and every 10th patient was chosen until 10 subjects were identified for the analytic subset. 16
While the CTA studies were received from the investigational sites in an anonymized form, each study was secondarily de-identified to remove the study subject unique identifier. Two copies of each CTA study were generated, and each copy was assigned different identifiers, one for the first read and one for the second. In this manner, complete blinding of the readers was assured for the analysis of intra-observer agreement between two reads of the same study. In all, the 10 studies (six baseline pre-intervention and four post-treatment studies) were assessed by four independent readers. Among the readers, two were core laboratory technicians, one was the core laboratory director, and one was a physician reader. Intra-observer agreement analyses were performed on the measurements of the two reads for each study, performed at least 60 days apart, to reduce the chance that a reader might recognize the images from one review to the next.
RV/LV diameter ratio
Two different methods were used to determine the RV/LV ratio. The first was a method that used non-reformatted CTA axial images without manipulation in any vector (“axial method”). The second was modified from the method reported by Quiroz that used multiplanar reformatted reconstructions (MPR) from CTA datasets to measure the short axis of each ventricle (MPR method). 17
In the axial method, the diameters of the right and left ventricles were measured in the cross-sectional transverse slices along the straight z-axis (craniocaudal axis), without manipulating the CTA datasets to generate images in other planes. The best single transverse slice was visually identified, scrolling through the slices to find the slice with adequate views of each ventricle approximating their largest diameters. The right and left ventricle diameters were measured on the same non-reformatted slice, accepting the discrepancies inherent in unadjusted, non-orthogonal planes. The maximum perpendicular distance from the free-wall ventricular endocardium to the interventricular septum was identified from these views.
In MPR (modified Quiroz) technique, the RV/LV ratio was measured in three-dimensional (3D) reformatted reconstructions, adjusting the axial, coronal, and sagittal planes to view the largest diameter of each ventricle as measured perpendicular to the interventricular septum (Figure 1).
17
In the MPR method, the obliquities were adjusted to visualize and measure the diameters in a plane with the short axis perpendicular to the interventricular septum and the long axis parallel to a line defined by the center of the mitral valve and apex of the heart. As in the axial method, the diameters of the right and left ventricles were measured from the ventricular endocardium of the free wall to the ventricular endocardium of the septum. In contrast to the axial method, however, diameters of the right and left ventricles diameters were measured on separate MPR reconstructions, adjusting the planes to obtain obliquities appropriate for each ventricle.
Right ventricle (RV) and left ventricle (LV) diameter determination from the three-dimensional reformatted views. Insets A and C are axial views and insets B and D are sagittal views. The axial planes (represented by the red line), coronal planes (represented by the green line) and sagittal planes (represented by the blue line) were adjusted to view the widest part of the ventricle perpendicular to the interventricular septum. In insets A and C, the diameter of each ventricle was taken in the plane with the short axis (blue line) perpendicular to the interventricular septum, and the long axis parallel to the line going through the center of the mitral valve and apex of the heart. The LV diameter (inset A, yellow line) and RV diameter (inset C, yellow line) were measured from the ventricular endocardium to the interventricular septum.
Pulmonary artery thrombus burden
The pulmonary arteries and assignment of modified Miller (MM) and refined Miller (RM) scores.
Mean ± Standard Deviation.
On the left side, these are replaced by the superior and inferior lingular arteries as the third-order branches.
To account for a wide spectrum of major pulmonary artery obstruction, a refined Miller (RM) scoring system was developed as a modification of the MM score. Like the MM score, the RM score regards each lung as having 10 segmental arteries, and the degree of obstruction of the segmental pulmonary arteries is measured using an ordinal scale. In contrast to the MM, however, the RM score allows for a finer distinction of partial obstruction in seven major arteries; the main pulmonary artery, left and right pulmonary artery, left and right interlobar arteries, left and right basal trunks. Partial obstruction is classified in three categories based upon cross-sectional diameter reduction: 0.5 for 1–33%, 1.0 for 34–66%, and 1.5 for 67–99% obstruction (Figure 2). As in the MM scoring system, the weighted score of the major supplying artery is compared with the sum of its tributaries, and the larger number is recorded. For example, if thrombus obstructed the lumen of a basal trunk vessel by 50% (score = 1.0 × 7), but its four segmental tributaries were fully occluded (score = 2.0 × 4), the higher value of 8 would be recorded as the RM score. A cumulative score is calculated by summing the scores for all arteries. Similar to the MM score, the RM score can range from 0 to 20 per lung, for a maximum of 40 bilaterally.
(a) CTA with thrombus in left basal trunk with a refined Miller score of 1.5 and modified Miller score of 1. (b) Thrombus in right basal trunk with a refined Miller score of 0.5 and modified Miller score of 1.
Statistical analysis
Data analysis was performed with SPSS version 22 (International Business Machines Corporation, Armonk, New York, USA). All values are expressed as means ± standard deviation (SD), median, and range. p Values were considered significant when the two-tailed alpha level was less than 0.05. Analyses that do not involve the first and second reads are represented by data from the first read.
The RV/LV diameter ratios and thrombus burden scores were assessed as continuous variables. Intra-observer agreement was evaluated by calculating the intraclass correlation coefficients (ICC) between the first and second measurements for each reviewer. Correlations were calculated with Spearman’s rho. Reliability of each measure was estimated with ICC and respective 95% confidence intervals (CI). 21 ICC were calculated on a scale of 0 to 1, where 1 represents perfect reliability and no measurement error and 0 indicates no reliability. Intra-observer agreement was calculated as the ICC between the first and second reads of the four reviewers. Inter-observer agreement was calculated as the ICC between the measurements for the four reviewers on each of the ten CTA images. Bland–Altman plots were constructed as scatter plots in which the Y axis represented the difference between two paired measurements, and the X axis represented the average of these measurements. 22 The graphical modification of Jones was used for assessing agreement for multiple observers. 23 Horizontal lines depict the mean difference and two standard deviations above and below the mean.
Results
Site-reported and core laboratory pre-treatment right-to-left ventricular diameter measurements and ratios in the 150-patient SEATTLE-II trial.
Two-tailed paired t-test between the site-reported and core-laboratory measurements.
LV: left ventricle; MPR: multiplanar reformatted reconstruction; RV: right ventricle.
RV/LV ratio
Using the MPR method for RV/LV ratio, the mean RV/LV ratio was 1.14 ± 0.26 (median 1.11, range 0.83–1.94) on the first read and 1.16 ± 0.26 (median 1.14, range 0.81–1.83) on the second read. The intra-observer agreement was high, with an ICC of .949 (CI .906–.973). The inter-observer agreement was also excellent, with an ICC of .969 (CI .922–.991) (Table 3, Figure 3).
Bland–Altman plot of inter-rater reliability for right ventricle–left ventricle (RV/LV) diameter ratio as measured by the axial method, without multiplanar reformatting of CTA images. Intraclass correlation coefficients (ICC) for right ventricle (RV) and left ventricle (LV) diameter measurements and RV/LV ratios measured with the multiplanar reformatting (MPR) and Axial techniques. Note: None of the differences in ICC between the MPR and Axial methods were statistically significant.
When the RV/LV ratios were measured using the axial method, results were similar to those measured with the MPR method. The mean RV/LV ratio was 1.12 ± 0.24 (median 1.07, range 0.83–1.77) on the first read and 1.13 + 0.26 (median 1.09, range 0.84–1.84). The intra-observer agreement of the axial method was as high or higher than observed using the MPR method, with an ICC of .970 (CI .944–.984). The inter-observer agreement was also excellent using the axial method, with an ICC of .976 (CI .939–.993) (Figure 4). When the individual measurements for RV and LV diameter was assessed, it appeared that the reliability of measurements for the LV diameter was slightly improved over those for the RV, but these differences did not attain statistical significance.
Bland–Altman plot of inter-rater reliability for right ventricle–left ventricle (RV/LV) diameter ratio as measured by the MPR (multiplanar reformatted reconstruction) method.
There was a high degree of correlation between the MPR and axial methods of measuring the RV/LV ratio (Spearman’s rho = .874, p < .001, Figure 5). However, a small difference between the two methods was observed. On average, the MPR estimate was 0.02 ± 0.08 greater than the axial estimate on the first read (p = .093) and 0.04 ± 0.11 greater on the second read (p = .048). The correlation was slightly higher for the LV measurement than the RV measurement (Spearman’s rho = .907 and .867, respectively).
Correlation between the two methods of measuring the right-to-left ventricle (RV/LV) diameter ratio. On average, the more simplistic axial method underestimated the RV/LV ratio by only 0.02.
Pulmonary artery thrombus burden
The MM score for pulmonary artery thrombus burden averaged 18.4 ± 7.1 (median 20, range 3–32) on the first read and 18.2 ± 6.6 (median 19, range 1–29) on the second read. Intra-observer agreement was good, with an ICC of .820 (CI .685–.900). Inter-observer agreement was also good, with an ICC of .829 (CI .629–.948) (Figure 6).
Bland–Altman plot of inter-rater reliability for the modified Miller (MM) scores for pulmonary artery thrombus burden.
Intra-observer agreement appeared better with the RM scoring system than with the MM method. The repeated observation ICC for the RM score was .883 (CI .789–.936). Inter-observer agreement also appeared to be slightly improved with the RM system, with an ICC of .914 (CI .796–.975) (Figure 7).
Bland–Altman plot of inter-rater reliability for the refined Miller (RM) scores for pulmonary artery thrombus burden, using five categories of major pulmonary artery obstruction. In addition to zero and 1 for no thrombus and complete obstruction, three intermediate categories included 0.5 for 1–33%, 1.0 for 34–66%, and 1.5 for 67–99% reduction in luminal diameter. MM: modified Miller score.
Thrombus burden calculated by the MM score and the RM scores was highly correlated (Spearman’s rho = .937, Figure 8). The MM score tended to be slightly lower for a given patient, with a mean difference of −0.2 ± 2.6 and −0.1 ± 1.8 on the first and second reads, respectively The differences between the two methods for scoring pulmonary artery thrombus did not attain statistical significance (p = .581 and .664, respectively).
Correlation between the two methods of measuring the pulmonary artery thrombus burden.
Correlation between RV/LV diameter ratio and thrombus burden
There was a correlation between RV/LV ventricular diameter ratios and the pulmonary artery thrombus burden. The correlations between the RV/LV ratio as determined by the MPR method and the MM and RM pulmonary artery thrombus scores were .526 (p < .001) and .509 (p = .001), respectively (Spearman rho). The correlations between the RV/LV ratio as determined by the axial method and the MM and RM scores were less striking but still statistically significant, with Spearman coefficients of .341 (p = .032) and .334 (p = .035), respectively.
Discussion
The current analysis suggests that the axial measurement of the RV/LV ratio yields results similar to the more complex and more time-consuming MPR measurement. Furthermore, the RM score modification of the standard MM score results in slightly improved diagnostic reliability. Contrary to the findings of others, the RV/LV ratio is highly correlated with the degree of pulmonary artery obstruction.
The binary diagnostic performance of CTA as an imaging test to discriminate between patients with and without PE does not encompass the full value of the test. In an era of increasing use of pharmacologic, mechanical, and combined methods to reduce the pulmonary artery thrombus burden, CTA helps to risk stratify, target the anatomic locations of occlusive thrombi, and gauge the response to therapy. Confidence in the precision of a CTA measurement can only be assured if the variability of measurements among readers is low. Reliability attains importance in the conduct of a clinical trial, but also in the real-world management of patients with PE. Evaluation of the intra- and inter-observer agreement of CTA is a crucial factor in assessing the comparative value of measurement protocols.
Intra-rater and inter-rater agreement for the MM score were acceptable, with ICC of .82 and .83, respectively, indicating that just under 20% of the variation was caused by measurement error. RM score, intended to be a refinement of the MM score with an increased ability to detect finer grades of major pulmonary artery obstruction, was associated with slightly improved intra- and inter-observer agreement, with ICC of 0.89 and 0.91, respectively. Compared with MM score, the RM score allows a finer discrimination of non-occlusive major pulmonary artery obstruction; adding two additional categories of differentiation.
The reproducibility of CTA in detecting RV dysfunction in acute PE was assessed by Kang et al. 24 While ICC was not calculated in this study, the investigators found a mean difference of .014 ± .195 in axial RV/LV ratios measured by two observers. Furlan et al. used a semi-automated method of quantifying the volume of thrombus in 30 patients with PE. 25 The intra- and inter-observer variability was extremely low in this series, with ICC of approximately .99. On Bland–Altman analyses, the inter-observer agreement was approximately 5%, with 95% of the readings within approximately 10% of one another.
Apart from reproducibility, a valuable imaging measurement technique must be rapid and easy to perform if it is to be readily incorporated into the routine diagnostic testing used in real-world clinical practice. In this regard, we compared the straightforward axial method of measuring the RV/LV ratio as an alternative to the more complex MPR technique. The axial method can be performed in less than a minute with unreconstructed images, while the MPR method requires sophisticated image analysis software applications, with attendant skills and time to perform the multiplanar reconstructions. Despite its relative simplicity, the axial method was as reliable as the MPR method. This finding is in agreement with the observations of Lu et al., where axial CTA images were as accurate as reformatted four-chamber views in the prediction of 30-day mortality after PE. 26 While reliability alone does not constitute the ultimate value of a test, it was also reassuring that the axial and MPR methods were highly correlated with one another. These observations suggest that the axial method for estimating the RV/LV ratio is acceptable at least in the context of clinical practice.
The current study is limited by its relatively small sample size, although the design is strengthened by the use of four individual readers. The study population included only those patients in the SEATTLE-II trial, where an RV/LV ratio >0.9 was a criterion for eligibility. Whether the findings are generalizable to patients with less severe PE or to those without PE cannot be established from the current data. The analysis performed by our dedicated imaging core laboratory may not be easy to replicate at most clinical centers, thereby limiting the able to generalize our findings. The current study did not evaluate the validity of the CTA measurement protocols or of the indices themselves.
In summary, the current study documents excellent reliability of the MM and RM scores for the quantification of pulmonary artery thrombus and for the axial and MPR methods of quantifying the RV/LV ratio. These measures should be considered when quantification of pulmonary artery thrombus and its effects on the right ventricle are important in the research or clinical setting.
Footnotes
Declaration of conflicting interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The employer of authors KO, RL and YL received research funds from the sponsor of the study.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by Ekos Corporation, a BTG International Group company. Ekos reviewed and commented on drafts of the manuscript but was not involved in imaging data collection, data analysis, or the drafting of the manuscript.
ClinicalTrials.gov Identifier
NCT01513759.
