Abstract
Background
Although digital breast tomosynthesis (DBT) is an emerging technique yielding higher sensitivity and specificity compared to digital mammography (DM) alone, relative contribution of prior mammograms on the interpretation of DBT combined with DM has not been investigated.
Purpose
To retrospectively compare the diagnostic performances of DM, DM + DBT, and DM + DBT with prior mammograms.
Material and Methods
Three breast radiologists independently reviewed images of 116 patients with 24 cancers in the sequential order of DM, DM + DBT, and DM + DBT with prior mammograms using Breast Imaging Reporting and Data System (BI-RADS) assessment categories.
Results
The average areas under the receiver operating characteristic curve (AUC) of DM, DM + DBT, and DM + DBT with prior mammograms were 0.712, 0.777, and 0.816, respectively. Adding prior mammograms did not significantly affect the AUC of DM + DBT (P = 0.108), whereas adding DBT significantly increased the AUC of DM (P = 0.009). Sensitivity for DM, DM + DBT, and DM + DBT with prior mammograms was 58.3%, 69.4%, and 69.4%, and specificities were 84.1%, 85.9%, and 93.8%, respectively. Addition of DBT significantly increased the sensitivity (P = 0.0090) of DM. Prior mammograms significantly improved the specificity of DM + DBT (P = 0.0004), whereas adding prior mammogram did not affect sensitivity of DM + DBT (P = 1.000).
Conclusion
DBT significantly increases the overall sensitivity and diagnostic performance of DM. Prior mammograms significantly increase the specificity of DM + DBT but have no significant effect on sensitivity and overall diagnostic performance.
Keywords
Introduction
In breast cancer screening using digital mammography (DM), it is customary for radiologists to read high volumes of DM with interpretation performed in batches comparing current images with previous images as the vast majority of women have been screened previously. Indeed, comparing the appearances of abnormalities on images with prior examinations during the interpretation of mammograms have been shown to substantially reduce recall rates and increase cancer detection by enhancing the ability of the observer to assess changes over time (1–3).
As a supplemental image tool to DM, digital breast tomosynthesis (DBT) is an emerging technique which can improve lesion visibility, consequently yielding higher sensitivity and specificity compared to DM alone (4). Recent publications have shown the benefits of adding DBT to DM in terms of detecting more invasive cancers and reducing recall rates (5), while also providing superior performance with respect to the detection, localization, and characterization of multiple abnormalities per examination compared with DM alone (6). However, although one previous study showed that these advantages of DBT were still evident even with prior mammograms (7), many studies did not include prior mammograms during the DBT interpretation (6,8–10). In addition, the relative contribution of prior mammograms on the interpretation of DBT combined with DM has not been investigated. We believe that by investigating the impact of prior mammograms on the interpretation of DBT combined with DM and obtaining information on its possible benefits would be valuable to radiologists.
Therefore, the purpose of this study was to retrospectively compare the diagnostic performances of DM, DM + DBT, and DM + DBT with prior mammograms and to determine whether prior mammograms improve the diagnostic performance of DM + DBT interpretation.
Material and Methods
A waiver of informed consent was obtained for this retrospective review which was approved by the Institutional Review Board (IRB) of our institution.
Patients
A total of 661 women underwent DM + DBT between December 2011 and May 2012 as part of our hospital’s routine clinical practice. DM + DBT was performed for the purposes of screening asymptomatic women and for diagnostic work-up in women with clinical symptoms or positive image findings who were referred to our institution from outside facilities. Among these patients, 158 patients had prior mammograms performed at least 1 year before undergoing the current DM + DBT. From these eligible patients, 42 were excluded for the following reasons: 38 were lost to follow-up; two underwent vacuum-assisted core needle biopsy prior to DM + DBT; and two received neoadjuvant chemotherapy prior to DM + DBT. Finally, a total of 116 patients constituted our study population. The mean age of the patients was 52.0 years (age range, 31–75 years). Of the 116 patients, 93 patients (80.2%) were asymptomatic, 15 patients (12.9%) had palpable masses, and the other eight patients had symptoms of pain or nipple discharge. Breast compositions of the patients according to the Breast Imaging Reporting and Data System (BI-RADS) as recorded during the current DM were one (0.9%) almost entirely fatty case (BI-RADS composition of a), 23 (19.8%) scattered fibroglandular densities (BI-RADS composition of b), 71 (61.2%) heterogeneously dense cases (BI-RADS composition of c), and 21 (18.1%) extremely dense cases (BI-RADS composition of d).
Image acquisition
All prior mammograms were checked for proper positioning and image quality which fulfilled the standards of the Mammography Quality Standards Requirements in Korea. We used the last prior mammography with negative results obtained prior to current DM + DBT acquisition for this assessment. The mean interval between prior mammography and current DM + DBT was 78 months (standard deviation [SD], 174 months). Most of the prior mammograms (n = 113) were performed using full field digital mammography systems including Lorad Selenia (Hologic Inc., Bedford, MA, USA) (n = 37), Senographe 2000D (n = 70) or Senographe DS (n = 3) (GE Healthcare, Milwaukee, WI, USA), and Mammomat Inspiration (Siemens AG, Erlangen, Germany) (n = 1). The other prior mammograms were performed using computed radiography (n = 2) and film radiography (n = 1).
DM was performed with a full field digital mammography unit with integrated tomosynthesis acquisition (Selenia Dimensions; Hologic). Participants underwent two bilateral views (craniocaudal and mediolateral oblique) in the Combo mode; DM and DBT images were obtained with a single breast compression for each projection. The device used consists of a custom-designed high-power (mA) tungsten (W) anode X-ray tube and filters of rhodium (Rh), silver (Ag), and aluminum (Al). Different filters were used for DM and DBT image modes (0.7-mm thick aluminum filter for DBT acquisition and 50-µm thick Rh or Ag filter for DM acquisitions) producing the optimal X-ray spectra (20–49 kVp) using automatic exposure control (AEC).
Performance study
Three radiologists (HRK, MSB, and MS) from an academic institution with a median experience of 9.3 years (range, 7–13 years) in breast imaging independently interpreted the images in batch mode. All radiologists had at least 3 years of clinical experience in DBT and had gained experience in two previous reader studies with different study populations. DM and prior mammograms were displayed on individual workstations with a 5-megapixel (2560 × 2048 pixels) liquid crystal display system (ME511L, Totoku Electric, Tokyo, Japan) and DBT images were displayed on a separate workstation with a 5-megapixel (2560 × 2048 pixels) liquid crystal display system (MDMG-5221, Barco, Kortrijk, Belgium). Radiologists were allowed to use tools for window/level adjustment, zooming, and to perform cine-mode review of the tomosynthesis images using thin-section or thick slabs. They were blinded to the locations of all breast lesions as well as the results of other imaging modalities or clinical data.
The study involved three sequential readings and each radiologist independently reviewed the images without clinical information. All radiologists reviewed (i) DM, (ii) DM + DBT, and (iii) DM + DBT with prior mammograms in the sequential order for each patient. They were asked to record the location of the most suspicious abnormality using a graphical interface and standardized template to prevent lesion misallocation, and to rate the probability of malignancy. For each lesion, the radiologists assigned a confidence probability for malignancy using the American College of Radiologists Breast Imaging Reporting and Data System (ACR BI-RADS) categories: BI-RADS 1 (negative), BI-RADS 2 (benign), BI-RADS 3 (probably benign), BI-RADS 4A (low suspicion for malignancy), BI-RADS 4B (moderate suspicion for malignancy), BI-RADS 4C (high suspicion for malignancy), and BI-RADS 5 (highly suggestive of malignancy). A probability of malignancy (POM) score in the range of 0–100% was also recorded for each case. Cases assigned a BI-RADS category of 1, 2, or 3 were considered normal or benign, and those assigned a BIRADS category of 4 or 5 were considered abnormal or malignant. Marks were considered positive if they correctly indicated the location on at least one view of DM and DBT.
Reference standard
All patients with malignancy underwent surgical excision. Lesions with malignant results after biopsy or surgical excision were considered positive. Lesions with concordant biopsy results and those that did not undergo biopsy with no evidence of breast malignancy after 1 year of clinical or imaging follow-up were considered negative. Lesion-matching of the breast lesions between each imaging modality and pathology was performed off-site in consensus by two radiologists (WHK and JMC) who specialize in breast imaging with 5 and 10 years of experience. Pathologic reports of the surgical specimen and biopsy samples as well as the standardized templates were used in the image review. Surgical pathology and core specimens were reviewed by one pathologist (IAP) with 25 years of experience in breast pathology.
Statistical analysis
Diagnostic sensitivity, specificity, and positive predictive value (PPV) were calculated using a BI-RADS score of 4 or 5 to indicate as a positive result and a BI-RADS score of 1, 2, or 3 as a negative result. The McNemar test and Fisher’s exact test were used for comparison of individual radiologist’s scores. For pooled analysis of sensitivity and specificity to adjust for the effect of clustering on radiologists and patients as random factors, we used generalized estimating equations with a logit link and an independent working correlation structure.
Area under the receiver operating characteristic curve (AUC) analysis was performed using the software package OR-DBM MRMC 2.4, (http://perception.radiology.uiowa.edu). This program is based on that initially proposed by Berbaum et al. (11) and Obuchowski and Rockette (12), and later unified and improved by Hillis and colleagues (3–5). Pooled receiver operating characteristic (ROC) analysis assumed fixed readers and random cases. AUCs were compared using the z-test and confidence intervals for AUC differences. Two-sided P values were reported; a P value <0.05 was considered to indicate a significant difference.
Results
Of the 24 malignant cases, 21 cases were invasive ductal carcinomas (IDC) and the remaining cases were a mucinous carcinoma (n = 1), invasive lobular carcinoma (n = 1), and tubular carcinoma (n = 1). Among 24 malignant cases, 22 cases had invasive cancers with ductal or lobular carcinomas in situ and two cases had pure invasive cancers. The mean size of cancers for invasive component was 1.7 cm (median, 1.5 cm; range, 0.2–3.5 cm) and the mean size of cancers for invasive component + ductal and lobular carcinoma in situ was 2.7 cm (median, 2.7 cm; range, 1–6 cm). There were no cases of pure carcinoma in situ. Seven cases had a positive axillary lymph node. Two cases were detected by ultrasonography and invisible at DM at initial presentation of cancer diagnosis. Patients with concordant biopsy results (n = 27) and those that did not undergo biopsy with no evidence of breast malignancy after 1 year of clinical or imaging follow-up (n = 65) were considered negative.
AUCs of DM alone, DM + DBT, and DM + DBT with prior mammograms.
AUC, area under the receiver operating characteristic curve, BI-RADS = Breast Imaging Reporting and Data System, POM = probability of malignancy.
P value was for the comparison between DM and DM + DBT.
P value was for the comparison between DM + DBT and DM + DBT with prior mammogram.
Sensitivity and specificity of DM alone, DM + DBT, and DM + DBT with prior mammograms.
P value was for the comparison between DM and DM + DBT.
P value was for the comparison between DM + DBT and DM + DBT with prior mammogram.
Distribution of BI-RADS score of DM alone, DM + DBT, and DM + DBT with prior mammograms for cancer patients.
Data are numbers of patients and numbers in parenthesis are percentages for cancer patients (n = 24 for each radiologist and n = 72 for total radiologists). BI-RADS 1 (negative), BI-RADS 2 (benign), BI-RADS 3 (probably benign), BI-RADS 4 (suspicion for malignancy), and BI-RADS 5 (highly suggestive of malignancy).
Specificity for DM, DM + DBT, and DM + DBT with prior mammograms were in the range of 81.8–84.8%, 82.6–89.1%, and 87.8–95.7%, respectively, among radiologists. On pooled analysis, the average specificity for DM, DM + DBT, and DM + DBT with prior mammograms was 84.1%, 85.9%, and 93.8%, respectively. Adding prior mammograms significantly improved the specificity of DM + DBT (P = 0.0004), whereas adding DBT did not affect the specificity of DM (P = 0.4340) (Table 2, Fig. 1). Moreover, for patients with benign lesions visible at mammography (n = 32) including benign mass or asymmetry and microcalcifications, the specificity for DM, DM + DBT, and DM + DBT with prior mammograms were in the range of 71.9–78.1%, 78.1–84.4%, and 87.5–100.0%, respectively, among radiologists. On pooled analysis, the average specificity for DM, DM + DBT, and DM + DBT with prior mammograms was 76.0%, 80.2%, and 94.8%, respectively. Adding prior mammograms significantly improved the specificity of DM + DBT (P = 0.0005), whereas adding DBT did not affect the specificity of DM (P = 0.4545).
Images of 41-year-old woman with unchanged asymmetry. (a) Mediolateral oblique DM shows asymmetry (arrow) mimicking mass in left upper breast. (b) Mediolateral oblique DBT image also shows more discrete focal asymmetry (arrow). (c) Prior mammogram taken 7 years ago shows asymmetrical density (arrow) at the same location in the breast suggesting asymmetry does not mean for malignancy. Ultrasonography (not shown) performed concurrently with mammography and DBT did not show any focal lesion.
The PPV for DM, DM + DBT, and DM + DBT with prior mammograms were in the range of 48.1–50.0%, 52.9–63.0%, and 60.7–89.5%, respectively, among radiologists. On pooled analysis, the average PPV for DM, DM + DBT, and DM + DBT with prior mammograms was 48.8%, 56.2%, and 74.6%, respectively. Adding prior mammograms significantly improved the PPV of DM + DBT (P = 0.0004), whereas adding DBT did not affect the PPV of DM (P = 0.0856).
Discussion
Our study showed that DM + DBT had higher sensitivity and diagnostic performance compared to DM, but there was no significant increase in the sensitivity of DM + DBT with the addition of prior mammograms, suggesting that the availability of DBT images sufficiently increased sensitivity of the DM without the need for prior mammograms. However, adding prior mammograms to DBT interpretation did contribute to improved specificity, suggesting that fewer abnormal interpretations can be made in patients without cancers when prior mammograms are available and compared, potentially resulting in less anxiety and less additional imaging in patients.
As periodic breast cancer screening through mammography has become more common, substantial women had prior mammograms (13,14). The use of prior mammograms in the reading of current mammograms had previously been reported to increase specificity especially in the screening setting (13,15), while also increasing the detection of true-positive findings in the diagnostic setting (1). Current studies by Hakim et al. also reported that recall rates were able to be reduced with the addition of prior images by 34% and 32% without and with DBT images (16,17) although their study did not describe the effect of prior mammograms on diagnostic performance in terms of sensitivity and specificity.
In our study, we did not separately assess the effect of prior mammograms on interpretation using DM. The effect of prior mammograms on DM has been investigated in many previous studies, and a recent study by Hakim et al. revealed that prior mammograms and DBT were independent contributing components in reducing recall rates during the interpretation of mammograms (16). We also observed this in our study, in fact, prior mammograms have clear independent advantages in terms of specificity since benign lesions presented as focal asymmetries, mass or architectural distortions on DBT, which still need recall can be regarded as benign findings if those findings did not show any temporal changes and malignant for many years. Conversely, additional, suspicious features could be seen on DBT regardless of their temporal morphological changes, which may increase sensitivity. Therefore, although the addition of prior mammograms to DBT did not improve sensitivity or diagnostic performance, it may still play an important role as it provides a distinct advantage over DBT in specificity, and DM interpretation can be benefited by the synergistic effects of the improved sensitivity with the addition of DBT and improved specificity with the addition of prior mammograms. This is of particularly important in screening, where most women do not have cancer in any one examination cycle.
Our study acknowledges several limitations. First, this study was a retrospective study from a single institution with a limited number of reviewers. Second, due to the nature of retrospective case collection, the period between the current DM and the prior ones were relatively long and age range of the examined women were variable. Third, our dataset was composed of various diagnostic situations rather than only screening. Therefore, our results may have potentially over- or underestimated its effect on diagnostic performance in actual screening situations. The specificities and PPVs of DM + DBT in our study were not significantly higher than those of DM alone which is not concordant with the results of previous studies showing that DBT provided lower recall rates and improvement in specificity (9,10). This discordance probably stems from our respective study designs with enriched cancer cases and depicting at least one abnormality of interest for a patient. In addition, the range of PPVs was wide in our retrospective review by readers due to not only different period of experience of DBT but also different threshold to recall in reader study. Fourth, we did not separately assess the effect of prior mammograms on DM as this topic has been investigated in many previous studies. Fifth, the impact of prior mammograms in addition of one-view (MLO) for DBT images was not investigated in our study; we used two-view (CC and MLO) for DBT images. Several methods proposed for the use of DBT in addition to DM include the addition of two-view (7,8) or the addition of one-view (18). It has been reported that the addition of one-view DBT to DM improved diagnostic accuracy and reduced the recall rate; however, the performance gain including recall rate was lower in one-view DBT than two-view DBT. In this respect, the use of prior mammograms may help to improve specificity with compensation of performance loss from the use of one-view DBT instead of two-view DBT.
In conclusion, DBT significantly increases the overall sensitivity and diagnostic performance of DM. Prior mammograms significantly increase the specificity of DM + DBT but have no significant effect on sensitivity and overall diagnostic performance.
Footnotes
Acknowledgements
The authors would also like to thank Chris Woo for his kind assistance in editing this manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Core Medical Device R & D Program (10043122) funded by the Ministry of Trade, Industry & Energy (MOTIE), Republic of Korea, and grant number (05-2014-0040) from Seoul National University Hospital Research Fund.
