Abstract
Background
Magnetic resonance imaging (MRI) is largely used for preoperative evaluation of breast cancer.
Purpose
The aim of this study is to evaluate inter-observer variability for measuring the sizes of non-mass enhancement (NME) breast cancers on various sequences of MRI and to compare with pathology size.
Material and Methods
From January 2013 to April 2017, 70 breast cancers in 70 women showed NME on preoperative MRI. Three observers retrospectively measured the cancers with the longest diameters on subtracted early dynamic contrast-enhanced (DCE) T1 (early T1), delayed DCE T1 (delayed T1), and maximum intensity projection (MIP) images; the measurements between observers were analyzed using the intraclass correlation coefficient (ICC). Pathology sizes were compared using ICCs and Bland–Altman plots, and concordance rates were calculated with MRI and pathology size differences of ≤0.5 cm.
Results
The ICC for the three observers’ breast cancer size measurements was excellent (0.945) and that between MRI and pathology size was good to excellent in all observers (0.742–0.850). Compared with pathology size, the Bland–Altman plots revealed a positive trend of differences in all MRI measurements. Three observers overestimated nearly half of the measured breast cancers on all three MRI sequences; the overestimation rates were the highest on MIP for observers 1 and 3 (68.6% and 55.7%) and on delayed T1 for observer 2 (57.1%).
Conclusion
The size assessments of NME breast cancer on MRI showed high correlation between observers and with pathology size. However, MRI measurements need a cautious approach because of their high overestimation rates. Measurements from early T1 images showed the highest concordance rate with pathology size.
Introduction
Accurate measurement of breast cancer size is important for determining the extent of surgery and predicting prognosis (1). Magnetic resonance imaging (MRI) is largely used for preoperative evaluation of breast cancer and is known to measure breast cancer size more accurately than mammography (MMG) or ultrasound (US) (2–4). However, MRI often shows discrepancy in breast cancer size compared with pathology size (5–10).
According to the Breast Imaging Reporting and Data System (BI-RADS), breast lesions can be described as having mass or non-mass enhancement (NME) on MRI (11); in the literature, discrepancies between MRI and pathology size measurements are more frequent breast cancers that show NME (5,10,12). Further, the size measurements of NME cancers showed discrepancies both on two-dimensional (2D) anatomical planes and in three-dimensional (3D) maximum intensity projection (MIP) reconstruction images (5,9,10).
Because of the less clear borders of NME breast cancer compared with mass lesions, inter-observer variability also exists in measuring the sizes of NME breast cancers. However, to our knowledge, most studies in the past have only focused on analyzing mass lesions and no researchers have evaluated inter-observer variability in the MRI measurements of these cancers. Thus, the purpose of this study was to evaluate inter-observer variability for the size measurement of NME breast cancers on various sequences of MRI and to compare with pathology size.
Material and Methods
The institutional review board approved this retrospective study and required neither patient approval nor informed consent for our review of patient images and records.
Study population
From January 2013 to April 2017, 518 patients with core-biopsy proven breast cancers underwent breast MRI in a single tertiary medical center. Two experienced radiologists (JK and HKJ) reviewed each lesion’s MRI findings and determined whether it should be designated to a mass or a non-mass in consensus. NME was defined as enhancement pattern discrete from the normal background parenchymal enhancement (BPE), neither a mass nor a focus according to the BI-RADS (11). We found that 153 (29.5%) patients had NME breast cancers. We excluded 83 of the 153 patients because they had multiple cancers, which precluded correlating them with pathology size (n = 57), they had been referred to other hospitals (n = 21), or they had received neoadjuvant chemotherapy (n = 5). Finally, we included 70 patients in this study who had undergone breast cancer surgery within four weeks after the MRI examinations. The mean age of the patients was 52.7 ± 9.5 years (age range = 37–79 years).
Image evaluation
MMG was performed using a Senographe DS unit (General Electric Healthcare, Slough, UK) with bilateral cranio-caudal (CC) and mediolateral oblique (MLO) views. Preoperative MRI was performed using a Signa HDxt 3.0 T (General Electric Medical Systems, Milwaukee, WI, USA) with a dedicated eight-channel surface breast coil and the patients in the prone position. MRI protocols included an axial fat-suppressed axial T2-weighted (T2W) image (TR/TE = 5400/85 ms; field of view [FOV] = 30 × 30 cm; section thickness = 0.5 cm), fat-suppressed sagittal T2W image (TR/TE = 4000/85 ms; FOV = 18 × 18 cm; section thickness = 0.4 cm), and dynamic contrast-enhanced (DCE) fat-suppressed T1-weighted (T1W) gradient echo sequence (TR/TE = 4.8/2.3 ms; FOV = 30 × 30 cm; section thickness = 0.2 cm) after intravenous injection of 0.1 mmol/kg of gadoteridol (ProHance; Bracco Diagnostics, Monroe Township, NJ, USA) over 20 s with an injector at a flow rate of 2 mL/s. Subtracted images were generated from the dynamic sequences and MIP reconstruction was obtained from the subtraction images 2 min after injection.
Image analysis
Three fellowship-trained breast imaging radiologists (HKJ, AYP, JK) were enrolled as observers in this study. The observers had 3–13 years (mean = 7 years) of clinical experience including MMG and breast MRI interpretation. The observers retrospectively and individually measured the size of the cancers on MMG and MRI; they were informed only that the cancers were NME type on DCE MRI, and they were blinded to clinical and pathology data. The observers measured the longest diameters on axial or sagittal subtracted DCE images 2 min (early T1) and 6 min (delayed T1) after injection and also measured the longest diameters on MIP images (Fig. 1). On MMG, if an abnormality was detected in the area that corresponded with the NME, the observers measured the longest diameter on either MLO or CC. All observers were given data files for entering their measurements and they evaluated the images on a picture archiving and communication systems workstation (m-view; Marotech, Seoul, Republic of Korea) using a high-resolution monitor (GX 1030, RadiForce, EIZO, Ishikawa, Japan). One observer (JK) evaluated mammographic breast composition and BPE on MRI according to BI-RADS terminology (11).

A 65-year-old woman with a history of right mastectomy was newly diagnosed with left breast cancer. On MRI, breast cancer was appeared as NME. (a) On early T1, observers 1, 2, and 3 measured cancer size as 5.3 cm, 5 cm, and 4.3 cm, respectively. (b) On delayed T1, observers 1, 2, and 3 measured cancer size as 5.4 cm, 4.8 cm, and 4.7 cm, respectively. (c) On MIP, observers 1, 2, and 3 measured cancer size as 8.7 cm, 7.3 cm, and 5.4 cm, respectively. (d) On MMG, mass with microcalcification was seen. Observers 1, 2, and 3 measured cancer size as 5.4 cm, 3.7 cm, and 5.7 cm, respectively. The pathology result was a 1.6-cm invasive carcinoma surrounded by 3.2 cm extent of ductal carcinoma in situ. NME, non-mass enhancement; MIP, maximum intensity projection; MMG, mammography
Pathology analysis
We obtained the histopathologic information from the pathology reports. All histopathologic examinations were performed in our hospital by a breast pathology specialist. Tumor size was assessed according to gross and microscopic measurements; the size on the final pathology reports was considered the gold standard. When there was invasive carcinoma combined with carcinoma in situ, the maximum tumor size was recorded.
Statistical analysis
Inter-observer variability for the three observers’ breast cancer size measurements on three breast MRI sequences (early T1, delayed T1, MIP image) and MMG was evaluated with the intraclass correlation coefficient (ICC); when the observer did not detect the presence of the lesion, it was considered to be missing data. Correlations of cancer size on MRI and MMG with pathology size were also performed with the ICC and we regarded ICC ≥ 0.75 as excellent agreement; we considered ICC ≥ 0.60 and <0.75 good agreement. We calculated the differences between image and pathology sizes, defining overestimation as when the size on an image exceeded the pathology size by >0.5 cm; we defined underestimation as when the size on the image was smaller than the pathology size by >0.5 cm and defined concordance as a difference of 0.5 cm or less between the two sizes. Comparison of concordance rate was performed by the Pearson chi-square test. A two-sided P value < 0.05 was considered to indicate statistical significance. We also used Bland–Altman plot analysis to quantify the agreements between MRI and pathology size for each MRI sequence. We illustrated the mean differences and limits of agreement (LOAs) between the MRI and pathology size using difference plots. The 95% LOAs were given as the mean difference ±1.96 standard deviation (SD). We performed the statistical analyses using SPSS statistical software ver. 24.0 (IBM Corp., Armonk, NY, USA).
Results
General pathology features
Total mastectomy was performed in 40 patients and breast-conserving surgery was performed in 30. The histologic subtypes of the 70 breast cancers were invasive carcinoma of no special type (n = 24, 34.3%), ductal carcinoma in situ (n = 22, 31.4%), microinvasive ductal carcinoma (n = 17, 24.3%), invasive lobular carcinoma (n = 4, 5.7%), and invasive ductal carcinoma, special type (n = 3, 4.3%).
General imaging features
On MRI, BPE was minimal in 32 patients (45.7%), mild in 17 (24.3%), moderate in 12 (17.1%), and marked in nine (12.9%). All three observers could measure all included breast cancers on early and delayed T1; however, on MIP images, observer 2 could not measure four breast cancers and observer 3 could not measure five cancers and reported that BPE was masking the index tumor; MMG was not available for three patients. Among 67 patients with available MMG, six had grade b, 54 had grade c, and seven had grade d breast composition. On MMG, observer 1 reported five cancers, observer 2 reported six, and observer 3 reported eight as negative. The 62 cancers that observer 1 reported presented as microcalcifications (n = 42), masses with microcalcification (n = 7), asymmetry (n = 5), masses (n = 4), and focal asymmetry (n = 4).
Size measurements
The ICCs for the three observers’ measurements of the 70 NME breast cancers were excellent on both MRI and MMG, with overall coefficients of 0.945 on early T1, 0.941 on delayed T1, 0.938 on MIP, and 0.921 on MMG; the pairwise ICCs for the observers were also excellent for all subsets (0.875–0.929; Table 1). The mean size of the 70 breast cancers on pathology was 3.3 ± 2.1 cm (range = 0.2–11.0 cm); the means were greater on both MRI and MMG than on pathology for all observers (range = 3.7 ± 2.0–4.5 ± 2.2 cm; Table 2). The ICCs for the image and pathology size measurements were also good to excellent for each observer (Table 2).
The ICCs for the three observers’ measurements of the 70 NME breast cancers on MRI and MMG.
ICC, intraclass correlation coefficient; NME, non-mass enhancement; MRI, magnetic resonance imaging; MMG, mammography; MIP, maximum intensity projection
The ICCs for the image and pathology size measurements for each observer.
*The ICCs for the image and pathology size measurements. The mean size on pathology was 3.3 ± 2.1 cm (range = 0.2–11.0 cm).
ICC, intraclass correlation coefficient; NME, non-mass enhancement; MRI, magnetic resonance imaging; MMG, mammography; MIP, maximum intensity projection
According to the Bland–Altman plot analyses (Fig. 2), the mean difference was the lowest on early T1 for two observers (0.5029–0.6429) and the highest on MIP images in all observers (0.9955–1.2143). The mean difference was 0.2435–0.8717 on MMG. The difference plots revealed a positive trend of differences (overestimation) proportional to the magnitude of the measurement for all MRI sequences and MMG for all observers.

Bland–Altman plots between MRI or MMG and pathology size for each observer on (a) subtracted DCE early T1W sequence, (b) subtracted DCE delayed T1W sequence, (c) MIP, and (d) MMG. (The 95% LOAs were given as the mean difference ± 1.96 SD). MMG, mammography.
The concordance rate of image-pathology size was the highest on early T1 for all three observers (37.1–41.4%; Table 3). However, concordance rate was not statistically significant for all observers (P = 0.308, 0.324, and 0.511). All three observers overestimated nearly half of the measured breast cancers on all three MRI sequences; the overestimation rates were the highest on MIP for observers 1 and 3 (68.6% and 55.7%) and on delayed T1 for observer 2 (57.1%). Among the 67 patients who underwent MMG, the overestimation rates were 37.1–45.7% and the concordance rates were 25.7–31.4%. For comparison with MRI, we analyzed 63 cases for which observer 1 could visualize the cancers on both MRI and MMG. MMG showed a 44.4% overestimation rate, which was lower than that for MRI (46.0–68.3%); the underestimation rate was 20.6%, higher than that for MRI (11.1–15.9%, P = 0.163; Fig. 1).
The concordance rate of image-pathology size of NME breast cancers for each observer.
Values are presented as n (%).
*Underestimation was defined as when the size on the image was smaller than the pathology size by >0.5 cm.
†Concordance was defined as a difference of ≤0.5 cm between the two sizes.
‡Overestimation was defined when the size on an image exceeded the pathology size by >0.5 cm.
NME, non-mass enhancement; MMG, mammography; MIP, maximum intensity projection.
Discussion
In the present study, three observers measured the sizes of NME breast cancers; the ICCs were excellent on both MRI and MMG. In a published study on breast cancers that showed both mass and NME, cancer size measurements on MRI between two observers showed excellent ICCs (9); we also found high inter-observer agreement in NME breast cancer size measurements among three observers.
When we compared the sizes on images with pathology sizes, the ICCs were good to excellent. However, with concordant measurements (image versus pathology size differences within 0.5 cm), the overestimation rates were considerable on MRI (45.7–68.6%) and MMG (37.1–45.7%). The difference plots also showed a positive trend of differences in the measurements on all MRI sequences for all observers. It was already known that MRI often overestimates breast cancer size, with reported overestimation rates of 27.1–35.2% when the differences between image and pathology sizes were >0.5 cm; these studies included both mass and NME (6,8,9). However, in our study, the overestimation rates were even higher than the reported ranges. This was consistent with the findings from one study that showed that NME breast cancers caused more overestimations than did mass lesions (9). Other studies also revealed that NME caused more discordance between image and pathology sizes than did mass lesions (5,10,12).
Previous study authors analyzed several possible reasons for the overestimation with MRI, including, potentially, shrinkage of pathologic specimens after formalin fixation (5,13). The prone positioning during breast MRI might also increase the sizes of breast cancers on MRI compared with pathologic specimens (12). The abundant microvessel density surrounding the cancers could also cause enhancement larger than the actual cancer sizes (14). This study showed that among the different MRI sequences, the concordance rate of measurements was the highest on early T1 among all observers, and overestimation was most prominent on MIP images for two observers. However, when using the chi-square test, comparing the concordance rate between three MRI sequences was not statistically significant. The study population might be small to reach a statistical significance. According to the Bland–Altman plots, the differences between the MRI and pathology sizes were the lowest on early T1 for two observers (0.5029–0.6429) and the highest on MIP images for all observers (0.9955–1.2143).
Choi et al. reported that MIP could predict breast cancer pathology size more accurately than other sequences (10), especially for invasive ductal cancers that appeared as masses on MRI; however, NME only comprised a small portion (20.9%) of that study population. Moreover, Choi et al. also noted that overestimation occurred more with using MIP (122/799, 15.3%) than with early T1 (104/806, 12.9%). In another study, in which NME breast cancers comprised 28.3% of the study population, measuring breast cancers in the 2D anatomical plane was more accurate than measuring the longest axis in a 3D space (5). Authors of another study reported that measuring the size of the whole tumor showed the best prediction on early subtracted images than on MIP or late subtracted images, which was similar to our study findings; however, only 8% of lesions were NME and thus exact comparisons are not possible (9). We did not include T2W imaging because most NME cancers cannot be seen on T2W imaging (9).
Because the ICCs for the three observers’ cancer size measurements were excellent, we performed subgroup analyses of cancers that were visible on both MRI and MMG to observer 1. MMG also showed a considerable overestimation rate (44.4%); its underestimation rate (20.6%) was higher than that for MRI (11.1–15.9%). A previous study also showed lower overestimation and higher underestimation rates for MMG than for MRI, but that study’s authors did not mention the exact MRI sequences they used; therefore, exact comparison with our study is not possible (2). In our study, the size measurements of NME breast cancers on MMG were less accurate than the measurements on early T1 MRI.
We acknowledge that there are several limitations in this study. First, the study was retrospectively designed and used a relatively small study population. We excluded half of the breast cancers that showed NME and this could have caused bias; future studies with larger populations should be conducted. Second, on MIP and MMG, the observers evaluated different numbers of breast cancers because of prominent BPE on MRI or negative MMG findings. BPE was moderate to marked in approximately 30% of the population, and a number of cancers were not measured because BPE had masked the index tumors. On MMG, most of the study population had grade c or d lesions, and therefore we missed 10% of breast cancers. However, the results were similar between observers who evaluated all breast cancers on MIP and those who did not, and therefore, the effects were minimal. We also performed subgroup analysis of breast cancers that were visible on both MRI and MMG to minimize bias from missing data. Third, we did not directly compare the measurements with those for breast cancers that showed mass lesions on MRI. Although a number of authors have compared NME with mass enhancement (5,9,10,12), most have included only small numbers of NME cases, and therefore, future studies are needed with larger numbers of breast cancers that show NME.
In conclusion, the size assessments of NME breast cancers on MRI show high correlation both between observers and with pathology size. However, MRI measurements require caution because of the associated high overestimation rates. Measurements from early T1 images showed the highest concordance rate with pathology size.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
