Reliability and Diagnostic Thresholds for Ultrasound Measurements of Dermal Thickness in Breast Lymphedema

Abstract

Background:

Lymphedema of the breast, secondary to treatment for breast cancer, is difficult to assess due to the shape of the breast and the nature of the tissue. Ultrasound measurement of dermal thickness has been previously used to assess breast swelling; however, the reliability of the measurements, or what should be considered an abnormal thickness, is currently known.

Methods and Results:

Thirty-eight women with breast edema were recruited and underwent assessment using ultrasound. During the assessment, the four quadrants (superior, inferior, medial, and lateral) of the affected and unaffected breasts were imaged three times each. Dermal thickness was then measured by two assessors, on two occasions for each captured image. The interimage, intrarater, and inter-rater reliability was all found to be excellent (Cronbach's alpha = 0.995; ICC_(3,1) = 0.962 and 0.851; and ICC_(2,1) = 0.977, respectively). A dermal thickness of >1.6 mm in the superior and lateral quadrants and 2.0 mm in the medial and inferior quadrants was determined, by receiver-operating characteristics curve analysis, as the optimal diagnostic threshold to detect breast edema.

Conclusion:

Dermal thickness measurements can be reliably completed on breasts with edema secondary to breast cancer. Future study is needed to determine the utility of the dermal thickness thresholds established as well as to investigate changes in dermal thickness as a response to treatment of breast edema.

Introduction

Breast lymphedema is a potential sequelae from the treatment for breast cancer. It is associated with pain and impairment as well as a reduction in quality of life¹ and is more likely to be self-reported as being severe than arm lymphedema.² The reported incidence of breast lymphedema varies between 9% and 35%.^1,3 A major reason for this variance is due to the lack of accepted measurement protocols or diagnostic thresholds.^1,3

Measurement techniques used to diagnose and monitor arm lymphedema, such as circumference measurements or bioimpedance spectroscopy, cannot be used for breast lymphedema due to the shape of the breast. Clinical assessment is the most common method of diagnosis of breast lymphedema but is subjective with only moderate inter-rater agreement.^3,4 Furthermore, clinical assessment may miss early signs of developing swelling.^5,6 Physical measurements, such as volume calculations⁷ and skin fold thicknesses,¹ also have poor reliability.¹ Imaging, specifically ultrasound, has therefore been suggested as an alternative, objective way of assessing breast lymphedema.

Ultrasound has been used to assess and diagnose lymphedema in the arm,^8,9 legs,^10,11 and genitals.¹² Particularly in the lower limb, it is able to distinguish between various pathologies that lead to increases in lower limb volume such as lipedema¹⁰ and dependent edema.¹¹ More importantly, specific differences in tissue presentations have been uniformly found between those with and without lymphedema regardless of the region being assessed. An increase in dermal thickness has been identified as a distinguishing feature on ultrasound assessment in limbs^8–10 and regions with lymphedema, including the breast.⁵ This appears to be a promising, objective way of assessing breast lymphedema; however, to be considered as a useful tool to assess lymphedema, the reliability and utility of the image capture and measurement protocols need to be determined.

In research studies using ultrasound, it is common to capture multiple images of each location, taking the average of the measurement completed on each image. However, with the irregular shape of the breast, particularly when it is swollen, the consistency of the images is unknown. Furthermore, while the reliability of dermal thickness measurement is excellent in the upper limb,^13,14 it is unknown for the assessment of breast dermal thickness. Previous findings have found poor definition of the boundary between tissues, dermal-facial border, in both the leg¹⁰ and breast.⁶ This may lead to decreased inter- and intra-rater reliability of the measurement if the locations of the measurements are unclear. Finally, although dermal thickness has been assessed for those with breast lymphedema, the difference in thickness between the dermal tissue in an affected and unaffected breast has not been reported. The utility of this assessment protocol can only be determined if significant differences can be found between the two sides.

The aims of this study, therefore, were to (1) determine the reliability of the dermal thicknesses between images (interimage reliability); (2) determine the inter- and intra-rater reliability of dermal thickness measurements; and (3) confirm the ability of the measurements of dermal thickness to distinguish between breasts with and without lymphedema secondary to treatment for breast cancer.

Materials and Methods

Breast ultrasound images from women with breast lymphedema taking part in a larger study were assessed. All participants in the study had undergone a wide local excision and axillary surgery for early breast cancer, and had developed unilateral breast edema that had been stable for at least 3 months. Ethics were received from the University of Sydney's Human Research Ethics Committee (10-2011/14037) as well as all other sites where recruitment occurred. All participants had signed consent forms before participation in the larger study.

A training set of ultrasound images from seven participants was compiled and used for practice by the assessors. The assessors also used this training set to discuss decision-making processes for challenging images, particularly when there was a disorganized lower border. A testing set of 38 participants was then evaluated with each assessor blinded to the results of the other assessors as well as to the affected sides of participants.

Ultrasound protocol

The participant was positioned in supine with one pillow under the head. If the participant had large breasts, the participant's arm was placed above their head to assist with the breast position. Measurement locations of 35–40 mm, dependent on breast size, superior, medial, inferior, and lateral to the nipple bilaterally were marked with an indelible pen to standardize the measurement location between image captures. Ultrasound images were captured using an Esaote MyLab™25Gold ultrasound machine (Esaote, Italy). The probe was held perpendicularly to the measurement site following the contour of the breast. A thick line of gel was applied to the surface of the 18–6 MHz linear probe (LA435; Esaote) to create a standoff effect and allow clear imaging of the surface of the dermal tissue. This gel layer and minimal probe pressure ensured that the thickness of the tissue was not altered due to compression. The depth of image captured was set at 2 cm. The gain was adjusted to ensure optimal image quality. An image of the breast tissue was then taken three times, with removal of the ultrasound probe from the tissue between each measurement. Image quality focused on a clear dermal-gel border as well as a clear dermal-facial boundary (Fig. 1A, B). A single individual, who had training with an experienced sonographer, undertook all data collection.

FIG. 1.

Sample image with clear gel-dermal and dermal-fascia borders of (A) unaffected breast and (B) affected breast.

Image analysis

Measurements of dermal thickness were taken from the superior edge of the dermal-gel border to the dermal-fascial border,¹⁵ preferentially from the center of the image. If the boundaries were not clear in the center of the image, the measurement was moved as necessary to the left or right to the location where the boundaries were most clear based on the judgment of the individual rater. If a clear boundary could not be determined (Fig. 2), no measurement was taken. Two assessors measured each image on two separate occasions. Thus, both assessors measured two sets of three images, each at four locations for all 38 participants on both the left and right breast. Neither assessor had previous training as a sonographer.

FIG. 2.

Sample image with disordered dermal-fascia border in an affected breast.

Data analyses

The interimage reliability was determined using Cronbach's alpha. This was calculated using both assessors' first measurement sets. The inter- and intrarater reliability was determined using the average dermal thickness found for the three images using intraclass correlation analysis.

To determine whether the technique could distinguish between breasts with and without breast lymphedema, the average of the six measurements made by the two assessors was determined for each quadrant of both the affected and unaffected breasts. Two-way analysis of variance (ANOVA) was undertaken to determine whether dermal thickness differed between the affected and unaffected breast and whether there was an interaction between the side affected and location of measurement. Duncan's post hoc test was used to explore significant differences in relation to location. Pearson's correlation between age and dermal thickness in the unaffected breast for each location was also determined. Finally, the data were used to determine thresholds for detection of lymphedema. ROC analyses were used to determine the best cut-point that differentiated the affected breast from the unaffected breast. The thresholds determined from this approach were compared with thresholds determined using an alternative approach of using 2 or 3 standard deviations (SDs) above the mean, derived from the unaffected breast. This alternative approach has been widely used to determine diagnostic thresholds for arm lymphedema.^16–18

All data analyses were completed using SPSS (IBM, version 22), with significance set at p > 0.05.

Results

The first measurement set for both assessors was used to assess the interimage reliability for the three images captured for each measurement location. The interimage reliability was found to be very high (Cronbach's alpha = 0.995).

Both assessors had excellent intrarater reliability [Assessor 1: ICC_(3,1): 0.962; 95% confidence interval (95% CI): 0.953–0.970 and Assessor 2: ICC_(3,1): 0.851; 95% CI: 0.817–0.880]. The inter-rater reliability was also excellent (ICC_(2,1): 0.977; 95% CI: 0.696–0.993). One or both of the assessors determined that 31 of 456 (6.8%) images could not be measured for the affected breast due to difficulties in determining the bottom boundary compared to 7 of 456 (1.5%) for the unaffected breast. For the unaffected breast, both assessors agreed that 4 of the 7 (57%) images were not assessable, while for the affected breast there was lower agreement on the images that could not be assessed (8 of 31; 26%).

As the interimage and inter-rater reliability was excellent, the average dermal thickness for each quadrant of each breast was calculated. ANOVA revealed that the dermal thickness was significantly larger in all quadrants of the affected breasts compared to the unaffected breasts (Table 1), and the dermal thickness differed, depending on location (Location x Side affected: F = 6.62; p < 0.01). Post hoc analysis of the unaffected breast data revealed a small but significant difference in dermal thickness among the four quadrants, with the lateral quadrant presenting with the least thickness and the medial and inferior quadrants with the greatest thickness. In contrast, analysis of only the affected breast data revealed that the dermal thicknesses for the medial and inferior quadrants were significantly greater than that for the superior and lateral quadrants (Fig. 3). Age was shown to have a negative medium correlation with dermal thickness in the unaffected breast in the medial (r = −0.41, p < 0.01), inferior (r = −0.49, p < 0.01), and lateral quadrants (r = 0.37, p = 0.02). No correlation was found between age and dermal thickness in the superior quadrant.

FIG. 3.

Comparison of mean dermal thickness for each quadrant for the unaffected (open circles) and affected (filled circles) breasts. Lines indicate no significant difference on post hoc analysis.

Table 1.

Average Dermal Thickness (in Millimeters) for Each Breast Quadrant for the Affected and Unaffected Breasts (n = 38)

	Affected Mean (SD)	Unaffected Mean (SD)	Difference Mean (SD)
Superior	2.4 (0.9)	1.4 (0.2)	1.0 (0.9)
Medial	3.7 (1.6)	1.6 (0.3)	2.1 (1.5)
Inferior	3.9 (1.6)	1.5 (0.3)	2.4 (1.6)
Lateral	2.8 (1.4)	1.2 (0.2)	1.6 (1.4)

SD, standard deviation.

The last step was to determine what would be an appropriate threshold for differentiating between tissue with and without lymphedema. As the dermal thickness differed significantly from the superior and lateral quadrants compared to the inferior and medial quadrants, receiver-operating characteristics curve (ROC) analysis was conducted separately for these two regions. For the superior/lateral quadrants, ROC analysis identified 1.6 mm as the cut-point for differentiating between tissue with and without lymphedema, with a sensitivity of 84% and specificity of 91%. For the medial/inferior quadrants, ROC analysis identified 2.0 mm as the cut-point for differentiating between tissue with and without lymphedema, with a sensitivity of 90% and specificity of 92%. However, 11% (16/152) of the average measurements on the unaffected breast were greater than relevant thresholds. Notably, 10 of the 38 participants exceeded the threshold for the superior quadrant, whereas three exceeded the thresholds for the medial and inferior quadrants; none exceeded the thresholds for the lateral quadrant. In addition, the threshold was exceeded on the unaffected side in only one quadrant in nine participants, two quadrants in two participants, and three quadrants in one participant. In contrast, on the affected side, the threshold was exceeded in all 4 quadrants in 28 participants, in three quadrants in four participants, two quadrants in two participants, and one quadrant in three participants. Within the affected breast, there was little difference among which quadrants were affected, with the inferior quadrant most affected (92%; n = 35) and the lateral quadrant least affected (82%; n = 31). Normatively determined thresholds, set at 2SD and 3SD above the mean, were also determined for each quadrant. The resulting thresholds were, generally, higher than those determined by ROC analysis and the sensitivity and specificity lower (Table 2).

Table 2.

Comparison of Three Methods of Determining Thresholds for the Classification of Lymphedema

	Superior quadrant	Medial quadrant	Inferior quadrant	Lateral quadrant
ROC analysis
Threshold	1.6 mm	2.0 mm	2.0 mm	1.6 mm
Sensitivity	84%	90%	90%	84%
Specificity	91%	92%	92%	91%
2SD above normative mean
Threshold	1.9 mm	2.2 mm	2.1 mm	1.5 mm
Sensitivity	68%	83%	83%	88%
Specificity	74%	97%	95%	83%
3SD above normative mean
Threshold	2.1 mm	2.5 mm	2.4 mm	1.8 mm
Sensitivity	60%	80%	80%	72%
Specificity	87%	87%	87%	74%

ROC, receiver-operating characteristics curve.

Conclusions

The use of ultrasound as an assessment tool for lymphedema is becoming increasingly widespread.¹⁹ With increases in the number of women undergoing excisional procedures combined with radiotherapy,³ the incidence of breast lymphedema may be increasing; objective ways of assessing this condition are necessary. While previous studies have used ultrasound to assess lymphedema of the breast secondary to treatment for breast cancer,⁴ this is the first study to show the robustness of the measurement protocol. The intraimage reliability was excellent as the intra- and inter-rater reliability of the measurement of dermal thickness. This reliability was found with assessors who did not have a background in sonography but underwent training in the measurement and interpretation of ultrasound images. Furthermore, the measurement of dermal thickness was able to distinguish between breasts with and without edema; a dermal thickness of >2 mm in the medial or inferior quadrants of the breast or a dermal thickness of >1.6 mm in the superior and lateral quadrant of the breast should be considered indicative of lymphedema. These thresholds were found to have excellent sensitivity and specificity.

In agreement with our findings, previous research has found the thickness of normal breast skin to vary between 1 and 2 mm.^4,6 Similarly, in the arm, increasing skin thickness has been found to be correlated with increasing severity of lymphedema.¹⁵ However, even in the unaffected breast there is variation in the dermal thickness across the breast,⁴ supporting the importance of a systematic assessment of each quadrant of both the affected and unaffected breasts to determine the extent of swelling.³

We found greater variability in the affected breast than the unaffected breast as well as more images that could not be assessed. A more disordered dermal-fascial border in regions with lymphedema has previously been report for both the breast⁶ and the leg.¹⁰ Naouri et al.¹⁰ reported that 68% of the images taken from individuals with lymphedema had a disorganized boundary region and that this was, in fact, a distinguishing feature of lymphedema from lipedema and control legs, which did not show this feature. With such precise measurements, it may have been expected that in the current study this disorganization would have a negative impact on reliability; however, although a small percentage of images were not assessable due to the disorganization, the excellent reliability found suggested this did not have a significant impact. Future research is required to determine the significance of this lack of uniformity and its presentation over time or in response to treatment.

Ultrasound imaging has been recommended for measurement and evaluation of the effectiveness of treatments for breast lymphedema.⁸ The basis for this recommendation is that changes in dermal thickness seen on ultrasound of the breast are not reflective of tissue changes due to radiotherapy and are, in fact, indicative of edema.⁴ The current study has advanced use of ultrasound for diagnosing lymphedema by determining an appropriate threshold for detection of lymphedema in the breast as well as establishing the intra- and inter-rater reliability. However, a limitation of this study is that the ethnicity of participants was not recorded; it is therefore unclear whether these results would apply to those of varying ethnicity. As ethnicity has been shown to impact on breast-related factors such as tissue density,²⁰ future research is required to elucidate these relationships. Age, which similarly impacts on tissue density,²⁰ was correlated with dermal thickness. With only 38 participants in a relatively narrow age range (36–70 years), the impact of this correlation also requires future study. Finally, the relationship between increasing dermal thickness and other clinical signs and symptoms of breast lymphedema is unknown. Low correlations between changes in dermal thickness and volume changes in the arm following treatment of arm lymphedema²¹ suggest that dermal thickness may provide another element of lymphedema not captured by measures of volume. Longitudinal studies are necessary to fully understand the contribution of ultrasound in diagnosing and monitoring breast lymphedema.

In conclusion, ultrasound measurement of breast dermal thickness appears to be a useful and reliable tool for the assessment of breast lymphedema secondary to treatment for breast cancer. Future research is required to further understand the presentation of images and the relationship with clinical signs and symptoms of breast edema.

Footnotes

Acknowledgment

The research presented in this study was funded by the National Health and Medical Research Council (Australia).

Author Disclosure Statement

No competing financial interests exist.

References

Jahr

, Schoppe

, Reisshauer

. Effect of treatment with low-intensity and extremely low-frequency electrostatic fields (Deep Oscillation) on breast tissue and pain in patients with secondary breast lymphoedema. J Rehab Med, 2008; 40:645–650.

Sierla

, Lee

, Black

, Kilbreath

. Lymphedema following breast cancer: Regions affected, severity of symptoms and benefits of treatment from the patient's perspective. Clin J Oncol Nursing, 2013; 17:325–331.

Degnim

, Miller

, Hoskin

, et al. A prospective study of breast lymphedema: Frequency, symptoms, and quality of life. Breast Cancer Res Treat, 2012; 134:915–922.

Wratten

, O'Brien

, Hamilton

, Bill

, Kilmurray

, Denham

. Breast edema in patients undergoing breast-conserving treatment for breast cancer: Assessment via high frequency ultrasound. Breast J, 2007; 13:266–273.

Wratten

, Kilmurray

, Wright

, et al. Pilot study of high-frequency ultrasound to assess cutaneous oedema in the conservatively managed breast. Int J Cancer, 2000; 90:295–301.

Ronka

, Pamilo

, von Smitten

KAJ

, Leidenius

MHK

. Breast lymphedema after breast conserving treatment. Acta Oncologica, 2004; 43:551–557.

Kovacs

, Edera

, Hollweck

, et al. Comparison between breast volume measurement using 3D surface imaging and classical techniques. Breast, 2007; 16:137–145.

Tassenoy

, De Mey

, De Ridder

, et al. Postmastectomy lymphoedema: Different patterns of fluid distribution visualised by ultrasound imaging compared with magnetic resonance imaging. Physiotherapy, 2011; 97:234–243.

Devoogdt

, Pans

, De Groef

, et al. Postoperative evolution of thickness and echogenicity of cutis and subcutis of patients with and without breast cancer-related lymphedema. Lymphatic Res Biol, 2014; 12:23–31.

10.

Naouri

, Samimi

, Atlan

, et al. High-resolution cutaneous ultrasonography to differentiate lipoedema from lymphoedema. Br J Dermatol, 2010; 163:296–301.

11.

Suehiro

, Morikage

, Murakami

, et al. Subcutaneous tissue ultrasonography in legs with dependent edema and secondary lymphedema. Ann Vascul Dis, 2014; 7:21–27.

12.

Grainger

, Hide

, Elliott

. The ultrasound appearances of scrotal oedema. Eur J Ultrasound, 1998; 8:33–37.

13.

Han

N-M

, Cho

Y-J

, Hwang

J-S

, Kim

H-D

, Cho

G-Y

. Usefulness of ultrasound examination in evaluation of breast cancer-related lymphedema. Ann Rehab Med, 2011; 35:101–109.

14.

Hwang

, Lee

, Kim

. A new soft tissue volume measurement strategy using ultrasonography. Lymph Res Biol, 2014; 12:89–94.

15.

Mellor

, Bush

, Stanton

, Bamber

, Levick

, Mortimer

. Dual-frequency ultrasound examination of skin and subcutis thickness in breast cancer-related lymphedema. Breast J, 2004; 10:496–503.

16.

Dylke

, Yee

, Ward

, Foroughi

, Kilbreath

. Normative volume difference between the dominant and nondominant upper limbs in healthy older women. Lymph Res Biol, 2012; 10:182–188.

17.

Dylke

, Schembri

, Bailey

, et al. Diagnosis of upper limb lymphedema: Development of an evidence-based approach. Acta Oncol (Stockholm, Sweden), 2016:1–7.

18.

Cornish

, Chapman

, Hirst

, et al. Early diagnosis of lymphedema using multiple frequency bioimpedance. Lymphology, 2001; 34:2–11.

19.

Rockson

. Ultrasonography in the evaluation of breast cancer-related lymphedema. Lymph Res Biol, 2016; 14:1.

20.

Heller

, Hudson

, Wilkinson

. Breast density across a regional screening population: Effects of age, ethnicity and deprivation. Br J Radiol, 2015; 88:20150242.

21.

Hacard

, Machet

, Caille

, et al. Measurement of skin thickness and skin elasticity to evaluate the effectiveness of intensive decongestive treatment in patients with lymphoedema: A prospective study. Skin Res Technol, 2014; 20:274–281.