Abstract
Background:
Circumferential measurements (CMs) every 4th cm are commonly used to assess lower limb volume (LLV), but fewer measurements would be less time-consuming. The aim of this study was therefore to establish the agreement between LLV measurements derived from CM every 4th cm (V4), 8th cm (V8), and 12th cm (V12), and to evaluate the intrarater test–retest reliability for each of the three measurement methods in persons with lower limb lymphedema (LLL).
Methods and Results:
Forty-two persons with unilateral or bilateral LLL were measured twice, 2 weeks apart. Volume measurements for the V4, V8, and V12 methods were derived using CM. The agreement was evaluated using intraclass correlation coefficient (ICC3.1) and Bland–Altman graphs including 95% limits of agreement (LOA). The reliability was evaluated using ICC2.1 and standard error of measurement (SEM%) and smallest real difference (SRD%). The agreement was high for the V4 and V8 methods (ICC 0.999), and for the V4 and V12 methods (ICC 0.998). The graphs revealed slightly higher agreement between the V4 and V8 than between the V4 and V12 methods visualized by the 95% LOA (−117 to 62 and −236 to 132 mL, respectively). For all three measurement methods, the test–retest reliability was high (ICC 0.993–0.995) and the measurement error low (SEM%: 1.2%–1.4% and SRD%: 3.4%–3.8%).
Conclusions:
The higher agreement between the V4 and V8 methods than between V4 and V12, and the high test–retest reliability in LLV measurements support the V8 method to replace the V4 method in persons with LLL.
Background
Lymphedema (LE) is considered a chronic disease characterized by increased volume of the affected limb or limbs. 1 To measure lower limb volume (LLV) and changes over time is therefore essential in persons at risk of or diagnosed with lower limb lymphedema (LLL). Reliable measurements of LLV will help clinicians to diagnose LLL, determine the stage of LE, plan appropriate management, and evaluate effects of treatments. 1
Various measurement methods can be used to assess LLV. The water displacement method (WDM) is standard in upper limb LE, 2 but is not so common in LLL due to bulky equipment, the large amount of water needed, and the extensive cleaning up efforts after use. The advantage with WDM is that the measurements include the entire limb volume including the foot. A disadvantage, however, is that there is limited knowledge about the reliability of LLV measurements. 3
The Perometer, an optoelectronic measurement method for LLV measurements, has a short measurement time4,5 and a high test–retest reliability in healthy persons.5,6 The disadvantages are that the Perometer is expensive and mostly used in specialist clinics and there is lack of knowledge about the reliability for LLV measurements in persons with LLL.
The tape measurement method that is standard for volume measurements in LLL1,2 and circumferential measurements (CMs) every 4th cm along the limb is commonly used.2,5,7,8 The advantages of this method are the low cost, it is easy to clean, and requires limited space. Disadvantages are that CMs every 4th cm is time-consuming, and that the measurement procedure could be challenging (i.e., positioning of the first measurement point, choosing the right angle for the tape on the limb, and tension on the tape). 9 However, by using a standardized measurement procedure, reliable CMs with small measurement errors can be obtained.7,8,10
To use fewer CMs than every 4th cm for LLV measurements would be desirable as this would save time in the clinic. Currently, few studies have evaluated how many CMs are needed for reliable assessments of LLV. Tidhar et al 11 evaluated test–retest reliability of LLV measurements derived from CMs at only 8 predefined points in five persons (two healthy and three with LE) and found that standard error of measurement (SEM) was low. Mayrovitz et al 12 investigated therapy-related changes in LLV measurements based on CMs every 4th cm, 8th cm, and 12th cm and found no difference in volume reductions regardless of the method used. However, when investigating if a new measurement method can replace an old one,13,14 several statistical methods are recommended such as calculating agreement, mean differences, and measurement errors. 14
Therefore, the aim of this study was to: (1) establish the agreement between LLV measurements derived from CMs every 4th cm (V4 method; reference standard) with CMs every 8th cm (V8 method) and every 12th cm (V12 method), and to (2) evaluate the intrarater test–retest reliability for each of the three methods in persons with LLL.
Methods
Research design
A test–retest design was used in the present study.
Participants
Between April 2018 and March 2019, 42 persons with LLL were recruited from the LE unit at Skåne University Hospital. The inclusion criteria were as follows: (1) 18 years or older; (2) a diagnosis of unilateral or bilateral primary or secondary LLL (assessed by lymphoscintigraphy and/or a medical specialist); (3) persistent LE for the last 6 months; (4) a total limb volume variation ≤5% for each limb the last 6 months; (5) treatment with compression stockings daytime or day and night according to usual care; and (6) a compression garment not older than 2 months at the time of inclusion.
The exclusion criteria were as follows: (1) comorbidity such as heart failure, kidney disease, or venous insufficiency that could affect swelling of the LLs; (2) prosthetic knee or hip implants; (4) muscular disorders of the LLs; (5) intake of diuretic medication or any other drug that may interfere with the volume of the LLs; and (6) inability to understand written or oral information in Swedish.
Ethics
Before inclusion, all participants received written and oral information about the study and provided written consent to participate. The study was approved by the Regional Ethics Committee Review Board in Lund, Sweden (Dnr 2016/136).
Measurements
Body mass index was calculated (kg/m2) using the weight measured on a digital scale and the body height reported by each participant.
Experience of heaviness and tightness in the LE limb/limbs over the past week was rated using a 100-mm visual analogue scale 15 ranging from “no discomfort” (0 mm) to “worst imaginable discomfort” (100 mm).16–18
Thickness of the subcutaneous tissue 19 of the LLs was assessed with the subject in the supine position with bent knees. Palpation of the tissue using the thumb and index finger was performed at the dorsal, lateral, and medial side of the lower part of the limbs and lateral, anterior, and medial side of the upper part of the limbs.8,10 Increased thickness was noted as yes or no.
Leisure time physical activity during the last 6 months was rated using a six graded classification system, 20 ranging from very low to regular/very strenuous activity. This classification system has been validated for a Scandinavian population. 20
Lower limb volume was calculated using the tape measurement method and CMs every 4th cm (V4 method), every 8th cm (V8 method), and every 12th cm (V12 method) along the limb. The following formula for a truncated cone was used:
To ensure that the same limb length measurement was used for all methods, the length for the V4 method was used as a preference. The most proximal volume segment was therefore converted to either a 4 cm segment or an 8 cm segment for the V8 method or the V12 method (Table 1). To analyze volume measurements in both limbs for all participants, the limb with the larger volume was referred to as the more affected (MA) limb and the limb with the smaller volume was referred to as the less affected (LA) limb in participants with bilateral LE. For participants with unilateral LE, the affected limb was referred to as the MA limb and the nonaffected limb was referred to as the LA limb.
Total Length of the Lower Limbs and Number of Measuring Points for Circumferential Measurements for the V4, V8, and V12 Methods
CMs, circumferential measurements; V4, CMs every 4th cm; V8, CMs every 8th cm; V12, CMs every 12th cm.
Procedure
Measurements of body weight and LLV were performed at two occasions, 2 weeks apart, by an experienced physiotherapist (C.J.). The participants were asked to maintain their normal activity level the day before each test occasion. The measurements took place during the morning about the same time, and with the same procedure each time, starting with the body weight and then 10 minutes of rest in a supine position followed by the CMs.
To identify and mark the measuring points for the CMs, a 110-cm-long measuring board, a 20-cm-long ruler, a narrow measuring tape, and a water-soluble pen were used. To ensure the right position of the LL, the foot and heel were placed against the footplate so that the length measurements on the lateral side of the board were visible (Fig. 1A). The short ruler was used to correctly position the measuring points by reading the length measurements on the board and to mark the points on the limb with 4-cm intervals starting 10 cm above the heel and ending near the groin (Fig. 1B).8,10

For the CMs, the measuring tape was placed around the limb tight to the skin with the tape slightly overlapping the measuring point (Fig. 1C). The measure to the nearest millimeter was taken once at each marking.8,10 Only the total limb length measure was available at the second test occasion.
To characterize the participants, the following measurements were conducted on the first test occasion: body height and weight, ratings of heaviness and tightness, thickness of the subcutaneous tissue, and leisure time physical activity status.
Statistics
For statistical analysis, IBM (Armonk, NY) SPSS Statistics version 24 was used. Demographics and clinical characteristics of the participants are presented as frequencies, means, and standard deviations (SDs). LLV measurements for each of the three methods on both test occasions are presented as means and SDs.
For the agreement analysis, data from the first test occasion were used. The agreement between the V4 and V8 methods and between the V4 and V12 methods was analyzed by intraclass correlation coefficient (ICC3.1), and by quantifying the differences between the methods using Bland–Altman graphs. In the graphs, the difference between two methods was plotted against the mean of the two methods to visually demonstrate any systematic bias and outliers. 14 The 95% limits of agreement (LOA) were also calculated where 95% of the differences between the measurements by the two methods are expected to lie. 14
The intrarater test–retest reliability was analyzed by ICC2.1. According to Fleiss,
21
ICC values ≤0.40 represent poor reliability and ≥0.75 represent excellent reliability. Changes in the mean were analyzed by calculating mean differences (
The SRD represents the limit for the smallest change that indicates a real change for a single person and is defined as follows: SRD = 1.96 × SEM ×
Results
Participants
Characteristics of the 42 participants (30 women and 12 men) are presented in Table 2. Thirty of them (71%) had secondary cancer-related LLL. Unilateral LE was present in 24 participants (57%), while 18 (43%) had bilateral LE. The duration of the LE was on average 11 (SD 8) years. In the MA limb, thickness of the subcutaneous tissue was present in the lower leg (n = 35) and in the thigh (n = 33). A feeling of heaviness during the past week was reported in 18 participants (43%), while 8 (19%) reported a feeling of tightness. Half of the participants (n = 22) were working, and the level of physical activity varied widely.
Characteristics of the 42 Participants with Lower Limb Lymphedema
LA, less affected;
In Table 3, the mean values (SD) of LLV measurements for the V4, the V8, and the V12 methods are presented. On average, there were 14 days (SD 2) between the two test occasions.
Lower Limb Volume Measurements on Two Test Occasions in 42 Participants with Lower Limb Lymphedema Using the V4, V8, and V12 Methods
Agreement between the measurement methods
In Table 4, agreement between the three measurement methods is presented. For the V4 and V8 methods, ICC was 0.999 and the 95% CI was narrow. The mean difference for the MA limb was −31 mL (95% CI −43 to −18) and for the LA limb −28 mL (95% CI −42 to −13).
Agreement Between Measurements for the V4 and V8 Methods, and Between the V4 and V12 Methods, in the More Affected Limb and Less Affected Limb, Respectively, in 42 Persons with Lower Limb Lymphedema
CI, confidence interval;
For the V4 and V12 methods, ICC was 0.998 for the MA and LA limbs, respectively, and the 95% CIs were narrow. The mean difference for the MA limb was −35 mL (95% CI −61 to −9) and for the LA limb −52 mL (95% CI −81 to −23).
The Bland–Altman graphs (Fig. 2A–D) revealed that the variability between the V4 and V8 methods (Fig. 2A, B) and the V4 and V12 methods (Fig. 2C, D) was small. No systematic relationship between the differences was revealed, or no increase in variability for larger volumes was disclosed. For the V4 and V8 methods, the 95% LOA ranged between −117 and 62 mL for the MA and LA limbs, respectively. For the V4 and V12 methods, the 95% LOA ranged between −236 and 132 mL for the MA and LA limbs, respectively.

Bland–Altman graphs where the differences between the V4 and V8 methods
Intrarater test–retest reliability
In Table 5, test–retest reliability data are presented for the V4, V8, and V12 methods. The ICCs ranged from 0.993 to 0.995 and the 95% CIs were narrow for all methods. The
Intrarater Test–Retest Reliability of Volume Measurements for the V4, V8, and V12 Methods, in the More Affected Limb and Less Affected Limb in 42 Persons with Lower Limb Lymphedema
Discussion
To the best of our knowledge, this is the first study showing that fewer CMs than every 4th cm can be used for LLV measurements in persons with mild-to-moderate LLL without compromising reliability. Overall, the agreement was high between all measurement methods, but slightly higher between the V4 and V8 methods than between the V4 and V12 methods, and the test–retest reliability was equally high for all three methods. The analyses indicate that the V8 method, in particular, can detect clinically relevant changes in LLV measurements similar to the V4 method. The V8 method can thus replace the V4 method when assessing LLV in persons with mild-to-moderate LLL.
The agreement between the V4 and V8 methods and between the V4 and V12 methods was very high (ICC 0.998–0.999), and the differences between the methods were low as visualized by the narrow 95% LOA (−117 to 62 and −236 to 132 mL, respectively). The Bland–Altman graph has been used in previous studies where agreement between the LLV measurements derived from different measurement methods has been evaluated.25,26 In these studies, a lack of agreement was found in most of the comparisons due to the wide LOA. However, for the comparison of the WDM and the tape measurement method using CMs every 3rd cm in healthy lower legs, the LOA were narrow, indicating that the more practical tape measurement method could be used instead of the WDM. 25
Even though the 95% LOA are a matter of clinical consideration, the LOA in our study for the V4 and V8 methods are in line with that of Sukul et al, 25 supporting that the agreement between these two methods is acceptable.
Furthermore, the test–retest reliability was very high for all three measurement methods in our study (ICC 0.993–0.995). The findings agree with the results in LLs of healthy persons (ICC 0.99) 8 and in persons with LLL (ICC 0.99) 10 using CMs every 4th cm for volume measurements. The findings are also in line with the results of persons with breast cancer-related upper limb LE (ICC 0.97–0.99)27–29 indicating that the tape measurement method is a reliable measurement method to assess volume of the upper and LL.
In our reliability analysis, several statistical methods were used as being recommended. 22 The measurement errors (SEM% and SRD%) for the V4, V8, and V12 methods were very low, and in agreement with studies in healthy women and men (SEM% 1.1–1.3%; SRD 3.1–3.6%). 8 The measurement errors presented in the reliability studies of upper limb volume measurements27–29 were in absolute values. Relative values would have been preferred as this would have enabled comparisons between studies and facilitated the interpretation for clinical use. 23
For a measuring method to be useful in the clinic, not only reliability but also time efficiency and feasibility must be considered.29,30 De Vrieze et al 29 concluded that the tape measurement method using CMs every 4th cm was the best measuring method considering reliability and measurement error, cost, limitations, and time consumption when different volume measurement methods in upper limb LE were compared. Surprisingly, there are very few studies focusing on these aspects even though the tape measurement method is commonly used in LLL.
In our study, between 16 and 19 measuring points were used for the V4 method, and this is rather time-consuming. The number decreased to 8–10 points for the V8 method and to 6–7 points for the V12 method. Hence, with the V8 method instead of the V4 method, fewer CMs will be used, and the time required for this measurement method becomes shorter. Brorson and Höijer 31 evaluated upper limb volume measurements based on CMs every 4th cm and CMs at 5 measuring points equivalent to a made-to-measure compression garment. They concluded that volume measurements derived from these two measurement methods did not differ significantly, and by using the same 5 measuring points for volume and for a made-to-measure compression garment, time will be saved.
To evaluate LLV measurements based on CM at measuring points equivalent to a made-to-measure compression garment may also shorten the measurement time, but this measurement method must be investigated using a comprehensive set of statistical analysis such as those used in our study.
Strengths and limitations
A strength of the present study was that 42 participants were included. In reliability studies, a sample size of at least 30 is recommended.21,32 Another strength was the use of a highly standardized test protocol that had previously been evaluated8,10 and the comprehensive set of statistical methods for the agreement analysis 14 and the test–retest reliability,22–24 enabling a comparison between reliability studies in the future. A limitation of the present study was that LLV measurements were based only on CMs on the limb and not on the foot. A single CM on the foot has been shown to be reliable, 33 but to calculate the volume of the foot using the tape measurement method is difficult due to the foot's irregular configuration.
Another limitation is that the result is based on persons with mild-to-moderate LLL. To use the V8 method for LLV measurements in persons with severe LLL needs further investigation as there may be a risk of larger variability in larger volume measurements over time.
Conclusion
The agreement was high between all measurement methods, but slightly higher between the V4 and V8 methods than between the V4 and V12, and the test–retest reliability was equally high for all three methods. The V8 method can thus replace the V4 method when assessing LLV in persons with mild-to-moderate LLL.
Footnotes
Acknowledgments
We thank the persons who volunteered to participate and RPT, PhD, Michael Miller for language editing.
Authors' Contributions
Conceptualization, methodology, formal analysis, and writing—original draft (C.J.). Conceptualization, methodology, and writing—review and editing (K.J., M.B., and C.B.).
Author Disclosure Statement
No competing financial interests exist.
Funding Information
This study was supported by grants from the Swedish Cancer Society (CAN 2015/443).
