Reliability of the Inverse Water Volumetry Method to Measure the Volume of the Upper Limb

Abstract

Background:

Lymphedema of the upper extremity is a common side effect of lymph node dissection or irradiation of the axilla. Several techniques are being applied in order to examine the presence and severity of lymphedema. Measurement of circumference of the upper extremity is most frequently performed. An alternative is the water-displacement method. The aim of this study was to determine the reliability and the reproducibility of the “Inverse Water Volumetry apparatus” (IWV-apparatus) for the measurement of arm volumes.

Patients and Methods:

The IWV-apparatus is based on the water-displacement method. Measurements were performed by three breast cancer nurse practitioners on ten healthy volunteers in three weekly sessions.

Results:

The intra-class correlation coefficient, defined as the ratio of the subject component to the total variance, equaled 0.99. The reliability index is calculated as 0.14 kg. This indicates that only changes in a patient's arm volume measurement of more than 0.14 kg would represent a true change in arm volume, which is about 6% of the mean arm volume of 2.3 kg.

Conclusion:

The IWV-apparatus proved to be a reliable and reproducible method to measure arm volume.

Introduction

Postoperative lymphedema of the upper extremity can be a significant problem for women treated for breast cancer. Lymphedema of the upper extremity is mostly seen after extensive axillary surgery (axillary lymph node dissection, ALND) but may also occur after minimal invasive axillary staging (sentinel lymph node biopsy, SLNB).

The incidence of arm lymphedema ranges from 0%–13% in breast cancer patients who underwent a SLNB and from 7%–77% after ALND.^1–9 This wide range is mainly due to the effect of the different definitions and measurement methods that are used to diagnose lymphedema.¹⁰ Commonly used methods to assess and to grade the presence of lymphedema are:

the Herpertz method:¹¹ This method is a combination of a four-point circumference measurement of both limbs. Afterwards a calculation is made of the relative proportional swelling compared to the healthy side.

circumference measurement at various levels on the both arms: A difference between both arms of 2.0 cm or more, or sometimes 2.5 cm or more, is used to define lymphedema.^4,12–16

questionnaires to determine whether patients consider themselves as having lymphedema.^6,8,17,18

Other methods based on perometry,¹⁹ ultrasonography,¹⁰ magnetic resonance imaging,²⁰ and scintigraphy²¹ were not viable in current medical practice.

Another not widely used method is the “Inverse Water Volumetry” (IWV). This technique is based on the water-displacement method^22–26 and is performed by using an IWV-apparatus (Fig. 1). The water-displacement method is considered to be the most accurate method because it is easy to perform, painless, and the possibility of direct measurement of objects with an irregular form.^23–25,27 Currently, a multicenter trial is performed in the Netherlands with this IWV-apparatus to measure the presence of lymphedema. In this trial breast cancer patients undergoing axillary reverse mapping (ARM), a new type of axillary surgery that is expected to decrease upper extremity lymphedema.²⁸ In this trial, lymphedema is defined as a 10% difference between the preoperative and the postoperative arm volume.

FIG. 1.

The Inverse Water Volumetry apparatus

The aim of this study is to assess the reliability and the reproducibility of the IWV-apparatus for measuring arm volumes.

Patients and Methods

Study cohort

In this experiment ten healthy volunteers (5 males, 5 females) participated. All volunteers were medical students. Three observers (breast cancer nurse practitioners) performed all measurements and none of them acted as a subject or vice versa. All participants gave informed consent. Ethical approval for this study was provided by the local ethical commitee of the Amphia Hospital. The study was conducted in accordance with the Declaration of Helsinki.

Inverse Water Volumetry apparatus²⁹

Archimedes' principle indicates that the upward buoyant force that is exerted on a body immersed in a fluid is equal to the weight of the fluid that the body displaces. The IWV-apparatus is based on Archimedes' principle (Fig. 1).

Prior to use, calibration to zero is performed by filling the water tank until the reference point where the water flows into the overflow tube. The overflow tube is emptied, after which the patient's arm is placed into the water tank. When the whole system is in equilibrium, patient's arm is removed and the display of the weighting device shows the shortness of water compared with the initial situation. Because the “lack of water” is measured, the method is named Inverse Water Volumetry”.²⁹

Measurements

The observers performed the arm volume measurements with the IWV-apparatus during three weekly sessions. At each session the subjects were separately measured by each of the three observers. Sessions were scheduled at the same time of day for each subject. The subjects were measured in a random order each session. For each separate measurement, the three observers were given a blank form to fill in the measurements results, the temperature of the water, and the room temperature. It was a double-blind study design; the observers were not able to see their preceding or each other's measurements and the subjects were not able to see their own outcomes or those of the other subjects.

Statistical analysis

The raw volume data was summarized through means and standard deviations by session (three sessions) and observer (three observers). After averaging the subject's volume measurements over the three observers, the volume trajectories of the ten subjects across the three sessions were depicted in a figure. In each subject, the average of the nine volume measurements and the nine deviations from that average were calculated. Then those deviations from average were plotted against the average subject's volumes using all subjects. In the resulting scatterplot, reference lines were drawn calculated as ±1.96 times the crude (unadjusted) estimate of the within-subject standard deviation of the volume measurements.

A formal statistical analysis of the data was performed using linear mixed modeling in the total dataset of 90 observations. The following fixed effects were specified in order to account for potential systematic differences in measurements between sessions: water temperature (degrees centigrade), room temperature (degrees centigrade), and session (a three-level categorical nominal variable). Given these fixed effects, the following sources of variability (random effects) were considered for decomposing the total variance of the measurements: the subjects (1), the observers (2). and the interactions when combining these sources: observer × subject (3), observer × session (4), and subject × session (5). The remaining error component then is the interaction subject × observer × session (6). These effects were supposed to stand for the variance components, respectively: between-subjects (source (1)), between-observers (source (2)), within-observers (sources (3) and (4)), within-subjects (source (5)), and the error inherent in functioning of the measurement device (source (6)). Total variance is the sum of these six components.

The fixed and random effects were estimated using a restricted maximum likelihood method (REML). Reliability is represented by the intra class correlation coefficient (icc), defined as the ratio of the first component (the subjects, source (1)) to the total variance. The icc is a dimensionless number between 0 and 1, representing the similarity of replicate measurements on the same subject. Clinically it may be more useful to define a reliability measure in the same units as the measurement involved (kilogram water). To this end, a reliability index is defined as 1.96 times the square root of twice the sum of the remaining components (sources (2) to (6)), assembling all within-subject random components. It represents the maximum absolute difference in kilogram water allowed (on the 95% level) between two consecutive measurements of the same subject with an unchanged arm volume.

Results

The mean age of the ten subjects was 27 years and their mean Body Mass Index (BMI) was 21. Baseline characteristics are shown in Table 1.

Table 1.

Baseline Characteristics of the Ten Healthy Volunteers (5 Males and 5 Females)

	Mean	Standard deviation	Minimum	Maximum
Age (years)	26.7	2.7	24	32
Weight (kg)	67.5	11.1	53	85
Height (m)	1.78	0.06	1.71	1.86
BMI (kg/m²)	21.3	2.7	17	25

Summary statistics of measured volumes by observer and session are shown in Table 2. Notably, one volunteer was not able to attend the second session. Mean volumes appeared to be systematically lower in session 3 than in sessions 1 and 2 for each observer.

Table 2.

Measurement Results with the “Inverse Water Volumetry Apparatus” (IWV-Apparatus)

Observer	Session	Mean (kg)	N	Standard deviation	Minimum	Maximum
1	1	2.33	10	0.46	1.64	3.07
	2	2.31	9	0.45	1.66	2.90
	3	2.24	10	0.43	1.63	2.81
2	1	2.31	10	0.46	1.64	2.97
	2	2.30	9	0.45	1.60	2.87
	3	2.20	10	0.48	1.52	2.85
3	1	2.30	10	0.45	1.63	2.97
	2	2.33	9	0.45	1.67	2.96
	3	2.24	10	0.46	1.53	2.92

Figure 2 shows the volume trajectories (averaged over three observers) of all subjects. All subjects had a lower average volume at session 3 than at session 1. No intersections of the trajectories were seen, meaning that all subjects kept their relative position in the distribution of volumes across sessions, indicating a high reliability.

FIG. 2.

Measured volumes (averaged across three observers) by subject by session (X=missing value).

Figure 3 shows the 9 deviations from mean volume per subject plotted against the subject's mean volume in all subjects. The vertical scatter of this plot represents the crude within-subject variability. As expected, almost all points were lying between±0.125 kg, equalling 1.96 times the crude within-subject standard deviation of 0.064 kg. From this plot there was no suspicion of any pattern in the variability of volume measurements in relation to volume level. No adjustment was made here for fixed effects of temperature and session on volume.

FIG. 3.

Deviation from mean volume plotted against mean volume (kg water). The upper and lower horizontal lines were drawn at a distance from the mean of 1.96 times the crude unadjusted within-subject standard deviation of 0.064.

The results of the linear mixed model analysis are shown in Table 3. Three missing observations due to one subject not attending session 2 were appropriately dealt with by using the restricted maximum likelihood method for estimating the fixed and random effects. It appeared that room temperature and water temperature did not have significant effects on the measured volumes. However, adjusted for those temperatures, the effect of the different session on measured volumes was significant (p=0.001). Measurements at the third session were systematically and significantly lower than those at the preceding sessions: 0.10 kg lower than session 1 and 0.08 kg lower than session 2, corroborating the conclusions from Table 2 and Figure 2.

Table 3.

Results of Linear Mixed Model for Analyzing Reliability of Measurements With IWV-Apparatus ^*

Variable	Fixed effects (95% CI)	P
Intercept	1.677 (0.861 to 2.492)
Water temperature (+1°C)	−0.003 (−0.006 to +0.000)	0.084
Room temperature (+1°C)	0.027 (−0.007 to +0.061)	0.11
Session
1	0.099 (0.063 to 0.135)	0.001
2	0.076 (0.032 to 0.120)	0.002
3	0

Variance component	Random effects	%
(1) Subject	0.20331	98.71
(2) Observer	0.00000	0.00
(3) Observer×subject	0.00029	0.14
(4) Observer×session	0.00001	0.00
(5) Subject×session	0.00026	0.13
(6) Error	0.00210	1.02
(7) Total variance	0.20597	100

Fixed effects in kg; random effects in squared kg.

ICC=(1)/(7)=0.99.

After correction for these systematic effects, reliability of volume measurements can be judged by calculating the intra class correlation coefficient (icc) as the ratio of the subject component (1) to the total variance (7) resulting in a value of 0.99 (=0.20331/0.20597), which is nearly perfect as already illustrated by means of Figure 2. The reliability index was calculated from Table 3 as \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$1.96 \times \sqrt{2 \times ( 0.20597 - 0.20331 ) }$$ \end{document} =0.14 kg, meaning that the absolute difference between two volume measurements within the same subject with an unchanged arm volume is maximally 0.14 kg at the 95% level. The overall within-subject standard deviation after adjustment for the fixed effects equalled \documentclass{aastex}\usepackage{amsbsy}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{bm}\usepackage{mathrsfs}\usepackage{pifont}\usepackage{stmaryrd}\usepackage{textcomp}\usepackage{portland, xspace}\usepackage{amsmath, amsxtra}\pagestyle{empty}\DeclareMathSizes{10}{9}{7}{6}\begin{document} $$\sqrt {0.20597 - 0.20331}$$ \end{document} =0.052 kg, which is smaller (as it should) than the crude (unadjusted) within-subject standard deviation of 0.064 kg used for the reference lines in Figure 3.

The inter-observer component (2) was smaller than 0.005% of the total variance. The intra-observer components (3) and (4) added to 0.15% of total variance and the intra-subjects component (5) was 0.13% of total variance. The remaining error component (6) due to functioning of the measurement device was 1.02% of total variance. These sources of variation were estimated after correcting for the fixed effects mentioned in the upper part of Table 3.

Discussion

In this study the use of measurements with the IWV-apparatus for the measurement of repeated arm volumes can be considered a reliable and reproducible method. There was no inter-observer variability and hardly any intra-observer and intra-subject variability.

The intra-class correlation coefficient defined as the ratio of the inter-subject component to the total variance reached the almost perfect value of 0.99, which is similar to the results reported in a study by Damstra et al.²⁹ The reliability index of the measurements with the IWV-apparatus was calculated as 0.14 kg. This means that only absolute changes in a patient's arm volume measurement of more than 0.14 kg would represent a true change in arm volume, which is about 6% of the mean arm volume of 2.3 kg. This reliability index can be considered a generalization to more than two replicate measurements of the limits of agreement as introduced by Bland and Altman³⁰ for the comparison of two measurement methods. In Bland and Altman, the limits are calculated around the fixed mean difference between the two methods. In our case, the reliability index represents limits calculated around zero, in order to meet the assumption of the replicate measurements on the subject being taken under the same conditions. The linear mixed model can be considered a general tool for both types of studies.

A puzzling result was the decrease in arm volume in all subjects at session 3 compared to session 1 (Fig. 2). Although relatively small (less than 5% of the mean volume of 2.3 kg), the mean decrease of 0.10 kg (Table 3) was highly statistically significant because of the relatively small overall within-subject standard deviation. It is very unlikely that the arm volume of all subjects would have decreased during this experiment. Hence, we concluded that this was not a real volume effect. As we did not measure a known dummy control volume along with the other measurements, we were not able to determine if this decrease could either be attributed to an artefact in the study subjects or a technical artefact of the IWV-apparatus. Although rather speculatively, we think that this decrease is the result of a learning effect in the group of subjects who, in the course of the experiment, learned to place their arms more carefully into the water tank.

Taking measurements with the IWV-apparatus have some limitations: it is time-consuming (on average 15 minutes, with most time spent on the preparation); it requires many handlings with water and water tanks; it does not measure the upper part of the upper arm; and the device is expensive.^23,24,29 Other methods to measure arm volume do have limitations as well. Brorson and Höijer described (with a limited number of patients) consistent results based on the circumferential measurement method compared to water displacement methods.¹⁶ They found the circumferential measurement method useful in evaluating lymphedema because it is cheap and quick.

In conclusion, the findings of this study suggest that the IWV-apparatus provides reliable and reproducible measurements so that its use for measuring arm volumes can be recommended.

Footnotes

Author Disclosure Statement

The authors declare no conflicting financial interests.

References

Mansel

, Fallowfield

, Kissin

, Goyal

, Newcombe

, Dixon

, Yiangou

, Horgan

, Bundred

, Monypenny

, England

, Sibbering M

, dullah

, Barr

, Chetty

, Sinnett

, Fleissig

, Clarke

, Ell

. Randomized multicenter trial of sentinel node biopsy versus standard axillary treatment in operable breast cancer: The ALMANAC Trial. J Natl Cancer Instit, 2006; 98:599–609.

Blanchard

, Donohue

, Reynolds

, Grant

. Relapse and morbidity in patients undergoing sentinel lymph node biopsy alone or with axillary dissection for breast cancer. Arch Surg, 2003; 138:482–487; discussion 7–8.

Haid

, Koberle-Wuhrer

, Knauer

, Burtscher

, Fritzsche

, Peschina

, Jasarevic

, Ammann

, Hergan

, Sturn

, Zimmermann

. Morbidity of breast cancer patients following complete axillary dissection or sentinel node biopsy only: A comparative evaluation. Breast Cancer Res Treat, 2002; 73:31–36.

Leidenius

, Leivonen

, Vironen

, von Smitten

. The consequences of long-time arm morbidity in node-negative breast cancer patients with sentinel node biopsy or axillary clearance. J Surg Oncol, 2005; 92:23–31.

Ronka

, von Smitten

, Tasmuth

, Leidenius

. One-year morbidity after sentinel node biopsy and breast surgery. Breast, 2005; 14:28–36.

Schijven

, Vingerhoets

, Rutten

, Nieuwenhuijzen

, Roumen

, van Bussel

, Voogd

. Comparison of morbidity between axillary lymph node dissection and sentinel node biopsy. Eur J Surg Oncol, 2003; 29:341–350.

Schrenk

, Rieger

, Shamiyeh

, Wayand

. Morbidity following sentinel lymph node biopsy versus axillary lymph node dissection for patients with breast carcinoma. Cancer, 2000; 88:608–614.

Swenson

, Nissen

, Ceronsky

, Swenson

, Lee

, Tuttle

. Comparison of side effects between sentinel lymph node and axillary lymph node dissection for breast cancer. Ann Surg Oncol, 2002; 9:745–753.

Noguchi

, Miwa

, Michigishi

, Yokoyama

, Nishijima

, Takanaka

, Kawashima

, Nakamura

, Kanno

, Nonomura

. The role of axillary lymph node dissection in breast cancer management. Breast Cancer, 1997; 4:143–153.

10.

Johnson

, Kennedy

, Henry

. Clinical measurements of lymphedema. Lymphat Res Biol, 2014; 12:216–221.

11.

Herpertz

. [Measuring and documentation of edema]. Zeitschrift Lymphol J Lymphol, 1994; 18:24–30.

12.

Farncombe

, Daniels

, Cross

. Lymphedema: The seemingly forgotten complication. J Pain Sympt Manag, 1994; 9:269–276.

13.

Voogd

, Ververs

, Vingerhoets

, Roumen

, Coebergh

, Crommelin

. Lymphoedema and reduced shoulder function as indicators of quality of life after axillary lymph node dissection for invasive breast cancer. Br J Surg, 2003; 90:76–81.

14.

Coen

, Taghian

, Kachnic

, Assaad

, Powell

. Risk of lymphedema after regional nodal irradiation with breast conservation therapy. Intl J Radiat Oncol Biol Phys, 2003; 55:1209–1215.

15.

Armer

, Stewart

. A comparison of four diagnostic criteria for lymphedema in a post-breast cancer population. Lymphat Res Biol, 2005; 3:208–217.

16.

Brorson

, Hoijer

. Standardised measurements used to order compression garments can be used to calculate arm volumes to evaluate lymphoedema treatment. J Plast Surg Hand Surg, 2012; 46:410–415.

17.

Bulley

, Coutts

, Blyth

, Jack

, Chetty

, Barber

, Tan

. A Morbidity Screening Tool for identifying fatigue, pain, upper limb dysfunction and lymphedema after breast cancer treatment: A validity study. Eur J Oncol Nurs, 2014; 18:218–227.

18.

Barranger

, Dubernard

, Fleurence

, Antoine

, Darai

, Uzan

. Subjective morbidity and quality of life after sentinel node biopsy and axillary lymph node dissection for breast cancer. J Surg Oncol, 2005; 92:17–22.

19.

Stanton

, Badger

, Sitzia

. Non-invasive assessment of the lymphedematous limb. Lymphology, 2000; 33:122–135.

20.

te Slaa

, Tetteroo

, Mulder

, Ho

, Vos

, Moll

, va der Laan

. Magnetic resonance imaging reveals edema-like changes not only subcutaneously, but also in muscle tissue after femoropopliteal bypass surgery. Ann Vasc Surg, 2012; 26:233–241.

21.

van der Laan

, Oyen

, Verhofstad

, Tan

, ter Laak

, Gabreels-Festen

, Hendriks

, Goris

. Soft tissue repair capacity after oxygen-derived free radical-induced damage in one hindlimb of the rat. J Surg Res, 1997; 72:60–69.

22.

Kettle

, Rundle

, Oddie

. Measurement of upper limb volumes: A clinical method. Austral New Zea J Surg, 1958; 27:263–270.

23.

Kaulesar Sukul

, den Hoed

, Johannes

, van Dolder

, Benda

. Direct and indirect methods for the quantification of leg volume: Comparison between water displacement volumetry, the disk model method and the frustum sign model method, using the correlation coefficient and the limits of agreement. J Biomed Engineer, 1993; 15:477–480.

24.

Megens

, Harris

, Kim-Sing

, McKenzie

. Measurement of upper extremity volume in women after axillary dissection for breast cancer. Arch Phys Med Rehab, 2001; 82:1639–1644.

25.

Sagen

, Karesen

, Skaane

, Risberg

. Validity for the simplified water displacement instrument to measure arm lymphedema as a result of breast cancer surgery. Arch Phys Med Rehab, 2009; 90:803–809.

26.

Oerlemans

, Goris

, Oostendorp

. Impairment level sumscore in reflex sympathetic dystrophy of one upper extremity. Arch Phys Med Rehab, 1998; 79:979–990.

27.

Gjorup

, Zerahn

, Hendel

. Assessment of volume measurement of breast cancer-related lymphedema by three methods: Circumference measurement, water displacement, and dual energy X-ray absorptiometry. Lymphat Res Biol, 2010; 8:111–119.

28.

Klompenhouwer

, Gobardhan

, Beek

, Voogd

, Luiten

. The clinical relevance of axillary reverse mapping (ARM): Study protocol for a randomized controlled trial. Trials, 2013; 14:111.

29.

Damstra

, Glazenburg

, Hop

. Validation of the inverse water volumetry method: A new gold standard for arm volume measurements. Breast Cancer Res Treat, 2006; 99:267–273.

30.

Bland

, Altman

. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 1986; 1:307–310.

Reliability of the Inverse Water Volumetry Method to Measure the Volume of the Upper Limb

Abstract

Abstract

Background:

Patients and Methods:

Results:

Conclusion:

Introduction

Patients and Methods

Study cohort

Inverse Water Volumetry apparatus 29

Measurements

Statistical analysis

Results

Discussion

Footnotes

Author Disclosure Statement

References

Inverse Water Volumetry apparatus²⁹