Abstract
Abstract
Background:
Lymphedema of the upper extremity is a common side effect of lymph node dissection or irradiation of the axilla. Several techniques are being applied in order to examine the presence and severity of lymphedema. Measurement of circumference of the upper extremity is most frequently performed. An alternative is the water-displacement method. The aim of this study was to determine the reliability and the reproducibility of the “Inverse Water Volumetry apparatus” (IWV-apparatus) for the measurement of arm volumes.
Patients and Methods:
The IWV-apparatus is based on the water-displacement method. Measurements were performed by three breast cancer nurse practitioners on ten healthy volunteers in three weekly sessions.
Results:
The intra-class correlation coefficient, defined as the ratio of the subject component to the total variance, equaled 0.99. The reliability index is calculated as 0.14 kg. This indicates that only changes in a patient's arm volume measurement of more than 0.14 kg would represent a true change in arm volume, which is about 6% of the mean arm volume of 2.3 kg.
Conclusion:
The IWV-apparatus proved to be a reliable and reproducible method to measure arm volume.
Introduction
P
The incidence of arm lymphedema ranges from 0%–13% in breast cancer patients who underwent a SLNB and from 7%–77% after ALND.1–9
This wide range is mainly due to the effect of the different definitions and measurement methods that are used to diagnose lymphedema.
10
Commonly used methods to assess and to grade the presence of lymphedema are:
the Herpertz method:
11
This method is a combination of a four-point circumference measurement of both limbs. Afterwards a calculation is made of the relative proportional swelling compared to the healthy side. circumference measurement at various levels on the both arms: A difference between both arms of 2.0 cm or more, or sometimes 2.5 cm or more, is used to define lymphedema.4,12–16 questionnaires to determine whether patients consider themselves as having lymphedema.6,8,17,18
Other methods based on perometry, 19 ultrasonography, 10 magnetic resonance imaging, 20 and scintigraphy 21 were not viable in current medical practice.
Another not widely used method is the “Inverse Water Volumetry” (IWV). This technique is based on the water-displacement method22–26 and is performed by using an IWV-apparatus (Fig. 1). The water-displacement method is considered to be the most accurate method because it is easy to perform, painless, and the possibility of direct measurement of objects with an irregular form.23–25,27 Currently, a multicenter trial is performed in the Netherlands with this IWV-apparatus to measure the presence of lymphedema. In this trial breast cancer patients undergoing axillary reverse mapping (ARM), a new type of axillary surgery that is expected to decrease upper extremity lymphedema. 28 In this trial, lymphedema is defined as a 10% difference between the preoperative and the postoperative arm volume.

The Inverse Water Volumetry apparatus
The aim of this study is to assess the reliability and the reproducibility of the IWV-apparatus for measuring arm volumes.
Patients and Methods
Study cohort
In this experiment ten healthy volunteers (5 males, 5 females) participated. All volunteers were medical students. Three observers (breast cancer nurse practitioners) performed all measurements and none of them acted as a subject or vice versa. All participants gave informed consent. Ethical approval for this study was provided by the local ethical commitee of the Amphia Hospital. The study was conducted in accordance with the Declaration of Helsinki.
Inverse Water Volumetry apparatus 29
Archimedes' principle indicates that the upward buoyant force that is exerted on a body immersed in a fluid is equal to the weight of the fluid that the body displaces. The IWV-apparatus is based on Archimedes' principle (Fig. 1).
Prior to use, calibration to zero is performed by filling the water tank until the reference point where the water flows into the overflow tube. The overflow tube is emptied, after which the patient's arm is placed into the water tank. When the whole system is in equilibrium, patient's arm is removed and the display of the weighting device shows the shortness of water compared with the initial situation. Because the “lack of water” is measured, the method is named Inverse Water Volumetry”. 29
Measurements
The observers performed the arm volume measurements with the IWV-apparatus during three weekly sessions. At each session the subjects were separately measured by each of the three observers. Sessions were scheduled at the same time of day for each subject. The subjects were measured in a random order each session. For each separate measurement, the three observers were given a blank form to fill in the measurements results, the temperature of the water, and the room temperature. It was a double-blind study design; the observers were not able to see their preceding or each other's measurements and the subjects were not able to see their own outcomes or those of the other subjects.
Statistical analysis
The raw volume data was summarized through means and standard deviations by session (three sessions) and observer (three observers). After averaging the subject's volume measurements over the three observers, the volume trajectories of the ten subjects across the three sessions were depicted in a figure. In each subject, the average of the nine volume measurements and the nine deviations from that average were calculated. Then those deviations from average were plotted against the average subject's volumes using all subjects. In the resulting scatterplot, reference lines were drawn calculated as ±1.96 times the crude (unadjusted) estimate of the within-subject standard deviation of the volume measurements.
A formal statistical analysis of the data was performed using linear mixed modeling in the total dataset of 90 observations. The following fixed effects were specified in order to account for potential systematic differences in measurements between sessions: water temperature (degrees centigrade), room temperature (degrees centigrade), and session (a three-level categorical nominal variable). Given these fixed effects, the following sources of variability (random effects) were considered for decomposing the total variance of the measurements: the subjects (1), the observers (2). and the interactions when combining these sources: observer × subject (3), observer × session (4), and subject × session (5). The remaining error component then is the interaction subject × observer × session (6). These effects were supposed to stand for the variance components, respectively: between-subjects (source (1)), between-observers (source (2)), within-observers (sources (3) and (4)), within-subjects (source (5)), and the error inherent in functioning of the measurement device (source (6)). Total variance is the sum of these six components.
The fixed and random effects were estimated using a restricted maximum likelihood method (REML). Reliability is represented by the intra class correlation coefficient (icc), defined as the ratio of the first component (the subjects, source (1)) to the total variance. The icc is a dimensionless number between 0 and 1, representing the similarity of replicate measurements on the same subject. Clinically it may be more useful to define a reliability measure in the same units as the measurement involved (kilogram water). To this end, a reliability index is defined as 1.96 times the square root of twice the sum of the remaining components (sources (2) to (6)), assembling all within-subject random components. It represents the maximum absolute difference in kilogram water allowed (on the 95% level) between two consecutive measurements of the same subject with an unchanged arm volume.
Results
The mean age of the ten subjects was 27 years and their mean Body Mass Index (BMI) was 21. Baseline characteristics are shown in Table 1.
Summary statistics of measured volumes by observer and session are shown in Table 2. Notably, one volunteer was not able to attend the second session. Mean volumes appeared to be systematically lower in session 3 than in sessions 1 and 2 for each observer.
Figure 2 shows the volume trajectories (averaged over three observers) of all subjects. All subjects had a lower average volume at session 3 than at session 1. No intersections of the trajectories were seen, meaning that all subjects kept their relative position in the distribution of volumes across sessions, indicating a high reliability.

Measured volumes (averaged across three observers) by subject by session (X=missing value).
Figure 3 shows the 9 deviations from mean volume per subject plotted against the subject's mean volume in all subjects. The vertical scatter of this plot represents the crude within-subject variability. As expected, almost all points were lying between±0.125 kg, equalling 1.96 times the crude within-subject standard deviation of 0.064 kg. From this plot there was no suspicion of any pattern in the variability of volume measurements in relation to volume level. No adjustment was made here for fixed effects of temperature and session on volume.

Deviation from mean volume plotted against mean volume (kg water). The upper and lower horizontal lines were drawn at a distance from the mean of 1.96 times the crude unadjusted within-subject standard deviation of 0.064.
The results of the linear mixed model analysis are shown in Table 3. Three missing observations due to one subject not attending session 2 were appropriately dealt with by using the restricted maximum likelihood method for estimating the fixed and random effects. It appeared that room temperature and water temperature did not have significant effects on the measured volumes. However, adjusted for those temperatures, the effect of the different session on measured volumes was significant (p=0.001). Measurements at the third session were systematically and significantly lower than those at the preceding sessions: 0.10 kg lower than session 1 and 0.08 kg lower than session 2, corroborating the conclusions from Table 2 and Figure 2.
Fixed effects in kg; random effects in squared kg.
ICC=(1)/(7)=0.99.
After correction for these systematic effects, reliability of volume measurements can be judged by calculating the intra class correlation coefficient (icc) as the ratio of the subject component (1) to the total variance (7) resulting in a value of 0.99 (=0.20331/0.20597), which is nearly perfect as already illustrated by means of Figure 2. The reliability index was calculated from Table 3 as
The inter-observer component (2) was smaller than 0.005% of the total variance. The intra-observer components (3) and (4) added to 0.15% of total variance and the intra-subjects component (5) was 0.13% of total variance. The remaining error component (6) due to functioning of the measurement device was 1.02% of total variance. These sources of variation were estimated after correcting for the fixed effects mentioned in the upper part of Table 3.
Discussion
In this study the use of measurements with the IWV-apparatus for the measurement of repeated arm volumes can be considered a reliable and reproducible method. There was no inter-observer variability and hardly any intra-observer and intra-subject variability.
The intra-class correlation coefficient defined as the ratio of the inter-subject component to the total variance reached the almost perfect value of 0.99, which is similar to the results reported in a study by Damstra et al. 29 The reliability index of the measurements with the IWV-apparatus was calculated as 0.14 kg. This means that only absolute changes in a patient's arm volume measurement of more than 0.14 kg would represent a true change in arm volume, which is about 6% of the mean arm volume of 2.3 kg. This reliability index can be considered a generalization to more than two replicate measurements of the limits of agreement as introduced by Bland and Altman 30 for the comparison of two measurement methods. In Bland and Altman, the limits are calculated around the fixed mean difference between the two methods. In our case, the reliability index represents limits calculated around zero, in order to meet the assumption of the replicate measurements on the subject being taken under the same conditions. The linear mixed model can be considered a general tool for both types of studies.
A puzzling result was the decrease in arm volume in all subjects at session 3 compared to session 1 (Fig. 2). Although relatively small (less than 5% of the mean volume of 2.3 kg), the mean decrease of 0.10 kg (Table 3) was highly statistically significant because of the relatively small overall within-subject standard deviation. It is very unlikely that the arm volume of all subjects would have decreased during this experiment. Hence, we concluded that this was not a real volume effect. As we did not measure a known dummy control volume along with the other measurements, we were not able to determine if this decrease could either be attributed to an artefact in the study subjects or a technical artefact of the IWV-apparatus. Although rather speculatively, we think that this decrease is the result of a learning effect in the group of subjects who, in the course of the experiment, learned to place their arms more carefully into the water tank.
Taking measurements with the IWV-apparatus have some limitations: it is time-consuming (on average 15 minutes, with most time spent on the preparation); it requires many handlings with water and water tanks; it does not measure the upper part of the upper arm; and the device is expensive.23,24,29 Other methods to measure arm volume do have limitations as well. Brorson and Höijer described (with a limited number of patients) consistent results based on the circumferential measurement method compared to water displacement methods. 16 They found the circumferential measurement method useful in evaluating lymphedema because it is cheap and quick.
In conclusion, the findings of this study suggest that the IWV-apparatus provides reliable and reproducible measurements so that its use for measuring arm volumes can be recommended.
Footnotes
Author Disclosure Statement
The authors declare no conflicting financial interests.
