Abstract
Objective:
Accuracy in the assessment of feed intake is important for preterm infants at risk of growth failure. Clinical observation tools are unvalidated in this population, and test weight measurement may be inaccurate in preterm infants taking small feed volumes.
Methods:
Test weights were performed to assess agreement between weights using a standardized protocol and a feed of known weight in preterm infants (born at <35 weeks gestational age [GA]) during their transition to oral feeding. Reproducibility was assessed using two repeated measurements in each participant. Agreement between test weights and known feed weights was assessed, and minimal detectable change was calculated.
Results:
Thirty-eight preterm infants (GA 30 + 5 (28–33 + 1), birth weight 1574 g (+/− 671 g)) were recruited and had test weights performed at CGA 35 + 3 (± 10 days). Each infant was weighed twice before and twice after each measured feed, and a high degree of reproducibility was found for both the paired pre-feed weights, ICC = 0.99 [0.99–0.99] and the paired post-feed weights, ICC = 0.99 [0.99–0.99]. The mean absolute difference between test weight and feed weight was 1.7 g (±2.2). We calculated the minimum detectable change as 0.96 g, representing the magnitude of change below which there is more than 95% chance that no real change occurred.
Conclusions:
During the establishment of oral feeds, a standardized protocol at the bedside for repeated pre- and post-feed weights demonstrated a high degree of reproducibility. Based on our data, test weight measurements are appropriate for use in this preterm population during the establishment of oral feeds.
Introduction
Information about the precise volume of feed taken by preterm infants during feeding at the breast is clinically relevant, particularly while infants are establishing oral feeds. In many hospital settings, including our own, clinical observation is used to assess breastmilk intake and inform the volume of supplemental feeds required after direct breastfeeding. The clinical observation tools used are unvalidated and have not been accurate in predicting volumes of feed intake in preterm infants. 1
As such, the intake of preterm infants while feeding at the breast remains largely unknown. Test weights have been used in clinical settings for decades. 2 For small, preterm infants, at risk of impaired growth, it is essential that test weights are reliable when used to determine volume of supplemental feeds. Previous studies have reported larger errors in test weights for infants with lower gestational age (GA) who are likely lower in weight and require smaller feed volumes. 3 Concerns have been raised that measuring test weights may undermine parental confidence in breastfeeding if volume of milk transfer is low.4,5 Additionally, without robust protocols in place test weights can be inaccurate 6 and may result in over or under supplementation of feed.
Preterm infants usually establish oral feeding between 33 and 38 weeks 7 and this is one key criteria for discharge from hospital.7,8 At this stage, infants are developing the neurological skills and coordination to orally feed. If, when latching to feed at the breast, feed volumes are assessed using unvalidated bedside observations, then vulnerable infants are at risk of over or under feed supplementation.
We hypothesized that test weights in preterm infants would show a high level of agreement with known feed weights and would be reproducible. The aim was to validate a standardized test weight protocol for use in preterm infants during the establishment of breastfeeding.
Methods
This cross-sectional, single-center validation study was performed from January 2021 to June 2022. The study was approved by the local research ethics committee. Test weights were performed to assess agreement between test weights using a standardized protocol and a known weight of feed in preterm infants (born at <35 weeks GA) during their transition to oral feeding. Reproducibility was assessed using two repeated measurements in each participant. Agreement between test weights and known feed weights was assessed, and minimal detectable change was calculated.
Study population
Infants who were born at <35 weeks GA, who required transition from nasogastric to oral feeds, and were inpatients from January 2021 to June 2022 were eligible for inclusion, regardless of location of birth. Participants were recruited after the clinical team determined that they were ready to start oral feeds, typically between 33 and 37 weeks corrected gestational age (CGA). Test weight measurements took place once infants were taking at least one oral feed per day by either breast or bottle. Infants were eligible regardless of feeding substrate and desired method of feeding, but feeding substrate (expressed breastmilk, fortified expressed breastmilk, preterm formula, or term formula) was recorded for all participants.
Sample size was calculated based on the mean absolute difference between estimated and actual volumes of 2.95 mL (
Infant characteristics including GA at birth, birth weight, sex, length of stay, CGA at discharge, weight at discharge, and death or transfer were recorded. Maternal characteristics that may influence milk supply, including maternal age, gravidity, mode of delivery, presence of maternal diabetes, hypertension or pre-eclampsia, assisted reproduction, smoking status, and previous breastfeeding experience were recorded. Key nutritional indicators are time to first feed, time to full feeds, use of donor breastmilk, and time to regain birthweight. Primary outcomes were:
Agreement between test weights and known volume feed weights Reproducibility of infant test weights
Test weights were measured once participants had started to take at least one oral feed per day by either breast or bottle. Feeding cues were assessed by bedside nursing staff to determine whether feeds were given by bottle versus nasogastric tube for each feed as described by Shaker et al. as signs to “stop” or “go” in terms of oral feeding. 9
Each participant had test weights performed for one feed on two separate days with 2 weights measured before a feed and 2 weights after a feed. A Seca 757 scale was selected for its high resolution and dampening system that minimizes the effect of movement. This scale also has a level gauge and comparable accuracy to the scales used in previous studies.3,10,11 Infant weights were measured to the nearest 1 g. Scales were stored and used on a dedicated stainless steel procedure trolley to ensure that they remained on a flat and stable surface. Prior to use, scales were plugged in or charged and aligned using foot screws to ensure that the spirit level indicated a flat alignment without any load. The scale position was not adjusted between weight measurements to ensure consistency. Before weights were measured, scales were powered on and the display read 0.000. A 1 kg calibration weight was used to ensure accuracy of scales prior to use and was repeated each month.
All weights were automatically transmitted via Bluetooth to dedicated laptops for accurate recording. Feed volume was estimated and recorded by the bedside nurse, and the researcher recorded feed weights before post-feed infant weights were completed.
Test weights were performed immediately before and immediately after a feed. Infants were placed fully clothed and lightly swaddled on a Seca 757 scale for two weight measurements prior to commencing a feed. Monitoring leads were temporarily disconnected and included within swaddles. Muslin cloths, bibs, or cotton pads that were used to collect posits, vomits, or spillage during the feed were included in weights to ensure that all milk transferred from the bottle or syringe to the infant was measured and loss of milk volume due to spillage was limited.
If there was >5 g discrepancy between the two weights, repeated measurements were taken for a maximum of four weights. The two closest weights were recorded and averaged. If repeated weights were required, all were documented.
Infants were then reconnected to monitors and fed as normal by bottle or nasogastric tube. Feeds were given by nasogstric tube or bottle by the bedside nurse or parent according to infant feeding cues. Infants were weighed twice directly after the feed according to the same protocol. Post-feed weights also included disconnected monitoring leads, muslin cloths, bibs, or cotton pads with spillage from the feed.
Test weights were calculated as the difference between the average of 2 post-feed weights and the average of 2 pre-feed weights in grams. The following data were also recorded for all test weights: feed weight, estimated volume of feed given, the need for repeated infant weight, timing of feed, age and weight of infants, method of feeding (nasogastric or bottle), and feeding substrate.
Feed weight assessment
Exact weight of feeds was measured using a high-accuracy mini digital scale (Ascher Elite Digital Pocket Scale) designed to measure weights between 0.05 and 200 g with a selected resolution of 0.01 g.
Immediately before a feed, the bottle and teat or container of milk (including any supplements or fortifier added) were placed on the scale. The same bottle and teat or container was weighed after the feed to determine the exact weight of feed that was given. The researcher recorded this number before post-feed infant weights were completed and recorded electronically.
For feeds that were given by bottle and required nasogastric supplementation during the same feed, the supplemental volume of milk was decanted from the bottle and teat into a syringe for tube feeding. The bottle and teat with any remaining milk were weighed to calculate the feed weight.
Statistical analysis
To compare the repeated feed and test weight differences for the same baby on 2 separate days, the test-retest reliability was calculated using an absolute agreement intraclass correlation coefficient with a single-rater and two-way mixed effects model.
To compare test weights and feed weights, agreement was visualized using Bland Altman plots based on means and mean absolute difference between 2 methods of measurement. The 95% limits of agreement were calculated as the mean absolute difference ±
We calculated the minimal detectable change (MDC) measurable by test weights versus feed weights; we used the formula described by Kovacs et al. 12 The MDC can be interpreted as the smallest amount of change that can be detected that is not due to inherent errors in the measurement or the magnitude of change below which there is more than a 95% chance that no real change has occurred. 12
Statistical significance for all calculations was defined as p < 0.05. All data analysis was performed using Stata/SE 17.
Results
In total, 38 infants were recruited and had test weights performed. In this cohort, the median GA at birth was 30 + 5 weeks [28–33 + 1] with mean birth weight 1574 g (±671). At the time of test weight measurement, the mean CGA was 35 + 3 weeks (±1 + 3), and all test weights were measured between CGA of 32 + 3 and 38 + 5 weeks. The mean weight at the time of test weighing was 2135 g (±354) and feed requirements were 39 mL (±10) per feed. At the time of test, weighing 26.3% of infants were taking 1–2 oral feeds per day, 23.6% were taking alternate tube and oral feeds, 31.6% were taking mostly bottles, and 15.8% were attempting full oral feeds (Table 1).
Clinical Characteristics of Preterm Test Weight Cohort
Clinical characteristics measured at the time of assessment took place during the establishment of oral feeds. Values are presented as means (±SD), medians [IQR], and absolute counts (%). Corrected gestational age (CGA). Expressed breast milk (EBM). Fortified expressed breast milk (FEBM). Nasogastric (NG).
Mothers of this preterm infant cohort had a mean age of 34.3 years (±3.6). Approximately one-third of infants (31.6%) were born via spontaneous vaginal delivery (SVD), and 68.4% were delivered via lower segment caesarean section (LSCS). In terms of maternal conditions, 7.9% had diabetes, 10.5% had preeclampsia (PET), 2.6% had pregnancy induced hypertension (PIH), and 5.3% had essential hypertension. 21.1% of pregnancies were achieved via assisted reproduction, and 13.2% had abnormal dopplers prior to delivery.
Half of the cohort (50%) were firstborn infants, 31.6% were second-born, and 18.4% of mothers were para 2 or more. Of mothers who had older children, 8/19 (42.1%) had previous breastfeeding experience compared to 11/19 (57.9.5%) who had not previously breastfed (Table 2).
Maternal Characteristics of Test Weight Cohort
Values are presented as means (±SD), medians [IQR], and absolute counts (%).
Spontaneous vaginal delivery (SVD). Lower segment caesarean section (LSCS). Preeclampsia (PET). Pregnancy induced hypertension (PIH). Breastfeeding (BF).
Each infant was weighed twice before and twice after each measured feed and a high degree of reproducibility was found for both the paired pre-feed weights, intraclass correlation coefficient (ICC) = 0.99 [0.99–0.99] and the paired post-feed weights, ICC = 0.99 [0.99–0.99]. Only 1 repeated pre-feed weight differed by more than 5 g and required 4 measurements. In this case, two closest measurements differed by 7 g. 17/76 (22.4%) paired measurements differed by 2–4 g, and all other paired measurements 58/76 (76.3%) were equal.
Of post-feed measurements, 23/76 (30.3%) differed by 2 g, 53/76 (69.7%) were the same, and none differed by more than 5 g. There was no significant difference between repeated pre-feed weights (0.05 g ± 1.3) and repeated post-feed weights (0.01 g ± 1.5) with p = 0.81.
Feed volumes for the same baby on 2 separate days displayed moderate reliability but with a wide confidence interval, ICC = 0.57 [0.15–0.79]. We included both nasogastric and oral feeds, and during oral feeds, feed volumes were variable according to infant intake explaining the variation day to day.
A high degree of reproducibility was found between test weights and actual feed weights when measurements were repeated on 2 separate days. The absolute agreement intraclass correlation coefficient was 0.975 with 95% CI of 0.918–0.99 (F = 0.00), indicating excellent reliability. 13 This ICC is based on a single rater, absolute agreement, two-way mixed effects model.
The mean absolute difference between test weight and feed weight was 1.7 g (±2.2). The 95% limits of agreement were −2.7 to 6.1 g (Fig. 1). We calculated the minimum detectable change as 0.96, representing the magnitude of change below which there is more than 95% chance that no real change occurred. Based on the calculated differences, this is more than the minimal detectable change based on error measurements. Therefore, in our setting, by comparing test weights and feed weights, we were able to determine the minimal detectable change for test weight measurement was 0.96 g and that for any larger differences, there is >95% chance that this is due to differences greater than the standard error of measurement.

Bland Altman Plot: Difference in test weights compared to feed weight.
The majority of test weights, 64/76 (84.2%), were less than the feed weight which may be due to small amounts of spillage, insensible losses, or loss of volume when measuring feed weights, for example, when NG tube and bottle were used for one feed.
The majority of test weight feeds, 43/76 (56.6%), were given by nasogastric tube, 22/76 (28.9%) were given by bottle, and 10/76 (13.6%) were given by bottle initially and supplemented with tube feeds. Larger differences in test weights versus feed weights were seen during feeds that required bottle feed and tube feeds compared to either bottle or tube feeds alone (p = 0.05).
The feeding substrate for the majority of test weights was maternal fortified EBM, 52/76 (68.4%). The 24/76 (31.6%) remaining feeds were all preterm formula. There was no significant difference between test weight and feed weight when feeding substrate differed (p = 0.87). There was no evidence that test weight reliability was influenced by either infant weight or corrected GA (Figs. 2 and 3).

Relationship between test weight differences and infant weight.

Relationship between test weight differences and corrected gestational age.
Discussion
In this cohort of preterm infants, measurements were taken during the establishment of oral feeds at a mean CGA of 35 + 3 weeks and mean weight of 2135 g (±354); however, infants with a corrected GA as low as 32 weeks were included. Both oral feeding and achievement of a minimum weight target are frequently requirements for discharge, and the timing of our measurements was reflective of the period just prior to discharge that we sought to capture.
During the establishment of oral feeds, a standardized protocol at the bedside for repeated pre and post-feed weights both demonstrated a high degree of reproducibility. Infant test weights compared to actual feed weights demonstrated strong agreement with limits of agreement of −4.7 to 6.1 g and high reproducibility indicated by intraclass correlation coefficients. We found that 94.7% of measurements were within a pre-defined clinically acceptable margin of error (
In contrast to the study by Rankin et al., we selected infants who were transitioning to oral feeds as this is the population of interest. In addition, our study included test weights not only for orogastric or nasogastric feeds but also bottle feeds and did not show any difference in test weight agreement when method of feeding differed. We did not find a relationship between agreement of test weights and infant weight or feed volume in our cohort. Rankin et al. previously demonstrated that test weights performed at lower corrected GAs (28–33 + 6 weeks) had higher mean percent error than those performed at ≥34 weeks. 3 In our study, the mean corrected age at enrolment was 35 + 3 weeks (±1 + 3) as test weights were performed once infants demonstrated some oral feeding ability. As such, few test weights were performed at lower ranges of CGA as the majority of direct breastfeeds at these gestations do not make up a substantial portion of total milk intake. While it is considered safe and beneficial to introduce direct feeding at the breast at earlier GAs, 15 these feeds usually contribute little to nutritional intake due to minimal milk transfer and require full supplementation. It is useful to note that test weights at lower GAs may be more prone to error 3 which limits generalizability to infants who may be establishing oral feeds at earlier gestations; however, test weights are more pertinent once infants are able to achieve substantial milk transfer while feeding at the breast. We anticipate that findings would be similar for infants on full oral feeds; however, when feeding on demand, some feeds may be small, highlighting the importance of establishing test weight reliability for small feed volumes.
Additionally, we calculated the minimally detectable change for test weights and found that any difference greater than 0.96 g would be detectable with our protocol. This is more clinically relevant than the predetermined margin of error used in previous studies.3,16,17 For our cohort of preterm infants, a difference of up to 10 mL of intake is of clinical significance. For example, for a preterm infant who may weight <2 kg and whose prescribed feed intake may be as little as 30–35 mL per feed, a measured difference of 25 mL intake versus 35 mL intake would be likely to have an effect on their overall weight gain and intake if this measurement error was applied. However, in the same infant, intake was measured, and differences were detectable from 0.96 g; this would have much less clinical significance even if applied systematically.
All infant weights were performed by a single investigator, and since consecutive weights with large differences had to be repeated, it was not feasible to measure infant weights blindly. We did aim to eliminate measurement bias with automatic electronic and time-stamped recording and transfer of weight measurements. Feed weights were not blinded as a single researcher was performing these measurements; however, weight of feeds were recorded after a feed was completed but before post-feed infant weights were performed. Feed volumes were also independently recorded by bedside nursing staff who were not aware of test weights or feed weights. In addition, decisions regarding mode of feeding were completely independent as this was guided by nursing staff and was in accordance with hospital guidelines.
When infants are transitioning to directly feeding at the breast, many feed by both bottle and breast since our hospital does not have the facility for mothers to room-in (defined as enabling mothers and their infants to remain together for 24 hours a day 18 ) for a prolonged period of time prior to discharge. In practice, this means that to achieve full oral feeds prior to discharge, a number of feeds are given via bottle, and 2–6 feeds per day are given by feeding at the breast depending on individual circumstances. In addition, there may be a proportion of feeds given by bottle so that fortification of breastmilk can continue during the transition to discharge if weight gain is suboptimal. In these cases, accurate feed volumes while feeding at the breast could provide a basis for feeding plans upon discharge. For example, infants who have demonstrated appropriate volumes of milk transfer could start to directly feed at the breast more frequently following their discharge when they have unrestricted time with their parents.
Savenije et al. have argued that due to inaccuracies in test weight measurement at the bedside, they are not clinically useful. 6 However, we have demonstrated that test weights performed using a standardized protocol are practical and reliable. While our protocol does take time and briefly interrupts the mother–infant dyad interaction post-feed, it has been designed to minimize this disruption while remaining correct. Given the simplicity and accuracy of the measurement, this method would be suitable for use in low-resource settings or at home, to enable support transition of the preterm infant to feeding at the breast.
Conclusion
During the establishment of oral feeds, a standardized protocol at the bedside for repeated pre and post-feed weights both demonstrated a high degree of reproducibility. While 94.7% of measurements were within a pre-defined clinically acceptable margin of error (
Footnotes
Acknowledgment
The authors are grateful to the babies and families who gave their time to be involved in the study.
Authors’ Contributions
M.K. was involved with data curation, formal analysis, investigation and writing the original draft. M.J.W. was involved with conceptualization and writing (reviewing and editing) the article. A.D. was involved in conceptualization, formal analysis, methodology, writing (reviewing and editing).
Disclosure Statement
The authors declare no conflict of interest.
Funding Information
This work was supported by the European Union's European Innovation Council funding.
