Abstract
Background and aim
Neonatal respiratory distress syndrome is a leading cause of morbidity in preterm new-born babies (<37 weeks gestation age). The current diagnostic reference standard includes clinical testing and chest radiography with associated exposure to ionising radiation. The aim of this review was to compare the diagnostic accuracy of lung ultrasound against the reference standard in symptomatic neonates of ≤42 weeks gestation age.
Methods
A systematic search of literature published between 1990 and 2016 identified 803 potentially relevant studies. Six studies met the review inclusion criteria and were retrieved for analysis. Quality assessment was performed before data extraction and meta-analysis.
Results
Four prospective cohort studies and two case control studies included 480 neonates. All studies were of moderate methodological quality although heterogeneity was evident across the studies. The pooled sensitivity and specificity of lung ultrasound were 97% (95% confidence interval [CI] 94–99%) and 91% (CI: 86–95%) respectively. False positive diagnoses were made in 16 cases due to pneumonia (n = 8), transient tachypnoea (n = 3), pneumothorax (n = 1) and meconium aspiration syndrome (n = 1); the diagnoses of the remaining three false positive results were not specified. False negatives diagnoses occurred in nine cases, only two were specified as air-leak syndromes.
Conclusions
Lung ultrasound was highly sensitive for the detection of neonatal respiratory distress syndrome although there is potential to miss co-morbid air-leak syndromes. Further research into lung ultrasound diagnostic accuracy for neonatal air-leak syndrome and economic modelling for service integration is required before lung ultrasound can replace chest radiography as the imaging component of the reference standard.
Introduction
Neonatal respiratory distress syndrome (NRDS) is a breathing disorder arising at, or shortly after birth (<24 hours); it increases in severity during the first 48 hours of life. 1 Although full term new-borns with a gestational age [GA] between 37 and 42 weeks can be affected, approximately four out of five cases occur in those born prematurely (<37 weeks).2,3 Severity and incidence of NRDS are inversely related to GA with 92% of neonates born at 24–25 weeks affected, 88% at 26–27 weeks, 76% at 28–29 weeks and 57% at 30–31 weeks.4,5
NRDS is caused by physiological and structural pulmonary immaturity – insufficient levels of pulmonary surfactant compromise alveolar integrity, impeding normal gas exchange due to deregulation of acinar surface tension.6,7 Resulting atelectasis causes decreased lung compliance through an increase of collapsed alveoli in the terminal airways. 8 NRDS progresses through hypoventilation, hypoxemia and respiratory acidosis.6–8 It is a leading cause of morbidity in premature new-borns and is a common reason for admission to the neonatal intensive care unit (NICU).9,10
NRDS is diagnosed by a combination of clinical signs and symptoms, laboratory analysis and chest radiography (CXR).1,6 Early diagnosis is important so that interventional therapy, respiratory support and surfactant replacement, can be instigated.7,8 Follow up imaging is required to monitor therapeutic effect and reduce broncho-pulmonary dysplasia as a result of unnecessary mechanical ventilation. 11
Clinical signs and symptoms
Clinical presentations of NRDS include non-specific tachypnoea, nasal flaring, cyanosis, substernal and intercostal retraction and grunting from expiratory air colliding with a partially closed glottis. 8 The ‘Clinical Risk Index for Babies’ (CRIB) is a risk assessment tool scoring birth weight, gestational age, maximum and minimum fraction of inspired oxygen, maximum base excess during the first 12 hours of life and presence of congenital malformations. 12 In suspected NRDS, the CRIB can be used to estimate severity of NRDS and trigger administration of assisted ventilation. 12
Laboratory tests
Arterial partial oxygen pressure (PaO2) levels below 50 mmHg with cyanosis in room air, or the need for supplementary oxygen to maintain PaO2 > 50 mmHg, is indicative of NRDS. 6 A blood sample can determine levels of metabolic and respiratory acidosis which indicate anaerobic metabolism and atelectasis, respectively. 13
Swallowed lung fluid is a significant constituent of neonatal gastric aspirate. The gastric aspirate shake test (GAST) identifies the presence or a lack of surfactant. 14 GAST is reported to have 100% sensitivity and 92% specificity for NRDS. 15
Chest radiography
In a study of 59 neonates with clinically suspected NRDS, Vergine et al. 16 found CXR to have sensitivity and specificity of 91% and 84% respectively when radiologists where blinded to clinical test results. Morris 17 suggests radiological appearances correlate well with clinical disease severity, atelectasis being represented by a bi-lateral fine granular or ‘ground glass’ appearance such that extent of disease corresponds to level of lung opacity. Reduced lung expansion, dilated bronchioles and air bronchograms are also visible depending on disease stage. 7
Further to diagnostic use, CXR is used to confirm endotracheal tube (ETT) position; premature new-borns with severe NRDS frequently receive continuous positive airways pressure (CPAP) in order to improve ventilation and oxygenation as well as facilitating intratracheal administration of surfactant.1,6 Confirmation of the ETT position minimises lung damage caused by malpositioning 1 .
Chest radiography involves exposure to ionising radiation. Neonates, due to their small size and the close proximity of radiosensitive tissues and organs, are at greater risk from latent effects of CXR in comparison to other age groups. 18 Although the actual risk of adverse latent effects from neonatal radiation exposure has not been quantified,19,20 the theoretical risk can be predicted using the linear no-threshold (LNT) model with relative risk increasing as absorbed dose increases. 20 With neonates undergoing multiple CXR examinations during their stay on the NICU, efforts have been made to identify an alternative diagnostic test.21,22
Lung ultrasound
In the past, ultrasound has not been widely used for neonatal chest imaging due to the obscuring artefact generated by normal air-filled lung. 21
Ultrasound does not involve ionising radiation but is associated with potential risks due to mechanical (inertial cavitation) and thermal tissue damage. 23 The risk of these adverse bio-effects is low in routine clinical practice, but proportional to duration of ultrasound examination, dependent on the specific tissues under examination and the output of the ultrasound transducer. Risk is quantified in terms of mechanical and thermal indices, MI and TI respectively and displayed during scanning. 24 The ‘as low as reasonably practicable’ (ALARP) principle, along with acoustic safety guidelines are implemented to minimise risk. 25
LUS appearances of the normal and NRDS affected lung. 21
NRDS: neonatal respiratory distress syndrome.

Normal and abnormal transthoracic LUS appearances of NRDS.
Ultrasonic verification of ETT position in neonates has also shown potential. Studies have reported close correlation between ultrasound and CXR measurements and is comparatively much faster.30,31 Due to a lack of high quality supporting evidence CXR remains the gold standard. 32
Aim
The aim of this review was to compare the diagnostic accuracy of LUS against the reference standard clinical test and CXR in symptomatic neonates of ≤42 weeks gestational age.
Method
Search strategy
Studies were identified during August 2016 using the following databases: OVID Embase 1996–2016, OVID Medline (R) 1996–2016, PUBMED 1996–2016, Science Direct 1995–2016, Leeds University Library’s Journals/Books@OVID (full-text), CINAHL 1990–2016, The Cochrane Library 2005–2016 and Google Scholar.
Search terms
Truncation command.
Inclusion and exclusion criteria were designed in accordance with the Population, Intervention, Comparator, Outcome (PICO) framework to correlate with the research question. To increase validity and reproducibility they were defined a priori. Studies were included if they were randomised control trials (RCTs), cohort or case-control studies, recruited neonates ≤ 42 weeks GA in a clinical setting with signs and symptoms of NRDS within 48 hours of birth, and had NRDS diagnosed using a combination of clinical indicators (presentation, vital signs and auscultation), CXR and/or laboratory blood gas analysis. Limited resources restricted inclusion to studies published in English. Although this may introduce language bias, 33 there is little evidence to suggest that systematic bias occurs with such an approach. 35 Articles were not excluded on the basis of geographical location or publication date to limit bias and maximise retrieval of relevant material.33,34
Studies were excluded where it was not possible to extract sufficient data to populate 2 × 2 contingency tables, obtain them through the local institutional or British Library, where requisite permission from parents and ethical committees had not been obtained or where studies collected non-human or cadaveric data.
After removing duplicate results, study titles, abstracts or full-papers were reviewed to determine inclusion in the review. Differences of opinion were resolved by discussion. The reference lists of included studies were examined to identify further relevant studies that had not been retrieved by the database search; forward citation tracking was performed in Google Scholar. The rigorous search and selection process limited selection bias and reduced the chance of random error.33,34
Quality assessment
Since the inclusion of studies other than RCTs can increase selection and reporting bias, 33 quality assessment was performed using the QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies 2) tool. 36 Risk of bias and applicability were assessed in four key areas relevant to the research question: patient selection, index test, reference standard and test flow and timing. Three team members individually scored each study awarding one point for each criterion where risk of bias was considered to be low. 36
Patient selection was considered to have low risk of bias if there was a consecutive sample of neonates, they were suspected to have NRDS within 48 hours of birth, and subjects had not been excluded inappropriately. Applicability concerns were considered low if neonates with congenital heart and chest disease had been excluded, studies were conducted in an appropriate clinical setting and there was no evidence of recruitment according to disease severity.
Index test bias criteria required LUS practitioners to be blinded to the results of the CXR and applicability concerns related to appropriateness of probe frequency and age and capability of equipment. Conversely for the reference standard, clinicians would ideally be blinded to the results of the LUS examination (low bias) and the clinical test had to be appropriate (applicability).
In terms of flow and timing of the reference and index tests, risk of bias was deemed low if all neonates received the same clinical test and a CXR, the interval between LUS and CXR was ≤5 hours and all recruits where included in 2 × 2 contingency table analysis.
Data extraction and analysis
Data extraction was carried out independently by MH and CW. The following data were extracted: sample size, age range, study design, blinding, method of NRDS diagnosis, LUS operator skill level, LUS diagnostic technique, time between CXR and LUS, LUS diagnostic criteria and the number of true positives, true negatives, false positives and false negatives.
Contingency tables were created to calculate test sensitivity and specificity and the DerSimonian and Laird random effects model 37 was fitted to the data to account for the heterogeneity across the studies. Use of a random-effects model is recommended in systematic reviews of diagnostic studies due to heterogeneity. 34 95% Confidence intervals (CIs) were calculated for individual and pooled data. The chi-squared test (χ 2 ) was applied to assess risk of heterogeneity (p < 0.10). 33 The Inconsistency (I2) test was used to quantify heterogeneity (significance greater than 50%). 33 Statistical analysis was undertaken using Meta-DiSc® version 1.4 software. 38
Results
Identification of studies
The search returned 803 studies of which 10 full texts were assessed for eligibility against the inclusion/exclusion criteria. Six of these studies were omitted because they had insufficient detail to produce 2 × 2 contingency tables (n = 4),9,27,39,40 reported LUS results for lung zones instead of individual neonates (n = 1)
41
or assessed LUS for predicting the need for mechanical ventilation rather than diagnosing NRDS (n = 1).
11
Two further quantitative studies identified through forward and backward searching16,42 were included in the analysis (Figure 2).
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagram of search process.
Study characteristics
Primary data extracted from retrieved studies for meta-analysis
General study characteristics
LUS: lung ultrasound.
Quality assessment
QUADAS-2 Risk of bias and applicability assessment
= Low
= High
Meta-analysis
Across the six studies, pooled sensitivity and specificity for the diagnosis of NRDS was 0.97 (CI: 0.94–0.99) and 0.91 (CI: 0.86–0.95), respectively (Figure 3(a) and (b)). The χ
2
values were statistically significant (p < 0.10) indicating heterogeneity amongst the studies due to chance; χ
2
22.92 (p = 0.0003) and χ2 21.60 (p = 0.0006). The I2 statistic values were 78.2% and 76.9%. Since these values were >50%, this was considered to be significant heterogeneity based on recommendations from the Cochrane handbook (2008)
33
.
Forest plots describing the sensitivity (a) and specificity (b) of LUS for the diagnosis of NRDS.
Subgroup analysis of the four prospective cohort studies4,13,14,16 showed pooled sensitivity of 96% (CI: 92–98%) and specificity 86% (CI: 79–92%). For the four studies using transthoracic scanning,4,10,16,42 LUS sensitivity was 99% (CI: 95–100%) and specificity 98% (CI: 93–100%); in comparison, the pooled sensitivity of the two studies using transabdominal scanning13,14 was 96% (CI: 91–98%) and the specificity 83% (CI: 72–98%).
Discussion
Diagnostic accuracy of LUS
Meta-analysis of six studies which compared LUS to CXR and clinical information showed high sensitivity (97%) and specificity (91%) for detecting and excluding NRDS, respectively. Subgroup analysis of the four prospective cohort studies showed markedly lower specificity. Although the healthy controls underwent the same index and reference tests as the disease group in the two case-control studies, the absence of a random or a consecutive sample of participants may have resulted in over-estimation of diagnostic accuracy in this subgroup. 36 As such we feel the subgroup analysis of prospective cohort studies provides the most accurate reflection of test accuracy (sensitivity 96%, specificity 86%).
The transthoracic technique appeared to be superior to the transabdominal approach for diagnosing NRDS because subgroup analysis demonstrated it to have marginally better sensitivity (99%, 97% respectively) and better specificity (98%, 82% respectively). The increased specificity of the transthoracic technique would reduce the number of false positive diagnoses and have the clinical benefit of reducing unnecessary additional testing or intervention.
Vergine et al. 16 measured the diagnostic accuracy of CXR without the addition of clinical information and found a sensitivity of 91% and a specificity of 84%. Based on these values, LUS appears to be a comparable test.
Timing of test performance
During the acute phase of NRDS, the clinical picture can vary significantly over time.6,7 Such changes are influenced by naturally increasing disease severity and the impact of any treatment provided. It is important when comparing a proposed new test with an existing ‘reference’ test that both are carried out within a narrow time frame to reduce performance bias. 36 Two studies4,10 specified that both tests were conducted within 5 hours. The remaining four studies13,14,16,42 completed LUS and CXR within 24 hours. This increases the risk of bias due to the possibility that changes occurred as a result of advancing disease severity or conversely, due to treatment response (Table 5). 34
Limitations of imaging
The long term biological effects of ultrasound on neonatal lung tissue are unknown. 25 Through prudent clinical use and the avoidance of ionising radiation, LUS is a safer alternative to CXR theoretically. 21 Despite an established pattern of radiological appearances in NRDS, findings often overlap with other respiratory pathologies that are common amongst premature neonates.11,21 The static, planar nature of the CXR can make differential diagnosis difficult and a degree of inter-observer disagreement is inevitable, especially in less advanced disease. 21
LUS has its own characteristic signs associated with NRDS,9,10,11,21 the identification of which are aided by real-time visualisation of lung parenchyma and the performance of numerous multi-planar sweeps across the lung fields.10,13 Ultrasound is notoriously operator dependant, an inherent source of potential error, 25 although utilisation of a standard approach helps to limit operator dependency and can improve diagnostic accuracy. 21
If LUS is to be used as a first line investigation for NRDS, it must be carried out soon after birth in order to maximise positive health outcomes. This presents economic and administrative challenges as LUS would require neonatal clinicians to spend time learning a new skill or alternatively, an LUS practitioner would be required to service the NICU 24 hours a day 7 days a week.
Consequences of diagnostic error
The relatively low (91%) pooled specificity for LUS implied a tendency for over-diagnosis of NRDS. Sixteen false positives cases were described across the studies due to pneumonia (n = 8), transient tachypnoea (n = 3), pneumothorax (n-1) and meconium aspiration syndrome (n = 1); in three cases no alternate diagnosis was given.
Pneumonia occurs frequently in new-borns and shares many of the same sonographic and radiographic appearances of NRDS. Consolidation with air bronchograms, pleural line abnormality, and alveolar interstitial syndrome (presence of > 3 b-lines) are all associated with the disease. 43 Consolidation in severe cases of pneumonia is often large with irregular margins; in less severe cases multi-focal areas of consolidation can be mistaken for NRDS. 44 In many cases, the diagnosis of pneumonia requires bacteriologic culture to identify the presence of infection. 7
Transient tachypnoea of the new-born (TTN) occurs in approximately 1% of all new-borns due to insufficient clearance of foetal lung fluid. 16 The resulting respiratory distress is accompanied by similar clinico-radiological features to those seen in NRDS. Copetti and Cattarossi 45 described ‘the double lung point’ sign in TTN which improves the accuracy of LUS for diagnosis (sensitivity 93%, specificity 97%). The ‘double lung point’ sign features a normal pleural line with sliding lung, difference in echogenicity of lower and upper lung areas, and comet tail artefacts in the inferior lung but largely absent in the superior lung. 45 All three false positives with TTN were from the same study 14 which utilised a transabdominal technique. Copetti et al. 42 suggests it is not possible to examine either the superior lung field or the pleural line using this approach, which may explain the failures to correctly diagnose the condition.
Of the nine false negative cases identified, seven were insufficiently reported and the eventual diagnosis is unknown. The remaining two were diagnosed by CXR as partial pneumothorax. This can be a complication of NRDS along with other associated air-leak syndromes such as interstitial emphysema, 21 pneumomediastinum and pneumopericardium.7,10 Air leaks may occur spontaneously, but more commonly occur through inadequate mechanical ventilation pressure causing alveolar rupture and subsequent escape of air beyond the terminal airways. 8 Neonates with NRDS have an increased risk of air-leaks due to the delicate nature of the surfactant-deficient lung and their frequent oxygen therapy requirement. 46 Lichtenstein et al. 47 defined a pattern of LUS features that can be used to diagnose pneumothorax; normal lung sliding and b-lines originating from the visceral pleura are obliterated at the site of pneumothorax. The point at which normal findings diminish is ‘the lung point’ which demarcates the presence of air in the pleural cavity (pneumothorax) and is associated with 79% sensitivity, 100% specificity. 47 Both instances of false negative pneumothorax were diagnosed by CXR in the study by Lovrenski, 4 the author maintaining that despite a well-defined pattern, smaller pneumothoraces remain diagnostically challenging. Bober and Swietliński 13 support this idea and suggest that an ultrasound beam can propagate through a small pneumothorax into the lung field, rendering production of the lung point sign impossible.
Pneumothoraces are also frequently encountered in cases of meconium aspiration syndrome which may explain the isolated false positive case identified in this review. Air is unable to escape upon exhalation due to airway constriction around aspirated meconium which increases the resistance of expiratory airflow. This ‘ball valve’ effect creates a volume of trapped gas causing hyperinflation and possible alveolar rupture (air-leak). 48
The use of LUS for the detection of pneumomediastinum and pneumopericardium is yet more contentious with arguments for49,50 and against.10,16 There is little high quality evidence to support or deny a role for LUS in this area. This is important, as a chief concern with suspected NRDS is the presence of leaking air due to its deleterious consequences (tension causing compression of vessels and airways). 46 with this in mind, CXR appears requisite to rule out air-leak syndromes for neonates with suspected NRDS.
Summary
This review has shown that LUS compares well with this current reference standard for the diagnosis of NRDS. With appropriate technique and knowledge of standardised findings and potential pitfalls, e.g. TTN, pneumothorax, the diagnostic accuracy of LUS could be further improved. LUS has superior diagnostic accuracy for alveolar consolidation – a major component of the NRDS pattern (90% sensitivity, 98% specificity). Reduced CXR sensitivity (68% sensitivity 95% specificity) occurs when the radiograph is acquired in the supine position – a necessity in neonates. 44 Less intra-observer variation occurs in LUS identification of small pneumonias and air bronchograms – a problematic source of error in CXR reading. 51 This may be due to real-time visualisation of lung behaviour in synchronisation with the respiratory cycle and the ability to access multiple cross sections of the lung fields. 13 Reduced lung volume, smaller thorax diameter and a thin thoracic wall in neonates may also improve image quality.5,13,52
Review limitations
A degree of heterogeneity across studies was expected and this was confirmed statistically by I2 values greater than 50% across both forest plots (Figure 3(a) and (b)). In addition to the differences in study design and scanning technique addressed in the subgroup analysis, three other sources of heterogeneity were identified:
LUS operators were not blinded to clinico-radiologic information in 50% of the studies (Table 4). As prior knowledge can influence the interpretation of the forthcoming examination, this could have biased diagnostic accuracy favourably.
With the exception of two studies,4,10 the duration between CXR and subsequent LUS was variable. This could have inflated LUS sensitivity due to disease progression leading to increased detection of pathology in the second test. Conversely, LUS sensitivity for NRDS may have appeared diminished due to the effects of surfactant replacement therapy between tests. No study reported instigation of treatment during the test interval so the effect of this bias remains unknown.
All studies used signs and symptoms in the clinical diagnosis; only three studies included a supplementary blood test.4,10,13 Additional CRIB and GAST tests were used in only two studies.13,14 Differences in the clinical tests used across the studies could have introduced bias leading to their varying diagnostic accuracy and applicability (Table 4).34,36
The six studies included 480 participants. This sample may not reflect the full spectrum of NRDS, or diseases that mimic the appearance of NRDS, which a larger sample size might. In cases of non-advanced disease, a differential diagnosis with LUS becomes harder to define, although this is a problem that is shared with CXR.
Although used as the reference standard, the absolute diagnostic accuracy of ‘CXR and clinical tests’ has not been verified in neonates. 47
Recommendations
Owing to the frequency of NRDS admissions to NICUs and the number of CXRs performed on neonates, LUS adheres to the ALARP principle by reducing ionising radiation burden. The following recommendations are suggested:
CXR is required in suspected NRDS to assess for air-leak syndromes. The combination of consolidation, pleural line abnormalities and bilateral white lung detected via the transthoracic technique offers the most reliable diagnostic criteria (sensitivity 99%, specificity 98%). Future research is required to understand LUS effectiveness as An initial screening tool for NRDS and comorbid ALS. ETT assessment to compare LUS and CXR at four hours of postnatal age. Follow up imaging tool for informing surfactant and ventilatory therapy in NRDS patients. Comparison of neonatologist vs. ultrasound practitioner vs. neonatal nurse practitioner in acquiring and interpreting LUS. Economic modelling to determine the feasibility of either current neonatal staff learning a new skill, spend time practicing it and interpreting the results; number of neonatologists or nurse practitioners or ultrasound practitioners to carry out LUS. Impact on neonatal service delivery 24/7 review.
Conclusion
The diagnostic accuracy of LUS appears to be comparable with the reference standard of CXR and clinical tests. However, the presence of heterogeneity amongst studies, which have small sample sizes, and no independently validated comparator mean the results must be treated cautiously. 33 LUS may potentially miss ALS (pneumothorax, pneumomediastinum and pneumopericardium), and therefore CXR remains necessary for suspected NRDS. It is a promising technique although currently in its infancy with a limited body of experimental studies to support its use. High quality RCT studies are required to quantify the diagnostic accuracy of LUS for NRDS and comorbid ALS, and to assess LUS effectiveness in follow up imaging. A significant role of CXR in NRDS is verification of ETT position for neonates receiving invasive ventilation. 32 Further study into the effectiveness of ultrasound ETT confirmation is required if the absorbed dose of IR is to be reduced. Future research should address ways to integrate LUS practice into NICUs in terms of personnel to perform the examination and its economic feasibility.
Footnotes
Declaration of Conflicting Interests
The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclose no receipt of financial support for the review. The views expressed in this publication are those of the author(s) and not necessarily those of the NHS or other bodies.
Ethical approval
Ethical approval was not necessary for this review.
Guarantor
SW.
Contributors
MH and SW conceived and developed the systematic review and review protocol. MH did the search of the databases. MH and CW screened the study titles and abstracts for inclusion and exclusion to ensure agreement and limit selection bias. MH reviewed the reference list of included studies to identify relevant studies that were not retrieved as part of the database search. MH did forward searching by using Google Scholar. MH and CW independently data extracted from the included studies. MH and CW carried out quality assurance of the included studies. SW reviewed the data extracted and the quality assurance process for accuracy assurance. Statistical analysis was performed with the assistance of a statistician (TM). MH wrote the first draft and final version of the manuscript. All authors edited drafts and reviewed and accepted the final version of the manuscript.
Acknowledgments
The authors express gratitude to the consultant paediatric radiologists from the Yorkshire region who gave their constructive criticism whilst drafting the paper.
