Abstract
Background
Early diagnosis of biliary atresia (BA) is an important clinical challenge.
Purpose
To summarize the latest diagnostic performance of different ultrasonic (US) features for BA.
Material and Methods
MeSH terms “biliary atresia” and “ultrasonography” and related hyponyms were used to search PubMed, EMBASE, and the Cochrane Central Register of Controlled Trials. Eligible articles were included and data were retrieved. The methodologic quality was assessed by version 2 of the Quality Assessment of Diagnostic Accuracy Studies tool. Estimated sensitivity and specificity of each US feature were calculated by Stata 14.0.
Results
Fifty eligible studies on 5622 patients were included. Respective summary sensitivity and specificity were 77% (95% CI=69–84) and 98% (95% CI=96–99) for triangular cord sign (TCS) in 32 studies, 86% (95% CI=78–92) and 86% (95% CI=72–94) for shear wave elastography (SWE) in seven studies, 75% (95% CI=65–83) and 92% (95% CI=86–95) for gallbladder and biliary system abnormality (GBA) in 25 studies, and 81% (95% CI=69–90) and 79% (95% CI=67–87) for hepatic artery (HA) enlargement in seven studies. The overall US features from 11 studies yielded a summary sensitivity of 84% (95% CI=72–92) and specificity of 86% (95% CI=77–92).
Conclusion
TCS and GBA were the two most widely accepted US features currently used for differential diagnosis of BA. The newly developed SWE was an objective and convenient method with good diagnostic performance. HA enlargement can be used as an auxiliary sign.
Keywords
Introduction
Biliary atresia (BA) is a disease characterized by progressive inflammation and fibrosis of the intrahepatic and extrahepatic bile ducts with unknown etiology, which can lead to cholestasis, progressive liver fibrosis, and even cirrhosis, and endangers the lives of children (1). It is one of the common serious hepatobiliary diseases in infancy (2). If it is not treated in time, biliary cirrhosis, portal hypertension, and liver failure will occur.
Early diagnosis and intervention of BA are critical to the prognosis of the disease. The current auxiliary examination used for this disease include ultrasound, hepatobiliary scintigraphy, magnetic resonance (MR) cholangiography and liver biopsy (3). Among these methods, ultrasound is recommended as the initial screening method because it is affordable, convenient, and radiation-free (4). Previous studies have found that triangular cord sign (TCS), gallbladder and biliary system abnormality (GBA), hepatic artery (HA) enlargement, and the newly developed liver shear wave elastography (SWE) provide good reference for the diagnosis of BA under ultrasonic (US) examination. Some meta-analyses on the diagnostic performance of ultrasound for BA have been carried out in the past, but in recent years, there have been many new studies exploring the diagnostic performance of different US features for BA; there is no new research about them in meta-analyses. In addition, the newly developed SWE has shown good diagnostic performance for BA, and there is currently no research to systematically review. Therefore, the aim of the present study was to conduct a meta-analysis of the latest diagnostic performance of different US features and SWE in diagnosing BA, and to evaluate their roles in identifying BA in infantile cholestatic patients.
Material and Methods
The study was a systematic review and meta-analysis, and institutional review board approval was not required.
Search strategy
MeSH terms “biliary atresia” and “ultrasonography” and related hyponyms were used to search PubMed, EMBASE, and the Cochrane Central Register of Controlled Trials (CENTRAL). The search was updated to March 2021. The search strategy in PubMed was as follows: (sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR (predictive[Title/Abstract] AND value*[Title/Abstract]) OR predictive value of tests[MeSH Term] OR accuracy*[Title/Abstract]) AND (((((((((((((((((((((((((((“Ultrasonography"[Mesh]) OR (Diagnostic Ultrasound)) OR (Diagnostic Ultrasounds)) OR (Ultrasound, Diagnostic)) OR (Ultrasounds, Diagnostic)) OR (Ultrasound Imaging)) OR (Imaging, Ultrasound)) OR (Imagings, Ultrasound)) OR (Echotomography)) OR (Ultrasonic Imaging)) OR (Imaging, Ultrasonic)) OR (Sonography, Medical)) OR (Medical Sonography)) OR (Ultrasonographic Imaging)) OR (Imaging, Ultrasonographic)) OR (Imagings, Ultrasonographic)) OR (Ultrasonographic Imagings)) OR (Echography)) OR (Diagnosis, Ultrasonic)) OR (Diagnoses, Ultrasonic)) OR (Ultrasonic Diagnoses)) OR (Ultrasonic Diagnosis)) OR (Echotomography, Computer)) OR (Computer Echotomography)) OR (Tomography, Ultrasonic)) OR (Ultrasonic Tomography)) AND (((((((((“Biliary Atresia"[Mesh]) OR (Atresia, Biliary)) OR (Intrahepatic Biliary Atresia)) OR (Biliary Atresia, Intrahepatic)) OR (Biliary Atresia, Extrahepatic)) OR (Atresia, Extrahepatic Biliary)) OR (Extrahepatic Biliary Atresia)) OR (Idiopathic Extrahepatic Biliary Atresia)) OR (Familial Extrahepatic Biliary Atresia))).
Study eligibility
First, duplicate articles were deleted by literature manager Endnote. An article was then considered potentially eligible if the abstract described the diagnostic efficiency of US features in the diagnosis of BA. Thereafter, full texts were obtained for further evaluation.
The inclusion criteria included the following: (i) explicit criteria defining US features; (ii) reference standard confirming BA, including liver biopsy, intraoperative cholangiography (IOC), MR cholangiography, 99mTc DISIDA imaging, percutaneous transhepatic, or endoscopic retrograde cholangiography; and (iii) sufficient data to extract the diagnostic performance of US features. The exclusion criteria included the following: (i) case reports, case series with a sample size of <10 patients, editorials, comments, letters, review articles, animal experiment studies, and conference proceedings; (ii) studies with insufficient data on the diagnostic performance of US features in patients with BA; and (iii) studies with overlapping patients and data.
Two investigators independently assessed the eligibility of each article, while the third investigator determined the disagreement.
Data extraction
We extracted the following data from the selected studies: (i) study characteristics—author(s), year of publication, affiliation, sample size, study design, and US features; (ii) diagnostic performance of US features in patients with BA, including the number of true-positive (TP), true-negative (TN), false-positive (FP) and false-negative (FN) findings; and (iii) detailed US features of BA. Two authors of this paper independently extracted the data. Any disagreement was resolved by discussion until consensus or by consulting a third author.
Quality assessment
The methodological quality of the studies was assessed by using version 2 of the Quality Assessment of Diagnostic Test Accuracy Studies (QUADAS-2) tool. Both reviewers scored the tool independently and any disagreement was resolved by discussion until consensus or by consulting a third author.
Statistical analysis
Meta-disc 1.4 software (Cochrane Colloquium, Barcelona, Spain) and Stata 14.0 (Stata Corporation, College Station, TX, USA) software were used for statistical analysis. We extracted or reconstructed 2 × 2 contingency tables for all the US features reported in the included studies. The TP, TN, FP, and FN were extracted or calculated from the original articles.
The overall sensitivity and specificity were calculated from the extracted data. A summary receiver operating characteristic (SROC) curve with a 95% confidence region and prediction region was also plotted to graphically present the results. The diagnostic odds ratio (DOR) and area under the curve (AUC) were calculated for evaluating the diagnostic performance. Heterogeneity among studies was determined by I2 statistic. When heterogeneity was noted, Spearman correlation coefficient between the sensitivity and FP rate was calculated to evaluate whether a “threshold effect” existed. We used Deeks’ funnel plot to assess potential publication bias and used Deeks’ asymmetry test to test the statistical significance. Slope coefficients with P < 0.1 were regarded as statistically significant asymmetry, indicating that the results have a publication bias (5).
Results
Literature search
The flow diagram of the study selection process is showed in Fig. 1. The literature search of PubMed, EMBASE, and the CENTRAL databases identified a total of 332 unique references; of them, 123 were deemed potentially relevant based on the titles and abstracts. Of these articles, 73 were further excluded after full-text reading (including 37 review articles, 21 case reports, 1 letter, 3 animal studies, and 11 conference proceedings). Finally, a total of 50 articles satisfied the inclusion criteria and were selected for data extraction and analysis (6–55) (Fig. 1).

Flow diagram of the study selection process.
Study and design characteristics
The characteristics of the 50 included studies are demonstrated in Table 1. Of these 50 articles, 26 were prospective and 24 were retrospective. A total of 32 studies reported the diagnostic performance of TCS, 7 described the diagnostic performance of SWE, 25 described the performance of GBA, 7 described the performance of HA enlargement, and 11 described the performance of overall US features.
Characteristics of included studies.
BA, biliary atresia; GBA, gallbladder and biliary system abnormality; HA, hepatic artery; SWE, shear wave elastography; TCS, triangular cord sign; US, ultrasonic.
QUADAS-2 was used for the overall quality of the included studies. The quality assessment chart is showed in Fig. 2. The included studies basically met the criteria for quality assessment of diagnostic accuracy.

Quality assessment of the studies selected for the meta-analysis (QUADAS-2).
The diagnostic performance of different US features
Diagnostic performance of TCS
Data on the diagnostic performance of TCS were collected from 32 studies on 3240 patients. A coupled forest plot of the sensitivity and specificity of the diagnostic performance of TCS in the 32 included studies (1–31,45) is shown in Fig. 3a. The sensitivities and specificities of individual studies were in the range of 23%–100% and 74%–100%, respectively. The TCS showed summary sensitivity of 77% (95% confidence interval [CI] = 69%–84%), specificity of 98% (95% CI = 96%–99%), DOR of 140 (95% CI = 73–272), and AUC of 0.97 (95% CI = 0.95–0.98) for the detection of BA (Fig. 3b). The between-study heterogeneity was high for both sensitivity (I2 = 94.4%, P < 0.001) and specificity (I2 = 87.3%, P < 0.001). The Spearman correlation coefficient between sensitivity and FP rate was 0.22 (P = 0.22), indicating no threshold effect. No significant publication bias existed among the studies (P = 0.47) (Fig. 3c).

The diagnostic performance of the TCS from the 32 included studies (1–31,45). (a) Forest plots of sensitivity and specificity of TCS for diagnosis. (b) Summary ROC curve. (c) Funnel plot for evaluating publication bias. TCS, triangular cord sign. ROC, receiver operating characteristic; TCS, triangular cord sign.
Diagnostic performance of SWE
Data on the diagnostic performance of SWE were collected from seven studies (27,29,32–35,48) on 1044 patients. The sensitivities and specificities of individual studies were in the range of 73%–97% and 67%–100%, respectively. A coupled forest plot of the sensitivity and specificity of the diagnostic performance of SWE in the seven included studies is shown in Fig. 4a. The SWE showed summary sensitivity of 86% (95% CI = 78%–92%), specificity of 86% (95% CI = 72%–94%), DOR of 39 (95% CI = 11–142), and AUC of 0.92 (95% CI = 0.89–0.94) (Fig. 4b). Substantial heterogeneity in both sensitivity (I2 = 83.0%) and specificity (I2 = 89.3%) existed. The Spearman correlation coefficient between sensitivity and FP rate was −0.46 (P = 0.29), indicating no threshold effect. No significant publication bias existed among the studies (P = 0.11) (Fig. 4c).

The diagnostic performance of the SWE from the seven studies (27,29,32–35,48). (a) Forest plots of sensitivity and specificity of SWE for diagnosis. (b) Summary ROC curve. (c) Funnel plot for evaluating publication bias. SWE, shear wave elastography. ROC, receiver operating characteristic; SWE, shear wave elastography; TCS, triangular cord sign.
Diagnostic performance of GBA
Data on the diagnostic performance of GBA were collected from 25 studies (1–8, 11–14,16,17,22,23,25,27,36,45–50) on 2803 patients. Some studies conducted diagnostic performance analysis on various types of GBA, including absence of gallbladder, small gallbladder, irregular gallbladder wall, absence of gallbladder constriction, and absence of common bile duct (CBD). All these different types of GBA were collected and analyzed, and the results showed summary sensitivity of 75% (95% CI = 65%–83%), specificity of 92% (95% CI = 86%–95%), DOR of 33 (95% CI = 18–58), and AUC of 0.91 (95% CI = 0.89–0.93) for the detection of BA (Table 2). The between-study heterogeneity was high for both sensitivity (I2 = 95.2%, P < 0.001) and specificity (I2 = 94.5%, P < 0.001). The Spearman correlation coefficient between sensitivity and FP rate was 0.34 (P = 0.01), and the coupled forest plot of sensitivity and specificity did not reveal a threshold effect. No significant publication bias existed among the studies (P = 0.68).
The diagnostic performance of different US features used for the diagnosis of BA.
AUC, area under the curve; BA, biliary atresia; DOR, diagnostic odds ratio; GBA, gallbladder and biliary system abnormality; HA, hepatic artery; SWE, shear wave elastography; TCS, triangular cord sign; US, ultrasonic.
Subgroup analysis of GBA was performed according to different types (Table 3). Absence of gallbladder had the highest specificity (98%), but its sensitivity (29%) was very low. In addition, irregular gallbladder wall also had a very high specificity (98%), and the sensitivity (59%) was higher than that of absence of gallbladder. The absence of gallbladder constriction and absence of CBD showed high sensitivity and low specificity. Irregular gallbladder wall, absence of gallbladder contraction, and absence of CBD had good diagnostic performance (AUC = 0.97, 0.92, and 0.94, respectively). A total of 14 articles studied the diagnostic performance of overall abnormal gallbladder on BA, with an AUC of 0.95 (95% CI = 0.92–0.96), which is similar to the results obtained by our present meta-analysis of various GBA in 29 different studies.
Subgroup analysis of different types of GBA.
AUC, area under the curve; CBD, common bile duct; DOR, diagnostic odds ratio; GBA, gallbladder and biliary system abnormality.
Diagnostic performance of HA enlargement
Data on the diagnostic performance of HA enlargement were collected from seven studies (4,11–13,18,22,25) on 722 patients. A coupled forest plot of sensitivity and specificity of the diagnostic performance of HA enlargement in the 7 included studies was shown in Fig. 5a. HA enlargement showed summary sensitivity of 81% (95% CI = 69%–90%) and specificity of 79% (95% CI = 67%–87%). The DOR was 16 (95% CI = 6–43) and the hierarchical summary ROC curves showed the AUC was 0.87 (95% CI = 0.84–0.90) (Fig. 5b). The Higgins I2 statistics showed substantial heterogeneity in both the sensitivity (I2 = 72.3%) and specificity (I2 = 88.2%). The coupled forest plot of sensitivity and specificity did not reveal a threshold effect. The Spearman correlation coefficient between sensitivity and FP rate was −0.36 (P = 0.43), indicating no threshold effect. No significant publication bias existed among the studies (P = 0.79) (Fig. 5c).

The diagnostic performance of the HA enlargement from the seven studies (4,11–13,18,22,25). (a) Forest plots of sensitivity and specificity of HA enlargement for diagnosis. (b) Summary ROC curve. (c) Funnel plot for evaluating publication bias. HA, hepatic artery; ROC, receiver operating characteristic.
Diagnostic performance of overall US features
A total of 14 studies reported the diagnostic performance of overall US features on BA. Overall US features showed summary sensitivity of 84% (95% CI = 72%–92%) and specificity of 86% (95% CI = 77%–92%). The DOR was 39 (95% CI = 12–94). The hierarchical summary ROC curve was symmetric, and the AUC was 0.92 (95% CI = 0.89–0.94).
Discussion
Among the 50 retrieved articles, 32 articles studied the diagnostic performance of TCS, the most studied US feature. Our meta-analysis showed that it had the highest diagnostic accuracy for BA (AUC = 0.97), which is consistent with the results of Zhou's study in 2015 from 20 studies (56), where the summary sensitivity of TCS was as low as 74% and the summary specificity was as high as 97%. Compared with Zhou's study, the results obtained after joining the new 12 studies carried out after 2015 are similar, which further confirms that TCS is the most commonly used US feature for BA and has excellent sensitivity and specificity.
SWE technology is an advanced US elasticity technology with advantages of quantitative evaluation, simple results for reading, and convenient operation for measuring liver tissue hardness (57). The Young's modulus value of liver elasticity is related to the pathological stage of liver fibrosis (58). The larger the Young's modulus value is, the greater the stiffness and elasticity value. Fibrous tissue proliferation of the periportal vein region is the main pathological characteristics of BA, which can be manifested as increasing liver stiffness. A total of 7 SWE studies were included in our meta-analysis and the results demonstrated that despite its inconspicuous overall sensitivity and total specificity, it has excellent diagnostic accuracy (AUC = 0.92). It is worth noting that the cutoff values used in these 7 studies are not completely consistent. This may be due to the difference between machines and the operating distinction between ultrasound performers. Because the current studies on SWE for BA is relatively few, with only 7 items, subgroup analysis cannot be performed. We only made a rough evaluation on SWE's BA diagnostic performance. In the future, more relevant research is needed and subgroup analysis could be carried out.
Among the 50 included studies, 25 mentioned GBA, second only to TCS sign, which has been regarded as a reliable and widely used US feature. Although the diagnostic performance of GBA is lower than that of TCS, its overall diagnostic performance is still very good (AUC = 0.92). Compared with the meta-analysis on GBA in 2015 (56), our study made a detailed subgroup analysis of it, with more included studies and more detailed grouping category. GBA includes many conditions, covering absence of gallbladder, small gallbladder, irregular gallbladder wall, absence of contraction, absence of CBD, and overall abnormal gallbladder. A subgroup analysis on different types of GBA showed that absence of gallbladder had the highest specificity, but its sensitivity was very low. In addition, irregular gallbladder wall also had a very high specificity. Absence of contraction and CBD had high sensitivity, but low specificity. Irregular gallbladder wall, absence of contraction, and CBD all had good diagnostic performance. However, it is worth noting that recognizing GBA during ultrasound examination is a subjective task. For example, it is also difficult to identify CBD in healthy infants. Judging whether the gallbladder wall is irregular mainly depends on personal experience and subjective feelings of the sonographer. Unifying and developing an objective gallbladder classification system may improve the diagnostic accuracy of BA.
In addition to the commonly used TCS, GBA, and the newly developed SWE for differential diagnosis of BA, HA is also used as one of the indicators by researchers. Its sensitivity, specificity, and diagnostic performance are not particularly high, but it can still be used as an auxiliary sign to help identify BA. In addition, there are some studies that simply describe the diagnostic performance of overall US features for BA, without specifying which feature or combined features. Therefore, the sensitivity and specificity obtained by using the vague concept of ultrasound signs to diagnose BA is at a moderate level, but its diagnostic performance is still very high. In the future, studies on specified combined US features would assist the differential diagnosis of BA.
Most of the studies in this meta-analysis used TCS and GBA as diagnostic features, and surgery and biopsy as reference standards, with broad sensitivity and specificity. These differences may be caused by a variety of factors. First, the research we included had a large time span, and technological advances in US equipment can explain some of these changes. Second, there were some differences in the criteria for abnormal US features between studies. For example, some studies used a cutoff value of 3 mm for TCS, while some used 4 mm. Small differences in the cutoff values of SWE between studies existed, which may be due to differences between machines. There were some subtle differences in the definition of GBA in different studies. In addition to the US features mentioned in this paper, researchers have also studied the diagnostic performance of other features, such as hilar lymph nodes, subhepatic blood flow, and portal vein/hepatic artery ratio (17,31). In view of the small number of studies on these features, systematic review and evaluation has not been performed at this time. It is likely that these US features also show good diagnostic accuracy for BA.
In conclusion, TCS and GBA are the two most accurate and widely accepted ultrasound features currently used for differential diagnosis of BA. The newly developed SWE is an objective and convenient method with good diagnostic sensitivity, specificity, and diagnostic performance, and can be used as an excellent means in clinical applications. HA enlargement can be used as an auxiliary feature to help the diagnosis of BA. More research on the diagnostic value of combined ultrasound signs for BA is needed in the future.
Footnotes
Authors' note
Yajie Tang and Pan Yang have made substantial contributions to conception and design, including literature retrieval, acquisition of data, analysis and interpretation of data, drafting the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
