Abstract
Abstract
Background:
Peri-prosthetic joint infection (PJI) is a serious and frequent complication of total joint arthroplasty (TJA). Recently, synovial fluid leukocyte esterase (LE), measurement of which is convenient and fast, has been examined as a marker of PJI. We summarized the articles describing synovial fluid LE as a biomarker for the diagnosis of PJI and assessed its diagnostic value in patients suspected of having PJI.
Methods:
We searched with appropriate key words in PubMed, Embase, Web of Science, the Cochrane database, and Science Direct. Eligible studies providing sufficient data to construct 2 × 2 contingency tables were chosen on the basis of several criteria, and the quality of the chosen studies was assessed. The pooled sensitivity, specificity, and diagnostic odds ratio (DOR) were calculated for those studies. The summary receiver operating characteristic (SROC) curve and the area under the SROC (AUSROC) were used to evaluate the overall diagnostic performance of LE.
Results:
Eleven studies were found suitable for this systematic review. Among them, eight articles with a total of 1,011 participants qualified for meta-analysis. The pooled sensitivity, specificity, and DOR were 0.90 (95% confidence interval [CI] 0.76–0.96), 0.97 (95% CI 0.95–0.98), and 310.76 (95% CI 103.86–929.88), respectively. The SROC was 0.98 (95% CI 0.96–0.99). Sub-group analysis indicated that the sample inclusion criteria might be the main source of heterogeneity. Publication bias was suggested by an asymmetrical funnel plot (p = 0.144).
Conclusion:
Although the result of synovial fluid LE assay can be influenced by sample-related factors, it is more specific as a means to exclude PJI.
T
Traditionally, the hematologic diagnosis of PJI is performed by measuring inflammatory factors of white blood cell (WBC) number, erythrocyte sedimentation rate (ESR), and serum C-reactive protein (CRP) concentration. In addition, microbiologic examination of the synovial fluid and peri-prosthetic tissue by histologic techniques and synovial fluid culture and imaging, including electrical capacitance tomography (ECT) bone scanning, magnetic resonance imaging (MRI), and positron emission tomography (PET), are used for the diagnosis of PJI [6, 7]. However, some of these results are non-specific for PJI, and the test results have to be combined with the clinical history and symptoms. Clearly, a more specific and sensitive routine test for PJI is required [8]. To address inconsistencies in diagnosing a PJI with these tests, several orthopedic associations have established clinical guidelines, which are based on consensus approaches, expert opinions, and reviews [9]. The American Academy of Orthopedic Surgeons (AAOS) guidelines first appeared in 2010 as a reference for the diagnosis of PJI [10] and include ESR and CRP as screening tests and aspiration of the joint when serologic markers are elevated. Then, in 2012, the Musculoskeletal Infection Society (MSIS) issued a consensus statement providing a concise definition of PJI [11]. Although the MSIS definition provides a standard for definitive retrospective diagnosis and research, its complexity makes it difficult to execute in daily clinical practice [12]. The ideal method of diagnosis would be a single test or panel that is highly sensitive, specific, and easy to interpret [13].
In recent years, research on PJI diagnosis focused on synovial fluid, as it represents the local environment of infection, so diagnosis should be more sensitive than that of serum markers. Studies revealed that various kinds of antimicrobial peptides and inflammatory cytokines, including interleukin (IL)-1, IL-6, IL-17A, interferon (IFN)-γ, and tumor necrosis factor (TNF)-α [14], could be biomarkers but are not specific to PJI. Other markers, such as CRP, α-defensin, and cathelicidin LL-37 [15], although more specific for PJI, require enzyme-linked immunosorbent assay (ELISA) and turbidimetric immunoassay as the detecting methods, which can take several hours or even a day to yield the final result.
Leukocyte esterase (LE) was first used in patients with urinary tract infections and can be measured by a dipstick technique to semi-quantify the count of WBCs (mainly neutrophils) in urine within several minutes [16]. The enzyme is present in granulocytes and secreted by neutrophils when activated during a bacterial infection [17]. Based on MSIS and AAOS guidelines, the increase of both the polymorphonuclear cell (PMN) percentage and WBC count in synovial fluid is considered an indicator of PJI. By measuring LE in synovial fluid by lysis of neutrophils and quantifying all intracellular and extracellular esterase activity, an estimation of the synovial fluid WBC count can be obtained [18]. The more neutrophils, the more intense the color changes (in this case, to purple) on the reagent strip [19]. The LE in synovial fluid thus has been detected by colorimetric strip tests through reactions producing a color change. Several studies have demonstrated that this quick, easy-to-perform, cost-effective measurement has satisfactory sensitivity and specificity [20, 21]. Problems including bloody samples, which interfere with the colorimetric change on the dipstick, and use of different cut-offs have led to a comparison between the LE test and CRP or α-defensin.
In view of the urgent clinical requirement for a convenient, precise, and prompt diagnosis of PJI, we conducted a systematic review to summarize studies related to LE and then used a meta-analysis to investigate the diagnostic accuracy of LE in PJI. In addition, we summarize the different attitudes toward LE and the problems currently unsolved in using it for the diagnosis of PJI. To the best of our knowledge, our study is the first meta-analysis to evaluate the clinical utility of synovial fluid LE in the diagnosis of PJI.
Materials and Methods
The methodological approach to evidence searching and synthesis described used in this protocol was based on the Cochrane Collaboration's diagnostic test accuracy methods [22]. We conducted a literature search, screened the studies identified, and selected those that met the eligibility criteria. Data were extracted from the selected studies, and eligible studies were assessed by the revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) criteria [23]. Statistical analysis, evidence synthesis, and report compilation were carried out as described below. We adhered strictly to the standards of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) in reporting the findings of this review [24].
Search strategy
We searched the databases PubMed, Embase, Web of Science, the Cochrane Library, and Science Direct for entries recorded from the time of database inception to April 2016. Vocabulary and syntax were adjusted according to the database. We used key words or Mesh words as follows: “periprosthetic joint infection” or “prosthesis-related infections” to represent the disease, “synovial fluid” or “fluid, synovial” to represent the source of our target biomarker, and “leukocyte esterase” as our target index.
Studies describing patients suffering from the hip, knee, and shoulder arthroplasty or investigating our target biomarker were included. Therefore, animal-only studies and studies that did not report data on the diagnostic performance of our target subject were excluded.
Study selection
Screening was performed in a two-step process: Title/abstract and full text. Two researchers reviewed the title and abstract of each paper independently to select those that were likely to reward further screening. In the initial stage, 10 articles reached acceptable agreement between the researchers. When confronted with disagreements, two researchers had to come to a consensus about the screening methods. After full-text screening, a list of excluded studies with reasons for exclusion was created.
The inclusion criteria were as following: Patients who had undergone knee, hip, or shoulder replacement; sufficient synovial fluid had to be aspirated for the study method; leukocyte esterase was detected in the synovial fluid; the diagnosis of PJI was confirmed by the Musculoskeletal Infection Society (MSIS) or AAOS guidelines or utilizing a combination of clinical data; and sufficient data could be extracted for the construction of a 2 × 2 contingency table. Exclusion criteria were: Unrelated biomarkers; insufficient data to calculate sensitivity and specificity; and case reports, commentaries, expert opinion, narrative reviews, and duplicates.
Quality assessment
The methodologic quality of the studies was appraised according to an adapted version of QUADAS-2, which consists of four key domains, namely patient selection, index test, reference standard, and flow and timing. Risk of bias in the four domains and the clinical applicability of the first three domains were assessed with signaling questions, which were answered “yes” for a low risk of bias/concerns, “no” for a high risk of bias/concerns, or “unclear.”
Data extraction
The following information was extracted: (1) Study characteristics, including author, year of publication, country, design, sample size, and number analyzed for each outcome; (2) population characteristics, including patients' sex and mean age; (3) intervention characteristics, including method of sampling, method of measuring, and threshold; (4) gold standard, including the test results based on the definition of PJI by MSIS; and (5) outcomes, including false/true positive, false/true negative from 2 × 2 table for diagnostic studies, sensitivity and specificity, positive likelihood ratio (LR+), and negative likelihood ratio (LR-). Data were extracted by a single reviewer from all outcomes data and verified by the other reviewer.
Statistical analysis and heterogeneity assessment
For all the studies from which we constructed the 2 × 2 table, sensitivity, specificity, LR+, LR-, and the diagnostic odds ratio (DOR) were calculated using the bivariable model [18]. A summarized receiver operating characteristic (SROC) curve was constructed. In diagnostic tests, heterogeneity was commonly caused by a threshold effect. When a threshold effect existed, there was a negative correlation between sensitivity and specificity. Heterogeneity caused by the threshold effect was evaluated by the Spearman correlation coefficient. The percentage of the total variation across studies was described by the I2 statistic, which indicated the existence of significant heterogeneity when the value exceeded 50%. (The value of I2 ranges from 0–100%, with 0 implying no observed heterogeneity, and larger values indicating increasing heterogeneity [25]). The random effects model was chosen because of the expected clinical and statistical heterogeneity among the studies [26]. For all effect estimates, p < 0.05 was considered statistically significant.
All analyses were conducted using Meta-disc software (version 14.0; Zamora et al., Madrid, Spain).
Results
Of the 72 articles found, 34 were left for further screening after excluding the duplicates. Twenty articles were excluded after reading the title and abstract, the reasons being inappropriate article type (reviews, comments, or letters). After examining the remaining set, 11 articles were chosen for systematic review [27–37]. Among them, the reference standard of one article was inappropriate [35], one had data analysis inappropriate for construction of a 2 × 2 table [36], and another used an automated reader instead of the naked eye to determine the color change of the strip [37]. Excluding these three studies, eight articles were left for meta-analysis. The study selection process is illustrated in Figure 1.

Flow chart of selection process for eligible studies.
A total of 1,101 patients who had undergone hip or knee replacement were included in the meta-analysis. Seven studies [27–32, 34] were conducted prospectively, taking the synovial fluid samples prior to any clinical treatment, whereas one [33] was conducted retrospectively, and the time of sample harvest was not clarified. Five studies were from the USA, the other three from Germany, Italy, and China. Of the accepted articles, seven were in English and one in Chinese. Among the 1,101 patients, the mean age ranged from 57.2 to 69.1 years, and the proportion of males ranged from 42.8% to 60.8%. The reference standard was fulfilled in 239 patients, and 772 patients had negative findings. The optimal cut-off value was specified in all studies. In three studies, ++ was used to diagnose PJI, whereas in another three studies, both ++ and + were considered positive. Two other articles provided the results of both these cut-off values (++ or ++/+). In five studies, LE strips were purchased from Roche Diagnostics, although the subtypes were different, whereas in one study, the strip was purchased from Dirui. The other two studies did not specify the strip source. In addition, the reference standard was not limited to MSIS and AAOS. In one study, a reference standard of their own institution was included in the meta-analysis because it was considered to have details similar to the MSIS guidelines. Although synovial fluid samples contaminated with blood were excluded in three studies, in one of the studies, both readable and unreadable samples were calculated, whereas the remaining studies reported that bloody samples were centrifuged and the supernatant liquid dropped on the strip, which provided readable results. Detailed characteristics of the individual studies are summarized in Table 1.
Bloody samples excluded.
Similar to MSIS guidelines.
Both ++ and ++/+ were used as cut-off value for sensitivity, specificity, and positive and negative predictive value.
Both bloody and non-bloody samples were analyzed for sensitivity, specificity, and positive and negative predictive value, but only non-bloody samples were included in the meta-analysis.
ELISA = enzyme-linked immunosorbent assay; P = prospective study; R = retrospective study; UA = unavailable.
A graphic summary of the methodologic assessment based on the QUADAS-2 for the nine studies is shown in Figure 2. All the studies fulfilled the requirements of an acceptable reference standard: Partial verification bias, differential verification bias, and incorporation bias were avoided; detailed description of the index test was available, blinding of investigators to the reference was clear, uninterpretable results were reported, and withdrawals were explained.

Quality assessment of included studies using QUADAS-2 tool criteria.
For the accepted studies, data extracted for the construction of a 2 × 2 table are provided in Table 2. The pooled sensitivity was 0.90 (95% confidence interval [CI] 0.76–0.97), and the pooled specificity was 0.97 (95% CI 0.95–0.98; Fig. 3). The pooled LR+ and LR- were 30.90 (95% CI 19.809–48.190) and 0.10 (95% CI 0.04–0.26), respectively. The area under the SROC was 0.98 (95% CI 0.96–0.99; Fig. 4), and the DOR was 310.76 (95% CI 103.86–929.88). The results are summarized in Table 3. The post-test probabilities based on various pre-test probabilities were illustrated using a Fagan nomogram (Fig. 5).

Pooled sensitivity and specificity of LE in diagnosis of PJI.

Summary receiver operating characteristic plot for the included studies with the associated 95% confidence region and the 95% prediction region.

Fagan's nomogram for the calculation of post-test probabilities with a fixed pre-probability of 24%.
Threshold was ++/+. Data in parentheses were summarized with the cut-off value of ++/+.
Synovial fluid samples with blood contamination were excluded as nonreadable.
Synovial fluid with blood contamination were readable after centrifugation.
FN = false negative; FP = false positive; TN = true negative; TP = true positive.
Studies conducted by Parvizi et al. [27] and Tischler et al. [31] provided data for 2 × 2 table with two cut-off values. These studies were used twice in sub-groups, thus making the number of each sub-group study five.
DOR = diagnostic odds ratios; LR− = negative likelihood ratio; LR+ = positive likelihood ratio; Sen = sensitivity; Spe = specificity; SROC = summarized receiver operating curve.
The between-study variability (heterogeneity) was high for sensitivity, with an I2 of 80.45. The bivariable analysis revealed that the heterogeneity was not explained by the threshold effect, as variations in sensitivity and specificity were related to differences in the cut-off value of LE in the accepted studies.
Among these studies, sub-groups were divided on the basis of a cut-off value, sample inclusion criteria, and patient number to explore further the heterogeneity of the study. For sub-groups based on cut-off, five studies [27, 30–32, 34] provided data, which were based on the cut-off of ++, and five [27–29, 31, 33] provided data based on the cut-off of ++/+. Because Javad et al. [27] and Eric et al. [31] conducted their studies with both of these cut-off values, the number of sub-groups was five. The pooled sensitivity and specificity were 0.86 (95% CI 0.69–0.95) and 0.97 (95% CI 0.93–0.99) in studies with ++ cut-off and 0.92 (95% CI 0.76–0.98) and 0.93 (95% CI 0.83–0.97) in studies with a ++/+ cut-off (Table 3). For sub-groups based on exclusion of bloody samples, four studies [27, 28, 30, 33] excluded such samples, whereas the other four groups of investigators [29, 31, 32, 34] centrifuged the samples to precipitate red blood cells and obtain clear synovial fluid for the LE strip test. The pooled sensitivity and specificity were 0.96 (95% CI 0.60–1.00) and 0.99 (95% CI 0.93–1.00) in studies excluding bloody samples and 0.85 (95% CI 0.67–0.94) and 0.96 (95% CI 0.93–0.98) in studies in which the bloody samples were centrifuged (Table 3). For sub-groups based on patient number, four studies [27–29, 31] provided samples larger than 100, whereas in other four [30, 32–34], the sample was smaller than 100. The pooled sensitivity and specificity were 0.89 (95% CI 0.70–0.97) and 0.96 (95% CI 0.91–0.98), respectively, in studies excluding bloody samples and 0.91 (95% CI 0.64–0.98) and 0.97 (95% CI 0.96–0.99) in studies with bloody samples centrifuged (Table 3). Because the I2 of sensitivity in the sub-group excluding or including bloody samples decreased to 21.8 and 15.6, respectively, sample inclusion criteria might account for the heterogeneity in sensitivity. Deek's funnel plot asymmetry test suggested the absence of publication bias (p = 0.144) (Fig. 6).

Deeks funnel plot for the evaluation of publication bias.
Discussion
Peri-prosthetic joint infection is a serious complication associated with high cost and a significant reduction in patient quality of life [1]. The reasons for the diagnostic difficulty include the absence of specific clinical signs and symptoms, the relative lack of accurate laboratory tests, and low isolation rate of pathogens because of prior therapy and formation of biofilms [1,38]. The MSIS recently responded to this diagnostic difficulty by developing a definition for PJI [39]. According to MSIS, the diagnosis of PJI requires either of two major criteria (sinus tract communication with a prosthesis or a pathogen isolated by culture from two separate fluid samples) or four of six minor criteria (elevated ESR, CRP, WBC, and percentage of PMN; the presence of purulence; and greater than five neutrophils per high-power field on frozen section) [40]. The AAOS guideline is similar, including the following four thresholds: ESR >30 mm/h, serum CRP concentration >10 mg/L, synovial WBC count >1,760/mcL for chronic infection or 10,700/mcL for acute infection, and synovial neutrophil differential percentage >73% for chronic infection or >89% for acute infection. A PJI should be diagnosed if three of the four threshold values are abnormal [10, 41].
Although clinically useful, these definitions are complex and time consuming, with subjective interpretation of the frozen-section histology and the delay in obtaining several culture results. On the contrary, synovial fluid aspirated from patients who have undergone total joint replacement provides researchers with a perfect source of PJI diagnosis, as host proteins with direct antimicrobial activity may play an important role in the response to pathogen elimination [42, 43]. The promise of synovial fluid biomarkers to diagnose PJI has been reported during the past few years; however, the reference standard in some of these studies was not based on the latest MSIS or AAOS guidelines, making comprehensive analysis of these studies more challenging. According to our search results, no articles so far have explored a systematic review or meta-analysis of synovial fluid LE for the diagnosis of PJI, and we considered it necessary to fill this gap.
Based on the search and evaluation results, we narrowed the scope of the current study to LE, as the test is easy to perform and can provide a final result within five min, which is beneficial for clinicians intending to make final decision during arthroplasty surgery. After quality assessment, eight articles qualified for our meta-analysis. We found that LE showed high sensitivity and specificity for the diagnosis of PJI. Synovial fluid LE has a high (AUC >0.9) diagnostic ability to identify PJI based on the suggested guidelines for the interpretation of the area under the SROC (AUSROC). As a single indicator of diagnostic test performance, DOR is independent of disease prevalence. The DOR of our pooled analysis was 310.76, indicating a high diagnostic value of LE in PJI diagnosis. Likelihood ratios are always used in clinical practice, as they show how a particular test result can predict the risk of disease and indicate the extent to which a given test would raise or lower the probability of the patient having the disease. With a pooled LR+ of 30.90 and LR- of 0.10, LE is a biomarker of moderate diagnostic value in PJI. The disparity noticed in the accepted studies may result from the sample inclusion criteria. Studies excluding bloody samples tend to underestimate the effect size. The pooled DOR of the four studies excluding bloody samples was 185.89, whereas the DOR of the other four was 122.27. Therefore, caution is necessary when interpreting the results.
Wetters et al. [37] used synovial fluid WBC or positive culture as the reference standard of PJI and generated different sensitivity and specificity values. Because they did not rely on the MSIS or AAOS guidelines, the data were not included in our meta-analysis. Although Nelson et al. [35] used MSIS as the reference standard, all samples were from shoulder synovial fluid, and the PJI diagnosis was divided into three sub-groups: True PJI, potential PJI, and negative PJI. In addition, bloody samples were considered as indeterminate or positive results, which made the 2 × 2 table complicated to construct based on a single standard. Shafafy et al. [36] used semi-quantitative readings of the strip, which provided a different cut-off value from that of the other studies. Those investigators also provided two 2 × 2 table based on different cut-off values from the semi-quantitative readings. All these studies also mentioned the influence of blood or debris in the samples, and some authors excluded samples contaminated with blood, whereas some considered bloody samples to be positive.
There are several limitations in our study. First, despite an in-depth search of several electronic databases, there were only eight articles appropriate for our meta-analysis. This small number of studies made it difficult to analyze the heterogeneity of sensitivity results using meta-regression but only with sub-group analysis. Second, the ideal cut-off value for the synovial fluid LE strip test could not be determined, as the raw data were not provided in any of the published articles. Because there is still no standard cut-off value for the diagnosis worldwide, different laboratories used mainly two cut-off values (++ or ++/+) to diagnose PJI, and these also were used in our meta-analysis.
Although the number of studies included in our meta-analysis is small, all the studies illustrate the high sensitivity and specificity of LE for the diagnosis of PJI. This systematic review creates a foundation for evidence-based guides on diagnostic performance of synovial fluid and provides recommendation to clinicians for diagnosing PJI accurately and efficiently. Considering the low cost and best-in-category accuracy of the LE test, serious consideration should be given to including it as a standard tool for diagnosing PJI, especially for exclusion of PJI.
Footnotes
Acknowledgments
We thank Chen Chen and Shang He for helpful discussions.
Author Disclosure Statement
All authors declare that they have no financial or personal relations with other people or organizations that could have influenced our work inappropriately. There is no professional or other personal interest of any nature or kind in any product, service, or company that could be construed as influencing the position presented in or the review of the manuscript.
Informed consent was obtained from all participants in the study.
