Error Matrix Tool to Overview the Validity of Evidence on Radix Sophorae flavescentis for Chronic Hepatitis B

Abstract

Objectives:

To introduce a conceptualized visual error matrix tool to overview the validity of evidence by taking Radix Sophorae flavescentis for chronic hepatitis B as an example and to propose recommendations for improving clinical trial design and evidence quality.

Methods:

The randomized clinical trials and reviews were collected during the conduct of a Cochrane systematic review. The authors used a visual error matrix tool to overview the evidence validity by looking at systematic, random, and design error risks. Systematic errors were measured by the type of evidence. Random errors were expressed by the standard error (SE). Design errors were assessed on the priority of outcome measures and the adequacy of nine design components. Three-dimensional error matrix on benefits and harms were then constructed.

Results:

The authors included 6 meta-analyses and 28 randomized clinical trials. In terms of systematic errors, all reviews were at critically low quality, and all included randomized trials were assessed at high risk of bias. On this systematic error level, they found that there was substantial risk of random errors regarding all-cause mortality (SE 0.36), moderate risk regarding serious adverse events (SE 0.22), substantial risk regarding nonserious adverse events (SE 0.35), and small to moderate risk regarding surrogate outcomes such as detectable hepatitis B e-antigen (HBeAg) and detectable hepatitis B virus (HBV)-DNA (SE 0.16 and 0.21). No study reported results on quality of life, hepatitis B-related mortality, and morbidity. The design error risks were mainly misuse of outcomes (14/34), inadequate selection of participants (5/34), inadequate description of intervention (11/34) and control (9/34), single-center setting (33/34), and unclear study objective regarding superiority, equivalence, or noninferiority.

Conclusion:

The current evidence on Radix S. flavescentis for chronic hepatitis B showed high risks of systematic errors, moderate or high risks of random errors, and high risks of design errors. These findings suggest that more randomized trials at minimum risks of all three errors are needed to assess the benefits and harms of Radix S. flavescentis for chronic hepatitis B. The visual error matrix tool provides an overview of the reliability of evidence and may assist in design and conduct of future randomized trials.

Introduction

Chronic hepatitis B is a major public health issue.¹ In 2015, ∼257 million people around the world, or 3.5% of the world's population, are infected with hepatitis B virus (HBV).¹ About 887,000 people may have died because of complications to chronic hepatitis B, such as cirrhosis, liver failure, or hepatocellular carcinoma.¹ Radix Sophorae flavescentis (Chinese name: Kushen) is the dried root of the shrub Sophora flavescens Aition. It has been claimed that Radix S. flavescentis has antibacterial, antiviral, anti-inflammatory, antitumor, and antipyretic effects, and it is one of the commonly used Traditional Chinese medicinal remedies for chronic hepatitis B.^2,3 However, the benefits and harms of Radix S. flavescentis for chronic hepatitis B remain unclear, as it has never been assessed in a systematic review with rigorous methodology. In the process of conducting the Cochrane systematic review of Radix S. flavescentis for chronic hepatitis B,⁴ they have identified a large number of trials when searching scientific databases. Before incorporating the data from these trials, the validity of the evidence should be considered before the direction and the size of the intervention effect can be assessed reliably.^5

–10

Research findings are less likely to be true if trials are small,^11,12 if patients are not randomized,^8,13,14 if allocation concealment is inadequate,¹⁵ if there is lack of patient and observer blinding,^16
–18 and if industries or others with conflicts of interests are involved.^19,20 Empirical evidence also showed that inappropriate controls,⁹ abuse of surrogate outcomes,^8,9,21 noninferiority trial design,^9,22 single-center setting,^23
–25 and poor reporting^9,26,27 may lead to overestimation of benefits and underestimation of harms. The whole array of factors that may jeopardize the reliability of clinical research are theoretically categorized into three dimensions: systematic errors, random errors, and design errors.^28,29 Accordingly, Keus et al. have conceptualized a visual error matrix tool to overview the studies and their data in perspectives of these three-dimensional errors.²⁹ The tool provides an overview of the validity of the evidence at a glance and may assist in making decisions about medical interventions and their future assessments in the present study.²⁹

Based on their Cochrane systematic review literature searches,⁴ the authors used the error matrix tool to overview the validity of clinical evidence on Radix S. flavescentis for chronic hepatitis B. Furthermore, they performed the evidence matrix method by illustrating detailed design error components rather than just assessing design errors through the priority of outcome measures. In addition, they tried to give some recommendations on how to design high quality of randomized clinical trials on this topic.

Materials and Methods

Data sources

The data were based on the Cochrane systematic review that is under preparation.⁴ Randomized clinical trials eligible for this Cochrane systematic review were included and analyzed. The authors also included the previous published reviews with meta-analyses with similar inclusion criteria for trials.

Participants

Participants of any sex and age, diagnosed with chronic hepatitis B, as defined according to guidelines or by trialists.

Interventions

Radix S. flavescentis or its extractions (e.g., matrine, oxymatrine) at any dose, form, and regimen are compared with placebo or no intervention. The authors excluded polyherbal blends containing Radix S. flavescentis. They allowed cointerventions if provided equally to all intervention groups of a trial. The primary outcomes were all-cause mortality, serious adverse events, and health-related quality of life. The secondary outcomes were hepatitis B-related mortality, hepatitis B-related morbidity, and serious adverse events considered to be not serious. The exploratory outcomes were proportion of participants with detectable HBV-DNA and proportion of participants with detectable hepatitis B e-antigen (HBeAg).

More information about data searching and selection of reviews with meta-analyses and randomized clinical trials can be found in the protocol.⁴

Systematic errors assessment

Systematic error (i.e., bias) is a deviation from the truth in results or inferences.³⁰ The risk of systematic errors depends on the type of clinical research design (systematic review, randomized trial, cohort study, case–control study, case series, case report, and expert opinion) (Table 1).^9,29 If well designed and conducted, meta-analysis of randomized trials at low risk of bias may have higher internal validity, while expert opinion is often biased by authors' subjective feelings and no control of confounding factors.^6,31 However, randomized trials and meta-analysis should not always be the predomination, as defined by the type of clinical question. When assessing the harms of an intervention, usually one should consider quasi-randomized studies, controlled clinical studies, and other observational studies, because adverse events are rarely reported in randomized trials, and such observational studies may provide information on late occurring adverse events. The authors therefore showed evidence in different types in the error matrix figure. The risk of systematic errors is found to be different not only among different types of evidence but also within each type of evidence. In addition, they used Assessment of Multiple Systematic Reviews (AMSTAR) tool³² to assess and illustrate systematic errors for each review with meta-analysis, and used Cochrane Collaboration's risk of bias tool³⁰ for each randomized clinical trial. The results are shown in independent tables and figures and are reported descriptively.

Table 1.

Types of Evidence Regarding Systematic Errors

Category	Studies
a	Meta-analysis of randomized trials with low RoB
b	Randomized trial with low risk of bias
c	Meta-analysis of all randomized trials
d	Randomized trial with high risk of bias
e	Meta-analysis of cohort studies
f	Cohort study
g	Meta-analysis of case–control studies
h	Case–control study
i	Case-series
j	Expert opinion

RoB, risk of bias.

For better understanding, they differentiate the concepts “systematic error,” “quality of evidence,” and “methodological quality.” “Quality of evidence” or “certainty of evidence” reflects their confidence in an estimate of the effect, and they can use the Grading of Recommendations Assessment, Development and Evaluation (GRADE) to evaluate; systematic error (bias) is just one of the five factors that could jeopardize the “Quality of evidence.” “Methodological quality” is to evaluate the flaws in the design, conduct, analysis, and reporting of different research types, and they can choose corresponding tools (e.g., AMSTAR³² or Risk of bias tool³⁰) to evaluate these aspects. “Systematic error” is based on the methodological quality across different types of researches (e.g., meta-analyses, randomized clinical trials) and within each type of research (e.g., AMSTAR³² for meta-analyses, risk of bias tool³⁰ for randomized clinical trials).

Random errors assessment

Random error (i.e., play of chance) is present when findings are based on sparse data, and may also be caused by multiplicity.^29,30 The authors used standard error (SE) as a measure of uncertainty, which indicates the closeness of the sample mean to the true population mean, to quantify and compare the risk of random errors between different studies.²⁹ They calculated the SE of the logarithm of the risk ratio for dichotomous data and planned to calculate the SE of the mean difference for continuous data, using the algorithms according to the Cochrane Collaboration's Handbook.³⁰ They performed the assessment of random error risk as follows: SE 0.0–0.1 = ignorable risk of random error; SE 0.1–0.2 = small risk of random error; SE 0.2–0.3 = moderate risk of random error; SE 0.3–0.5 = substantial risk of random error; SE over 0.5 = high risk of random error.²⁹

Design errors assessment

Design errors (i.e., erroneous selection of patients, doses of medication, comparators, analyses, outcomes, and so on) refer to wrong use of design components that may affect the evidence validity.^9,29 The following questions should be taken into consideration: Is the initial question valid? Is there abuse of noninferiority trials? Is the selection of participants adequate to answer the initial question? Is the dose, form, length, etc. of the intervention adequate? and Is there misuse of surrogate outcomes rather than patient-centered outcomes?^9,28,29 Based on the previous studies, the authors assessed the adequacy of the following nine design components to overview the design error risks: outcomes; participants; experimental intervention; control intervention; clinical setting; exploratory or pragmatic goal; superior, equivalent or noninferior design; trial structure; and unit of analysis.^9,28,29 In Table 2, the authors have presented the definitions of “inadequacy” of each design component. In the case where insufficient data were obtained for each design component even after contacting the authors, it has been assessed to be “inadequacy.”

Table 2.

Design Risk Components and the Definitions of Inadequacy of Each Component

Different design components that may introduce errors	The definition of inadequacy of each design component
Outcome measures	There was abuse of surrogate and composite outcomes
Participants	Unclear or too strict or too broad diagnostic and inclusion criteria was used to identify participants
Experimental intervention	The dose, form, length, etc. of the intervention was inappropriate or unclear
Control intervention	The dose, form, length, etc. of the control intervention was inappropriate or unclear
Clinical setting	It was conducted in single-clinical center compared to a multicenter
Goal (exploratory/pragmatic)	It was an exploratory trial when considering the translation into practice compared to a pragmatic trial
Objective (superior/equivalent/noninferior)	The trial was noninferiority design that often allows substantial harm to participants and unethical
Trial structure	It was cross-over trial compared to parallel trial
Unit of analysis	The unit of analysis is different with the unit of allocation and did not take the appropriate statistical method

Among the many variables that should be considered, the priority of outcomes lies as the core of the clinical research. The authors therefore focus on them from a patient's perspective and include them into the error matrix figure. Outcomes can be divided into three categories according to the GRADE classifications.³³ Primary outcomes are central in deciding the use of one intervention over another, concurring with the category of “critical for decision-making” outcomes. Secondary outcomes are additional outcome measures, referring to “important, but not critical for decision making” outcomes. Exploratory outcomes are of “not important for decision-making” category. From the patients' perspective, they ordered the predefined outcomes on a three-category scale with primary outcomes at the highest level and the exploratory outcomes at the lowest level.³⁰ That is, the all-cause mortality, more important to participants, is at the higher level than the decrease of detectable HBeAg. According to GRADE, the assessment of the outcomes may vary according to the clinical question and the users of the information.³³

N.L. performed literature searching. Four authors in pairs (N.L., X.-H.L., M.Y., and L.-D.F.) independently selected the literature, extracted data from each study, and contacted authors for the missing data. Two authors in pair (S.-S.M. and C.-L.L.) independently assessed the risk of bias and the risk of design error for each study and calculated the random error. The authors contacted J.-P.L. to arbitrate the disagreements, before proceeding with the analysis.

Construction of error matrix

The authors used Excel to construct the three-dimensional matrix on “benefit” and “harm” separately. Outcomes with results in favor of Radix S. flavescentis were exhibited in the “benefit” matrix, and those in favor of control intervention were shown in the “harm” matrix. Matrix step I classifies the identified studies according to different type of evidence (shown as y axis). Step II orders the studies according to the random errors (shown as the x axis).²⁹ Step III orders the studies according to the design errors based on the priority of outcomes (shown as z axis). To differentiate the outcomes within same outcome category, for example, primary outcomes, they used different colors to show the outcomes. The “Manhattan-like” matrix (bar chart) was then constructed accordingly, showing the results from different types of evidence together with the random error estimate and the outcomes evaluated.

Results

The authors included 6 reviews with meta-analyses^34

–39 and 28 randomized clinical trials.^{40

–67} Besides, 426 references, considered to be potential randomized clinical trials, were on the waiting list for further analysis, as they could not obtain any response from the authors about their randomization method.⁶⁸

Validity of meta-analyses

Systematic errors

All six reviews with meta-analyses were appraised as critically low quality when using the seven critical domains for measurement as recommended by AMSTAR³² (Table 3). Only one review³⁸ registered the protocol before the review was undertaken and had no significant deviations from the protocol. All the reviews did not use a comprehensive literature search. Three reviews^34
–36 had unjustified publication restriction on language or searching date, while the other three^37
–39 lacked comprehensive searching for gray literature. None of the reviews provided a list of all potentially relevant studies that were read in full-text but excluded with reason. Two reviews^34,38 used a satisfactory technique for assessing the risk of bias, while the others did not assess risk of bias or only focused on certain domains. When this was so, it was usually unclear what led to the selection of these domains. Only one review³⁸ performed the meta-analysis using an appropriate method. All the reviews except one³⁶ discussed the limitation of risk of bias when interpreting the results. Two reviews^34,38 performed publication bias investigation and discussed the potential impact on results, while the others did not conduct publication bias analysis or just displayed the funnel plot without any explanation.

Table 3.

Critical Appraisal for Reviews with Meta-Analyses Using Assessment of Multiple Systematic Reviews Tool

	Wu (2011) ³⁹	He (2013) ³⁷	Qi (2013) ³⁵	Jiang (2013) ³⁶	Song (2016) ³⁸	Wang (2017) ³⁴
(1) Did the research questions and inclusion criteria for the review include the components of PICO?	Yes	Yes	Yes	Yes	Yes	Yes
(2) Did the report of the review contain an explicit statement that the review methods were established before the conduct of the review and did the report justify any significant deviations from the protocol?	No	No	No	No	Yes	No
(3) Did the review authors explain their selection of the study designs for inclusion in the review?	No	No	No	No	No	No
(4) Did the review authors use a comprehensive literature search strategy?	Partial yes^a	Partial yes^a	No	No	Partial yes^a	No
(5) Did the review authors perform study selection in duplicate?	No	No	No	Yes	Yes	No
(6) Did the review authors perform data extraction in duplicate?	Yes	Yes	Yes	Yes	Yes	Yes
(7) Did the review authors provide a list of excluded studies and justify the exclusions?	No	No	No	No	No	No
(8) Did the review authors describe the included studies in adequate detail?	Partial yes^b	Partial yes^b	No	No	Partial yes^b	Partial yes^b
(9) Did the review authors use a satisfactory technique for assessing the RoB in individual studies that were included in the review?	No	Partial yes^c	Partial yes^c	No	Yes	Yes
(10) Did the review authors report on the sources of funding for the studies included in the review?	No	No	No	No	No	No
(11) If meta-analysis was performed, did the review authors use appropriate methods for statistical combination of results?	No	No	No	No	Yes	No
(12) If meta-analysis was performed, did the review authors assess the potential impact of RoB in individual studies on the results of the meta-analysis or other evidence synthesis?	No	No	No	No	No	No
(13) Did the review authors account for RoB in individual studies when interpreting/discussing the results of the review?	Yes	Yes	Yes	No	Yes	Yes
(14) Did the review authors provide a satisfactory explanation for, and discussion of, any heterogeneity observed in the results of the review?	No	No	No	No	Yes	Yes
(15) If they performed quantitative synthesis did the review authors carry out an adequate investigation of publication bias (small study bias) and discuss its likely impact on the results of the review?	Yes	No	No	No	No	Yes
(16) Did the review authors report any potential sources of conflict of interest, including any funding they received for conducting the review?	No	Yes	No	No	Yes	Yes

The review authors searched at least two databases, provided key word and/or search strategy, justified publication restrictions, but lacked comprehensive searching for grey literature.

The review authors described populations, interventions, comparators, outcomes, and research designs of the included designs, but not in adequate details.

The review authors assessed the RoB from allocation concealment and blinding of participants and assessors, but not assessed allocation sequence generation method and/or selective reporting.

Random errors

On this systematic error level, the authors found that there was substantial risk of random errors on the outcome all-cause mortality (SE 0.36), moderate risk on serious adverse events (SE 0.22), substantial risk on nonserious adverse events (SE 0.35), and small to moderate risk on surrogate outcomes such as detectable HBeAg and detectable HBV-DNA (SE 0.16 and 0.21) (Table 4). They found no data on quality of life, hepatitis B-related mortality, or morbidity.

Table 4.

Ordering of Evidence According to Levels of Evidence and Standard Error for All Available Outcome Measures of Each Study

Study ID, first author (year)	Priority of outcomes
	Primary outcomes			Secondary outcomes			Exploratory outcomes
	All-cause mortality	Health-related quality of life	Serious adverse events	Hepatitis B-related mortality	Hepatitis B-related morbidity	Nonserious adverse events	Detectable HBV-DNA	Detectable HBeAg
He (2013)^37,a	N	N	N	N	N	0.35	0.20	0.16
Jiang (2013)^36,a	N	N	N	N	N	N	0.06	0.05
Qi (2013)^35,a	N	N	N	N	N	N	0.21	0.09
Song (2016)^38,a	Z	N	Z	N	N	N	0.04	0.04
Wang (2017)^34,a	N	N	0.22	N	N	0.10	0.05	0.05
Wu (2011)^39,a	0.36	N	N	N	N	N	0.03	0.03
Duan (2004)^61,b	N	N	N	N	N	Z	0.18	0.15
Gao (2003)^66,b	N	N	N	N	N	N	N	N
He (2013)^64,b	N	N	Z	N	N	Z	0.36	0.17
Huang (2004)^63,b	N	N	N	N	N	N	0.11	0.11
Huang (2005)^62,b	N	N	N	N	N	N	0.13	0.13
Li (2006)^59,b	N	N	N	N	N	0.18	0.16	0.14
Li (2008)^60,b	N	N	N	N	N	Z	0.21	0.16
Li (2010)^53,b	N	N	Z	N	N	Z	0.18	0.12
Liu (2005)^58,b	N	N	N	N	N	0.36	0.14	0.09
Liu (2016)^67,b	N	N	N	N	N	N	0.24	N
Lu (2004)^57,b	N	N	Z	N	N	0.65	0.11	0.09
Lv (2010)^55,b	N	N	N	N	N	N	N	N
Lv (2011)^56,b	N	N	Z	N	N	Z	0.22	0.14
Mao (2014)^54,b	N	N	N	N	N	N	N	N
Su (2014)^50,b	N	N	N	N	N	0.77	1.03	N
Sun (2011)^49,b	N	N	Z	N	N	0.36	0.31	0.15
Wang (2006)^48,b	N	N	Z	N	N	Z	0.29	0.18
Wang (2011)^47,b	N	N	N	N	N	N	N	N
Wei (2010)^46,b	N	N	Z	N	N	Z	0.30	0.18
Xi (2010)^44,b	N	N	N	N	N	0.18	0.18	0.13
Xie (2010)^65,b	N	N	N	N	N	N	0.18	N
Xue (2008)^51,b	N	N	N	N	N	N	0.20	0.18
Yan (2011)^43,b	N	N	Z	N	N	1.21	0.19	N
Ye (2015)^42,b	N	N	Z	N	N	0.57	0.13	N
Zhang (2011)^41,b	N	N	N	N	N	N	0.06	0.10
Zhang (2015)^52,b	N	N	N	N	N	N	0.47	N
Zhang (2017)^40,b	N	N	Z	N	N	Z	0.15	0.12
Zhou (2013)^45,b	N	N	N	N	N	N	0.71	0.18

Reviews with meta-analyses.

Randomized clinical trials.

HBeAg, hepatitis B e-antigen; HBV, hepatitis B virus; N, no data; Z, outcome with zero events in one or both intervention groups.

Design errors

Six reviews with meta-analyses showed high potential of design errors regarding outcomes,^34

–37 experimental intervention,^{34,35,37
–39} control intervention,^34

–39 clinical setting,^34

–39 trial goal,^34

–39 trial objective,^34

–39 and trial structure^34

–39 (Appendix Table A1). The assessments as “inadequacy” were mostly due to unclear description about the inclusion criteria of trials on these components.

Validity of randomized clinical trials

Systematic errors

All trials used adequate randomization method to allocate the participants. Two trials^56,54 used sealed opaque envelopes to conceal the allocation. Only one trial⁵⁷ blinded the participants and practitioners. Three trials^48,62,63 blinded the outcome assessors. Twenty trials had low risk of bias regarding incomplete outcome reporting, while the other eight trials^{43,48,55,57,62,63,65,66} had dropouts and did not use appropriate method to analyze the data. All trials might have risk of selective reporting due to lack of trial protocols. Seventeen trials^{41,44,45,47,51,52,54,55,58

–63,65–67} did not report assessment of patient-centered outcomes, resulting in their assessment of high risk of selective reporting bias. Nine trials^{42,47,55
–57,60,62
–64} reported to be supported by government or hospital funding (Figs. 1 and 2).

FIG. 1.

Risk of bias graph: review authors' judgments about each risk of bias item presented as percentages across all included randomized clinical trials. Color images are available online.

FIG. 2.

Risk of bias summary: review authors' judgments about each risk of bias item for each included randomized clinical trial. Color images are available online.

Random errors

None of the trials reported data on all-cause mortality, health-related quality of life, hepatitis B-related mortality, or hepatitis B-related morbidity. Eight trials^{42
–44,49,50,57
–59} reported nonserious adverse events, of which 75% (2/8) had substantial or high risk of random errors (SE over 0.3), and the median SE was 0.42 (range from 0.18 to 1.21); out of the 24 trials^{40

–46,48–53,56

–65,67} that reported data on detectable HBV-DNA, 5^{37,46,49,50,52} showed substantial or high risk of random errors (SE = 0.31–1.03); all the 18 trials^{40,41,44
–46,48,49,51,53,56

–65} that reported data on detectable HBeAg showed ignorable or small random error risks (Table 4).

Design errors

None of the trials had adequate design on all nine design components (Appendix Table A1). Eleven trials^{40,41,45,51,52,54,55,63,65
–67} reported only surrogate outcomes and/or composite outcomes to base their conclusion on. Five trials^50,58

–61 did not provide clear diagnostic and inclusion/exclusion criteria to get participants. Seven trials^{40,51,53,54,58,59,62} were assessed as inadequacy on intervention and control regimen. Only one trial⁵⁷ was conducted in multiple centers. All the trials could not be judged as superiority and pragmatic design, as they could not find any protocols or comparable information. All the trials were assessed at adequacy as all had parallel group design and individual unit of analysis.

The error matrix

From the “benefit” matrix figure, they can see at a glance that Radix S. flavescentis versus placebo/no intervention may provide benefits on all-cause mortality (red bars), serious adverse events (green bars), and surrogate outcomes such as HBV-DNA and HBeAg (yellow and blue bars) (Fig. 3). They can also see that Radix S. flavescentis may cause harm to participants on nonserious adverse events (orange bars) (Fig. 3). However, the results should be interpreted in relationship to the three-dimensional errors. The included meta-analyses and randomized trials were assessed at poor quality in terms of systematic errors. On this systematic error level, they found there was substantial risk of random error regarding all-cause mortality (SE 0.36), moderate risk regarding serious adverse events (SE 0.22), substantial risk on nonserious adverse events (SE 0.35), and small to moderate risk regarding surrogate outcomes such as HBeAg and HBV-DNA (SE 0.16 and 0.21). Data were lacked on patient-centered outcomes (primary and secondary outcomes), while surrogate outcomes as hypothesis-generating were reported mostly in the articles. The quick glance at the evidence using a conceptualized visual matrix tool showed that the results from available meta-analysis and randomized trials supporting Radix S. flavescentis for chronic hepatitis B seemed to be of low certainty.

FIG. 3.

Three-dimensional matrix building upon the risks of systematic errors, random errors, and design errors. (a) Outcomes with beneficial effects of Radix S. flavescentis versus placebo/no intervention. (b) Outcomes with harmful effects of Radix S. flavescentis versus placebo/no intervention. A quick guide to the perception of the figure: If you want to know what the evidence is for Radix S. flavescentis on all-cause mortality: go to the red bars and read (1) types of evidence (the risks of systematic errors) and (2) SE (the risks of random errors). As the authors only included randomized controlled trials and meta-analyses in this study (which was a limitation), data of other types of evidence were not shown. From the figures, one can see at a glance that Radix S. flavescentis may provide benefit to patients in terms of all-cause mortality (red bars), serious adverse events (green bars), and exploratory outcomes proportion of people with detectable HBV-DNA (yellow bars) and proportion of people with detectable HBeAg (blue bars). However, they can also see at a glance that Radix S. flavescentis may provide harm to people in terms of nonserious adverse events (orange bars). Data on health-related quality of life, hepatitis B-related mortality, and hepatitis B-related morbidity were lacking. Reading the dimension of systematic errors, it is immediately clear that there is meta-analysis of randomized trials. Reading the dimension of random errors on this systematic error level shows that there is substantial risk of random errors regarding all-cause mortality (SE 0.36), moderate risk regarding serious adverse events (SE 0.22), substantial risk regarding nonserious adverse events (SE 0.35), and small to moderate risk regarding surrogate outcomes such as detectable HBeAg and detectable HBV-DNA (SE 0.16 and 0.21). A clear version for creating a three-dimensional matrix can be found at the Copenhagen Trial Unit's homepage. HBeAg, hepatitis B e-antigen; HBV, hepatitis B virus; SE, standard error. Color images are available online.

Discussion

Summary of findings

The visual error matrix tool is a conceptualized tool for overviewing the studies and the data to assess the evidence from three dimensions of errors: systematic error, random error, and design error. The authors explanatorily introduced this tool in Traditional Chinese Medicine and took Radix S. flavescentis for chronic hepatitis B as an example. According to the error risks assessment, all the reviews with meta-analyses were assessed at critically low quality and all the randomized trials were assessed at high risk of bias in terms of systematic error. Random error risks varied between the different outcomes. All the outcomes except for the surrogate outcome detectable HBeAg had moderate or high risks of random errors. None of the studies had adequate design on all design components. Data on patient-centered outcomes were lacked compared to large amount of data on exploratory surrogate outcomes. Overall, the available evidence on Radix S. flavescentis for chronic hepatitis B was at risks of multiple forms of errors (systematic errors, random errors, and design errors). Lack of high-quality evidence before implementing an intervention to patients is a phenomenon existing both in Traditional Chinese Medicine and western medicine.^9,69
–71

Implications for practice and research

In addition to the 28 randomized trials that were included for analysis, there were 426 studies which claimed to be randomized trials, but they could not obtain any response from the trial authors about randomization methods. Considering the possibility of misuse of randomization and its risks,^72,73 before more information are obtained, they can only put these 426 studies on the waiting list for further analysis and urge the authors to contact them if their trial indeed is a properly randomized clinical trial. It is important that the trial authors should have understood and used the “randomization” correctly.⁷² Future trials must report the random sequential generation and allocation concealment methods clearly. Communicating research outcomes does not end with publishing a research article. Trial authors should realize the importance of research promotion, provide effective contact information in their publications, respond actively to inquiries, and hereby reduce research waste.^74

–80

The included randomized trials had big problems on selective reporting and blinding. The authors could not find a protocol of any included trials, and almost half of the included trials reported only surrogate outcomes such as HBV-DNA and HBeAg. Such outcomes may not be valid surrogates for patient-centered outcomes such as mortality, complications, and quality of life.^21,22 Moreover, such surrogate outcomes should not be used unless there are meta-analytic validation that they are dependable surrogates not only for specific patient-centered outcomes but also in case of the specific intervention being assessed.^21,22,81,82 Only one trial⁵⁷ compared Radix S. flavescentis with placebo, while the other trials explored the effects of Radix S. flavescentis in addition to a cointervention with a cointervention. In the latter comparison, results may be biased due to lack of blinding of practitioners and participants. However, it is almost always possible to obtain blinded outcome assessment.²⁸ For interventions that are hard or impossible to obtain “double blinding,” they should at least blind the outcome assessors.²⁸

Even though statements like the Consolidated Standards of Reporting Trials (CONSORT) have improved the reporting quality of randomized clinical trials,⁸³ most of the identified publications in the study still missed substantial information such as protocols, allocation concealment, incomplete outcome data, and profit bias. Unfortunately, they have received very few responses (7/28) about the missing information of the included trials after they contacted the authors, making it difficult to distinguish between poor reporting and the methodological quality of the trials.

Most of the randomized clinical trials in Cochrane systematic reviews are underpowered.¹² Also, about 25% of meta-analyses with a small number of events may falsely report a statistically significant result.^84,85 However, there is a lack of awareness of random error risk on the reliability of the evidence.^28,29,85 In the present study, the authors used SE to quantify the level of random error. According to Keuset et al. criteria,²⁹ all the outcomes except proportion of participant with detectable HBeAg had moderate or high risks of random errors. The included studies were assessed to be underpowered. As there are more than one way to assess the random error, the results may be different if the authors choose another measurement, such as calculating Bayes factor.⁸⁶ However, the result in their study should be a reminder for trialists and reviewers to consider the possibility of “play of chance” effect and try to avoid it. Inadequate sample size may lead to random errors.⁸³ The estimation of sample size in randomized trials depends on the goal of the trial (superiority, equivalence, or noninferiority), the type of the primary outcome measure (dichotomous or continuous), and the assumed intervention effect.²⁸ Trial Sequential Analysis has developed the method of sequential analysis and may assess the information size of meta-analysis as well as a single trial to accept or reject the intervention hypothesis.^85,87,88

None of the studies included had adequate design on all design components, questioning whether the study design was valid to answer the question about Radix S. flavescentis for chronic hepatitis B. Most identified evidence in this study reported the results on unvalidated surrogate outcomes such as HBV-DNA and HBeAg. Surrogate outcomes may show apparent benefits, while patient-centered outcomes may show no effect or harm.^8,21,81,82 Whether surrogate outcome results truly reflect improvement of clinical effects is questionable and requires demonstration.^{21,81,82,89,90} The authors therefore suggest future studies to focus on patient-centered outcomes such as mortality, adverse events, and health-related quality of life.

They could not judge whether the goal of the included trials was superior, equivalent, or noninferior due to poor reporting. They found that all the studies just performed a statistical difference test and never considered clinical significance.^8,9,89,90 Only if the trialists were clear about the goal of their trials, it could be ensured that sample size estimate and the interpretation of the results were reasonable.²⁸ The included trials in their study also had problems on participants selection, with either unclear diagnosis or no inclusion criteria. The participants to be included in a trial should be clearly defined, and researchers should avoid using too broad or too strict inclusion and exclusion criteria.²⁸

They only identified one multicenter trial out of the included 28 trials.⁵⁷ Empirical evidence showed that single-center trials usually report significantly larger interventions effects than did multicenter trials, and the positive results of many single-center trials were frequently contradicted when tested in multicenter settings.^23
–25 Therefore, they should always try to conduct multicenter trials when possible, which will also be a solution to the difficulty of recruiting enough trial participants.

Strengths and limitations

This is an exploratory and conceptual attempt to introduce the error matrix tool to overview the evidence validity on Traditional Chinese Medicine for chronic hepatitis B, and they have also used it for another Chinese formula “Xiao Chai Hu Tang” for chronic hepatitis B (article submitted). The error matrix tool creates matrix of the evidence depending on systematic, random, and design errors and provides an overall visual assessment of reliability of the evidence. The authors then can base their conclusion in relationship to the evidence with minimum risks of all three errors; or if the overall evidence is jeopardized, this matrix may help figure out the current problems existing in the clinical evidence, to highlight the future research direction, and to facilitate safe and correct intervention implementation in clinical practice. The authors suggested the error matrix tool to be used in complementary to GRADE approach, as the matrix can provide detailed visual information about each error components for the future GRADE assessment about evidence quality.³³ In their research, they used SE to assess the precision, but in GRADE, the imprecision is evaluated in relationship to optimal information size and confidence interval, or alternatively by Trial Sequential Analysis. The GRADE also assesses the heterogeneity and publication bias.³³

They tried to enrich Keus et al.' error matrix by extracting more information. They additionally used corresponding tools to assess the systematic errors for different types of evidence, for example, AMSTAR for systematic reviews and meta-analyses. Also, based on the previous studies, they conceptualized nine design factors that may bring design errors.^8,9,28 As there has been no agreement on a validated tool to assess the design error risks, their choice of these nine factors may be arbitrary, and more research should be done in future. There are some other limitations for further improvement. When assessing random errors, they have not considered the risks that may be caused by “multiplicity” or early stopping of interim analysis.^91,92 The assessment of systematic errors and design errors was limited to the poor reporting of related items.

Internal validity and external validity are both important to determine the quality of evidence and are incompatible to each other. That is why they focused on all types of evidence. However, as their research is based on a Cochrane systematic review, they only included randomized clinical trials and systematic reviews/meta-analyses of such trials; to check external validity, one needs observational studies of clinical practice, including check of compliance as well as disease-specific and general morbidity and mortality.

The authors' research tries to classify the factors that may affect the evidence validity into three dimensions; however, there may be some additional factors that are not considered and should be further researched. External validity and model validity of study results are important issues from a clinical point of view.⁹³ Further research should give more focus on the external validity, and can refer to publications on some scales, checklists, or domain-based evaluations that include external validity and/or model validity assessment, such as Model Validity of Homeopathic Treatment (MVHT) and Rapid Evidence Assessment of the Literature (REAL) tools.^93

–96

The matrix tool will not replace the through process of systematically reviewing evidence and profound evaluations of data, but can serve as a tool to provide visual assessment of evidence validity of observations with respect to systematic errors, random errors, and design errors.

Moreover, the estimates of the effects of the herb on outcomes of randomized clinical trials are repeated to ascertain extent in the meta-analyses, which consist of a sample of the randomized clinical trials. This may amplify the imprecision of effects, which is not the intervention. In the present article, the authors focus on how to apply this matrix tool to assess the evidence validity, and they will synthesize the data from randomized clinical trials and present the effect estimate in their Cochrane review to prevent duplication or overlap of evidence.

Conclusions

The available evidence from reviews with meta-analyses and randomized clinical trials on Radix S. flavescentis for chronic hepatitis B showed high risks of systematic errors, moderate or high risks of random errors, and high risks of design errors. The authors found no data on quality of life, hepatitis B-related mortality, or morbidity. More evidence with minimum risks of all three errors is needed to guide the future practice, and this error matrix tool may assist in design and conduct of future studies.

Footnotes

Acknowledgments

The authors acknowledge the great help of Sarah Klingenberg, the Information Specialist in the Cochrane Hepato-Biliary Group, in designing the search strategies. This work was supported by China State Scholarship Fund (No. 201706550015) and Capacity Building in Evidence-Based Chinese Medicine and Internationalization Project (No. 1000061020008).

Authors' Contributions

N.L. proposed the idea for this study, designed and organized the research, analyzed data, and drafted and revised the article; D.-Z.K. proposed the idea for this study, coordinated the research, and revised the article; N.L., X.-H.L., M.Y., and L.-D.F. performed literature searching and selection, collected data, contacted the authors for missing data, and commented on the article; S.-S.M. and C.-L.L. assessed the risk of bias and the risk of design errors, calculated the random errors, and commented on the article; C.G. interpreted data providing a methodological view, and revised the article; J.-P.L. interpreted data providing a clinical view and revised the article.

Author Disclosure Statement

No competing financial interests exist.

Appendix Table A1.

The Characteristics of Included Studies on Nine Design Error Components

First author (year)	Outcome measures	Participants	Experimental intervention	Control intervention	Cointervention	Clinical setting	Goal (exploratory/pragmatic)	Objective (superiority/equivalence/noninferiority)	Structure	Unit of analysis
He (2013)^37,a	Surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Matrine	No intervention	Lamivudine	NR	NR	NR	NR	NR
Jiang (2013)^36,a	Surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Kushensu at any form and administration way for at least 3 months	No intervention	Interferon for at least 3 months	NR	NR	NR	NR	NR
Qi (2013)^35,a	Surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Kushensu for at least 6 months	No intervention	Entecavir for at least 6 months	NR	NR	NR	NR	NR
Song (2016)^38,a	All-cause mortality, hepatitis B-related mortality and morbidity, adverse events, surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Oral oxymatrine preparation	Placebo or no intervention	Cointervention was allowed	NR	NR	NR	NR	NR
Wang (2017)^34,a	Adverse events, surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Mat	No intervention	Interferon and/or others	NR	NR	NR	NR	NR
Wu (2011)^39,a	All-cause mortality, hepatitis B-related mortality and morbidity, adverse events, surrogate outcomes	Clear definition on diagnostic criteria and exclusion criteria	Kushensu, matrine, or oxymatrine	Placebo or no intervention	Cointervention was allowed	NR	NR	NR	NR	NR
Duan (2004)^61,b	Adverse events, surrogate outcomes	Clear diagnostic criteria, but not mentioned inclusion and exclusion criteria	Oxymatrine, 600 mg, once daily, intravenous infusion, 2 months; three times daily, oral administration, 4 months	No intervention	Oral lamivudine, 100 mg, once daily, 6 months	Single center	NR	NR	Parallel group	Individual
Gao (2003)^66,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases	Kushensu, 600 mg, intramuscular injection, once daily, 3 months; every other day, 3 months	No intervention	Oral glycyrrhizin, 150 mg, two times daily, 6 months	Single center	NR	NR	Parallel group	Individual
He (2013)^64,b	Adverse events, surrogate outcomes, composite outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases and drug history during the past 6 months	Kushensu, 600 mg, three times daily, oral administration, 12 months	No intervention	Oral lamivudine tablets, 100 mg/day; oral adefovir capsules, 10 mg/day, 12 months	Single center	NR	NR	Parallel group	Individual
Huang (2004)^63,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases	Kushensu capsules, 200 mg, three times daily, oral administration, 6 months	No intervention	Interferon a-2b, 5 MU, intramuscular injection, every other day; liver protective drugs (e.g., Silymarin and vitamin C), oral administration, 6 months	Single center	NR	NR	Parallel group	Individual
Huang (2005)^62,b	Surrogate outcomes, composite outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases and drug history during the past 6 months	Kushensu intravenous infusion, 600 mg, once daily, 3 months	No intervention	Thymosin intravenous infusion, 80 mg, once daily; liver protective drugs (e.g., ganlixin and vitamin C), 3 months	Single center	NR	NR	Parallel group	Individual
Li (2006)^59,b	Adverse events, surrogate outcomes	Clear diagnostic criteria, but not mentioned inclusion and exclusion criteria	Kushensu injection, 600 mg, once daily, intravenous infusion, 2 months	No intervention	Ganlixin, 150 mg, intravenous infusion; liver protective drugs (e.g., intravenous infusion of vitamin C, gantaile, potassium aspartate and magnesium aspartate), 2 months	Single center	NR	NR	Parallel group	Individual
Li (2008)^60,b	Adverse events, surrogate outcomes, composite outcomes	Clear diagnostic criteria, but not mentioned inclusion and exclusion criteria	Kushensu, 600 mg, intravenous or intramuscular injection, once daily, 6 months	No intervention	Adefovir capsules, 10 mg, once daily, 6 months	Single center	NR	NR	Parallel group	Individual
Li (2010)^53,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases	Kushensu capsules, 200 mg, three times daily, 3 months	No intervention	Routine liver protection drugs including compound glycyrrhizin, potassium magnesium aspartate, and vitamin C, 3 months	Single center	NR	NR	Parallel group	Individual
Liu (2005)^58,b	Adverse events, surrogate outcomes	Clear diagnostic criteria, but not mentioned inclusion and exclusion criteria	Oxymatrine injection, 600 mg, once daily, intravenous infusion, 1 month; oxymatrine capsules, 200 mg, three times daily, oral administration, 5 months	No intervention	Routine liver protection treatment, including intravenous infusion of ganlixin and Panangin, oral administration of vitamin C and glucurone, 6 months	Single center	NR	NR	Parallel group	Individual
Liu (2016)^67,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of drug allergy history	Kushensu capsules, 200 mg, three times daily, oral administration, 6 months	No intervention	Entecavir, 0.5 mg, once daily, oral administration, 6 months	Single center	NR	NR	Parallel group	Individual
Lu (2004)^57,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases and drug history during the past 6 months	Oxymatrine capsules, 300 mg, three times daily, oral administration, 16 months	Placebo (consistent size, color, shape, and taste with oxymatrine capsules), 16 months	No cointervention	Multicenters	NR	NR	Parallel group	Individual
Lv (2010)^55,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases and drug history during the past 6 months	Kushensu capsules, 200 mg, three times daily, oral administration, 12 months	No intervention	Adefovir capsules, 10 mg, once daily, oral administration, 12 months	Single center	NR	NR	Parallel group	Individual
Lv (2011)^56,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases and drug history during the past 6 months	Kushensu capsule, 300 mg, three times daily, oral administration, 9 months	No intervention	Adefovir capsules, 10 mg, once daily, oral administration, 9 months	Single center	NR	NR	Parallel group	Individual
Mao (2014)^54,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases	Kushensu injection, 600 mg, intravenous infusion with 250 mL 5% glucose solution, once daily, 1 month	No intervention	Tanshinone IIA, 60 mg, intravenous infusion with 250 mL 5% glucose solution, once daily; antiviral and liver protective drugs (vitamin, liver protective tablets, oral administration; potassium aspartate and magnesium aspartate, vitamin K, intravenous infusion), 1 month	Single center	NR	NR	Parallel group	Individual
Su (2014)^50,b	All-cause mortality, adverse events, surrogate outcomes	Clear diagnostic criteria, but not mentioned inclusion and exclusion criteria	Kushensu capsules, 200 mg, 3 months daily, oral administration, 1 month	No intervention	Adefovir, 10 mg, once daily, oral administration, 1 month	Single center	NR	NR	Parallel group	Individual
Sun (2011)^49,b	Adverse events, surrogate outcomes, composite outcomes	Clear diagnostic criteria and exclusion criteria of contaminant diseases, age, and drug history during the past 1 month	Matrine injections, 150 mg, intravenous infusion with 250 mL 10% glucose solution, once daily, 3 months	No intervention	Liver protection compounds, including vitamin C, vitamin B6, inosine, Gantaile, Hutianbao, and branched chain amino acids, once daily, 3 months	Single center	NR	NR	Parallel group	Individual
Wang (2006)^48,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease and drug history during the past 6 months	Kushensu capsules, 200 mg, three times daily, oral administration, 12 months	No intervention	Lamivudine, 100 mg, once daily, oral administration, 12 months	Single center	NR	NR	Parallel group	Individual
Wang (2011)^47,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of drug history during the past 6 months	Matrine, 150 mg, intravenous infusion, once daily, 1 month	No intervention	Recombinant human interferon α-2b, 5 MU, subcutaneous injection, once daily, 3 months	Single center	NR	NR	Parallel group	Individual
Wei (2010)^46,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease and drug history during the past 3 months	Kushensu, intravenous infusion, 600 mg, once daily, 12 months	No intervention	Adefovir tablets, 10 mg, once daily, 12 months	Single center	NR	NR	Parallel group	Individual
Xi (2010)^44,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease and drug history during the past 6 months	Kushensu injections, 50–100 mg (5–10 years old children), 100–150 mg (11–14 years), intravenous infusion with 10% glucose solution, once daily, 2 months, Kushensu capsules, three times daily, 4 months	No intervention	Liver protective drug, one to two tablets (1–8 years old children), two to three tablets (9–14 years), oral administration, 6 months	Single center	NR	NR	Parallel group	Individual
Xie (2010)^65,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease and age	Kushensu capsules, 200 mg, three times daily, 24 months	No intervention	Entecavir, 0.5 mg, once daily, 24 months	Single center	NR	NR	Parallel group	Individual
Xue (2008)^51,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of drug history during the past 6 months	Acupoint injection of Kushensu solution, four acupoints (bilateral Zusanli [ST36] and bilateral Sanyinjiao [SP 6]), 1 mL for each acupoint per time, six times a week, 6 months	No intervention	Liver protective drugs (e.g., vitamin C, Yiganling, multivitamin B), 6 months	Single center	NR	NR	Parallel group	Individual
Yan (2011)^43,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Kushensu capsules, 150 mg, three times daily, 13 months	No intervention	Adefovir, 10 mg, once daily, 13 months	Single center	NR	NR	Parallel group	Individual
Ye (2015)^42,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Kushensu capsules, 200 mg, three times daily, 12 months	No intervention	Entecavir, 0.5 mg, once daily, 12 months	Single center	NR	NR	Parallel group	Individual
Zhang (2011)^41,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Matrine injections, 50–100 mg (5–10 years old children), 100–150 mg (11–14 years), intravenous injection with 10% glucose solution, once daily, 6 months	No intervention	Ganlixin capsules, 3 mg/kg/day, two times daily, Inosine tablets, 100–400 mg, three times daily, Lamivudine, 3 mg/kg/day, once daily, oral administration, 6 months	Single center	NR	NR	Parallel group	Individual
Zhang (2015)^52,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Kushensu dispersible tablets, 200 mg, three times daily, oral administration, 12 months	No intervention	Telbivudine, 600 mg daily, oral administration, 12 months	Single center	NR	NR	Parallel group	Individual
Zhang (2017)^40,b	Adverse events, surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Kushensu capsules, 200 mg, three times daily, oral administration, 6 months	No intervention	Routine liver protection and nutrition support treatments, 6 months	Single center	NR	NR	Parallel group	Individual
Zhou (2013)^45,b	Surrogate outcomes	Clear diagnostic criteria and exclusion criteria of contaminant disease	Kushensu, 300 mg, three times daily, oral administration, 24 months	No intervention	Adefovir, 10 mg daily, oral administration, 24 months	Single center	NR	NR	Parallel group	Individual

Reviews with meta-analyses.

Randomized clinical trials.

NR, not reported.

References

World Health Organization. Global hepatitis report. Online document at: www.who.int/hepatitis/publications/global-hepatitis-report2017/en, accessed November 22, 2017 .

Zhu

. Chinese Materia Medica-Chemistry, Pharmacology and Applications. 1st ed. Boca Raton, FL: CRC Press, 1998.

Tanabe

, Kuboyama

, Kazuma

, et al. The extract of roots of Sophora flavescens enhances the recovery of motor function by axonal growth in mice with a spinal cord injury. Front Pharmacol, 2015; 6:326.

Liang

, Kong

, Nikolova

, et al. Radix Sophorae flavescentis for chronic hepatitis B. Cochrane Databas Syst Rev, 2018; 2018:CD013089. DOI: 10.1002/14651858.CD013089.

Guyatt

, Cairns

, Churchill

. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA, 1992; 268:2420–2425.

Burns

, Rohrich

, Chung

. The levels of evidence and their role in evidence-based medicine. Plast Reconstr Surg, 2011; 128:305–310.

Graves

. Users' guides to the medical literature: A manual for evidence-based clinical practice. J Med Libr Assoc, 2002; 90:483.

Jakobsen

, Gluud

. The necessity of randomized clinical trials. Br J Med Med Res, 2013; 3:1453–1468.

Garattini

, Jakobsen

, Wetterslev

, et al. Evidence-based clinical practice: Overview of threats to the validity of evidence and how to minimise them. Eur J Intern Med, 2016; 32:13–21.

10.

Ioannidis

JPA

. Why most published research findings are false. PLoS Med, 2005; 2:e124.

11.

Faber

. How sample size influences research outcomes. Dental Press J Orthod, 2014; 19:27–29.

12.

Turner

, Bird

, Higgins

. The impact of study size on meta-analyses: Examination of underpowered studies in Cochrane reviews. PLoS One, 2013; 8:e59202.

13.

Ioannidis

, Haidich

, Pappa

, et al. Comparison of evidence of treatment effects in randomized and nonrandomized studies. JAMA, 2001; 286:821–830.

14.

Schulz

, Grimes

. Generation of allocation sequences in randomised trials: Chance, not choice. Lancet, 2002; 359:515–519.

15.

Pildal

, Hróbjartsson

, Jørgensen

, et al. Impact of allocation concealment on conclusions drawn from meta-analyses of randomized trials. Int J Epidemiol, 2007; 36:847–857.

16.

Hróbjartsson

, Emanuelsson

, Skou Thomsen

, et al. Bias due to lack of patient blinding in clinical trials. A systematic review of trials randomizing patients to blind and nonblind sub-studies. Int J Epidemiol, 2014; 43:1272–1283.

17.

Hrobjartsson

, Thomsen

, Emanuelsson

, et al. Observer bias in randomized clinical trials with measurement scale outcomes: A systematic review of trials with both blinded and nonblinded assessors. CMAJ, 2013; 185:E201–E211.

18.

Hrobjartsson

, Thomsen

, Emanuelsson

, et al. Observer bias in randomised clinical trials with binary outcomes: Systematic review of trials with both blinded and non-blinded outcome assessors. BMJ, 2012; 344:e1119.

19.

Amiri

, Kanesalingam

, Cro

, Casey

. Does source of funding and conflict of interest influence the outcome and quality of spinal research?. Spine J, 2014; 14:308–314.

20.

Lundh

, Sismondo

, Lexchin

, et al. Industry sponsorship and research outcome. Cochrane Databas Syst Rev, 2017; 2:MR000033. DOI: 10.1002/14651858.MR000033.pub3.

21.

Gluud

, Brok

, Gong

, Koretz

. Hepatology may have problems with putative surrogate outcome measures. J Hepatol, 2007; 46:734–742.

22.

Garattini

, Bertele

. Non-inferiority trials are unethical because they disregard patients' interests. Lancet, 2007; 370:1875–1877.

23.

Unverzagt

, Prondzinsky

, Peinemann

. Single-center trials tend to provide larger treatment effects than multicenter trials: A systematic review. J Clin Epidemiol, 2013; 66:1271–1280.

24.

Bafeta

, Dechartres

, Tringuart

, et al. Impact of single centre status on estimate of intervention effects in trials with continuous outcomes: Meta-epidemiological study. BMJ, 2012; 344:e813.

25.

Kjaergard

, Villumsen

, Gluud

. Reported methodologic quality and discrepancies between large and small randomized trials in meta-analyses. Ann Intern Med, 2001; 135:982–989.

26.

Moher

, Pham

, Jones

, et al. Does quality of reports of randomised trials affect estimates of intervention efficacy reported in meta-analyses?. Lancet, 1998; 352:609–613.

27.

Song

, Parekh

, Hooper

, et al. Dissemination and publication of research findings: An updated review of related biases. Health Technol Assess, 2010; 14:1–193.

28.

Gluud

. The culture of designing hepato-biliary randomised trials. J Hepatol, 2006; 44:607–615.

29.

Keus

, Wetterslev

, Gluud

, van Laarhoven

. Evidence at a glance: Error matrix approach for overviewing available evidence. BMC Med Res Methodol, 2010; 10:90.

30.

Higgins

JPT

, Green

, eds. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. 2011. Online document at: handbook-5-1.cochrane.org, accessed October 5, 2018.

31.

Guyatt

, Oxman

, Kunz

, et al. What is “quality of evidence” and why is it important to clinicians?. BMJ, 2008; 336:995–998.

32.

Shea

, Reeves

, Wells

, et al. AMSTAR 2: A critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ, 2017; 358:j4008.

33.

Guyatt

, Oxman

, Vist

, et al. GRADE: An emerging cons on rating quality of evidence and strength of recommendations. BMJ, 2008; 336:924–926.

34.

Wang

, Lin

, Zhang

. The clinical efficacy and adverse effects of interferon combined with matrine in chronic hepatitis B: A systematic review and meta-analysis. Phytother Res, 2017; 31:849–857.

35.

, Zuo

. Entecavir plus matrine vs entecavir monotherapy for HBeAg-positive chronic hepatitis B: A meta-analysis. World Chin J Digestol, 2013; 21:1432–1436.

36.

Jiang

, Zhan

, Cheng

. Interferon combined with kushensu for chronic hepatitis B: A meta analysis. Shandong Med J, 2013; 53:34–38.

37.

. Lamivudine matrine combination therapy for chronic hepatitis B: A systematic review. Lishizhen Med Mater Medica Res, 2013; 24:944–946.

38.

Song

, Luo

, Wu

, Yao

. Oral oxymatrine preparation for chronic hepatitis B: A systematic review of randomized controlled trials. Chin J Integr Med, 2016; 22:141–149.

39.

. Sophorus species for chronic hepatitis B and viremic carrier: A systematic review of randomized clinical trials. Doctoral dissertation. Beijing University of Chinese Medicine, Beijing, 2011.

40.

Zhang

, Xin

, Liu

. Effect of oxymatrine on elderly patients with hepatitis B cirrhosis and its influence on portal hemo-dynamics. Chin J Gerontol, 2017; 37:3270–3271.

41.

Zhang

, Niu

, Hu

, Liu

. Effect of lamivudine combined with matrine on treating children with chronic hepatitis B. Res Integr Tradit Chin West Med, 2011; 3:121–123, 126.

42.

, Wang

. Efficacy of entecavir combined with oxymatrine in the treatment of HBeAg positive chronic hepatitis B patients. Mod Chin Doctor, 2015; 53:92–95.

43.

Yan

, Li

, Chen

, et al. Adefovir dipivoxil and oxymatrine combination therapy in treatment of patients with HBeAg positive chronic hepatitis B. Chin J Pract Med, 2011; 38:31–33.

44.

, Li

, Luo

, Xi

. Matrine injections combined with liver protective drugs for children with chronic hepatitis B. J Gansu Coll Tradit Chin Med, 2010; 27:36–39.

45.

Zhou

. Therapeutic effect of adefovir dipivoxil and oxymatrine on patients with chronic hepatitis B not responsive to interferon. Guide China Med, 2013; 11:199–200.

46.

Wei

, Meng

, Mo

. Adefovir in combination with kushensu for chronic hepatitis B in 36 cases. Guangxi Med J, 2010; 32:1529–1530.

47.

Wang

, Yuan

, Wu

, et al. The observation of anti-viral effect of matrine combined with interferon a-2b on the chronic hepatitis B patients with mild transaminase elevations. J Clin Hepatol, 2011; 27:488–489, 502.

48.

Wang

, Zhang

, Li

, et al. Kushensu capsules in combination with lamivudine for chronic hepatitis B. Zhejiang Integr Tradit Chin West Med, 2006; 16:412–414.

49.

Sun

, Qiu

, Fan

, Pu

. Clinical effect of matrine application in treating viral hepatitis. China J Emerg Resuscitation Disaster Med, 2011; 6:427–429.

50.

, Liu

, He

, Liu

. Therapeutic effects of adefovir dipivoxil combined with matrine capsules on patients with chronic hepatitis B and its influence on HBV-DNA. Chin J Hosp Pharm, 2014; 34:472–475.

51.

Xue

. Kushensu acupoint injection for chronic hepatitis B in 37 cases. Shandong Med J, 2008; 48:24.

52.

Zhang

. The best course of telbivudine in the treatment of HBeAg-positive chronic hepatitis B and the feasibility of combination therapy. E J Transl Med, 2015; 2:73–74.

53.

. Clinical observation of kushensu capsules for chronic hepatitis B. Contemp Med, 2010; 16:142–143.

54.

Mao

, Xiao

, Wu

, Tao

. Effects of tanshinone IIA combined with oxymatrine on liver function and liver fibrosis in patients with chronic hepatitis B. Pract Clin Med, 2014; 15:21–23.

55.

, Zhu

. Adefovir combined with kushensu capsules for chronic hepatitis B. World Health Dig Med Period, 2010; 7:115–116.

56.

, Jia

, Liu

. Therapeutic effect of HBeAg positive chronic hepatitis B treated by kurorinone capsule combined with adefovir dipivoxil capsule. China Med Herald, 2011; 8:62–64.

57.

, Zeng

, Mao

, et al. Oxymatrine in the treatment of chronic hepatitis B for one year: A multicenter random double-blind placebo-controlled trial. Chin J Hepatol, 2004; 12:597–600.

58.

Liu

, Xie

, Sha

, Tan

. Therapeutic effect of oxymatrine on patients with chronic hepatitis B. China J Mod Med, 2005; 15:2678–2679, 2682.

59.

, Huang

, Liu

. Effect observation and nursing of kushensu combined with ganlixin for chronic hepatitis B. Chronic Pathematol J, 2006; 8:12–13.

60.

, Zhang

, Hu

. The clinical study on the effects of adefovir dipivoxil combined with oxymatrine in patients with chronic hepatitis B. J Clin Hepatol, 2008; 11:370–372.

61.

Duan

. Oxymatrine combined with lamivudine for chronic hepatitis B in 60 cases. J Fourth Mil Med Univ, 2004; 25:1345.

62.

Huang

, Lin

, Ji

, Xu

. Clinical study of treatment of chronic hepatitis B with kurorinone combined with thymosin. Med J Chin Pla, 2005; 30:1100–1102.

63.

Huang

, Lin

, Ji

, et al. Clinical study of the interferon a-2b combined with kurorinone in the treatment of chronic hepatitis B. Chin J Infect Dis, 2004; 22:259–262.

64.

, Chen

, Wang

, Bao

. Clinical observation of matrine in combination with lamivudine and defovir dipivoxil for the treatment of chronic hepatitis B virus. China Health Ind, 2013; 11:11–13.

65.

Xie

. Entecavir combined with kushensu for hepatitis B cirrhosis. Fujian J Tradit Chin Med, 2010; 41:16–17.

66.

Gao

, Li

, Zhou

, Qi

. Randomized controlled trial of oxymatrine therapy for hepatic fibrosis in patients with chronic hepatitis B. J Southeast China Nat Def Med Sci, 2003; 5:94–96.

67.

Liu

. Kushensu capsules combined with entecavir for chronic hepatitis B. Chin Prim Health Care, 2016; 30:68–69.

68.

Liang

. Radix Sophorae flavescentis for chronic hepatitis B—Characteristics of potential randomised clinical trials (data set). Zenodo. Online document at: doi.org/10.5281/zenodo.1445391, accessed November 1, 2018.

69.

Minozzi

, Ruggiero

, Capobussi

, et al. EBM, guidelines, protocols: Knowledge, attitudes and utilization in the era of law on professional responsibility and safety of health care. Recenti Prog Med, 2018; 109:294–306.

70.

Goodarzi

, Hanson

, Jette

, et al. Barriers and facilitators for guidelines with depression and anxiety in Parkinson's disease or dementia. Can J Aging, 2018; 37:185–199.

71.

Zhao

, Han

, Wang

, et al. Problems and countermeasures of development of clinical practice guideline of Chinese medicine. J Tradit Chin Med, 2010; 51:119–121, 141.

72.

, Li

, Bian

, et al. Randomized trials published in some Chinese journals: How many are randomized?. Trials, 2009; 10:46.

73.

Liu

, Kjaergard

, Gluud

. Misuse of randomization: A review of Chinese randomized trials of herbal medicines for chronic hepatitis B. Am J Chin Med, 2002; 30:173–176.

74.

Chalmers

, Glasziou

. Avoidable waste in the production and reporting of research evidence. Lancet, 2009; 374:86–89.

75.

Al-Shahi Salman

, Beller

, Kagan

, et al. Increasing value and reducing waste in biomedical research regulation and management. Lancet, 2014; 383:176–185.

76.

Chalmers

, Bracken

, Djulbegovic

, et al. How to increase value and reduce waste when research priorities are set. Lancet, 2014; 383:156–165.

77.

Glasziou

, Altman

, Bossuyt

, et al. Reducing waste from incomplete or unusable reports of biomedical research. Lancet, 2014; 383:267–276.

78.

Macleod

, Michie

, Roberts

, et al. Biomedical research: Increasing value, reducing waste. Lancet, 2014; 383:101–104.

79.

Ioannidis

, Greenland

, Hlatky

, et al. Increasing value and reducingwaste in research design, conduct, and analysis. Lancet, 2014; 383:166–175.

80.

Moher

, Glasziou

, Chalmers

, et al. Increasing value and reducing waste in biomedical research: Who's listening?. Lancet, 2016; 387:1573–1586.

81.

Jakobsen

, Nielsen

, Feinberg

, et al. Direct-acting antivirals for chronic hepatitis C. Cochrane Databas Syst Rev, 2017; 9:CD012143.

82.

Ciani

, Buyse

, Drummond

, et al. Time to review the role of surrogate end points in health policy: State of the art and the way forward. Value Health, 2017; 20:487–495.

83.

Schulz

, Altman

, Moher

. CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. BMJ, 2010; 340:c332.

84.

Trikalinos

, Churchill

, Ferri

, et al. Effect sizes in cumulative meta-analyses of mental health randomized trials evolved over time. J Clin Epidemiol, 2004; 57:1124–1130.

85.

Thorlund

, Devereaux

, Wetterslev

, et al. Can trial sequential monitoring boundaries reduce spurious inferences from meta-analyses?. Int J Epidemiol, 2009; 38:276–286.

86.

Goodman

. Toward evidence-based medical statistics. 2: The Bayes factor. Ann Intern Med, 1999; 130:1005–1013.

87.

Brok

, Thorlund

, Gluud

, Wetterslev

. Trial sequential analysis reveals insufficient information size and potentially false positive results in many meta-analyses. J Clin Epidemiol, 2008; 61:763–769.

88.

Wetterslev

, Thorlund

, Brok

, Gluud

. Trial sequential analysis may establish when firm evidence is reached in cumulative meta-analysis. J Clin Epidemiol, 2008; 61:64–75.

89.

Jakobsen

, Gluud

, Winkel

, et al. The thresholds for statistical and clinical significance—A five-step procedure for evaluation of intervention effects in randomised clinical trials. BMC Med Res Methodol, 2014; 14:34.

90.

Jakobsen

, Wetterslev

, Winkel

, et al. Thresholds for statistical and clinical significance in systematic reviews with meta-analytic methods. BMC Med Res Methodol, 2014; 14:120.

91.

Montori

, Devereaux

, Adhikari

, et al. Randomized trials stopped early for benefit: A systematic review. JAMA, 2005; 294:2203–2209.

92.

Imberger

, Vejlby

, Hansen

, et al. Statistical multiplicity in systematic reviews of anaesthesia interventions: A quantification and comparison between Cochrane and non-Cochrane reviews. PLoS One, 2011; 6:e28422.

93.

Khorsan

, Crawford

. How to assess the external validity and model validity of therapeutic trials: A conceptual approach to systematic review methodology. Evid Based Complement Alternat Med, 2014; 2014:694804.

94.

Bornhöft

, Maxion-Bergemann

, Wolf

, et al. Checklist for the qualitative evaluation of clinical studies with particular focus on external validity and model validity. BMC Med Res Methodol, 2006; 6:56

95.

Mathie

, Roniger

, Wassenhoven

, et al. Mehod for appraising model validity of randomised controlled trials of homeopathic treatment: Multi-rater concordance study. BMC Med Res Methodol, 2012; 12:49.

96.

Dyrvig

, Kidholm

, Gerke

, Vondeling

. Checklists for external validity: A systematic review. J Eval Clin Pract, 2014; 20:857–864.