Abstract
This article explores how systematic reviews can provide a useful addition to a general practitioner’s knowledge toolbox and explores scenarios where systematic reviews can be used to help inform a decision. The article also explores how the trustworthiness of the information from a systematic review or indeed any knowledge resource, can be assessed, and describes some of the ways that systematic reviews are changing. A follow up article will explore, in more detail, how to appraise, understand and use the information in a systematic review.
Clinical case scenario
Mr Brown and his wife attend the surgery to discuss his care. Mr Brown has lung cancer. They have heard from an elderly medical relative that a study has shown that use of oral ‘blood thinners’ might be effective in people with small cell lung cancer. They are particularly concerned because a friend also had a form of cancer and died due to a massive pulmonary embolism. They, therefore, wonder whether Mr Brown would benefit from taking an oral anticoagulant.
You decide to search on the National Institute for Health and Care Excellence and Clinical Knowledge Summaries websites, and BMJ Best Practice for guidance on this matter, but cannot find any reference to the use of oral anticoagulation in people with no obvious indication except cancer. You search The Cochrane Library and find a systematic review: Oral anticoagulation in people with cancer who have no therapeutic or prophylactic indication for anticoagulation. This review finds moderate-certainty evidence of little or no difference in mortality, although there is low-certainty evidence of a reduction in thromboembolism in people treated with oral anticoagulants compared with no treatment. You note that there is also moderate-certainty evidence of an increase in both major and minor bleeding, increasing the absolute risk of major bleeding from around 5% to 10%. Mr and Mrs Brown had not understood that there would be risks involved, and easily make the decision that they do not wish to pursue this treatment.
Background
Evidence has shown that in over half of all patient consultations, knowledge-focused questions were identified by the clinician (Del Fiol et al, 2014). About half of these questions were further pursued, and in these an answer was found in about three quarters of cases. Of the questions identified, the commonest related to drug treatment, closely followed by the cause of a symptom, physical finding or the result of a diagnostic test. The authors stated that ‘clinicians lack of time and doubt that an answer exists were the main barriers to information seeking’. More recently, an Australian study found that of 126 clinical questions submitted by Australian GPs to an evidence-based practice information service, treatment accounted for over 70% of enquiries, and diagnosis was the next most common at 15% (Muscat et al., 2020). This study concluded that there was an unmet need for better systems to identify ‘real time’ questions in order that more relevant and timely evidence could be made available for use in the clinical encounter.
It is unrealistic to expect clinicians to keep abreast of the scientific literature (Bastian et al., 2010). However, high-quality systematic reviews represent the best tools we have to gain a realistic appreciation of the effects of health care interventions. They are the basis for most national and international guidelines since they pull together all the highest quality relevant research and summarise the evidence on the likelihood of beneficial and harmful effects of health care interventions. This process is frequently associated with statistical pooling of the results of similar studies, or meta-analysis. Hence, reviews have the potential to:
Identify and analyse all the relevant evidence on the effects of health care interventions Identify and quantify sources of bias Optimise the statistical power by pooling data across studies Identify the presence, direction and magnitude of any observed effects (benefits or harms) and assess the certainty that they truly represent the real-world effect
However, systematic reviews are probably infrequently consulted in GP consultations. There are many reasons for this, including perceptions around the retrievability, complexity, and applicability of the information they provide, and the extent to which the results are likely to be ‘actionable’.
There are also many other useful resources already available for health professionals to consult. These include drug formularies, electronic decision support tools such as UpToDate, Dynamed, the BMJ’s Best Practice and local and international guidelines from organisations such as the National Institute for Health and Care Excellence (NICE), Scottish Intercollegiate Guidelines Network (SIGN) and World Health Organisation.
What is a systematic review?
Cochrane, a global organisation that produces, publishes and maintains high-quality systematic reviews, uses the following definition (Cochrane Library, 2019): A systematic review attempts to identify, appraise and synthesize all the empirical evidence that meets pre-specified eligibility criteria to answer a specific research question..(using) explicit, systematic methods that are selected with a view aimed at minimizing bias, to produce more reliable findings to inform decision making.
Therefore, a key element of a systematic review is a pre-defined question and accessible research plan.
How do systematic reviews fit into the knowledge toolbox?
As we have seen, questions are frequently raised in clinical consultations. Many of these can be addressed by other information sources such as formularies, e-textbooks and guidelines. In the UK,
There are also many questions that are not well addressed in any of the knowledge resources above, most of which tend to take a linear view of the clinical process, where one action follows another and the next follows after. This can be useful in many circumstances, but as GPs know, life is not always that straightforward.
Standard framework for systematic reviews of interventions.
Traditional systematic reviews have focused on assessing the effects of health care interventions in terms of patient-important outcomes. These may be aimed at prevention, screening, diagnostic tests, or treatment. Unsurprisingly, the majority of published systematic reviews evaluate drug treatments, but systematic reviews may evaluate interventions of any kind, including therapies, vaccinations, complementary and alternative remedies, surgery, public health interventions and different models for the delivery of care. All of these research questions can be framed in terms of a ‘PICO’ and indeed identifying the question in such terms is the first stage of the process. Systematic reviews that seek to assess the accuracy of diagnostic tests (sensitivity and specificity as opposed to clinical outcomes), to evaluate prognosis, risk factors, aetiology, or qualitative evaluation are outside the scope of this article.
What systematic reviews are not and why this matters
Systematic reviews, and indeed evidence more broadly, are sometimes perceived as being authoritarian and undermining of the expertise and autonomy of both patients and professionals. These criticisms are exemplified by the term ‘cookbook medicine’ (Genuis and Genuis, 2004). The introduction of guidelines and rules-based approaches to managing health services and budgets may have fueled this misperception. However, from the outset a key principle of evidence-informed health care has been that evidence in isolation is insufficient to determine a clinical decision. As early as 2002, Haynes and colleagues wrote: … research evidence alone is not an adequate guide to action. Rather, clinicians must apply their expertise to assess the patient's problem and must also incorporate the research evidence and the patient's preferences or values before making a management recommendation. (Haynes et al., 2002)
The misconception of evidence-based medicine (EBM) as didactic rather than facilitative may have been a barrier to its application in health care decision making. Cochrane reviews are not permitted to provide recommendations, since despite their detailed understanding of what the evidence shows about the benefits and harms of a given intervention, researchers cannot replicate those elements of professional expertise and patient preferences, values and expectations or the geographical, economic or social context that are equally important in decision making.
In the remainder of this article I will discuss the following important elements:
How can a reader determine whether evidence and health knowledge resources are trustworthy? How is evidence changing?
Can I trust what I am reading?
Red flag indicators that evidence may be suspect.
Formal tools and checklists have been developed to assess the quality of conduct and reporting of a systematic review. These include the AMSTAR (A MeaSurement Tool to Assess systematic Reviews) and ROBIS checklists and the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) standards on the quality of reporting (Liberati et al., 2009; Shea et al., 2017; Whiting et al., 2016). These are all useful reference tools but may be judged too time-consuming for regular use by busy GPs.
Perhaps the first question to ask is whether the question addressed by the review is a close match to that raised by the particular patient. This may not be as straightforward as it might appear. An understanding of the PICO framework is frequently useful. How well do the patients in the trials mimic the patient in front of you? Differences of age, sex, ethnicity, comorbidity, country of origin or inequality all may exert an influence. For example, when a team at the BMJ was tasked with evaluating the evidence for treating Acquired Immune Deficiency Syndrome in resource-poor settings in the mid-2000s, the researchers had to consider the extent to which the evidence available from randomised controlled trials (RCTs) was applicable. All these studies were undertaken in relatively high-income countries, where the patient demographics and services were quite different from people in the target populations.
Similarly, in considering the PICO question, it is important to consider whether the interventions and comparisons match the relevant patient or population. For example, a review that focusses on a comparison of intervention with placebo or no treatment is not useful when the question that matters to patients is how it compares with another active agent. Finally, there is the issue of outcomes. It is inevitable that patients will differ in how they prioritise the likely benefits or harms. Even such outcomes as long term survival may be of more importance to some people than others, who may prioritise quality of life, or avoiding harm. Checking whether a review reported the outcomes it set out to assess, and the extent to which these match the patient’s values and preferences, are critical first-order questions.
There are several more checks that can be applied that will provide a guide as to the reliability or otherwise of the findings:
Transparency of authorship and the importance of a multi-professional team
For any piece of health knowledge, it should be easily possible to identify the authors, their affiliations, their conflicts of interest and any external support they received. In addition, systematic reviews and guidelines in the modern world should be a team activity, and yet some are still published with only one author. By definition this precludes the careful reflection and dialogue that should be an important element of the process, and for some elements of the review process such as study selection and data extraction, two authors working independently remains a key quality assurance factor. Finally, it is rare for one individual to combine the expertise in information science, content, methods and statistics that are required in a high-quality review.
Conflicts of interest
Many systematic reviews are produced by teams who have a clear financial interest in the outcome. At a minimum this should be reported, along with details of any specific funding for the review. Of course, any such conflict of interest (COI) can also occur in the included studies. Many reviews include studies that are dominated by industry sponsorship. There is ample evidence that this renders the studies more likely to return a positive result and conclusions that favour the intervention in question (Lundh et al., 2017). Whereas financial conflicts of interests are frequently declared, non-financial conflicts of interest (such as being a trialist or a known advocate for or opponent of an intervention) are not. Cochrane has one of the strictest COI policies across scientific journals, based on the principle that declaration in itself might not be sufficient, but this remains a highly controversial area (Bero, 2018).
Publicly accessible protocol and adherence to this in the reporting of the research
For all high-quality systematic reviews there should be an accessible protocol or research plan, published before work gets under way on the review. Protocols may be made public in a variety of ways. For example, Cochrane and the journal Systematic Reviews publish review protocols.
Comprehensive and up-to-date search
Systematic reviews that rely on only one bibliographic database (e.g. PubMed), or limit themselves to study reports in one language (e.g. English), or by date of publication may lack key data. The search date should be reported for each database or source, so that the reader can judge whether it is likely to have missed important data.
Risk of bias
Irrespective of the type of study, there should be some assessment of the risk of bias of the evidence. The Cochrane risk of bias tool for RCTs (Higgins et al., 2011), is about to be released in its second major iteration (ROB2, 2019). It is now viewed worldwide as the preferred tool. There are separate tools for assessing risk of bias in non-randomised studies (Sterne et al., 2016).
Fair and balanced reporting
Spin is a major issue in relation to the reporting of science, as it is in other aspects of public life. Spin does not necessarily represent a deliberate intention to deceive, but it may reflect a strong commitment on the part of that author or researcher to a particular view, particularly around the effectiveness of a given intervention. There are a number of checks that the reader can make to make a brief assessment of this issue:
The conclusions of the review (or any other article) should be entirely consistent with the data Relative and absolute effects should be presented for both benefits and harms. It is commonplace to present results on beneficial outcomes in relative terms only, e.g. Relative Risk of 0.5 – or ‘reducing the risk of x by 50%’. Studies have shown that this leads to overly optimistic assumptions by readers, since depending on the control group event rate this may or may not represent an important difference in absolute terms (Gigerenzer, 2002). Similarly, harms data may be presented only in absolute terms, which can have exactly the opposite effect on the reader’s assumptions Outcomes should be reported on the basis of their importance to patients, and the a priori plan described in the protocol. Ordering should not be guided by whether there was or was not a statistically significant effect. Negative or inconclusive results are as important as positive ones. Statistical significance is not a proxy for the presence of a clinically important effect, and articles that focus merely on an arbitrary determination of statistical significance (e.g. p < 0.05), can be misleading (Pike, 2019). Harms should always be reported, even when they were sought but not found Systematic reviews should not present recommendations on treatment decisions, but should simply interpret the results of the review in the context of what is already known. In addition, they should present gaps or deficiencies in the evidence
Measuring quality (or certainty) of a body of evidence
The degree to which the body of evidence identified in the review provides a true estimation of the effects of the intervention(s) being tested is a crucial factor. This incorporates not only the risk of bias analysis for each study, but also, and importantly, includes other criteria that may increase or diminish the certainty that the calculated effect estimate is close to the actual effect. The prevailing method worldwide is now the approach described by the GRADE (Grading of Recommendations Assessment, Development and Evaluation) Working Group, which will be discussed in more detail in a separate article (Guyatt et al., 2008). GRADE is used by the NICE and other international bodies. The principle output of the GRADE method is the Summary of Findings Table, which once understood, provides the quickest, most accessible and actionable summary of the results of a systematic review.
How evidence is changing
Evidence synthesis approaches and methods are constantly changing and becoming more complex. Decision-makers are leading the call for many of the recent advancements, as they seek evidence-based approaches to non-traditional questions such as prognosis, the role of risk factors, more personalised medicine, or evaluations of how a service is delivered. There are also increasingly ambitious expectations for the efficient delivery of evidence. Recent changes reflect a number of different influences:
Emergence of new data sources
There has been a growing appreciation within the evidence community of important methodological limitations of traditional approaches to evidence synthesis. Foremost in this is the realisation, growing over at least two decades, of the limitation of reliance on published reports of trials in scientific journals. Given the potential impact of publication bias (non- publication of results) and selective outcome reporting (usually the selection of outcome measures that are more favourable to the intervention under study in published reports), this has led to the identification and use of different data sources. These include Clinical Study Reports, Trials Registries such as clinicaltrials.gov, and individual participant data.
New methods
New and richer methods help to produce evidence that is more useful to evidence users. For example, network meta-analysis enables the comparison of many different interventions from studies where similar patients have been recruited. It combines data in a single network of interventions using methods that exploit direct and indirect comparisons from the underlying trials. Decision-makers can then obtain treatment rankings for different interventions that might be used at a particular point in the patient journey. Other methods, including the increased availability and use of individual participant data, and meta-regression have been developed to support moves towards personalised medicine.
Different questions
These include methods that address questions that are important to evidence users, but lie outside the scope of traditional systematic reviews of interventions, such as qualitative evidence, prognosis, reviews of rare harms, economic evidence and complex interventions.
Conclusion
Systematic reviews are widely used in the development of clinical and policy guidelines, but use in individual consultations is more unusual. There are a number of understandable reasons for this situation, including the length and complexity of many systematic reviews, their tendency (perceived or otherwise) to be out of date and the reality that many high-quality reviews are equivocal in their findings. However, as professionals seek to deliver active shared decision making, systematic reviews frequently provide the most realistic assessment of the effects of interventions in health care. This article seeks to provide a framework that health professionals can use to assess the evidence presented in terms of its reliability, applicability and the extent to which it can inform decision making.
Evidence should be facilitative, not authoritarian: informed patients can reasonably make diametrically different decisions, dependent on their personal choices, preferences and values. However, a basic understanding of the evidence from the highest quality research available can nonetheless be a facilitator of decision making. A critical approach on the part of both professional and patient is also always important. Evidence is not written in stone, and as John Ioannadis and others have reported, there are many causes of bias, poor choices and low quality that can skew results (Ioannidis, 2005).
Finally, it is clear that evidence is changing: using new and diverse data sources, addressing more complex questions, and developing new methods to improve the validity or utility of the evidence produced. It is unlikely that most health care professionals will have the time or inclination to become masters of clinical epidemiology, but a basic understanding could lead to better decision making. Evidence synthesis researchers likewise need to spend more time and effort communicating their findings in ways that are accessible and useful to the people at the sharp end of clinical decision making.
KEY POINTS
Systematic reviews are the cornerstone of evidence informed health care and at the pinnacle of the hierarchy of evidence Systematic reviews attempt to identify, appraise and synthesize evidence meeting eligibility criteria to answer a specific research question using methods aimed at minimising bias, to produce findings that can inform decision making An assessment of the best current evidence forms only one element of the information needed for decision making Decision making is also influenced by patients’ values and preferences and the expertise of the health professional The MRCGP curriculum emphasises the importance of skills to understand systematic reviews and critically assess whether they provide trustworthy information and credible results This article provides a short framework for assessing the quality and trustworthiness of systematic reviews
