Abstract

Since 2010, there has been a growing expectation that health care will be transformed by “big data” and “big data analytics”, most visibly in the form of new technologies that incorporate artificial intelligence (AI) algorithms. In 2016, Obermeyer and Emanuel 1 anticipated that there would be rapid advances in the performance of prognostic models and in the ability of AI algorithms to interpret diagnostic images and scans, advances that would support health care professions to more reliably and accurately interpret the large amounts of information that accompany the delivery of modern medicine. And yet the transformation has been slow to materialise, with Kelly et al 2 commenting in 2019 that despite many studies reporting the superior performance of AI-based technologies compared with current practice, few had been deployed.
Kelly et al 2 described various reasons for the slow uptake of AI-based technologies, including uncertainty about (1) how AI-based tools could be implemented within existing care pathways and (2) whether the better performance reported in research studies would translate into more effective care. The authors also highlighted the limited explainability of these new tools and how this collides with an intrinsic feature of patient-centred care – the need for patients and clinicians to have information that is interpretable and trustworthy given the potential for poor decisions to lead to adverse health outcomes. This principle was given added weight in many countries by legislation (e.g., through the EU General Data Protection Regulation) that gave people the right to have an explanation for how a decision that affects their health care was made. 3 By 2021, systematic reviews that assessed the methodological quality of studies researching AI-based technologies were also raising concerns about whether adopted designs and analyses were sufficiently rigorous. For example, a review of studies that evaluated the performance of “deep learning” AI algorithms to interpret medical imaging 4 and a review of studies developing prognostic models using machine learning techniques 5 both concluded that most reviewed studies had significant methodological weaknesses and were at high risk of bias.
During the last few years, logistical challenges around the deployment of AI-based technologies seem to be diminishing. In the United States of America, the Food and Drug Administration’s list of AI and Machine Learning-Enabled Medical Devices approved for use contained a total of 950 devices in August 2024. 6 In the UK, the rapid investigation into the state of the NHS by Lord Darzi noted that 56% of NHS acute trusts were using AI tools within radiology departments, 7 suggesting that hospitals had found ways to integrate these tools into diagnostic pathways. However, the 2024 guidance from the National Institute for Health and Care Excellence (NICE) on the use of AI-based software for stroke diagnosis and treatment 8 offers a more nuanced picture. The guidance examined the potential benefits of 13 AI-based tools when used by health professionals to interpret CT brain scans of patients with stroke symptoms. The quicker identification of people with an ischaemic stroke could potentially speed up patients’ access to time-sensitive treatments such as thrombolysis and thrombectomy.
The NICE guidance noted that AI-based technologies had been widely adopted within NHS stroke units, which had occurred despite significant gaps in the research evidence. Many studies were found not to reflect real-world use because the performance of the AI-based software was evaluated in isolation and not combined with clinician interpretation. The guidance recommended that NHS stroke services continue using three AI-based technologies with health care professional review, and that the other AI-based software should only be used within research studies. A theme within the guidance was the lack of generalisable evidence on whether AI-based technologies led to faster access to treatment for patients with ischaemic stroke, and whether their use was cost-effectiveness.
The NICE guidance highlights some of the opportunities for health services researchers to play an important role in the adoption of AI-based technologies. It is likely that investment in AI-based tools will continue to grow; the Darzi review remarked that “there must be a major tilt towards technology to unlock productivity” in the NHS, p. 13. 7 Yet, there are large gaps in the evidence on where AI-based tools can be integrated into clinical practice to deliver the most benefit for patients and health professionals, and when such investments represent good value for money. Advice on how to report and appraise AI-based prognostic models has been published to help ensure these are developed using robust methods and evaluations are rigorous. 9 Nonetheless, the design of evaluations will not be straightforward in many circumstances - for example, when an AI-based tool is introduced into a complex care pathway or when its introduction is just one change in an ongoing process of service redesign. 8
Alongside these questions about effectiveness and value for money, the adoption of AI-based technologies generates many questions that fall within the scope of health services research. The breadth of these can be deduced from suggested principles for the responsible development of AI-based technologies.9,10 These principles lead to questions about (1) ensuring equitable access to and the use of these technologies, (2) whether AI-based tools are used in ways that respect patient autonomy and protect a person’s privacy, (3) the funding, staffing, and organisation of a health care system, (4) the quality of health care services, including how to implement the ongoing monitoring and updating of AI-based tools, and (5) whether AI-based technologies will increase or decrease the risk of overdiagnosis and overtreatment. More fundamentally, there are questions about what impact AI-based technologies will have on health outcomes, and how to ensure that their use reduces disparities in health rather than exacerbates differences between population groups. 10
In many situations, it is likely that these questions can be answered by studies using existing HSR methods. But the evolution of “big data” and “big data analytics” also presents health services researchers with opportunities to develop new approaches. These might arise from the creation of rich, population-based computerised health care datasets. In addition, studies have begun to explore the potential of social media as a source of data about health care use, often using machine learning algorithms to identify patterns (see the review by Walsh et al 11 for an overview of health research using spontaneously generated patient experience (SGOPE) data). But the use of “big data” and “big data analytics” poses similar challenges to the conduct of robust health services research in much the same way as they pose challenges for delivering effective health care. For example, information produced for health care professionals, policy makers, patient representatives, and other stakeholders needs to be interpretable if it is to support decision-making. Consequently, when considering the use of machine learning algorithms, researchers will face the trade-off between performance and explainability, and making good decisions about this will require knowing when the audience will need to understand how the results were produced as well as the results themselves. As such, the growth of “big data” and “big data analytics” could be transformative for health services research as well as for health care services.
