Safeguarding AI-driven digital health – An adaptation of the Swiss cheese model for safety

Abstract

The use of artificial intelligence (AI) in digital health applications is increasing rapidly, creating new opportunities as well as safety challenges. AI-related errors can be systematic, repeating consistently across similar cases and, when deployed at scale, potentially affecting multiple patients within a short time. In this paper, we extend James Reason’s classic distinction between person-based and system-based approaches to error by incorporating a third, technology-based perspective to account for failures inherent to AI-driven digital health systems and their lifecycle, including issues related to data, model design, deployment, and monitoring. Protecting patients in an increasingly AI-driven healthcare system requires not only model-centric controls, but an integrated set of overlapping technical, organizational, and regulatory safeguards designed explicitly for AI-driven digital health.To explore how such safeguards may be structured, we conducted a qualitative, exploratory analysis of 15 real-world incidents retrieved from an AI incident database. We extracted contributing factors for these incidents and conducted a thematic analysis. Aiming for conceptual alignment with Reason’s Swiss cheese model for safety, the themes were grouped into higher-level system layers. As a result, we propose five protective layers to safeguard AI-driven digital health: (1) data governance and quality assurance; (2) model development, evaluation and change control; (3) sociotechnical integration and human oversight; (4) governance, regulatory and ethical compliance; and (5) post-market monitoring, incident response, and learning. We propose safeguards that were also derived from the incident analysis to operationalize the layers, but they are not intended to represent an exhaustive or consensus-based set of recommendations, but rather an exploratory, incident-informed framework to support analysis and system design. Finally, we argue for a “just culture” in AI-driven digital health, where researchers, healthcare professionals, and patients are encouraged to report critical incidents and errors, as this is essential for learning and system improvement.

Keywords

patient safety artificial intelligence adverse effects digitalovigilance unexpected harm digital health Swiss cheese model

1. Introduction

Artificial Intelligence (AI)-driven digital health applications are becoming increasingly embedded across healthcare systems. For the purpose of this work, we understand AI-driven digital health as the use of AI within digital health technologies to analyze data, inform clinical and patient decisions, and support more personalized and efficient care.¹ We consider both, AI-based systems that are regulated medical devices and non-regulated health applications. The boundary of AI-driven digital health is not regulatory status, but the functional role of AI systems in influencing health-related decisions or actions within a healthcare or health-relevant context. This includes systems that directly inform clinical decision-making as well as those that shape patient behavior or access to care. They span healthcare professional-facing tools such as AI-enabled clinical documentation systems and ambient listening technologies,² as well as patient-oriented applications including symptom-checking chatbots³ or digital therapeutics.⁴

While the adoption of AI-driven digital health has accelerated rapidly, this growth has been accompanied by substantial concerns regarding risks to patient safety and care quality.⁵ In response, new conceptual frameworks such as digitalovigilance⁶ and algorithmovigilance⁷ have emerged to study how digital technologies and AI influence clinical outcomes and health system functioning.⁸ Algorithmovigilance focuses on monitoring and managing risks associated with algorithms, a topic that specifically arose with the availability of AI and machine learning algorithms.⁷ Digitalovigilance refers to the monitoring, evaluation and management of risks associated with digital health technologies and data-driven health ecosystems.⁶ Such approaches reinforce the importance of moving beyond static validation toward dynamic, post-deployment surveillance to ensure ongoing clinical reliability and patient safety.⁹ Nevertheless, understanding the full range of unintended consequences associated with AI-driven digital health remains incomplete and a systematic characterization of the potential harms associated with AI-driven digital health and the conditions under which they occur is missing.^5,10

Recent analyses of AI-driven digital health incidents illustrate that algorithms may fail to detect critical clinical events or produce misleading outputs in high-stakes contexts.¹¹ Beyond, serious errors linked to AI-driven digital health have been reported, along with indications that such incidents are underreported or inconsistently documented.¹² Coiera et al. noted “little reporting of patient harms from trials.”.¹³ Farrah Adegunle et al. concluded current UK and US regulatory models “lack mechanisms to systematically detect or prevent bias.”.¹⁴ Muralidharan et al. found that among 692 FDA-approved AI/Machine Learning devices, only 3.6% reported race/ethnicity data, 99.1% provided no socioeconomic data, and only 9% included a prospective study for post-market surveillance.¹⁵ Publicly available incident repositories such as the OECD AI Incidents and Hazards Monitor¹⁶ suggest that incidents are being identified with increasing frequency, although their true magnitude remains uncertain. Limitations of the existing - rather passive - surveillance strategies include underreporting, reporting bias, and limited capacity to capture software-specific harms as it has been acknowledged by Lakhan et al.¹⁷

Traditionally, patient safety science has addressed risks through system-level learning, reporting, and layered safeguards.¹⁸ Already in 2000, James Reason introduced an error approach that distinguishes person and system approaches to error.¹⁸ Reason argued that human error is inevitable, even among skilled healthcare professionals. Therefore, safety should focus less on blaming individuals and more on designing systems that anticipate, detect, and mitigate errors. The introduction of AI, however, adds new sources of risk to healthcare systems. AI failures in healthcare manifest differently from human failures due to their systematic, scalable and often silent nature. When AI-driven digital health systems learn flawed patterns, they reproduce them consistently across similar cases, which can affect thousands of patients simultaneously. In contrast, a human would not make errors consistently and in different places at the same time. AI-driven digital health systems have the potential to generate outputs that are plausible yet incorrect, making them difficult to detect.¹⁹ As a result, they may fail silently and with high confidence. Beyond these core characteristics, AI-driven digital health systems are susceptible to environmental changes, i.e. the models can perform well under the conditions they were trained on, but their accuracy can drop when the environment and context changes.²⁰ Further, AI models often learn statistically relevant patterns that correlate with outcomes, but actual causes remain hidden. For example, a particular diagnosis may be strongly associated with a specific combination of symptoms. However, if a patient presents with an atypical symptom profile, the model may fail to provide an accurate diagnosis.

In this paper, we extend James Reason’s models of error and safeguards¹⁸ to the emerging landscape of AI-driven digital health. We argue that protecting patients in a healthcare system that is increasingly AI-driven requires not only isolated controls, but an integrated system of overlapping technical, organizational, and regulatory safeguards designed explicitly for AI-driven digital health. Building on this perspective, we aim to complement and extend existing risk frameworks such as the National Institute of Standards and Technology AI Risk Management Framework, ISO 14971-based medical device risk management, and emerging concepts such as digitalovigilance and algorithmovigilance by providing a unified, incident-informed, and healthcare-specific safety model. In contrast to existing approaches, which often address individual dimensions of risk (e.g., technical performance, governance, or post-market monitoring) in isolation, our framework integrates these dimensions across multiple layers and explicitly captures how failures can propagate through the system. The study develops a conceptual framework informed by a qualitative, exploratory analysis of empirical cases. An exploratory incident review is used to ground the framework in observed real-world failures, rather than to derive a fully generalizable or formally validated qualitative theory.

2. An error approach to AI-driven digital health

First of all, we propose extending James Reason’s model to human error comprising the person and the system approach, using a technology approach to better capture AI-driven health specific failure modes, and to avoid attributing harm solely to healthcare professionals, patients or the healthcare system (see Figure 1). We elaborate on the person, technology, and system approaches below.

Figure 1.

Error approaches to AI-based systems in healthcare. Grounding on the two error approaches of James Reason, we introduce a third approach referring to the technology, i.e. failures of AI-driven digital health systems.

2.1. Person approach

The person approach in James Reason’s model locates the source of error in individual healthcare professionals.¹⁸ Errors originate from mental processes such as inattention, poor motivation, forgetfulness or carelessness. With patient-facing AI-driven digital health, errors can also be caused by the patient, who may use these tools and commit errors intentionally or unintentionally – which is in addition to errors committed by healthcare professionals as described in the original model. In contexts where AI outputs are perceived as authoritative by either patients or healthcare professionals, errors attributable to individuals may be reinforced through deskilling and automation bias.²¹ Countermeasures should strengthen healthcare professionals’ and patients’ ability to calibrate trust, critically interpret AI outputs, and use AI-driven digital health systems as intended. For instance, ensuring that outputs are explainable and that the uncertainty of AI models is clearly communicated can help users recognize when additional verification is needed and thereby help preventing careless use.²²

2.2. Technology approach

As an additional analytic lens to the original safety model, we introduce the technology approach that locates the source of failures in the AI-driven digital health systems themselves. Failures can be divided into active failures and latent conditions, in parallel with Reason’s terminology.¹⁸ Active failures are visible manifestations of errors in patient care, such as unsafe inferences from input data, hallucinations, erroneous, automatically generated diagnoses or inappropriate treatment recommendations or health advice. Unlike individual human mistakes, such failures can occur consistently across similar cases, creating systematic error patterns that affect multiple patients simultaneously. They may also manifest themselves subtly or with a delay, which makes them harder to detect than conventional clinical errors. Avoiding this kind of failures requires critical evaluations of the AI algorithms and clear scope setting and description so that the user knows about application areas and tasks where accuracy can be expected or not.

Latent technological conditions, by contrast, are vulnerabilities embedded within the AI lifecycle. These include biased or incomplete training datasets, inadequate validation in real-world clinical contexts, a lack of explainability, insufficient cybersecurity protections and insufficient regulatory compliance or governance. Addressing these issues requires robust design, careful selection of datasets, knowledge about the underlying datasets and their biases, thorough validation and continuous post-deployment monitoring as well as consideration of regulatory requirements regarding security and privacy.

2.3. System approach

The system approach understands errors as a consequence of broader organizational and contextual factors. Persons and AI-driven digital health systems can make mistakes as it was discussed before in the person and technology approach. With the system approach, we acknowledge the conditions that allow failures to happen and to create harm. In the case of AI-driven digital health, this perspective highlights that even the most advanced algorithms are fallible and that their safe use depends on the environment in which they are deployed. Depending on their purpose and functionalities AI-driven digital health has to be registered as software as a medical device and has to undergo certain regulations to ensure their safety. However, workflows that restrict opportunities for human oversight, lack of access to care for patients, time pressures that prevent healthcare professionals from questioning automated recommendations and organizational safeguards that are inadequate can transform isolated human or technology errors into systemic hazards. Therefore, responsibility does not lie with the individual healthcare professional, patient or the AI-driven digital health system alone, but with the absence of barriers and defenses that should prevent errors from propagating to patients.

3. Methodology for adapting the Swiss cheese model

To adapt Reason’s error approach and Swiss cheese model to AI-driven digital health, we conducted a qualitative, exploratory study. It combines a multiple case study design with inductive thematic analysis to enable theory-building grounded in empirical cases. A purposive sample of 15 AI-related healthcare incidents was collected from the AI, Algorithmic, and Automation Incidents and Controversies (AIAAIC) database (https://www.aiaaic.org). It is a publicly accessible repository that documents real-world incidents, failures, and controversies involving AI systems. It compiles cases from diverse sources, including academic literature, regulatory reports, investigative journalism, and credible media coverage. Each entry typically includes a description of the incident, contextual information, and references to original sources.

Incidents were included in our sample if they (i) involved an AI or machine learning system used in the health domain, (ii) described a negative outcome or identifiable risk (including near-misses), and (iii) provided sufficient detail to reconstruct contributing factors. The dataset was retrieved on March 24, 2026. In total, 38 health-related incidents were identified in the AIAAIC database and screened against these criteria. From these cases, a final sample of 15 incidents was selected using purposive sampling. Selection aimed to maximize diversity across types of AI systems, application contexts, and forms of failure or risk (e.g., clinical decision support, self-management applications, and resource allocation systems), to capture a broad range of contributing mechanisms. For example, for incidents related to the same system only one case was included. The extraction table in Appendix 1 shows all included incidents and the 15 selected cases. The screening and coding process was conducted by a single researcher.

For each case, we extracted data including the clinical context (setting and domain), AI system characteristics, incident description, outcome (harm/near miss), contributing factors, detection mechanisms (how and whether the incident was identified) and existing or missing safeguards. Contributing factors were analyzed using inductive thematic coding. Thematic coding followed an iterative and reflexive process. Initial codes were generated from a subset of cases and compiled into a preliminary codebook (Appendix 1), which was continuously refined as additional cases were analyzed. Earlier cases were revisited to ensure consistent application of codes across the dataset. Coding decisions and category definitions were documented throughout the process.

The grouping of codes into higher-level system layers was guided by both empirical patterns in the data and conceptual alignment with Reason’s Swiss cheese model. This step involved iterative comparison across cases to identify clusters of contributing factors that operated at similar levels within the sociotechnical system. While no formal inter-rater reliability assessment or independent second coding was conducted, the analysis emphasized consistency through repeated comparison and iterative refinement. The resulting framework should therefore be understood as an interpretive, theory-building contribution rather than a definitive classification.

To support practical implementation, we used the selected incidents and the information on missing or existing safeguards to describe a set of minimum and advanced safeguards for each layer.

4. The Swiss cheese model adapted for safety related to AI-driven digital health

4.1. Exploratory incident analysis

Table 1 shows the results from the incident analysis. The complete extraction table and the code book are available in the Appendix 1. The 15 selected incidents concerned a broad range of systems including clinical applications (e.g. sepsis prediction algorithm), self-management applications (e.g. blood oxygen measurement), or social/clinical support (e.g. resource allocation algorithm). Contributing factors were grouped into 7 thematic groups: 1) Data quality, bias, data representativeness, 2) Model design and technical limitations, 3) Validation, reliability, clinical safety, 4) Transparency, accountability, governance, 5) Workflow integration and sociotechnical fit, 6) Privacy, security, and ethical risks, 7) System-level and infrastructure failures. Incident impacts included equity/fairness issues, misdiagnosis, treatment delay, deterioration of health, misuse of data, mistreatment and patient harm/safety risks. The selected incidents span a heterogeneous range of AI-driven digital health applications. This diversity was intentional and reflects the study’s aim to derive a system-level framework that is not specific to a single AI modality or clinical use case. Despite differences in technical implementation and application context, all cases share a common underlying structure: they involve AI systems embedded in complex sociotechnical healthcare environments, where outcomes are shaped by interactions between model characteristics, human decision-making, workflow integration, and organizational governance. These shared features provide the basis for identifying cross-cutting patterns of failure and for adapting the layered structure of Reason’s Swiss cheese model to AI in healthcare.

Table 1.

Extraction table for the purposive sample of 15 AI incidents occurred in healthcare retrieved from AIAAIC database.

Identifier	Clinical context (setting)	Clinical context (domain)	AI system characteristics	Incident impact	Themes contributing factors
AIAAIC0898	Self management	Blood oxygen measurement	Oxygen prediction	Equity/Fairness, Treatment delay	Data Quality, Bias and Representativeness
AIAAIC0907	Consumer health app	Women’s health	Fertility prediction	Data misuse/use against assigned purpose	Model Design and Technical Limitations, Workflow Integration and Sociotechnical Fit, Privacy, Security, and Ethical Risks
AIAAIC1504	Clinical prescribing	Risk scoring	Risk scoring for addictive behavior	Equity/Fairness, Treatment delay	Model Design and Technical Limitations, Transparency, Accountability, and Governance
AIAAIC0759	Clinical screening	Ophthalmology	Image analysis and screening	Data misuse, Misdiagnosis	Workflow Integration and Sociotechnical Fit, Privacy, Security, and Ethical Risks, Validation, Reliability, and Clinical Safety
AIAAIC0758	Clinical prescribing	Risk scoring	Risk assessment algorithm for predicting drug addiction	Equity/Fairness, Treatment delay	Data Quality, Bias, and Representativeness
AIAAIC0657	In patient care	Sepsis/Critical care	Machine learning algorithm to predict sepsis infection	Delay of treatment	Data Quality, Bias, and Representativeness, Model Design and Technical Limitations
AIAAIC0160	Consumer health	Triage, Diagnosis	Diagnostic and Triage System/Symptom checker	Misdiagnosis, Equity/Fairness, Treatment delay	Validation, Reliability, and Clinical Safety, Data Quality, Bias, and Representativeness
AIAAIC0106	Clinical decision support	Oncology	Recommender system and diagnostic system	Deterioration of health	Data Quality, Bias, and Representativeness, Model Design and Technical Limitations, Transparency, Accountability, and Governance
AIAAIC0105	Urology	Diagnosis of acute kidney injury	Acute kidney injury detection system	Misuse of data	Transparency, Accountability, and Governance, Privacy, Security, and Ethical Risks
AIAAIC014	Public health	Epidemiology	Prediction algorithm for flu outbreaks	Misuse of data, Mistreatment	Model Design and Technical Limitations, Privacy, Security, and Ethical Risks
AIAAIC007	Social/clinical support	Resource allocation	Budget allocation algorithm: resource allocation model	Equity/Fairness, Treatment delay	Data quality, Bias and Representativeness, Model Design and Technical Limitations, Transparency, Accountability, and Governance
AIAAIC2235	Consumer AI chatbot	Mental health	LLM chatbot provides emotional support	Patient safety risk	Validation, Reliability, and Clinical Safety, Transparency, Accountability, and Governance
AIAAIC2064	Kidney Transplanta-tion	Clinical transplant	Risk scoring algorithm	Delay of treatment, Equity/Fairness	Data Quality, Bias, and Representativeness
AIAAIC1826	Consumer AI	Medical information handling	Analysis of medical records	Patient safety, data misuse	Privacy, security and ethical risk
AIAAIC2140	Home monitoring	Diabetes management	Prediction and monitoring glucose level, sensor plus algorithm	Misdiagnosis, delay of treatment	Model Design and Technical Limitations, Transparency, Accountability, and Governance, System-Level and Infrastructure Failures

From the identified contributing factors, we derived five protective layers that form the adapted Swiss cheese model described in the following subsection. Specifically, the theme “Data quality, bias, and representativeness” is mapped to a “Data governance and quality assurance” layer. The themes “Model design and technical limitations” and “Validation, reliability, and clinical safety” jointly inform a “Model development, evaluation, and change control” layer. The themes “Transparency, accountability, and governance” and “Privacy, security, and ethical risks” are addressed within a “Governance, regulatory, and ethical compliance” layer. In addition, we introduce a “Sociotechnical integration and human oversight” layer, corresponding to the theme “Workflow integration and sociotechnical fit”. Finally, a “Post-market monitoring, incident response, and learning” layer captures contributing factors related to “System-level and infrastructure failures”. When contributing factors fit to multiple themes, factors were assigned to the layer representing the primary origin of the failure or the layer where intervention would most directly prevent recurrence.

4.2. Adapted Swiss cheese model

The diversity of contributing factors requires several layers of safeguards to prevent patient harm in the context of AI-driven healthcare. The Swiss cheese model for safety – also introduced by James Reason - visualizes multiple protective layers against harm, each represented by a slice of Swiss cheese with “holes” symbolizing vulnerabilities.¹⁸ When these holes align, accidents or adverse outcomes can occur. Applied to AI-driven digital health, the model highlights that safety is not achieved by any single safeguard, e.g. those that can be integrated into AI systems, but by the integrity of an entire system of overlapping protections. We propose five protective layers to reduce the risk of patient harm arising from failures in AI-driven digital health or the uncritical use thereof. While each layer has potential vulnerabilities, together they can provide a resilient safety net (see Figure 2). The layers should be understood as analytically distinct but not mutually exclusive; individual incidents may involve interacting failures across multiple layers. In the following, we describe these protective layers.

Figure 2.

Swiss cheese model applied to AI-driven digital health: Five protective layers prevent occurrence of patient harm caused by errors, failures and latent conditions related to AI-driven digital health systems.

Data governance and quality assurance. AI models inherit the strengths and limitations of the data used to develop them.²³ Using high-quality, representative and sufficiently diverse datasets can reduce the risk of biased or inequitable outcomes.^24,25 Key defenses include data governance (e.g. provenance tracking, version control and access management), systematic data quality checks (e.g. label auditing, missing data analysis and unit harmonization) and documentation of dataset composition and known limitations. Gaps in this layer can be caused by mislabeled outcomes, changes in clinical practice over time, limited representation of relevant subgroups and unrecognized differences in measurements.²⁶ For instance, early warning systems for sepsis trained predominantly on data from younger populations may underperform in older adults or patients with multimorbidity.²⁷

Model development, evaluation and change control. A robust design and rigorous evaluation of AI-driven digital health are essential to ensure clinical reliability.²⁸ This layer is strengthened through pre-release testing, independent external validation across settings, calibration assessment, subgroup analyses, and ongoing revalidation following software or model updates.²⁹ In line with the principles of clinical evaluation of Software as a Medical Device (SaMD),³⁰ the evidence should address the analytical validity and clinical performance of the model in its intended context of use.^15,16 Because clinical knowledge evolves, a defined model lifecycle is required to prevent outdated recommendations due to outdated data in the trained model.³¹ This should include versioning, change-control criteria and timely retirement of obsolete models.³² Communicating uncertainty (e.g. calibrated confidence or ‘insufficient information’ flags) can facilitate safer interpretation. Holes in this layer arise when models are overfitted to narrow datasets, are poorly calibrated, are affected by data leakage or spurious correlations, lack transparency appropriate to risk, or are insufficiently validated under real-world conditions and workflow constraints.

Sociotechnical integration and human oversight. Even well-designed AI-driven digital health systems can cause harm if they are not safely integrated into clinical pathways. Ensuring safety requires sociotechnical integration, including clear role definitions (specifying who is accountable for acting on outputs), escalation and fallback procedures, user-friendly interfaces, and workflows that allow for human oversight.^33,34 For patient-facing tools (e.g. a chatbot delivering cognitive behavioral therapy), safeguards should include the ability to detect symptom deterioration, clear emergency guidance and a defined handover process to healthcare professionals or crisis services.³⁵ In contrast to the layer “Model development, evaluation and change control” that captures failures intrinsic to the AI system and its lifecycle management, this layer captures failures emerging from the interaction between the system, users, workflows, and organizational context during real-world deployment. This layer can be compromised when AI-generated recommendations are poorly timed, alert thresholds are miscalibrated (leading to false positives and fatigue³⁶), risk escalation is inadequate, or the tool increases the workload, thereby undermining a healthcare professional’s ability to verify outputs.

Governance, regulatory and ethical compliance. Another protective layer is robust governance that is aligned with regulatory, ethical and professional standards. For AI systems that qualify as medical device software, this involves complying with the relevant medical device regulations (e.g. the Medical Device Regulation (MDR)) and implementing a risk management process, for example in accordance with ISO 14971, supported by software lifecycle processes (IEC 62304) and usability engineering (IEC 62366-1). When it comes to AI-specific governance, organizations can draw on AI risk management and management system standards (e.g. ISO/IEC 23894 and ISO/IEC 42001), as well as information security management standards (ISO/IEC 27001). In the EU context, aligning with the AI Act and clarifying its interaction with medical device regulation can strengthen defenses.³⁷ However, overlaps between these frameworks can create compliance gaps if responsibilities and documentation requirements are fragmented. In addition, limited transparency and the use of opaque (“black-box”) models³⁸ can undermine healthcare professionals’ ability to verify AI-generated outputs and appropriately calibrate trust.^22,39 Transparency measures, such as model cards,⁴⁰ intended-use statements^41,42 and the explicit disclosure of AI involvement to patients, are therefore critical to support informed consent, accountability, and safe clinical use.

Post-market monitoring, incident response and learning. Ensuring safety of AI-driven digital health systems requires continuous oversight during real-world use. Defenses include live performance monitoring, drift detection, periodic benefit–risk reviews and clear triggers for retraining, revalidation, rollback and retirement.⁴³ Post-market monitoring is also a formal regulatory expectation for high-risk AI systems in the EU. Effective safeguards include prospective incident registries, automatic risk reporting, the systematic capture of near misses and ‘silent failures’, predefined incident response workflows and the clear assignment of surveillance responsibility across providers and deployers. This layer’s safety measures can be compromised by a lack of monitoring, reliance on technical metrics alone, slow detection or reporting of incidents, failure to communicate findings to all users, and unclear accountability for corrective actions.

4.3. Safeguards for the five protective layers

Table 2 presents the safeguards for the five protective layers. Given the study’s exploratory design, the safeguards are not intended to represent an exhaustive or consensus-based set of recommendations. The proposed safeguards were derived inductively from the identified contributing factors by translating recurrent failure modes into corresponding preventive and mitigative controls across the five protective layers. In this process, observed failure modes, such as bias, calibration errors, lack of transparency, and unsafe outputs, were systematically mapped to safeguards within the respective layers.

Table 2.

Safeguards per protective layer.

Protective layer	Derived from incidents	Minimum safeguards	Advanced safeguards
Data governance and quality assurance	- Skin color/measurement location bias - Data bias - Corrupted/low-quality data - Dataset shift - Synthetic data limitations - eGFR race correction	- Subgroup representation checks - Data quality validation pipelines - Explicit documentation of data assumptions and demographic composition	- Bias auditing across protected groups - Monitoring for dataset shift post-deployment - Validation of synthetic data against real-world distributions
Model development, evaluation and change control	- Oversimplified model - Poor calibration - Weak search algorithm - NLP limitations - Algorithmic failure	- Model performance evaluation beyond accuracy - Calibration assessment - Task-specific validation	- Stress testing under edge cases - Model complexity justification - Formal change control/versioning
Sociotechnical integration and human oversight	- Poor workflow integration - Overconfidence - Unsafe outputs - Lack of guardrails	- Human-in-the-loop decision points - Basic guardrails for high-risk outputs - Workflow integration testing	- Monitoring of automation bias/overreliance - Escalation protocols for uncertain outputs - User training on limitations
Governance, regulatory, and ethical compliance	- Lack of transparency - Closed models - Governance failure - Accountability issues - Access to sensitive data (cycle tracking example)	- Clear use of intended use and limitations - Defined accountability structures - Data access and privacy controls	- Independent auditability - Ethical risk assessment - Vendor transparency requirements
Post-market monitoring, incident response and learning	- System-level failures - Hardware errors - Dataset shift - Lack of guardrails (detected post-deployment)	- Incident reporting systems - Performance monitoring over time - Rollback procedures	- Automated drift detection - Defined incident response workflows - Feedback loops into model updates

The suggested safeguards are intended to operationalize the protective layers and support implementation in practice. We distinguish between minimum and advanced safeguards. Minimum safeguards represent baseline requirements that should be in place prior to deployment, whereas advanced safeguards provide additional risk mitigation, particularly for high-risk applications or more complex AI-driven digital health systems. The relevance and prioritization of specific safeguards may vary depending on the AI modality, clinical context, and clinical risk profile.

The incidents analyzed in this study span a heterogeneous set of modalities, including predictive models embedded in clinical workflows, generative AI systems (e.g., chatbot-based interfaces), computer vision applications, and hardware-integrated sensing systems, which exhibit distinct dominant failure modes. The adapted Swiss cheese model should be applied in a modality-sensitive manner, where safeguards are not uniformly weighted but prioritized according to the dominant risk profile of the system under consideration. The layered structure remains applicable across modalities, but its practical implementation requires adaptation to the specific technical and clinical context. For example, predictive analytics systems are particularly sensitive to issues such as calibration drift, dataset shift, and alert fatigue, placing greater emphasis on safeguards related to model development, evaluation and change control (Layer 2), as well as sociotechnical integration (Layer 3). In contrast, generative AI systems such as large language model-based chatbots introduce risks related to hallucinations, inappropriate responses, and missing escalation pathways, increasing the importance of safeguards in Layer 3 (Sociotechnical integration and human oversight) and Layer 4 (governance, regulatory and ethical compliance.

4.4. Evidence-informed examples to illustrate the framework

To illustrate the adapted Swiss cheese model, we map the 15 real-world incidents to the framework. Across the analyzed cases, incidents rarely resulted from a single point of failure; rather, they emerged when multiple weaknesses aligned across the proposed protective layers. Several recurring patterns can be identified.

Limitations in the data layer were a frequent initiating factor in the incidents. Issues such as biased or unrepresentative data, dataset shift, and low data quality propagated through subsequent layers, often remaining undetected until harm occurred. Cases involving demographic bias (e.g., skin color, race-based adjustments) illustrate how upstream data assumptions can lead to systematic inequities in clinical outcomes.

Limitations in model design and evaluation including poor calibration, oversimplification, and modality-specific weaknesses such as algorithmic limitations contributed to unreliable or unsafe outputs. These risks were amplified when models were deployed without sufficient external validation or robustness testing.

Failures in sociotechnical integration (e.g. poor workflow integration, overreliance on AI outputs, and lack of guardrails) led to inappropriate use of AI systems in practice. Even technically sound models failed when deployed in contexts for which they were not adequately designed or when users misinterpreted their outputs.

Most notably, the post-market monitoring and learning layer was consistently weak or absent. In many cases, there were no systematic mechanisms to detect performance degradation, unsafe outputs, or emerging risks in real-world use. As a result, failures such as dataset drift, bias, or hardware malfunctions persisted longer than necessary, increasing the likelihood and severity of harm.

These findings suggest that AI-related incidents in healthcare are best understood as system-level failures, arising from the interaction of technical, organizational, and human factors. Strengthening safety therefore requires not only improving individual components but ensuring that robust and complementary safeguards are implemented across all layers, with particular emphasis on continuous monitoring and governance.

We present three representative cases in detail to illustrate how failures propagate across layers. A complete mapping of all analyzed cases across the five protective layers is provided in Appendix 1.

The Apple Watch blood oximeter case (AIAAIC0898) illustrates how failures originating in the data layer can propagate across the system. The device exhibited reduced accuracy for individuals with darker skin tones, reflecting insufficient representativeness in the underlying data and limitations related to measurement location. This bias was not adequately mitigated at the model or validation stage, and no systematic subgroup monitoring was in place post-deployment. While some users mitigated the issue through double-checking, the absence of robust safeguards across layers resulted in inequitable outcomes and potential delays in treatment.

The incident concerning the Babylon symptom checker (AIAAIC0160) demonstrates how limitations in model design, combined with sociotechnical factors, can lead to harm. The system exhibited overconfidence and insufficient clinical validation, producing unsafe triage recommendations. These risks were amplified by its deployment in a consumer-facing context without adequate guardrails or escalation mechanisms. The lack of systematic monitoring further limited the detection of unsafe outputs. This case highlights how model-level weaknesses, when combined with inappropriate integration into real-world use, can result in patient harm.

The case related to the Abbott glucose monitoring system (AIAAIC2140) illustrates how failures can arise from interactions between hardware and algorithmic components. A manufacturing defect, combined with insufficient fail-safes in the algorithm, led to incorrect glucose readings and subsequent dosing errors. Although post-market reports eventually identified the issue, the absence of robust real-time monitoring and rapid response mechanisms delayed mitigation. This case underscores the importance of lifecycle quality control, system-level validation, and effective post-market surveillance.

5. Discussion

This paper argues that safeguarding AI-driven digital health requires moving beyond a metrics-centric view of quality, such as accuracy, AUROC and F1, and adopting a patient safety model that recognizes harm as the convergence of vulnerabilities in people, technology and systemic structures. We selected Reason’s model of error and the Swiss cheese model as they offer a well-established approach of patient safety culture that explains how patient harm rarely originates from a single mistake, but rather emerges when weaknesses in people, technology and systems align. Digital health and AI are quintessentially sociotechnical; safety depends not only on algorithmic performance, but also on data practices, workflow integration, governance and monitoring during real-world use.

5.1. Practical implications for AI-driven digital health

As mentioned earlier, the introduced layers should be understood as analytically distinct but not mutually exclusive; individual incidents may involve interacting failures across multiple layers. From a practical perspective, this suggests that contributing factors should not be forced into a single-layer classification where overlap exists. Instead, identifying all relevant layers to which a contributing factor pertains may provide a more accurate representation of the underlying failure dynamics and the implementation of safeguards across multiple layers may be more effective than isolated interventions, as this allows risks to be addressed at different points of manifestation.

One persistent challenge is that the translation of research into practice is often guided by performance metrics (e.g. accuracy, F1 score), which do not specify the necessary evidence for safe deployment. Although reporting guidelines for clinical AI studies have been introduced (e.g. CONSORT-AI/SPIRIT-AI for trials,^44,45 TRIPOD+AI for clinical prediction models⁴⁶ and DECIDE-AI for early-stage clinical evaluation⁴⁷), inconsistent reporting and adherence continue to undermine a harmonized safety assessment of AI models and algorithms, reproducibility of evaluations and clinical validity.⁴⁸ Harmonized reporting is particularly important for the protective layers referring to data governance, quality assurance, model evaluation, and change control. From a Swiss cheese model perspective, such reporting is particularly important for layers 1 and 2 (data governance and quality assurance, and model development, evaluation and change control, respectively) because these layers require transparent documentation of dataset provenance and representativeness, labelling practices, subgroup analyses, calibration, intended use and update/revalidation policies.

Further, healthcare institutions deploying AI-driven digital health systems must treat their implementation as a safety-critical intervention. This requires human factors and workflow testing (Layer 3), clear accountability for responding to outputs (Layer 4), pathways for escalating safety-critical situations and establishing and maintaining a usable monitoring infrastructure (Layer 5, e.g. tracking alert burden, overrides, performance drift and subgroup safety signals). However, the systematic learning of post-market incidents involving digital health interventions is still in its infancy. In the United States, although the FDA’s MAUDE database provides a publicly searchable collection of medical device reports, the scale and variability of the data make it challenging to conduct a systematic assessment, and the database is known to have limitations as a surveillance source.⁴⁹

Crucially, existing incident infrastructures also make it difficult to identify events specifically linked to AI functionality, as structured identifiers (e.g. an Artificial Intelligence/Machine Learning flag and software/version fields) are not consistently available. This limits the reliable retrieval and synthesis of information, and therefore adverse event identification for AI as a medical device requires targeted search strategies and remains methodologically challenging.^50,51

In this untransparent landscape of AI-driven digital health, healthcare professionals and patients need support to calibrate their trust. This includes providing training and developing skills,¹² designing workflows that allow for verification and escalation opportunities, and ensuring that patient-facing tools offer clear guidance on when and how to seek human assistance. It is particularly important to maintain a clinician-in-the-loop pathway (or an explicit escalation route) because patients may hesitate to report unexpected events due to uncertainty, low health literacy or fear of misunderstanding, which reduces the likelihood that early warning signs will be captured and acted upon.

5.2. Comparison to other safety approaches

Reason’s model is not the only useful approach to safety; its value lies in its explanatory clarity and barrier logic. Complementary frameworks can help to specify, test and ensure those barriers. Standards for medical device risk management and software lifecycles (e.g. ISO 14971, IEC 62304 and usability engineering guidance) provide a structured hazard–control lifecycle and are particularly relevant for regulated digital therapeutics and AI medical devices.

AI systems that are classified as software as a medical device are subject to established regulatory requirements, including risk management, clinical evaluation, and post-market surveillance. Many consumer-facing or wellness tools operate outside these frameworks despite having potential health impacts. Safeguards such as transparency, user guidance, and post-market monitoring may be less consistently applied or rely on voluntary standards. This creates potential gaps, particularly where such tools are used in ways that influence health-related decisions. The proposed framework can help to bridge this divide by highlighting a common set of safety layers that are relevant across both contexts, while allowing for differences in the rigor and formalization of safeguards depending on the level of risk and regulatory oversight.

General AI governance frameworks (e.g. the NIST AI Risk Management Framework (https://airc.nist.gov/airmf-resources/airmf/) and the ISO/IEC 23894:2023 Information technology – Artificial intelligence – Guidance on risk management) provide cross-sector risk management functions and organizational capabilities. Our five-layer model can be operationalized using the NIST AI RMF functions:

• GOVERN → Layer 4 + Layer 5: define accountable roles, risk tolerances, documentation requirements, and post-deployment responsibilities (including incident response and corrective action).

• MAP → Layer 1 + Layer 3: characterize context of use, populations, workflows, and human oversight needs; identify where the system may be brittle (e.g., time pressure, handoffs, missing data).

• MEASURE → Layer 2 + Layer 5: quantify model performance in context, calibration, subgroup outcomes, robustness under shift, and real-world safety signals (alert burden, override rates, near misses).

• MANAGE → all layers (especially 2–5): implement controls, change control and revalidation triggers, escalation and fallback pathways, rollback/kill switches, and continuous improvement.

This mapping enables digital health developers to translate patient safety narratives into governance and operational actions that can be assigned, audited and improved iteratively.

Sociotechnical patient safety models (e.g., Systems Engineering Initiative for Patient Safety (SEIPS) model⁵²) are closely aligned with our integration layer, as they explain how work-system design shapes outcomes. However, our framework goes further by emphasizing AI lifecycle issues (e.g. data provenance, drift and update governance) that are not always considered in traditional healthcare workflow models.

Safety-II/resilience engineering⁵³ complements the adapted Swiss cheese model by emphasizing how work usually goes right, which is highly relevant for AI-enabled care, where healthcare professionals routinely adapt to uncertainty. This approach can enhance Layers 3 and 5 by shifting the focus from preventing failure to fostering adaptive capacity, recovery, and learning.

Table 3 shows the limitations of established approaches to AI and healthcare safety, including the National Institute of Standards and Technology AI Risk Management Framework, the SEIPS model,⁵² ISO 14971-based risk management for software as a medical device (SaMD), and emerging concepts of digitalovigilance and algorithmovigilance. It also shows how the proposed framework complements these existing approaches.

Table 3.

Comparison with other risk management frameworks.

	Primary focus	Strengths	Limitations (relative to this work)	Added value of our framework
NIST AI RMF	AI risk management lifecycle	Comprehensive, cross-sector guideline	High-level, not healthcare-specific; limited linkage to clinical workflows	Operationalized for healthcare; linked to real incident patterns
SEIPS	Sociotechnical systems in healthcare	Strong human factors and workflow focus	Limited coverage of AI-specific risks (e.g. data bias, model drift)	Extends SEIPS with AI-specific technical layers
SaMD risk management/ISO 14971	Medical device risk management	Structured hazard analysis and lifecycle control	Focus on device-level risks; less emphasis on data/model bias and sociotechnical use	Integrate data, model and system-level risks
Digitalovigilance/Algorithmovigilance	Post-market monitoring of digital health/algorithms	Emphasis on real-world surveillance	Primarily reactive; less focus on upstream design and governance	Embeds monitoring withing a full multi-layer safety model

While these frameworks provide important guidance on risk management, sociotechnical system design, and post-market surveillance, they typically address specific dimensions of safety in isolation. In contrast, our framework integrates these perspectives into a unified, incident-informed model that explicitly links failure modes across multiple layers, from data and model design to governance, clinical integration, and post-market monitoring.

The added value of the proposed framework lies in four key contributions. First, it introduces an explicit barrier-based, layered logic adapted from Reason’s Swiss cheese model, enabling the analysis of how failures emerge and propagate across interconnected system levels. Second, the framework is empirically grounded in real-world incident analysis, linking abstract risk categories to observed failure patterns in AI-enabled healthcare. Third, it provides a healthcare-specific sociotechnical integration that connects technical, human, organizational, and governance dimensions within a single structure, rather than addressing these domains separately. Fourth, it offers a practical mapping of safeguards across layers, including minimum and advanced safeguards, which supports implementation and prioritization in real-world settings. Rather than replacing existing approaches, the framework integrates and operationalizes them within a unified structure that reflects how risks manifest in practice.

5.3. Guidance to practical application

The suggested framework is intended to be applied prospectively, i.e., as a proactive safety assessment tool. It visualizes how errors in AI-driven digital health systems occur and which protective layers may help prevent these errors from resulting in patient harm. To apply the model for a specific AI-driven digital health solution, potential hazards first need to be identified. Subsequently, for the identified hazards safeguards need to be developed and integrated into system and processes. Examples of such safeguards are provided in Table 2.

This assessment process can be repeated iteratively to identify remaining hazards or “holes” within the protective layers. Additional mitigation strategies can then be developed to minimize or eliminate these vulnerabilities.

Given the limited resources available in healthcare, development of mitigation strategies have to be prioritized. Such prioritization may consider several factors, including the potential impact on patient safety, feasibility, resource requirements, interdependence of safeguards, and generalizability across settings. For example, a model trained on biased or unrepresentative data for the target population is likely to pose a substantial risk of patient harm; therefore, safeguards addressing data representativeness should be prioritized. Some safeguards can be implemented relatively easily, even in resource-constrained settings, such as clear documentation of the underlying dataset, intended purpose, and target population. In addition, safeguards may need to be prioritized for layers whose failure could lead to downstream system failures. For instance, poor data quality may compromise the reliability of the entire system. The relative risk associated with failures in each layer depends heavily on the specific purpose of the AI system and the healthcare environment in which it is deployed. Importantly, no single protective layer acts in isolation. Effective safety depends on a robust and interconnected system of safeguards operating across all layers. As a rule of thumb, we suggest, that after addressing technical vulnerabilities, broader measures such as fostering a culture of safety and providing user training should be implemented. Even after deployment, regular reviews are essential to identify emerging vulnerabilities and maintain system safety over time.

5.4. Limitations

This work has some limitations. First, the Swiss cheese model risks over-simplifying complex adaptive systems if interpreted as a linear chain of failures. We mitigate this by emphasizing dynamic holes and by explicitly integrating person, technology, and system lenses; nonetheless, system-theoretic and resilience approaches may be required for tightly coupled environments. Second, AI-driven digital health is heterogeneous: predictive models, generative systems, and autonomous decision support differ in hazard profiles and source of error. The incidents considered from the incident database relate to a broad range of AI that exhibit distinct hazard profiles. For example, predictive models are particularly susceptible to calibration drift and dataset shift⁵⁴; computer vision systems often face challenges related to data representativeness and subgroup performance⁵⁵; and generative systems introduce risks such as hallucinations, unsafe outputs, and failures in escalation.^56,57 Rather than assuming uniformity in these systems, the proposed adaptation of the Swiss cheese model is intended to accommodate modality-specific risks within shared system-level safeguards. Each layer addresses different failure modes depending on the type of AI system, thereby enabling both generalizability and specificity in risk mitigation. The proposed protective layers have to be considered as guidance as they cannot be considered complete for all AI-driven digital health technologies. Specific tailoring is needed. The exploratory study was based on 15 randomly selected cases that fulfilled our inclusion criteria. This is a limited set of incidents. However, they were referring to a diverse set of technologies. The heterogeneity of included cases, while enabling cross-cutting insights, may obscure modality-specific risk patterns. The information available on the incidents was basically originating from online news or social media posts and may be biased by media reporting.

The proposed safeguard catalogue should be interpreted in light of the study’s exploratory design. The layers and safeguards were derived inductively from a limited and heterogeneous sample of 15 incidents and are not intended to represent an exhaustive or consensus-based set of recommendations. Rather than prescribing definitive controls, the framework offers an analytically grounded structure for identifying and organizing failure modes and safeguards across sociotechnical layers, and should be adapted to specific AI modalities, clinical contexts, and regulatory environments.

The thematic analysis was conducted by a single researcher using an inductive, iterative approach. Initial codes were generated from the first cases and compiled into a preliminary codebook, which was continuously refined as additional cases were analyzed. To promote internal consistency, earlier cases were revisited and recoded where necessary, and coding decisions were systematically documented. Through this process of constant comparison across the 15 cases, the thematic categories and their grouping into higher-level constructs were progressively stabilized. The proposed Swiss cheese model for AI safety was presented to several audiences in international conferences (e.g. keynote at AI in Health conference in Cambridge, September 8-10, 2025) with informal feedback received and included in the adapted version. Nevertheless, the absence of a second independent coder or formal inter-rater reliability assessment represents a limitation. The mapping of incident-level observations to thematic groups and subsequently to protective layers involves interpretive judgment, which may introduce subjectivity. As a result, the derived layer structure should be understood as an exploratory analytical framework rather than a definitive classification. This constraint may limit the transferability of the findings, and future research could strengthen robustness by incorporating multiple coders, formal codebook validation procedures, or external validation of the thematic structure.

6. Conclusion

The success of AI-driven digital health depends less on achieving perfect system performance than on creating robust, adaptive safeguards that can protect patients despite limitations in both humans and AI, and additionally on a systemic view on AI-driven system’s use in healthcare. We argue for a “just culture” in the context of AI in healthcare, in which researchers, healthcare professionals, and even patients, are encouraged to report critical incidents and errors associated with AI-based digital health tools. Such reporting is essential for learning from incidents rather than concealing them, as only recognized failure modes can be systematically addressed through appropriate safeguards. However, a “just culture” must be carefully balanced with accountability: while most incidents may arise from system-level issues that warrant learning and improvement, cases of negligence or misconduct require appropriate response. Moreover, effective reporting infrastructures must address practical constraints, including patient privacy, legal liability, and vendor-related limitations, particularly in the context of proprietary AI systems. Importantly, patients should also be enabled to report incidents, which requires accessible, transparent, and non-punitive reporting channels, alongside protections against blame or retaliation. Establishing such a balanced and inclusive reporting culture is essential to ensure safety and enable continuous learning from AI-related failures in healthcare.

Future work should (i) translate each layer into measurable maturity levels (minimum vs advanced practices), (ii) define and validate core safety metrics for AI-enabled workflows (calibration drift, subgroup disparity, alert burden, override patterns, incident rates), (iii) evaluate how “dynamic holes” evolve longitudinally as models and workflows change, and (iv) test whether a layered barrier approach reduces safety incidents in prospective, multicenter implementations. Finally, measurements have to be taken to support and facilitate the recommended “just culture”, e.g. by establishing low-level reporting options for observed safety hazards related to AI-driven digital health.

Supplemental material

Supplemental material - Safeguarding AI-driven digital health - An adaptation of the Swiss cheese model for safety

Supplemental material for Safeguarding AI-driven digital health – An adaptation of the Swiss cheese model for safety by Kerstin Denecke in Digital Health.

Footnotes

ORCID iD

Kerstin Denecke

Ethical considerations

No ethics approval was needed for this research.

Author contributions

Conceptualization, Methodology, investigation, Project administration, Writing – original draft, Writing – review and editing: KD.

Funding

The author received no financial support for the research, authorship, and/or publication of this article

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Supplemental material

Supplemental material for this article is available online.

References

Jia

. A Scoping Review of AI-Driven Digital Interventions in Mental Health Care: Mapping Applications Across Screening, Support, Monitoring, Prevention, and Clinical Education. Healthcare 2025; 13(10): 1205. https://doi.org/10.3390/healthcare13101205

Biro

Handley

Cobb

, et al. Accuracy and Safety of AI-Enabled Scribe Technology: Instrument Validation Study. J Med Internet Res 2025; 27: e64993. https://doi.org/10.2196/64993

Knitza

Tascilar

Fuchs

, et al. Diagnostic Accuracy of a Mobile AI-Based Symptom Checker and a Web-Based Self-Referral Tool in Rheumatology: Multicenter Randomized Controlled Trial. J Med Internet Res 2024; 26: e55542. https://doi.org/10.2196/55542

Vasdev

Gupta

Pawar

, et al. Navigating the future of health care with AI-driven digital therapeutics. Drug Discov Today 2024; 29(9): 104110. https://doi.org/10.1016/j.drudis.2024.104110

Denecke

Lopez-Campos

May

. The Unintended Harm of Artificial Intelligence (AI): Exploring Critical Incidents of AI in Healthcare. Stud Health Technol Inform 2025; 329: 1013–1018. https://doi.org/10.3233/SHTI250992

Lopez-Campos

Gabarron

Martin-Sanchez

, et al.

Digital Interventions and Their Unexpected Outcomes - Time for Digitalovigilance?

Stud Health Technol Inform 2024; 310: 479–483. https://doi.org/10.3233/SHTI231011

Balendran

Benchoufi

Evgeniou

, et al. Algorithmovigilance, lessons from pharmacovigilance. NPJ Digit Med 2024; 7(1): 270. https://doi.org/10.1038/s41746-024-01237-y

Turri

Dzombak

. Why We Need to Know More: Exploring the State of AI Incident Documentation Practices. In: Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society [Internet], Montr’{e}al QC Canada: ACM, 2023, pp. 576–583. https://doi.org/10.1145/3600211.3604700

Masilamani

George

. Diving into the Regulatory Landscape of Digital Therapeutics. Ther Innov Regul Sci 2026; 5: 725–735. https://doi.org/10.1007/s43441-026-00944-w

10.

Torous

Linardon

Goldberg

, et al. The evolving field of digital mental health: current evidence and implementation issues for smartphone apps, generative artificial intelligence, and virtual reality. World Psychiatry 2025; 24(2): 156–174. https://doi.org/10.1002/wps.21299

11.

Pias

Afrose

Tuli

, et al. Low responsiveness of machine learning models to critical or deteriorating health conditions. Commun Med 2025; 5(1): 62. https://doi.org/10.1038/s43856-025-00775-0

12.

Denecke

Lopez-Campos

Gabarron

, et al. Hidden in Plain Sight: The Harmful Side of AI–Based Mental Health Interventions. In: Andrikopoulou

Gallos

Arvanitis

(eds). Studies in Health Technology and Informatics [Internet]. IOS Press, 2025. https://doi.org/10.3233/SHTI250322

13.

Coiera

Liu

. Evidence synthesis, digital scribes, and translational challenges for artificial intelligence in healthcare. Cell Rep Med 2022; 3(12): 100860. https://doi.org/10.1016/j.xcrm.2022.100860

14.

Adegunle

Chhatwal

Arab

, et al. Bias and Oversight in Clinical AI: A Review of Decision Support Tools and Equity Frameworks. J Gen Intern Med 2026; 41: 1957–1968. https://doi.org/10.1007/s11606-026-10229-5

15.

Muralidharan

Adewale

Huang

, et al. A scoping review of reporting gaps in FDA-approved AI medical devices. Npj Digit Med 2024; 7(1): 273. https://doi.org/10.1038/s41746-024-01270-x

16.

OECD . OECD AI Incidents and Hazards Monitor [Internet]. [cited 2025 Dec 26]. https://oecd.ai/en/incidents (2025).

17.

Lakhan

. Postmarket Safety Surveillance of FDA-Cleared Prescription Digital Therapeutics Using the Manufacturer and User Facility Device Experience (MAUDE) Database: A Pharmacovigilance Study. Cureus 2025; 17: e85343. https://doi.org/10.7759/cureus.85343

18.

Reason

. Human error: models and management. West J Med 2000; 172(6): 393–396. https://doi.org/10.1136/ewjm.172.6.393

19.

Pandit

Hong

, et al. MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models. In: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing [Internet]. Suzhou, China, 2025: Association for Computational Linguistics, pp. 2858–2873. https://doi.org/10.18653/v1/2025.emnlp-main.143

20.

Musa

Prasad

Hernandez

. Addressing cross-population domain shift in chest X-ray classification through supervised adversarial domain adaptation. Sci Rep 2025; 15(1): 11383. https://doi.org/10.1038/s41598-025-95390-3

21.

Rafner

Dellermann

Hjorth

, et al. Deskilling, Upskilling, and Reskilling: a Case for Hybrid Intelligence. Morals Mach 2021; 1(2): 24–39. https://doi.org/10.5771/2747-5174-2021-2-24

22.

Rosenbacke

Melhus

McKee

, et al. How Explainable Artificial Intelligence Can Increase or Decrease Clinicians’ Trust in AI Applications in Health Care: Systematic Review. Jmir Ai 2024; 3: e53207. https://doi.org/10.2196/53207

23.

Cross

Choma

Onofrey

. Bias in medical AI: Implications for clinical decision-making. PLOS Digit Health 2024; 3(11): e0000651, PubMed PMID: 39509461; PubMed Central PMCID: PMC11542778. https://doi.org/10.1371/journal.pdig.0000651

24.

Norori

Aellen

, et al. Addressing bias in big data and AI for health care: A call for open science. Patterns N Y N 2021; 2(10): 100347. https://doi.org/10.1016/j.patter.2021.100347

25.

Obermeyer

Powers

Vogeli

, et al. Dissecting racial bias in an algorithm used to manage the health of populations. Science 2019; 366(6464): 447–453. https://doi.org/10.1126/science.aax2342

26.

Ghassemi

Naumann

Schulam

, et al. A Review of Challenges and Opportunities in Machine Learning for Health. AMIA Jt Summits Transl Sci Proc AMIA Jt Summits Transl Sci 2020; 2020: 191–200.

27.

Komolafe

Mei

Zarate

, et al. Early Prediction of Sepsis: Feature-Aligned Transfer Learning [Internet]. arXiv 2025. DOI: 10.48550/ARXIV.2505.02889. https://arxiv.org/abs/2505.02889

28.

Sadeghi

Alizadehsani

Cifci

, et al. A review of Explainable Artificial Intelligence in healthcare. Comput Electr Eng 2024; 118: 109370. https://doi.org/10.1016/j.compeleceng.2024.109370

29.

Challen

Denny

Pitt

, et al. Artificial intelligence, bias and clinical safety. BMJ Qual Saf 2019; 28(3): 231–237. https://doi.org/10.1136/bmjqs-2018-008370

30.

United States . Department of Health and Human Services, issuing body. Software as a medical device (SAMD): clinical evaluation: guidance for industry and Food and Drug Administration staff. https://collections.nlm.nih.gov/catalog/nlm:nlmuid-101720008-pdf (2017).

31.

Subbaswamy

Saria

. From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics 2019; 21: kxz041–kxz352. https://doi.org/10.1093/biostatistics/kxz041

32.

Saelmans

Seinen

Pera

, et al. Implementation and Updating of Clinical Prediction Models: A Systematic Review. Mayo Clin Proc Digit Health 2025; 3(3): 100228. https://doi.org/10.1016/j.mcpdig.2025.100228

33.

Salwei

Carayon

. A Sociotechnical Systems Framework for the Application of Artificial Intelligence in Health Care Delivery. J Cogn Eng Decis Mak 2022; 16(4): 194–206. https://doi.org/10.1177/15553434221097357

34.

Curcin

Delaney

Alkhatib

, et al. Learning Health Systems provide a glide path to safe landing for AI in health. Artif Intell Med 2026; 173: 103346. https://doi.org/10.1016/j.artmed.2025.103346

35.

Miner

Shah

Bullock

, et al. Key Considerations for Incorporating Conversational AI in Psychotherapy. Front Psychiatry 2019; 10: 746. https://doi.org/10.3389/fpsyt.2019.00746

36.

Ancker

Nosal

Hauser

, et al. Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system. BMC Med Inform Decis Mak 2017; 17(1): 36. https://doi.org/10.1186/s12911-017-0430-8

37.

European Union. Artificial Intelligence Act (Regulation (EU) 2024/1689) . Official Journal 2024, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689

38.

Shuttleworth

KMJ

. Medical artificial intelligence and the black box problem: a view based on the ethical principle of “do no harm.”. Intell Med 2024; 4(1): 52–57. https://doi.org/10.1016/j.imed.2023.08.001

39.

Durán

Jongsma

. Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI. J Med Ethics 2021; 18. https://doi.org/10.1136/medethics-2020-106820

40.

Mitchell

Zaldivar

, et al. Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency [Internet]. Atlanta GA USA, 2019: ACM, pp. 220–229. https://doi.org/10.1145/3287560.3287596

41.

Gebru

Morgenstern

Vecchione

, et al. Datasheets for Datasets. 2018. https://arxiv.org/abs/1803.09010

42.

McNamara

Lotter

. The clinician-AI interface: intended use and explainability in FDA-cleared AI devices for medical image interpretation. Npj Digit Med 2024; 7(1): 80. https://doi.org/10.1038/s41746-024-01080-1

43.

Morley

Murphy

Mishra

, et al. Governing Data and Artificial Intelligence for Health Care: Developing an International Understanding. JMIR Form Res 2022; 6(1): e31623. https://doi.org/10.2196/31623

44.

Ibrahim

Liu

Rivera

, et al. Reporting guidelines for clinical trials of artificial intelligence interventions: the SPIRIT-AI and CONSORT-AI guidelines. Trials 2021; 22(1): 11. https://doi.org/10.1186/s13063-020-04951-6

45.

Cruz Rivera

Liu

Chan

Denniston

Calvert

The

SPIRIT-AI

CONSORT-AI Working Group , et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med 2020; 26(9): 1351–1363. https://doi.org/10.1038/s41591-020-1037-7

46.

Collins

Moons

KGM

Dhiman

, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 2024; 385: e078378. https://doi.org/10.1136/bmj-2023-078378

47.

Vasey

Nagendran

Campbell

, et al. Reporting guideline for the early stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. BMJ 2022; 377: e070904. https://doi.org/10.1136/bmj-2022-070904

48.

Kolbinger

Veldhuizen

Zhu

, et al. Reporting guidelines in medical artificial intelligence: a systematic review and meta-analysis. Commun Med 2024; 4(1): 71. https://doi.org/10.1038/s43856-024-00492-0

49.

Everhart

Karaca-Mandic

Redberg

, et al. Late adverse event reporting from medical device manufacturers to the US Food and Drug Administration: cross sectional study. BMJ 2025; 388: e081518. https://doi.org/10.1136/bmj-2024-081518

50.

Handley

Krevat

Fong

, et al. Artificial intelligence related safety issues associated with FDA medical device reports. Npj Digit Med 2024; 7(1): 351. https://doi.org/10.1038/s41746-024-01357-5

51.

Kale

Dattani

Tabansi

, et al. AI as a Medical Device Adverse Event Reporting in Regulatory Databases: Protocol for a Systematic Review. JMIR Res Protoc 2024; 13: e48156. https://doi.org/10.2196/48156

52.

Holden

Carayon

. SEIPS 101 and seven simple SEIPS tools. BMJ Qual Saf 2021; 30(11): 901–910. https://doi.org/10.1136/bmjqs-2020-012538

53.

Ham

. Safety-II and Resilience Engineering in a Nutshell: An Introductory Guide to Their Concepts and Methods. Saf Health Work 2021; 12(1): 10–19. https://doi.org/10.1016/j.shaw.2020.11.004

54.

Finlayson

Subbaswamy

Singh

, et al. The Clinician and Dataset Shift in Artificial Intelligence. N Engl J Med 2021; 385(3): 283–286. https://doi.org/10.1056/NEJMc2104626

55.

Seyyed-Kalantari

Zhang

McDermott

MBA

, et al. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat Med 2021; 27(12): 2176–2182. https://doi.org/10.1038/s41591-021-01595-0

56.

Chen

, et al. Strategies for the Analysis and Elimination of Hallucinations in Artificial Intelligence Generated Medical Knowledge. J Evid-Based Med 2025; 18(3): e70075. https://doi.org/10.1111/jebm.70075

57.

Howell

. Generative artificial intelligence, patient safety and healthcare quality: a review. BMJ Qual Saf 2024; 33(11): 748–754. https://doi.org/10.1136/bmjqs-2023-016690

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.03 MB

0.00 MB