Recommendation Statement for the Validation,Implementation,and Clinical Application of Artificial Intelligence Within a Clinical Laboratory from the Digital Pathology Association

Abstract

Background:

The rapid adoption of digital pathology and artificial intelligence (AI) in oncology creates multiple opportunities for precision diagnostics but also creates an urgent need for evidence-based standards to ensure safe and effective implementation.

Goal:

This article, developed by the Digital Pathology Association, presents recommendations for digital pathology as well as the validation and clinical utility of AI-enabled digital pathology tools in clinical practice. This guidance addresses analytical and clinical validation, algorithm reliability, and criteria for establishing clinical utility attributed to test use. Key recommendations emphasize separate validation of scanning processes and AI algorithms, concordance studies to support interscanner generalizability, rigorous assessment of accuracy and reliability in real-world settings, and clear description of algorithmic use limitations. These recommendations further provide frameworks for when AI may replace or augment existing diagnostic approaches, such as biomarker scoring, cancer diagnosis, and prognostic risk assessment.

Conclusions:

By considering payer, regulatory, and clinical perspectives, the recommendations promote transparency, trust, and reproducibility in digital pathology while encouraging value-based care delivery. We support responsible innovation in computational pathology, ensuring that AI applications achieve not only technical performance goals but also deliver measurable clinical benefit to patients.

Keywords

guideline artificial intelligence pathology recommendation statement

Executive Summary

Digital pathology and artificial intelligence (AI) are transforming anatomical pathology practice, yet their adoption has outpaced the development of comprehensive implementation standards. This recommendation statement addresses that gap by establishing evidence-based frameworks for validation, deployment, and clinical use of digital pathology systems, including AI-enabled tools, in clinical laboratories.

The Challenge: Health care institutions face critical decisions about digital pathology adoption without standardized guidance on validation requirements, appropriate clinical applications, or quality assurance measures. This uncertainty creates risks: inconsistent performance across institutions, unclear regulatory pathways, variable reimbursement, and potential patient safety concerns. Meanwhile, patients in underserved settings lack access to subspecialty pathology expertise that digital systems could provide.

Core Principles: This recommendation statement establishes that digital pathology and AI tools must meet rigorous validation standards while remaining under qualified pathologist oversight. AI augments rather than replaces pathologist expertise. Validation requirements scale appropriately with clinical risk: assistive tools that highlight features require different evidence than autonomous systems that generate diagnostic conclusions.

Key Recommendations

For Digital Pathology Infrastructure:

Health care institutions should begin planning for digital pathology adoption to enable workflow optimization, improve access to subspecialty expertise, and support artificial intelligence (AI) integration.

Digital pathology is specifically indicated when traditional pathology services are unavailable, particularly for patients in Critical Access Hospitals, Rural Health Centers, and Federally Qualified Health Centers.

All clinical AI applications require qualified pathologist oversight.

For Validation:

Slide scanning systems must be validated independently from AI algorithms.

AI algorithms must demonstrate analytical and clinical validation for each scanner on which they will be used.

Interscanner concordance studies can generalize performance across scanners when using validated systems as reference standards.

AI validation must assess both accuracy and reliability, with quality control systems to detect batch effects and technical failures.

Minimum tissue requirements and artifact limitations must be specified for each algorithm.

For Clinical Application:

AI should be deployed only within validated clinical contexts where it can improve diagnostic accuracy, reliability, or efficiency.

Three categories of appropriate use are defined: (1) replacing existing invasive or expensive tests when validated, (2) augmenting pathologist assessment of verifiable tasks, and (3) predicting outcomes using features that pathologists cannot readily assess.

Clinical utility must be demonstrated through improved patient outcomes, either via direct evidence or validated chains of evidence.

Impact and implementation

These recommendations balance innovation with patient safety by providing clear pathways for responsible AI adoption. Our hope is to provide clarity for validation frameworks for laboratories and algorithm developers, clinical utility standards for payers, and, most importantly, improved diagnostic accuracy and expanded access to expertise to impact patient outcomes.

The recommendations are graded by evidence certainty and strength, ranging from strong recommendations backed by robust evidence to conditional guidance where further research is needed. Use cases spanning cancer detection, biomarker scoring, prognostic assessment, quality control, and case triage demonstrate current applications meeting these standards.

Stakeholders

This document is directed to a number of stakeholders that rely on, interpret, produce, pay for, or manufacture digital pathology data, devices, and software. Table 1 lays out each stakeholder, what their role is within digital pathology, and how they can use the document’s recommendations.

Table 1.

An Overview of the Various Stakeholders Involved in Digital Pathology and How They Can Best Utilize This Document

Stakeholder	Role	Primary interest	How to use this document
Patient	Ultimate beneficiary and risk bearer of diagnostic decisions. They indirectly consume digital pathology outputs via reports and treatment decisions.	Diagnostic accuracy and safety, timeliness of results, privacy and data protection, transparency, and trust.	Benefit from the standards, which aim to promote diagnostic accuracy, safety, and expanded access to subspecialty pathology expertise (eg, in rural/underserved areas).
Treating physician	Uses pathology reports to guide clinical management.	Reliability of diagnosis, turnaround time, clarity of reports, confidence in standard of care, and use of results in medical decision-making.	Use the recommendations to gain confidence in the reliability and clarity of pathology reports, and to ensure that new AI tools are deployed within validated clinical contexts that improve patient outcomes.
Pathologists	Primary diagnostic authority; signs out cases and bears clinical responsibility.	Diagnostic confidence, image quality, workflow, validation standards, and medicolegal defensibility.	Apply the rigorous validation standards to their practice, leverage the document’s frameworks to support diagnostic confidence and workflow optimization (eg, AI-based triage), and ensure compliance with clinical AI oversight.
Diagnostic laboratory leadership	Oversees lab operations, quality, compliance, and strategy.	Regulatory compliance, risk management, operational efficiency, scalability, and cost.	Utilize the recommendations to guide planning for digital pathology adoption, ensure regulatory compliance, manage risk, and establish quality assurance measures for new systems and AI tools.
Regulators	Sets and enforces standards to protect patient safety.	Reproducibility, traceability, accountability, and evidence-based practice.	Use the recommendations on validation, reliability, and accountability to set and enforce guidelines that protect patient safety and facilitate transparency in the computational pathology space.
Payers	Reimburses pathology services and influences adoption through coverage policy.	Clinical validity and utility, cost-effectiveness, consistency of care, and fraud prevention.	Obtain clinical utility standards and cost-effectiveness criteria to develop informed coverage policies that align reimbursement with clinically meaningful, generalizable, and quality-aligned digital pathology and AI services.
Manufacturers	Develops and supply scanners, software, and AI tools.	Clear requirements, validation expectations, interoperability, and market acceptance.	Use the clear requirements and validation expectations outlined in the document to inform the development, supply, and interoperability of scanners, software, and AI tools, ensuring market acceptance and regulatory clarity.

AI, artificial intelligence.

Scope

This document encompasses various elements of digital pathology including whole-slide image (WSI) scanning, image viewing, and AI algorithms for image interpretation in the clinical setting. This is inclusive of applications in solid formalin-fixed, paraffin-embedded tissue using brightfield microscopy and excludes cytology and immunofluorescence-based imaging.

Objectives

The objectives of this document are to provide clear, practical, and evidence-based recommendations for the safe, effective, and responsible use of digital pathology, inclusive of uses of AI in clinical digital pathology. Specifically, this guidance aims to:

Define key principles for evaluating and deploying digital pathology in clinical pathology workflows, with a focus on diagnostic support, quality assurance, and clinical utility.

Establish best practices for the validation and performance assessment of digital pathology across varied tissue types, staining protocols, and imaging systems.

Promote transparency and interpretability of digital pathology outputs to support pathologist oversight and clinical decision-making.

Support consistency and reproducibility in AI-assisted outputs across institutions and settings.

Facilitate payer and stakeholder confidence by outlining standards that ensure that digital pathology and its AI tools are clinically meaningful, generalizable, and aligned with quality care delivery.

These objectives are intended to support clinicians, developers, payers, institutions, and regulators in the thoughtful integration of digital pathology into pathology practice, ultimately advancing diagnostic excellence and patient care.

Definitions

Different fields use the same terms in distinct ways. This issue is particularly acute at the intersection of health care and computational technology. Indeed, some terms widely used in digital pathology or health care applications of AI either lack a clear and consistent definition or exhibit contradictory definitions in academic literature. In the absence of a clear and universally accepted definition, we provide the following definitions for the purposes of this recommendation statement:

Accuracy: The frequency with which a model produces the desired output (ie, “a model produces the desired result 78% of the time”). Depending on the task, the desired output or “ground truth” is typically derived from a consensus pathologist label, ancillary/reference testing, or recorded observations of patient outcomes.

Algorithm: A predefined set of computational steps that processes input data to produce a desired output, such as a classification, segmentation, or regression result. Algorithms may contain as a step the application or use of a model alongside other simple operations or hand-crafted rules.

Model: A mathematical or computational representation that performs a discrete task by mapping input data (eg, images) to outputs, such as categorical classifications, continuous values, or spatial segmentations. A model’s representation is composed of numerical parameters derived during a “training” process from historical or labeled datasets. The desired outcome of such a process is a model that accurately performs the intended task. Synonymous with “AI Model.”

AI Algorithm: Shorthand for an algorithm that uses a model as a step.

Classification: The computational process of assigning predefined labels or categories to an image, image region, or object based on its visual or quantitative features. Examples include classifying WSIs into tumor versus normal and tumor subtyping (eg, adenocarcinoma vs squamous cell carcinoma in lung tissue).

Regression: The computational process of assigning a continuous numeric value from image-derived features. Examples include tumor percentage or tumor area, contiguous malignancy scores, or risk of recurrence.

Detection: The computational process of identifying a region of interest within a WSI.

Segmentation: The computational process of delineating and classifying specific regions, structures, or objects within a digital pathology image into meaningful components for analysis or interpretation. Examples include tissue types, cellular compartments, lesions, or regions of interest for subsequent analysis or interpretation.

Reliability/Precision: The degree to which repeated evaluations of the same specimen using a digital pathology system yield consistent results, independent of when, where, or by whom the evaluation is performed.

Generalizability: The extent to which the performance of an algorithm maintains accuracy and reliability across diverse, relevant patient populations, care settings, digital slide scanners, and staining protocols.

WSI: A digitized representation of an entire histology slide, often at 20× or 40× magnification, used for diagnostic review or computational pathology analysis.

Introduction

Digital pathology is the practice of converting glass histology slides into high-resolution digital images using specialized scanners, enabling pathologists to review and analyze tissue samples on a computer screen, mimicking the microscope but with enhanced capabilities. These digitized slides can be viewed locally or remotely, stored electronically, and analyzed using advanced software, including artificial intelligence (AI) tools. Digital pathology supports remote consultation and patient access, enhances consistency in diagnosis, and forms the foundation for computational pathology to support precision medicine and data-driven health care.

Digital pathology adoption has grown significantly in academic medical centers, integrated delivery networks, and reference laboratories. Clinical use is increasing as whole slide scanners, image management systems, and displays are used by pathologists for primary diagnosis, enabling more efficient service delivery models and improving access to subspecialty expertise through telepathology. In parallel, digitization supports downstream automation and analytics, including the use of AI-based image analysis tools.

AI in pathology has demonstrated significant promise in improving consistency^1–3 and reducing diagnostic variability.^4,5 Algorithms are increasingly used to assist in tumor grading,^6–8 cell classification,^9,10 biomarker quantification,¹¹ risk stratification,^12–14 and triage of negative cases. The vast majority of these tools are designed to augment rather than replace the expertise of a pathologist, improving overall quality and throughput in diagnostic workflows.^15–17 Importantly, AI can introduce a level of standardization that facilitates more equitable care delivery^16,18 and can serve as a foundation for methodology- and performance-based reimbursement models.

The remainder of the document provides an overview of the technologies underpinning digital pathology, including AI (Technology Background); specifies a set of concrete recommendations (Recommendations); and the methodology used to draft the recommendations (Methodology). Finally, it contains a motivating set of use cases.

Whole Slide Imaging in the Context of the Health Care System

An early hope for digital pathology was that it would enhance timely access to expert pathology interpretations via remote image review and interpretation. In the late 1990s, early telepathology efforts focused on expanding access to subspecialty pathology expertise. The Mediterranean Institute for Transplantation and Advanced Specialized Therapies in Palermo, Italy, collaborated with University of Pittsburgh Medical Center (UPMC) to provide continuous (24-h) expert evaluation of transplant allografts, including both primary and second-opinion interpretations by UPMC transplant pathologists.¹⁹ In a series of 78 second-opinion cases via telepathology, the UPMC pathologist disagreed with the primary pathologist 11 times, of which 3 were considered clinically significant.²⁰ This collaboration demonstrated the feasibility and clinical value of around-the-clock remote subspecialty transplant pathology services.

More recently, the National Academies of Sciences, Engineering, and Medicine convened a 2018 workshop on improving cancer diagnosis and care.²¹ A common issue raised was the increasing complexity of diagnostic assessment in cancer and the need for expertise along with patient access to that expertise.²² Digital pathology was raised as a potential solution to address this need.²²

In addition to the need for specialized pathologist opinions, some pathological interpretations also call for timely critical reviews and reports to treating clinicians. In general, conditions in which there is a need for urgent communication of results tend to be those in which there is a potentially urgent medical condition associated with the finding.

The College of American Pathologists (CAP) calls on laboratories and institutions to establish protocols and criteria for identifying and reporting urgent and significant diagnoses to the treating clinician.²³ A survey²⁴ of 1,130 CAP-accredited pathology laboratories’ policies regarding significant, unexpected, and critical diagnoses in surgical pathology found that the following specific conditions were often included in written policies: findings not expected by clinical history, malignancy, life-threatening infections, organ rejection or graft-versus-host disease, inflammatory or immunological processes, and no chorionic villi in products of conception.

As the workload for each pathologist continues to increase,²⁵ driven by expanding diagnostic criteria required for precision medicine, regulatory demands, and concomitant understaffing in many pathology departments, individual pathologists must take on more responsibilities to keep up with clinical demand. In traditional glass slide workflows, critical cases are often “hidden” among hundreds of other slides, awaiting their turn. Digital pathology introduces the possibility of AI-based triage, allowing cases with potentially critical findings to be prioritized within the pathologist’s worklist. This prioritization improves the likelihood of timely reporting and adherence to recommended turnaround times. AI systems designed to triage cancer in pathology images have already been reported, demonstrating the feasibility of identifying and prioritizing cases with critical findings for earlier review.²⁶ In radiology, for example, AI has shown the value of effective worklist prioritization: intracranial hemorrhage on CT scans can be automatically flagged, ensuring that cases with critical findings are reviewed sooner and patients receive faster care.²⁷

Access to care and specialized centers

The increasing complexity of diagnostic assessments has increasingly required greater levels of specialization and expertise in pathology, like much of the rest of medicine. While this affects all health care institutions to some degree, it disproportionately burdens those serving regions and populations for whom access to care is already limited. Care settings that are explicitly designed to serve underserved populations or communities include Rural Health Clinics (RHCs), Critical Access Hospitals (CAHs), and Federally Qualified Health Centers (FQHCs),²⁸ with Medicare criteria as described in the State Operations Manual.²⁸ Many health care institutions that do not have one of these designations may also serve those with limited access to care.

Limited access to specialty care in rural areas has long been recognized as a persistent challenge. Analyses of cancer incidence and mortality have shown that, despite declining cancer incidence rates in rural populations, cancer-related mortality has increased.²⁹ This disparity has been attributed in part to barriers in accessing care, including the geographic concentration of oncologists in densely populated urban centers rather than in rural areas.³⁰ Unger and colleagues attempted to more closely study this issue by examining outcomes from rural versus nonrural patients in SWOG studies for which protocol-directed care regardless of geography (ie, patients had access to care) would mitigate differences in rural versus urban patients.³¹ In line with this hypothesis, they found that within the SWOG cohort, outcomes were similar among the rural and nonrural patients, suggesting that improved access to care for rural patients may help mitigate the poorer cancer outcomes observed in rural populations.

FQHCs serve patients regardless of their ability to pay, rather than based on location,²⁸ but all of these facility types are recognized as a matter of definition as entities providing care to a population that has limited health care access. Health care providers may be recognized by Centers for Medicare & Medicaid Services (CMS) as an FQHC or RHC. In 2022, such community health centers served over 30 million Americans, including 1 in 9 children, 9.6 million rural residents, and 395,000 veterans.³² These health care centers continue to serve a wide range of Medicare beneficiaries, many of whom have significant comorbid illnesses such as cancer, heart disease, and lung disease.³³ A Mathematica Policy Research Report³⁴ prepared for the HHS Office of the Assistant Secretary for Planning and Evaluation noted that providing patient access to specialty care has been an important challenge among federally funded health centers, and one of the strategies being attempted to address this is the use of electronic consultations.

WSI and digital pathology open a new opportunity to give patients access to specialized pathology interpretations even if those pathologists are geographically far removed from the site of care.

Multimedia reports for treating clinicians and patients

Medicine has developed an extensive vocabulary to describe anatomical regions, tissue architecture, and pathology. This vocabulary is helpful to facilitate communication between health care professionals and patients regarding what is seen in diagnostic imaging. However, jargon is often confusing to nonclinicians, and patients may find images of their own disease easier to understand than medical language that attempts to translate those images into verbal descriptions. In addition, seeing images may give patients a sense of engagement and better ownership of their own health and care.

Availability of digital images enables treating clinicians to better understand their diagnostic findings and share those images with patients in discussions regarding the disease to provide both education and assist in informed decision-making. It is a very common practice in the United States for treating clinicians, especially specialists, to review their own images and even show them to patients since the advent of digital radiology. A study by Nyak³⁵ and colleagues of 160 US physicians sought to understand the role of images accompanying radiology text reports. The overwhelming majority (91%) of respondents indicated that access to images helps to understand the text report, and 60% of clinicians felt strongly or very strongly that access to images accompanying text would significantly improve patient care and outcomes. Not only has this result been consistently demonstrated in radiology,^36,37 but the same trends are occurring in pathology. An increasing number of patients are relying on online patient portals to access their pathology reports,³⁸ and patients prefer to augment the text reports with the images necessary to understand them.³⁹ Satisfying this patient need ultimately can only be done via the reliance on digital pathology. This benefit of digital pathology goes beyond even the patient as research suggests that the review of digital pathology slides with patients at the clinic can reduce pathologist burnout.⁴⁰

In addition, in a study of patient satisfaction, evidence from digital radiology suggests that patients prefer to see the images themselves,⁴¹ and visualization of pulmonary nodules by patients⁴² provides context to the patients to assist with understanding an ambiguous diagnosis. To date, there has not been much opportunity for clinicians to review pathology images with patients. Prior to digital pathology, the practice of pathology relied on fragile glass slides that required expensive microscopes to review, which also required maintenance. This contrasts with radiology, which could be reviewed with patients using robust silver halide film that could readily be viewed on cheap light boxes even prior to the emergence of digital radiology. Because digital pathology obviates the need to transport fragile slides and enables the viewing of slides on digital screens, the technical capability for clinicians to begin reviewing pathology images with patients is now present. While not yet widely practiced, patient-facing pathology clinics where pathology images are reviewed with patients are emerging, an encouraging practice that would be impossible without digital pathology.

Technology Background

The Technology Background section provides a high-level overview of the foundational technologies underpinning the digital transformation of pathology. It first describes brightfield imaging for creating high-resolution WSIs. This is followed by an explanation of specialized image viewing systems, which are essential software for the smooth visualization, navigation, and clinical review of WSI files, often with integration into laboratory systems. Finally, the section delves into AI, highlighting its basis in deep learning and neural networks (eg, convolutional neural networks and Transformers) as a set of computational tools for automated image interpretation, designed to augment pathologist expertise in tasks such as classification, segmentation, and detection.

Brightfield imaging

The foundational imaging technique in clinical digital pathology is brightfield scanning, which uses transmitted white light and high-resolution color cameras to create images from stained tissue sections. These scanners are capable of handling large slide volumes, offering fast scan times, automated focus, and integration into laboratory information systems (LIS). The ability to digitize slides not only supports remote review and centralized workflows but also creates a standardized, reproducible dataset that serves as the foundation for AI-driven analysis.

Image viewing

Digital pathology viewers are software applications that allow for the visualization, navigation, and manipulation of WSIs—digitized representations of traditional glass pathology slides scanned at high magnification (typically 20× or 40×). These viewers are purpose-built to handle the extremely large file sizes and resolutions associated with WSIs, often several gigabytes per image. To ensure smooth performance, viewers commonly use dynamic image tiling and streaming, loading only the portion of the image currently in view at the desired resolution, which minimizes memory usage and latency. This allows pathologists to pan and zoom through digital slides fluidly, mimicking the experience of a traditional light microscope.

Modern digital pathology viewers often support additional functionality that enhances diagnostic accuracy and efficiency. These features can include measurement tools, annotations, side-by-side comparison of serial sections or stains, and synchronized viewing of multiple images. Many platforms are integrated with LIS, electronic health records (EHR), and picture archiving and communication systems (PACS), allowing for streamlined case review and reporting. Increasingly, viewers also incorporate or interface with, AI algorithms, enabling automated quantification (eg, Ki-67, HER2), feature detection (eg, mitotic figures, tumor margins), and quality control (QC) checks (eg, blur, stain variation).

For clinical deployment, digital pathology viewers must meet high standards for usability, reliability, and regulatory compliance. Clinical-grade viewers often support audit trails, user access controls, and secure data transmission, which are critical for HIPAA and GDPR compliance and other data protection regulations. Regulatory body-approved (such as the Food and Drug Administration in the US, etc) systems may be required for use in primary diagnosis, depending on the region. Viewers used in regulated environments may undergo formal validation processes to ensure diagnostic equivalence to traditional microscopy. Overall, digital pathology viewers are a core technology underpinning the broader digital transformation of pathology, enabling remote diagnosis, collaboration, education, and integration of computational pathology tools into routine practice.

Artificial intelligence

AI refers to a set of computational techniques that enable machines to perform tasks typically associated with human cognition, such as visual perception, language understanding, and decision-making. In digital pathology, AI is most commonly applied through a subfield known as computer vision, which allows computers to interpret and extract information from high-resolution histopathological images. These AI systems are not intended to replace pathologists or their human cognition, but to assist them in improving diagnostic accuracy, efficiency, and consistency.

The earliest approaches to AI relied on rule-based systems in which explicit instructions were written by humans (eg, “if a cell is larger than 50 microns and has an irregular nucleus, flag it as abnormal”). These approaches were displaced by data-driven algorithms, collectively referred to as machine learning, in which algorithms automatically learned from data rather than relying on hard-coded rules. In machine learning, a clinician might label thousands of slides (eg, benign vs malignant) and indicate which features a machine should rely on (nucleus size, shape, perimeter, staining intensity, etc), and the machine would learn from the data which features were important and how to combine them to produce a correct result.

The vast majority of modern AI approaches in digital pathology are based on deep learning, a subfield of machine learning that takes this approach one step further. Deep learning can either rely on human-specific features or it can automatically discover which features are most useful for performing the requested task from the raw images, without needing a human to hand-pick them. For example, instead of a human specifying that an algorithm should focus on cell count or cell shape, a deep learning algorithm will discover that these are useful features by itself through the process of reviewing millions of image patches.

Deep learning algorithms rely on structures called neural networks, which interpret inputs via a series of layers of computational processing. Each layer in a neural network processes and transforms information provided to it by an earlier layer. This approach is loosely modeled on the human visual cortex in which layers of biological neurons in the brain, referred to as V1, V2, etc, process and transform visual signals from the previous layer. Like the human visual cortex, the layers of a neural network extract and represent different types of information at different stages of processing. For example, the initial layers might identify features such as edges, corners, or lines; intermediate layers might identify coarser structures such as morphological patterns; and the final layers identify more abstract concepts such as biomarkers, risk factors, and diagnoses.

In human brains, neurons are only connected to a relatively small subset of other neurons. This topology or map of connections defines how information flows. Similarly, deep learning algorithms have a topology that defines which digital neurons are connected. This map of connections is referred to as the “architecture” of the deep learning algorithm. Perhaps the most common architecture for image-based tasks is the convolutional neural network (CNN), which is specifically designed to process and analyze image data. More recently, transformer-based architectures⁴² have begun to replace CNNs in many cutting-edge models. Originally developed for natural language processing, transformers excel at capturing long-range dependencies and global context and have been adapted to vision tasks with remarkable success. This architectural shift is also evident in multiple pathology foundation models,^43–54 where vision transformers are increasingly serving as the backbone for large-scale image understanding, marking a major advance in the field.

A digital pathology image—such as a WSI—is represented mathematically as a 3D tensor, or multidimensional array. Each image can be thought of as a grid of pixels, where each pixel has associated intensity values for color channels (typically red, green, and blue), resulting in an image tensor of shape [height, width, channels]. For example, an image that is 512 pixels tall, 512 pixels wide, and has 3 color channels would be represented as a tensor of shape [512, 512, 3]. This tensor becomes the input to the AI model.

Depending on the task, an AI model may produce different types of outputs including, but not limited to, the following:

For segmentation tasks, the output is typically another image of the same spatial dimensions as the input, where each pixel is assigned a label indicating the tissue class (eg, tumor, stroma, background). This results in a pixel-wise map that outlines structures of interest.

For classification tasks, the model outputs a vector of probabilities corresponding to different diagnostic categories (eg, benign vs malignant).

For regression tasks, the model produces one or more continuous numeric values—for example, a predicted tumor grade or biomarker expression level.

For detection tasks, the output may consist of a set of bounding boxes and class labels identifying specific features (eg, mitotic figures or lymphocytes) within the image.

Due to the extremely large size of WSIs, often containing gigapixels of data, it is computationally infeasible for most AI models to process an entire image at once. Instead, a common strategy is to divide the image into smaller subregions called patches, typically ranging from 128 × 128 to 512 × 512 pixels. Each patch is analyzed individually by an AI algorithm, which generates predictions for that localized region. These predictions can then be aggregated to produce slide-level or patient-level outputs, such as:

Combining patch-level classifications to generate a heatmap showing spatial distributions of tumor and nontumor areas.

Summing detected features (eg, mitoses) across all patches to compute a total count.

Averaging or voting across patch-level predictions to generate a final diagnostic classification.

In some cases, models are trained using multiple types of input data—an approach known as multimodal learning. A multimodal AI system may simultaneously process image patches alongside clinical variables (such as patient age, tumor grade, or hormone receptor status) or textual data (such as free-text pathology reports or radiology summaries). These systems are designed to more closely reflect the real-world diagnostic process, where a pathologist synthesizes information from multiple sources to make informed decisions.

To make these outputs useful in clinical settings, the model’s predictions are typically presented as overlays (eg, false-color segmentation masks), visual heatmaps, or structured outputs (eg, scores or classifications) integrated into digital pathology platforms. The goal is to offer interpretable, reproducible results that complement the pathologist’s expertise.

By understanding how AI models process, interpret, and aggregate image data, clinicians and regulators are better equipped to assess their reliability, clinical value, and appropriate use in practice. Transparency in how models function—and how they arrive at their predictions—is essential for building trust and ensuring safe integration into diagnostic workflows.

Categories of AI applications

There are multiple ways to categorize AI applications. We discuss two that are particularly relevant in the context of validation, regulation, and payment for AI algorithms applied to digital pathology.

Regulatory status of AI laboratory tests in pathology

Regulatory agencies across countries and regions rely on distinct approaches to regulating medical software. These are rapidly changing as AI applications continue to evolve and mature. However, broadly speaking, software (with or without AI) is categorized based on its intended use.

In particular, software is considered a medical device if it is intended to diagnose, treat, or drive or inform clinical decisions for an individual patient. Digital pathology software that would clearly fall into this bucket includes software intended to diagnose cancer detection, produce automatic cancer grading, or automatically select patients for the use of a drug as a companion diagnostic.

Software that stores, transfers, displays, or manages data without interpretation is generally not a medical device. Examples in digital pathology of nonmedical device software include PACS/LIS integration software, image compression or caching software, software intended to segment regions of interest but not diagnose, and WSI viewers specifically being used for nondiagnostic purposes.

United States

In the United States, regulatory oversight of AI laboratory tests in pathology involves a combination of federal agencies and legislative frameworks.

First, the Food and Drug Administration (FDA) regulates AI software and devices that meet the definition of a medical device. The FDA defines Software as a Medical Device (SaMD) as “software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device.”⁵⁵ The FDA has been actively developing policies for SaMD, including AI/ML-based technologies, emphasizing a total product lifecycle approach that accommodates iterative learning and updates. The agency’s framework distinguishes between locked algorithms (fixed behavior) and adaptive algorithms that change with real-world data. AI tools that aid in diagnosis or inform clinical decisions typically require premarket review and clearance or approval, depending on risk classification.

The FDA does not normally regulate a laboratory’s use of a device. Consequently, many laboratory tests are not FDA-cleared, even though the devices used to perform those tests may be. This distinction proved crucial in the 2025 Texas federal court decision American Clinical Laboratory Association v FDA,⁵⁶ which vacated the FDA’s attempt to regulate laboratory-developed tests (LDTs) as medical devices. The court’s reasoning centered on the interpretation of “test system,” holding that while physical devices are subject to FDA jurisdiction, the professional laboratory service using those devices is not. We should note that we are unaware of the use of the term “laboratory-developed test” within any regulations that are currently in force, so we are unable to provide a formal definition. However, this term is generally used to describe laboratory services using devices that have not been FDA-cleared or that use devices in a way that does not align with the FDA labeling that perform LDTs. This remains an evolving area of law, and future legislative or regulatory developments may alter this framework.

Second, the CMS plays a dual role in laboratory oversight: it administers the Clinical Laboratory Improvement Amendments (CLIA) program that regulates laboratories nationwide, and it operates Medicare, the largest health payment program in the United States. Clinical laboratories in the United States must obtain a CLIA certificate in order to operate, and some states have additional licensing requirements. While not required, many laboratories in the United States seek additional accreditation by the CAP.⁵⁷

Under Medicare, CMS distinguishes between physician pathology services and clinical diagnostic laboratory tests. Physician services must be both performed by a physician and ordinarily require a physician.⁵⁸ Clinical diagnostic laboratory tests, such as genetic tests and serum chemistry panels, do not ordinarily require a physician to perform the service. AI applications in digital pathology may fall into either category depending on its use case. When a physician pathologist provides the work, the AI serves as part of a physician pathology service. When no physician work is required to generate the test result, the AI functions as a clinical laboratory test.

Europe and the United Kingdom

In the European Union, in vitro diagnostics (IVDs) are regulated under the In Vitro Diagnostic Regulation (IVDR),⁵⁹ which replaced the earlier IVDD⁶⁰ and substantially strengthened requirements for clinical evidence, risk classification, postmarket surveillance, and oversight of higher-risk devices. Unlike the centralized FDA model in the United States, EU device oversight relies on notified bodies, independent organizations designated by member states, to conduct conformity assessments. Manufacturers must demonstrate compliance with the IVDR’s general safety and performance requirements, implement a quality management system (typically ISO 13485⁶¹), and obtain CE marking to place a device on the market. In the United Kingdom, most of Great Britain continues to operate under a transitional IVDD-based framework overseen by the Medicines and Health care products Regulatory Agency, while Northern Ireland aligns with the EU IVDR.

Software, including AI algorithms, is regulated as a medical device in Europe when its intended use involves diagnosis, prevention, monitoring, prediction, prognosis, or treatment of disease.⁶¹ AI tools used in laboratory medicine, such as diagnostic classifiers or risk-stratification algorithms, therefore qualify as IVD medical devices when they generate clinical outputs based on biological samples. Under the IVDR, such systems are risk-classified (Classes A–D), with most clinically impactful AI systems falling into higher-risk categories requiring notified body review, robust performance evaluation (scientific validity, analytical performance, and clinical performance), and ongoing postmarket performance follow-up.

Clinical laboratories themselves are regulated separately from devices. While the IVDR governs commercially marketed IVDs and imposes conditions on in-house tests, laboratories are primarily overseen through national licensing and accreditation systems. Most European medical laboratories are accredited under ISO 15189,⁶² which sets standards for quality management, personnel competence, assay validation, and external quality assessment. Thus, in Europe, AI algorithms used in laboratory medicine are regulated as medical devices under IVDR (or UK medical device law), whereas laboratories operate under parallel accreditation and health system regulatory frameworks.

Other jurisdictions

In Canada, IVDs are regulated as medical devices under the Food and Drugs Act and the Medical Devices Regulations, administered by Health Canada.⁶³ IVDs are classified into Classes I–IV based on risk, with most clinically significant diagnostic assays and AI-driven diagnostic software falling into Classes II–IV. Higher-risk devices require a Medical Device Licence⁶⁴ supported by evidence of safety and effectiveness, quality system certification under ISO 13485 through the Medical Device Single Audit Program, and ongoing postmarket surveillance. Software intended for diagnostic or clinical decision-making purposes, including AI-based tools, is regulated as SaMD. In parallel, clinical laboratories are regulated at the provincial level and are typically accredited under ISO 15189 or equivalent provincial accreditation frameworks (eg, Accreditation Canada Diagnostics), creating a dual structure in which devices are federally regulated while laboratories are provincially overseen.

Across Asia, regulatory systems vary considerably but generally treat IVDs—including AI-based diagnostic software—as medical devices subject to national regulatory authority oversight. In Japan, IVDs are regulated under the Pharmaceuticals and Medical Devices Act (PMD Act) by the Pharmaceuticals and Medical Devices Agency (PMDA)⁶⁵ and the Ministry of Health, Labour and Welfare (MHLW), with risk-based classifications and premarket review for higher-risk devices. In China, the National Medical Products Administration (NMPA) regulates IVDs under a three-class risk system, requiring local type testing and, for higher-risk products, clinical evaluation. Singapore’s Health Sciences Authority (HSA) and South Korea’s Ministry of Food and Drug Safety (MFDS) operate similar risk-based frameworks. AI algorithms intended for diagnostic use are typically regulated as SaMD and must demonstrate analytical and clinical performance. Laboratory oversight, however, is generally handled separately through national accreditation schemes, frequently aligned with ISO 15189 standards.

In Australia, IVDs are regulated by the TGA⁶⁶ under the Therapeutic Goods Act 1989 and associated medical device regulations. IVDs are classified into Classes 1–4 (low to high risk), with most clinically significant diagnostic tests and AI-based diagnostic software falling into higher classes requiring conformity assessment and inclusion in the Australian Register of Therapeutic Goods (ARTG). Australia recognizes conformity assessment evidence from comparable jurisdictions in certain circumstances but maintains its own regulatory oversight. Laboratories are regulated separately through the National Pathology Accreditation Advisory Council standards and accreditation by the National Association of Testing Authorities, typically to ISO 15189. As in Canada and much of Asia, this creates a bifurcated system in which AI diagnostic devices are regulated as medical devices at the national level, while laboratories operate under accreditation and professional regulatory frameworks.

AMA classification

The American Medical Association’s (AMA) CPT Manual classifies AI into “assistive,” “augmentative,” or “autonomous.” Autonomous AI is further classified into Levels I, II, or III based on the extent to which the AI initiates treatment. While this taxonomy was designed to fit all specialties, some specialty-specific nuance must be considered when applying this framework to a particular medical specialty.⁶⁷

The AMA’s taxonomy may be appropriate for coding but does not meet the needs of these recommendations for three reasons. First, broadly speaking, the AMA’s taxonomy focuses on the level of human interpretive effort that the provider billing for the service must provide. Second, these recommendations represent an approach to evaluating the role of AI within a patient’s entire care journey, rather than limiting it to the nature of the specific procedure provided by a billing entity, which is out of the scope of the AMA’s CPT process. Third, the AMA classification is still being very actively edited as of the writing of this article.

Validation

Validation of digital pathology systems, and more broadly, software and hardware systems that perform a clinical function, is a very broad space encompassing aspects of clinical validation, usability, information quality, accuracy, repeatability, reproducibility, robustness, and compliance. The scope of these recommendations is limited to the three areas of validation described below.

Analytical validation of AI

In the context of digital pathology, what constitutes analytical validation has been outlined by various regulatory and professional bodies, which include:

Clinical and Laboratory Standards Institute (CLSI): Provides detailed protocols in documents such as CLSI EP05-A3 (precision),⁶⁸ EP17-A2 (detection limits),⁶⁹ and EP15-A3 (verification of precision and accuracy).⁷⁰

U.S. FDA: Outlines expectations for analytical validation in the context of SaMD.⁵⁵

CMS under CLIA: Requires laboratories to verify or establish performance specifications for nonwaived tests.⁷¹

CAP: Offers accreditation checklists that require evidence of analytical validation for new or modified tests.⁷²

In the context of these recommendations, a summary of Analytical Validation inclusive of the aforementioned definitions would be the process of systematically evaluating and documenting a test or system’s technical performance to ensure it reliably and accurately measures the intended analyte or output under defined conditions. In other words, does the test accurately and reliably measure what it is supposed to under defined conditions, regardless of clinical meaning.

While we are not proposing an entirely new approach to analytical validation, these guidelines do propose adjustments and applications of the aforementioned principles that are particular to AI.

Clinical validation of AI

Several regulatory and standards organizations provide guidance on clinical validation, especially for SaMD, LDTs, and IVDs:

U.S. FDA: FDA defines⁵⁵ clinical validation as the process of establishing that the test output is clinically meaningful for its intended use. This is outlined in guidance such as: “Clinical performance must be demonstrated through studies that measure how accurately the test predicts a clinical condition.”

CLSI: CLSI provides⁷³ protocols for clinical performance studies and test evaluation in documents like EP24-A2.

CAP: CAP’s accreditation program⁷² requires laboratories to document clinical validation for tests, especially LDTs and modified assays, in accordance with stated clinical purpose.

In the context of these recommendations, a summary of Clinical Validation inclusive of the aforementioned definitions would be the process of systematically evaluating that a test or system is clinically meaningful, that it accurately and reliably predicts or correlates with a clinical condition, risk, or outcome in the intended population and setting.

Clinical utility

Clinical utility is a critical aspect of evidence generation for test adoption and reimbursement. Major sources defining clinical utility include:

CMS and Medicare Administrative Contractors: CMS requires evidence of clinical utility for coverage decisions under the Medicare program.

National Academy of Medicine (formerly IOM): Emphasizes that clinical utility is a cornerstone of evidence-based medicine, particularly for genomic and personalized diagnostics.

In the context of these recommendations, a summary of Clinical Utility inclusive of the aforementioned definitions would be the evidence that improved patient outcomes can be attributed to the use of the test either directly or via a chain-of-evidence approach.

Digital Pathology Association’s approach to clinical utility

Clinical utility is a complex and often contested concept, representing one of the greatest sources of interpretation and disagreement in diagnostic testing. Accordingly, the Digital Pathology Association (DPA) seeks not only to define clinical utility and establish associated standards but also to articulate a structured process for evaluating whether clinical utility has been demonstrated.

Before we propose a formal framework to guide determinations of clinical utility, we first review four sources that appear to use differing language to address similar notions.

First, the National Cancer Institute defines⁷⁴ clinical utility as: “A term that refers to the likelihood that a test will, by prompting an intervention, result in an improved health outcome. The clinical utility of a genetic test is based on the health benefits related to the interventions offered to individuals with positive test results.” Second, the National Academy of Medicine has adopted⁷⁵ the definition of the EGAPP working group of clinical utility⁷⁶: “Simply stated, clinical utility is defined as the use of a clinical test’s result to make a treatment decision that positively changes the outcome of a patient.” Third, Medicare does not explicitly provide a definition for clinical utility but instead uses clinical utility as part of the evidentiary standard to determine whether a test is “reasonable and necessary.”^77,78 Specifically, Medicare’s criteria require that medical intervention be safe and effective, not experimental, appropriate in duration and frequency, consistent with accepted standards of medical practice, delivered in an appropriate setting by qualified personnel, tailored to meet the patient’s medical needs, and “it should be at least as beneficial as any existing, medically appropriate alternative.” Lastly, the Agency for Healthcare Research and Quality uses a framework⁷⁹ for assessing clinical utility based on attaching patient outcomes to a particular diagnostic technology and outlining particular evidentiary questions.

A common theme among all of these approaches is the tying of a particular medical intervention (service or test) to an improved health outcome (reduced morbidity, mortality). In certain cases, the improvement in patient outcomes can be based on very direct evidence (mortality benefit of statins being a prime example⁸⁰). However, most diagnostics in routine clinical use have not used this evidentiary approach. Even FDA-approved companion claims for diagnostics^81,82 are supported by a chain of evidence via concordance to a clinical trial assay. Indeed, the chain of evidence forms the typical approach for establishing clinical utility in diagnostics, in which concordance studies to well-accepted diagnostic approaches and noninferiority studies to accepted diagnostic approaches are expected to be among the approaches to evidence generation. Figure 1 illustrates the two approaches to evidence generation.

Fig. 1.

A flowchart illustrating the manner in which evidence for clinical utility can be obtained either directly or via chain-of-evidence methods.

Finally, we can propose an approach to clinical utility that applies the definitions above to the context of digital pathology in a manner sensitive to how evidence is practically generated. In particular, clinical utility can be defined as the following: The use of a clinical test’s result to make a treatment decision that positively changes the outcome of a patient, whether demonstrated via a direct or chain-of-evidence approach.

Recommendations

The following are a set of comprehensive recommendations that we hope will assist providers, patients, algorithm developers, and payers to ensure that digital pathology systems, inclusive of uses of AI in digital pathology, are properly validated and deployed in a safe and effective manner and that they have the intended effect of ultimately improving patient outcomes. Our recommendations are subdivided into three sections.

The first section covers recommendations on how digital pathology systems should be deployed. We hope that these will assist pathology laboratories and large health care organizations in planning for, and increasing adoption of, digital pathology.

The second section covers recommendations on how digital pathology algorithms are validated. We hope that these will assist pathology laboratories in verification and validation of AI systems and device manufacturers in communicating the performance characteristics of their models.

The third section covers recommendations for how AI for digital pathology algorithms are clinically used to maximize safety and efficacy. We hope that these will assist providers in adopting digital pathology, and payers and regulators to evaluate proper uses of digital pathology.

Recommendations for the deployment of digital pathology

R1: Pathology labs and health care institutions should begin planning for digital pathology

Digital pathology already enables significant improvement to patient care and will continue to expand opportunities in the future. The benefits of adoption of digital pathology include workflow optimization and efficiency,^83,84 reproducible image analysis,^85,86 remote access for consultation, research, and education,^85–88 the ability to align and review slides across slide stains,⁸⁹ prevention of slide loss or degradation,⁸⁹ and the ability to use AI algorithms.^83,89 Furthermore, adoption of AI algorithms on top of digital pathology alone includes further increases in workflow efficiency,^12,90,91 diagnostic objectivity,^4,91,92 and more informed decision support.¹²

Unfortunately, individual physicians are often unable to implement digital pathology recommendations in their own practice until their institutions adopt digital pathology. By adopting digital pathology, institutions will enable pathologists to practice in manner contemporary with current standards of excellence and enable better communication and coordination of care.

A critical barrier to the adoption of digital pathology is that it relies on not only equipment and software purchases but also institutional changes in both human processes and information technology (IT) infrastructure, which require significant financial and operational planning. Consequently, we recommend beginning this planning early so as to ensure that laboratories and health care institutions have adequate financial, human, and IT resources to successfully implement digital pathology.

Certainty of evidence: Low

Strength of recommendation: Conditional

R2: Access to digital pathology is specifically indicated for cases where traditional pathology is not readily available

Digital pathology, independent of AI, is recognized as a medical necessity in settings where traditional pathology services are limited or unavailable, enabling timely and accurate diagnostic support through remote access and consultation. The CAP and the American Telemedicine Association have endorsed digital pathology for remote interpretation, particularly in underserved or rural areas, to bridge gaps in access to subspecialty expertise.⁹³ In global health contexts, the World Health Organization has also highlighted digital pathology as a tool to reduce diagnostic delays and improve cancer care equity where pathology infrastructure is lacking.⁹⁴ Studies have demonstrated that WSI is noninferior to traditional glass slide review for primary diagnosis, with concordance rates exceeding 95%, thereby supporting its use for routine clinical interpretation.⁹⁵ As such, digital pathology is not merely a convenience but a critical infrastructure element that ensures diagnostic continuity and timely patient care in scenarios where in-person pathology is not feasible due to geographic, logistical, or personnel constraints.

Consequently, we recommend that if a specimen being reviewed is from a patient seen in one of the following settings, documented in the patient record, then the use of digital pathology should be seen as particularly indicated in the following settings:

CAHs

Rural Health Center

FQHC

Independent Laboratory

The service is provided for a patient of one of the facility types above as part of a contractual arrangement with the facility.

Certainty of evidence: N/A

Strength of recommendation: Strong

R3: Clinical AI used in digital pathology should be overseen by a qualified pathologist and does not replace a pathologist

Pathologists are central to the practice of anatomical pathology and laboratory medicine, not only for their diagnostic skills but also for overseeing and ensuring high-quality results from a laboratory. Within the United States, there are varying laws governing the practice of medicine and the oversight of laboratories. This recommendation generally assumes that a pathologist is a licensed medical doctor who specializes in the diagnosis of disease primarily from in vitro specimens and who engages in the oversight of laboratories or laboratory tests. As part of training, pathologists learn to recognize a wide range of diseases and tissue types from a wide variety of specimen types.

Current AI systems may be very good at addressing specific tasks for which they were trained with well-defined inputs and outputs. However, high-quality health care demands that health care providers identify even very rare diseases and accommodate the diagnosis of patients with rare presentations of common diseases. This requires experts who are able to assess and interpret a wide range of potentially relevant information. Therefore, clinical AI systems used in pathology require pathologist oversight, even if the systems provide information that does not require pathologist interpretation. Clinical AI systems complement the skills of a pathologist rather than replace them.

Certainty of evidence: N/A

Strength of recommendation: Strong

Recommendations for algorithm validation

R4: The digital pathology scanning process should be validated separately from an AI algorithm

The process of scanning a physical histology slide to produce a digital image file is a prerequisite for both human and AI interpretation of digital pathology images. For AI interpretation of digital pathology images, we recommend that the digital pathology scanning mechanism (typically a combination of staining protocol and a digital slide scanner) be validated separately from any subsequent image analysis component. CAP has already provided guidelines⁹⁶ for validating WSIs independently of whether the images are consumed by humans or algorithms. In cases where the digital slide scanner is not part of a regulatory-approved (FDA, CE, etc) system, then it should be validated in the lab in which it is being used in accordance with a consensus guideline.⁹⁶ Once a digital pathology scanner is validated, any AI algorithm intended to be used with the scanner must be subsequently validated with digital images produced from that scanner (see R5).

Certainty of evidence: N/A

Strength of recommendation: Strong

R5: Analytical and clinical validation of AI algorithms must be demonstrated for each scanner model in which the algorithm is to be clinically used

AI algorithms in digital pathology rely on analyzing WSIs, and the performance of these algorithms can vary significantly depending on the slide scanner used to digitize the pathology specimens. Differences in scanner hardware, optics, resolution, color calibration, compression algorithms, and file formats can subtly alter the appearance of histological features in ways that AI models may not be robust, especially when models are trained on a narrow set of devices.

Multiple peer-reviewed studies have demonstrated that even state-of-the-art AI models show performance degradation when applied to images captured on a different set of scanners than those used during AI algorithm training. In a study by Aubreville et al.,⁹⁷ an AI algorithm is trained to detect mitotic figures. When the algorithm is applied to the exact same slides scanned on three different digital slide scanners, they observe a 20–40% difference in F1 scores between scanners. In Swiderska-Chadaj et al,⁹⁸ the authors develop an algorithm for cancer detection that is tested via a multicenter, multiscanner protocol. They find that the algorithm’s accuracy varies between 5% and 15% depending on the digital slide scanner used for testing.

Regulators have recognized this risk. The FDA, in its 2021 clearance of Paige Prostate,⁹⁹ explicitly limited the algorithm’s use to images generated by the Philips Ultra Fast Scanner, reflecting that performance data had only been validated on that system. This highlights the agency’s position that AI tools must be evaluated within the specific ecosystem in which they are deployed, including scanner hardware. The VENTANA^® TROP2 (EPR20043) RxDx Device achieved breakthrough FDA status as the first computational pathology companion diagnostic.¹⁰⁰ This significant achievement clears the way for the use of AstraZeneca’s Quantitative Continuous Scoring (QCS) AI algorithm to be used to objectively quantify TROP2 expression and subsequently support treatment decision-making for targeted therapies based on TROP2 quantification. This algorithm is validated exclusively for Roche’s DP 200 and DP 600 scanners.

Without scanner-specific validation, AI algorithms are at risk of reduced accuracy, false negatives, or overcalls, particularly in edge cases or rare morphologies. To mitigate these risks, the CAP statement for AI in pathology recommends validating AI tools on representative scanner platforms and ensuring ongoing performance monitoring as hardware evolves.¹⁰¹

Note that many commonly used digital slide scanners are not yet FDA-approved. In these cases, in order to reliably validate and utilize AI algorithms on such scanners, images from these scanners must first be validated for each indication separately from the AI algorithm (see R4), and second, each algorithm must be shown to validate on images from these scanners.

Certainty of evidence: High

Strength of recommendation: Strong

R6: Interscanner concordance studies are acceptable to generalize a validated AI algorithm’s performance from one slide scanner to another when using the original digital slide scanner as the gold standard

When the same physical slide is scanned using two different slide scanners, the resulting pair of digital slide images can be significantly distinct in appearance. This raises the natural question: if an algorithm is validated on slide scanner A, can it be reliably used on slide scanner B? To be clear, this is not a question of comparing the algorithm’s performance on slide scanner B to human performance but rather ensuring that an algorithm’s performance on slide scanner B is effectively equivalent to its performance on slide scanner A.

This question has become central to regulatory, clinical, and operational decision-making. Sufficient evidence^97,102,103 suggests that we cannot assume that an algorithm validated on scanner A will necessarily also validate on scanner B. Furthermore, while many color normalization techniques have been introduced and may improve cross-scanner performance, they do not necessarily close the performance gap.^104,105 Consequently, a scientifically grounded and pragmatic approach is to conduct concordance studies, using the original validated scanner as the reference standard to assess performance on alternate scanners.

Interscanner concordance studies aim to determine whether an algorithm, when applied to slides digitized on a new scanner, produces output values that are statistically and clinically equivalent to those it generates from its originally validated scanner. These studies typically compare diagnostic outputs, such as detection sites, scores, or classifications, from images of the same slide scanned on different systems. If high agreement is demonstrated, it can be reasonably concluded that the algorithm performs similarly on both scanners for the intended use. As an illustrative example, a study might demonstrate that Cohen’s kappa >0.8, or that categorical concordance exceeds ≥95. Ultimately, the choice of both the metric and threshold are task-dependent and ultimately require higher thresholds for higher-risk tasks.

A critical component of such interscanner concordance studies is the use of a set of slides, typically referred to as reference slides or quality assurance slides, which are scanned on each of the digital slide scanners in the study. The number of slides involved and their content should reflect the typical variability of samples that one would expect the algorithm to be applied to.

This approach mirrors established practices in pathology and radiology. For example, assay bridging studies in laboratory medicine routinely use a reference standard as the comparator to validate performance on alternative platforms, and the FDA has accepted such bridging methodologies¹⁰⁶ for companion diagnostics and digital imaging systems. Moreover, CAP’s statement on AI validation in pathology¹⁰¹ recognizes that “analytical validation using digital concordance across systems can be sufficient if it is performed rigorously and includes pathologist review or clinical outcome linkage.”

Scientific studies support this method of concordance. A paper by Lu et al¹⁰⁷ demonstrated that AI-based Gleason scoring models maintained >95% concordance when applied to prostate biopsies scanned on three different WSI systems—after modest color normalization was applied.

Importantly, full analytical revalidation of AI tools on every scanner is impractical and unnecessarily resource-intensive, especially when high-quality reference slides and rigorous concordance protocols can demonstrate equivalency. Requiring full clinical validation on every new scanner would create significant regulatory and adoption bottlenecks, slowing innovation and access to AI-enabled care without clear gains in safety.

Consequently, interscanner concordance studies are an efficient, evidence-supported, and clinically meaningful mechanism to generalize AI performance across digital pathology scanners.

Certainty of evidence: High

Strength of recommendation: Strong

R7: When scanner settings can be adjusted, AI algorithm validation using a specific scanner should specify scanner settings and conditions

Some WSI systems are not locked with specific settings. For example, changing focus parameters, image compression, or z-stacking parameters may be sources of variability that can impact the results of an AI model. This is less of a concern on WSI systems with locked settings, but laboratories or manufacturers should ensure that they specify the relevant settings and parameters for use of AI algorithms when scanner settings are adjustable. It may be reasonable to specify a range of parameters or settings, but in such a case, validation should be done across that range.

When a WSI system has locked settings, or always runs using clearly specified settings, then the validation may simply reference the default settings, so long as these remain the default settings.

Certainty of evidence: Medium

Strength of recommendation: Strong

R8: Analytical validation should include measures of both accuracy and reliability

Accuracy refers to how closely an AI tool’s output matches the reference standard (eg, expert pathologist consensus or established clinical outcomes). Without a quantification of accuracy, it is impossible to assess the model’s ability to detect or classify pathology correctly. Numerous studies have shown that measuring diagnostic accuracy is critical in determining the clinical value of AI algorithms. For example, Campanella et al¹⁰⁸ evaluate deep learning models at diagnosing patients with prostate, breast, and lung cancer using the area under the ROC curve (AUC) in order to determine the trade-off between sensitivity and specificity. Steiner et al¹⁰⁹ similarly use AUC to evaluate AI-assisted reads in breast cancer. Coudray et al¹¹⁰ evaluate their AI algorithms for predicting lung cancer mutations using Cohen’s kappa to measure the concordance between the algorithm and the reference standard of expert pathologists.

Reliability, also sometimes referred to as precision, is often operationalized as intra- and inter-run reproducibility or consistency across varying conditions (eg, different scanners, staining batches, or image artifacts) and is equally important. AI tools that are accurate in a controlled setting but yield inconsistent results when deployed in the real-world pose significant clinical risk. Lin et al¹⁰⁵ demonstrate the degree to which AI models trained on a particular batch of images failed to generalize to images in a different batch. Miranda Ruiz et al¹¹¹ demonstrate that different scanner types affect a trained AI model’s outputs in the context of Amyloid-β detection. Ochi et al¹¹² demonstrate that AI models are sensitive to variations in the hematoxylin and eosin (H&E) staining protocol. These examples underscore the need for reliability assessments in validation studies.

The FDA and CAP both emphasize the need for both accuracy and reliability in analytical validation. The FDA’s guidance⁵⁵ on SaMD explicitly recommends verification and validation procedures that include performance characteristics like accuracy and reproducibility. Likewise, CAP’s recommendations¹⁰¹ on validating AI in pathology recommend evaluation of both “diagnostic performance (accuracy)” and “consistency across relevant input and operating conditions (reliability).”

Excluding either metric introduces clinical risk: an AI tool that is not accurate may produce misleading outputs, while one that is not reliable may produce unpredictable results across cases, patients, or environments.

Certainty of evidence: Low

Strength of recommendation: Conditional

R9: AI predictions that can be verified by pathologists should rely on a cohort whose size should be determined by the intended use

Many AI algorithms produce outputs that a pathologist can verify. These include AI algorithms that highlight regions of interest in WSIs, such as mitotic figures, tumor areas, or areas suspicious for metastasis in lymph nodes¹⁰⁹; segmenting and classifying nuclei into categories (tumor cells, stromal cells, and lymphocytes); computing cell-specific features (nuclear size, shape, density); counting lymphocytes and mapping their spatial proximity to tumor cells; and performing automated QC of digital pathology slides by identifying blur, pen markings, bubbles, and other artifacts. The typical approach to validation of such algorithms would be to compare an AI-generated prediction to that of a human pathologist.

The question of how many pathologists to compare with naturally arises because interobserver variability is well-documented in pathology (and diagnostic decision-making more generally as well), particularly in the interpretation of complex or borderline cases. Relying on a single pathologist’s diagnosis risks encoding individual bias into the evaluation process, while a consensus among three or more provides a more stable and representative ground truth. Such individual bias is an inherent part of medical practice. Consequently, AI that mimics the bias of its sole user would be noninferior to that user. However, if the AI is intended to reduce bias, then validation on a consensus using multiple pathologists may be necessary.

As a general standard, we would recommend reliance on a minimum of three pathologists. Studies across tumor types, including prostate,^4,113–117 breast,^118–122 colon,^123–127 and lung cancer,^128,129 have shown that diagnostic agreement improves significantly when using multipathologist adjudication panels. Furthermore, using three pathologists enables majority voting in the absence of full consensus, offering a statistically stronger comparator and reducing the likelihood of idiosyncratic error influencing algorithm validation. This approach aligns with practices used in regulatory submissions (though regulatory bodies may demand more depending on the context) and high-quality AI development frameworks where multireader, multicase studies are the standard for establishing reference truth.

That being said, there are several exceptions to this standard.

First, even once validated by the device manufacturer, individual labs may want to add their own validation to ensure that their users interact with the AI tool in a manner acceptable to the performance within the lab. In this case, a single pathologist may suffice, and it is often infeasible for individual labs to rely on greater numbers.

Second, deployments that are research-use only may rely on a single pathologist, but AI developers and clinical collaborators should be made aware of the aforementioned limitations of reliance on single experts, and the use of less than three should be adequately justified.

Third, while the intended use of many algorithms is meant to be user agnostic, other tools might be customized to assist a single individual pathologist. In this case, that single pathologist may provide the gold standard for validation. Such AI would not be used to represent a consensus diagnosis and therefore not validated for such a purpose. Examples of such algorithms include tools that might draw contours of regions of interest on digital slides or language models that help craft pathology reports in the style of a single pathologist.

Certainty of evidence: Medium

Strength of recommendation: Conditional

R10: AI predictions that are distinct from pathologist interpretations should use a reference standard that corresponds with the output being predicted

For many tasks that AI algorithms perform (cell segmentation, artifact detection), the natural reference for clinical validation performance is human pathologists. However, what if the AI algorithm is performing a task that pathologists cannot easily visually verify, such as therapeutic response?

AI tasks that pathologists cannot easily visually verify are those tasks that a pathologist cannot simply examine the contents of one or more slides to produce the same output. Examples include quantifying molecular biomarkers when there is insufficient tissue for performing immunohistochemical (IHC) staining, quantifying the likelihood that a particular patient’s tumor exhibits high microsatellite instability (MSI), or quantifying the likelihood that a particular patient’s tumor is characterized by HRD. A growing number of products have been released or FDA-cleared in recent years. The FDA granted Breakthrough Device Designation to the TROP2-QCS AI¹⁰⁰ biomarker, which calculates a normalized membrane ratio of TROP2 IHC staining to predict response to Datopotamab deruxtecan that a human pathologist cannot visualize with precision.^130–132 The Artera AI test¹² produces predictive outputs, such as whether a patient is likely to benefit from adding androgen deprivation therapy to radiation in order to guide decisions about treatment intensification versus de-escalation. A number of vendors have commercialized AI algorithms that produce prognostic risk scores^12,133–135 by analyzing digitized H&E pathology slides from a patient’s biopsy along with clinical variables. AI algorithms have also demonstrated the ability to predict molecular biomarkers from H&E^136–139 and even simulate IHC staining from H&E alone.¹⁴⁰

With this in mind, for tasks that pathologists can visually verify, one or more human pathologists should serve as the reference standard. For tasks that pathologists cannot visually verify, an appropriate reference standard must be obtained for sufficient clinical validation of the algorithm. Table 2 provides a noncomprehensive list of examples of the references that might be used to clinically validate AI algorithms that are not easily visually verifiable.

Table 2.

Examples of Reference Standards for Digital-Pathology-Based Tasks That Are Not Verifiable by a Pathologist

Task	Example reference standard
Molecular biomarker prediction	Molecular biomarker values obtained via traditional molecular tests
IHC virtual staining	IHC slide adjacent to the H&E input
Prognostic risk assessment	Observational outcomes from longitudinal patient data
Predictive therapeutic response	Observational outcomes from longitudinal patient data
MSI-H status from H&E	PCR- or NGS-based MSI testing or IHC staining for mismatch repair proteins (eg, MLH1, MSH2, etc)

H&E, hematoxylin and eosin; IHC, immunohistochemical; MSI, microsatellite instability.

Certainty of evidence: N/A

Strength of recommendation: Strong

R11: As with any laboratory test, analytical quality systems and processes should be established so as to ensure that invalid results are not reported due to laboratory errors

Responsible lab operators are already accustomed to putting into place laboratory-level countermeasures to protect patients from device-generated medical errors. In this respect, AI software should be no different. Indeed, many problems can arise within the performance of AI testing, so it is critical that systems be put in place so as to detect problems and address them. QC should be implemented to address the full end-to-end processes in the testing system to detect any problems that may arise with the test processes that could threaten the validity of results reported on patients. A growing body of evidence^{97,102,103,105,111,112} demonstrates the degree to which AI models in particular are sensitive to batch effects, changes in H&E staining protocols, and scanner changes.

QC methods and acceptance criteria should be developed based on the intended use of the AI test and any established criteria on the appropriate specimen types as well as the reportable range of test output. QC may be performed using methods such as intermittent testing of control specimens or within-batch controls. To develop appropriate QC processes, laboratories implementing AI should systematically identify potential sources of variability within the end-to-end testing process and ensure that they have adequate QC methods to identify all such sources of potential test failure.

Certainty of evidence: Medium

Strength of recommendation: Strong

R12: Laboratories and manufacturers should specify minimum usable sample requirements for AI algorithms

AI algorithms applied to pathology slides must specify limits of use, such as minimum usable tissue area and minimum cancer content, because their performance is highly dependent on input quality and context.

Algorithms trained on datasets with adequate tumor representation may produce unreliable outputs when applied to small biopsies or slides with sparse cancerous regions, leading to false negatives or clinically misleading quantification. For instance, FDA-cleared tools like Paige Prostate⁹⁹ explicitly limit use to prostate needle biopsies with sufficient tumor presence, noting that performance degrades in benign or low-tumor-content samples.⁹⁹ Similarly, studies evaluating PD-L1 scoring algorithms emphasize that accurate quantification requires a minimum number of tumor cells, often recommending thresholds such as ≥100 viable tumor cells to ensure reliability.^141,142 Without enforcing these boundaries, AI tools risk being misapplied outside of their validated domain, undermining diagnostic accuracy and patient safety. Thus, clear specification of use parameters is essential for clinical integration and regulatory compliance.

Labs and manufacturers should consider how to establish what constitutes usable tissue. Artifacts within tissue may impact the extent to which the tissue can be used for inferential purposes by AI algorithms. Tissue containing artifacts to which the algorithm is robust may be usable, while tissue containing artifacts that adversely impact algorithmic validity or containing artifacts with an unknown impact on algorithmic validity should not be considered usable.

Research has shown that artifacts can confound deep learning models, introducing false positives or negatives, particularly in tasks like tumor detection or cell quantification.¹⁴³ For example, a study by Tizhoosh and Pantanowitz¹⁴⁴ highlights how AI performance degrades in the presence of common slide defects unless explicitly trained on artifact-diverse datasets or equipped with preprocessing artifact detection modules. Moreover, the FDA and regulatory bodies increasingly expect manufacturers to characterize performance in the presence of real-world variability, including artifacts, to ensure safe clinical deployment.¹⁴⁵ Therefore, transparent documentation of algorithm robustness—or limitations—in artifact-laden conditions is essential to guide clinical users and avoid misapplication in suboptimal imaging scenarios.

Certainty of evidence: Medium

Strength of recommendation: Strong

Recommendations for realizing clinical utility

R13: A clinical AI algorithm should be used per its intended use

Analogous to immunohistochemistry or special stains, the ability to use AI in the laboratory does not imply that it should be universally used. AI should be used when it can improve the interpretation of the pathologist or provide new information that the pathologist is not able to provide in the absence of AI.

For example, clear-cut cases of prostate cancer are unlikely to require AI to determine if cancer is present. Conversely, if there is a question on whether the cancer should be classified as a Gleason 3 + 4 or a Gleason 4 + 3, it may be appropriate to apply AI to address this question so long as the AI has been appropriately validated. Similarly, a liver biopsy showing clear cirrhosis is unlikely to benefit from AI assessment of fibrosis status. However, when the biopsy shows fibrosis, AI that can provide a histological grade may be appropriate for use if the patient’s treatment is dependent upon accurate fibrosis measurement.

Some evidence from inside^4,146 and outside^147,148 of digital pathology suggests that while AI algorithms have demonstrated improved efficiency and accuracy at diagnostic tasks, clinicians may end up relying on the AI algorithm in a manner inconsistent with its intended use. For example, one can imagine a “second read” algorithm being overly relied on in a manner that results in the algorithm effectively being trusted as the primary read.

Ultimately, clinicians should be familiar with each algorithm’s intended use and realize their adoption of such algorithms in a manner consistent with their intended uses.

Certainty of evidence: Low

Strength of recommendation: Strong

R14: AI algorithms are appropriate for use when they have been validated to complement or replace an existing test

AI algorithms have emerged as a laboratory tool that are appropriate for clinical use when they are reliable complements, screens, or substitutes for more invasive, expensive, or time-consuming tests, such as molecular or genomic assays, by extracting equivalent or superior prognostic or predictive information directly from histopathology slides. Indeed, in resource-constrained settings, or when tissue quantity is limited, molecular testing may be intentionally limited.

While AI algorithms that predict molecular signatures are still emerging, an increasing number of studies have demonstrated that AI models can indeed infer molecular features like MSI, tumor mutational burden, or even specific mutations (eg, IDH1, EGFR) from routine H&E-stained slides with clinically relevant accuracy.^149,150 For example, Echle et al¹⁴⁹ showed that MSI could be predicted from colorectal cancer histology with an AUC of up to 0.89, suggesting the potential to triage cases for confirmatory molecular testing.

Commercially available AI algorithms are also emerging that produce similar outputs to existing molecular tests. For example, the ArteraAI Prostate test¹⁵¹ combines pathology image data and clinical variables to stratify patients by risk and guide treatment decisions based on digital pathology images and clinical variables. Comparable molecular tests include Decipher Prostate¹⁵² and MDX Health’s GPS.¹⁵³ However, CMS’s price¹⁵⁴ for the AI-driven test, $706.26, is markedly different than the price for the molecular tests, $3,873.

In addition, AI IHC quantification algorithms that do not require additional reflex testing (eg, in situ hybridization [ISH]) are also medically necessary. For example, a CE-marked HER2 algorithm¹⁵⁵ has demonstrated a reduction in the need for ISH testing by reclassifying HER2 IHC from 2+ to either 0/1+ or 3+. This reclassification of IHC was validated using FISH. When the algorithm downgrades an IHC 2+ (human read 2+) to a 0/1+, the FISH score is 0. When the algorithm upgrades the IHC score to 3+, the FISH is very positive.

The ability to replicate such insights from routinely acquired data such as H&E-stained pathology slides not only streamlines diagnostics and reduces costs but also shortens time to treatment decisions. However, clinical use in this context requires rigorous validation and transparent reporting of model performance to ensure AI can match or exceed the reliability of the replaced test, as recommended by recent best-practice guidelines.¹⁵⁶

Certainty of evidence: Low

Strength of recommendation: Strong

R15: AI algorithms are recommended to assist physicians when sufficiently validated and medically necessary

AI algorithms have already had a significant effect in digital pathology in several respects and, provided they are sufficiently validated and medically necessary, we recommend their use in assisting physicians.

In particular, a number of AI algorithms already assist clinicians in their workflows by detecting and surfacing clinically relevant data without generating conclusions. Examples include AI algorithms that highlight regions of interest in WSIs.¹⁰⁹ FDA-approved devices such as Paige Prostate Detect⁹⁹ assist pathologists by flagging prostate biopsy regions that may contain cancerous features, requiring expert review for final diagnosis. Similarly, algorithms can prescreen colorectal biopsies to identify slides unlikely to contain malignant findings, allowing for more efficient case triaging.¹⁰⁸ These systems have already been shown to improve case review efficiency,^4,157 reduce oversight-related errors, and support pathologists in managing rising case volumes.^144,158 While these particular algorithms do not make clinical decisions themselves, they enhance the diagnostic process by focusing attention, reducing fatigue-related error, and enabling faster, more consistent interpretations—ultimately contributing to improved patient care¹⁵⁹ with increased potential in community hospital settings.¹⁶⁰

Furthermore, a number of AI algorithms produce diagnostically relevant information that can be used to improve patient care. The TROP2 QCS algorithm developed by AstraZeneca objectively quantifies TROP2 expression on IHC WSIs by identifying tumor regions and TROP2-positive membranes to generate a continuous score reflecting staining intensity and tumor cell prevalence,^130–132 reducing interobserver variability and enabling patient stratification in trials of TROP2-targeted therapies. As the first AI-driven H&E companion diagnostic,¹⁶¹ TROP2 demonstrates that clinically actionable biomarker information can be inferred directly from routine H&E slides, reducing tissue use, cost, and turnaround time while capturing intratumoral heterogeneity and establishing AI as a regulated, treatment-enabling diagnostic modality. Another example is the algorithm used in the FDA-approved Artera AI test.¹⁵¹ The algorithm analyzes digitized H&E pathology slides from a patient’s biopsy along with clinical variables such as PSA, age, and tumor stage and produces prognostic outputs (eg, risk of metastasis or prostate cancer-specific mortality) as well as predictive outputs, such as whether a patient is likely to benefit from adding androgen deprivation therapy to radiation. Clinically, it is used to help stratify patients across low-, intermediate-, and high-risk disease and to guide decisions about treatment intensification versus de-escalation.

With this in mind, we propose that AI algorithms are considered sufficiently validated when: 1.

Peer-reviewed published research shows that the AI algorithm improves the accuracy, reliability, or throughput of diagnostic assessments by pathologists.

Analytical validation and clinical validation follow Recommendations 4–12.

The AI has been verified in the laboratory and implemented with appropriate processes for QC.

In addition, AI algorithms should be considered medically necessary when: 1.

There is an order for an anatomical pathology service by the treating clinician.

A board-eligible or board-certified pathologist or dermatopathologist is using the AI to perform a medically necessary assessment of the tissue.

Certainty of evidence: N/A

Strength of recommendation: Strong

Methodology

The DPA assembled a working group of pathologists and technologists with a mix of backgrounds in academia and industry. The goal of the cohort was to provide guidance to the aforementioned stakeholders in a manner that would ensure that digital pathology continues its rapid adoption in a manner that ultimately yields improved patient outcomes, based on rigorous approaches to validation and clinical utility.

To inform the development of these guidelines, a comprehensive search of publicly available sources was conducted to identify relevant evidence pertaining to digital pathology and AI applications. Sources included peer-reviewed journal articles, conference proceedings, publicly available preprints, regulatory guidance documents, professional society statements, and other authoritative materials.

The evidence collection process was guided by the principles of comprehensiveness and relevance. Searches were performed using a combination of key terms related to digital pathology, computational pathology, AI, machine learning, clinical and analytical validation, regulatory frameworks, and diagnostic performance. Identified sources were evaluated for applicability to the guideline topics, with priority given to materials that provided empirical data, systematic reviews, or expert consensus relevant to the safe and effective implementation of DP and AI technologies.

All evidence sources were documented and cited within the guideline to ensure transparency. While this approach did not employ a formal systematic review methodology, it aimed to capture the breadth of current literature and guidance to support evidence-informed recommendations.

For each recommendation, the group adapted the GRADE system¹⁶² as we excluded any recommendations with “Very Low” confidence. Table 3 illustrates the levels of evidence categories and the definitions we relied on.

Table 3.

The Levels of Evidence Used to Evaluate Each Recommendation

Level	Description
High	Recommendation is based on large, systematic reviews, and meta-analyses. Further research is very unlikely to change our confidence.
Medium	Recommendation is based on evidence limited by sample size or study limitations. Further research may very well have an important impact on our confidence in the recommendation.
Low	Recommendation is based on weak evidence. Further research is likely to have an impact on our confidence.

Furthermore, we not only provide a description of our strength of recommendations in Table 4 but also make clear, for an abbreviated set of stakeholders, how they should interpret each corresponding guideline based on the strength.

Table 4.

The Strength of Each Recommendation as Well as an Abbreviated Guide for How Various Stakeholders Should Interpret Each Guideline Based on the Strength

Strength	Description
Strong	The working group is confident that the recommendation will result in improved patient outcomes. The implications of a strong recommendation are the following: • For patients: Most people in this situation would want the recommended course of action, and only a small proportion would not; request discussion if the recommendation is not followed. • For clinicians: AI usage should follow the recommended course of action. • For policymakers: The recommendation can be adopted as a policy in most situations. • For payers: Reimbursement of AI services that do not follow the recommendation would need significant justification.
Conditional	The working group has a moderate level of confidence that the benefits of the recommendation outweigh the costs. The implications of a conditional recommendation are the following: • For patients: Most people in this situation would want the recommended course of action, but many would not. • For clinicians: Recognize that different choices will be appropriate for different patients and that you must work with a patient to arrive at a decision consistent with their values and preferences. It is likely that shared decision-making will play a major role. • For policymakers: Policymaking will require substantial debate and involvement of many stakeholders. • For payers: Reimbursement of AI services that do not follow the recommendation may be justified by means external to those specified in this document.

Strength

Description

Strong

The working group is confident that the recommendation will result in improved patient outcomes.
The implications of a strong recommendation are the following:
• For patients: Most people in this situation would want the recommended course of action, and only a small proportion would not; request discussion if the recommendation is not followed.
• For clinicians: AI usage should follow the recommended course of action.
• For policymakers: The recommendation can be adopted as a policy in most situations.
• For payers: Reimbursement of AI services that do not follow the recommendation would need significant justification.

Conditional

The working group has a moderate level of confidence that the benefits of the recommendation outweigh the costs.
The implications of a conditional recommendation are the following:
• For patients: Most people in this situation would want the recommended course of action, but many would not.
• For clinicians: Recognize that different choices will be appropriate for different patients and that you must work with a patient to arrive at a decision consistent with their values and preferences. It is likely that shared decision-making will play a major role.
• For policymakers: Policymaking will require substantial debate and involvement of many stakeholders.
• For payers: Reimbursement of AI services that do not follow the recommendation may be justified by means external to those specified in this document.

AI, artificial intelligence.

Use cases/examples

The preceding sections established evidence-based recommendations and standards for the validation and implementation of digital pathology and AI-enabled tools. The following examples, spanning applications such as cancer detection and prognostic risk assessment, illustrate specific use cases and examples where these guidelines would be applied.

Cancer detection

AI-powered cancer detection is one of the most mature and widely adopted applications of AI in digital pathology, demonstrating clear value in diagnostic accuracy, efficiency, and QC. Solutions from companies such as Paige, Ibex Medical Analytics, PathAI, and Aiforia are being deployed in clinical settings to detect malignancy in WSIs, assist in tumor grading, and enhance safety-net functions through diagnostic discrepancy detection.

A landmark example is Paige Prostate, which in 2021 became the first AI tool for pathology to receive FDA de novo clearance⁹⁹ for clinical use in prostate cancer detection. In a prospective multireader study, Paige Prostate improved the diagnostic sensitivity of general pathologists from 88.7% to 96.6% when used as an assistive tool, without a significant decrease in specificity.⁴

Similarly, Ibex’s Galen™ Prostate and Galen™ Breast platforms are CE-marked¹⁶³ and used across Europe and the Middle East, and Ibex Prostate Detect received FDA clearance.¹⁶⁴ Pantanowitz et al⁸⁵ demonstrated Galen Prostate’s ability to achieve 99.0% sensitivity and 97.6% specificity in identifying prostate cancer in core needle biopsies.

Though reimbursement mechanisms are still emerging, particularly in the United States, many of these tools are deployed under LDT frameworks or embedded in institutional workflows. Their value is evident: multiple studies report that AI-assisted triage and detection can reduce review time,¹⁵⁷ improve interobserver agreement,³ and help ensure rare or subtle cancers are not overlooked.¹⁶⁵

As validation studies continue and regulatory pathways expand, AI-driven cancer detection is transitioning from an innovative add-on to a foundational element of modern digital pathology practice.

Prognostic risk assessment

AI-driven prognostic risk assessment in digital pathology is rapidly emerging as a powerful clinical tool, enabling more precise and individualized predictions about cancer progression, treatment response, and patient outcomes. Unlike traditional diagnostic AI tools that focus on detecting the presence of malignancy, prognostic models extract quantitative histological features from digitized slides—often imperceptible to the human eye—and use them to predict disease aggressiveness, likelihood of recurrence, or survival.

A leading example is Artera, which developed the first multimodal prognostic and predictive biomarker platform for localized prostate cancer.¹² Artera’s test combines pathology image data and clinical variables to stratify patients by risk and guide treatment decisions. In a validation study,¹⁶⁶ the model successfully identified which patients benefited from intensified therapy, such as long-term androgen deprivation, and which could safely avoid it—demonstrating clinical utility beyond existing genomic or clinical risk models.

Another example is Paige’s AI-based grading and quantification tools,⁹⁹ which are being developed to not only detect cancer but also provide risk stratification information based on tumor architecture, gland morphology, and spatial features. These features are often used to refine Gleason grading or supplement prognostic indices.

Studies using open datasets such as The Cancer Genome Atlas (TCGA) and CPTAC have shown that deep learning models trained on pathology slides can predict overall survival, recurrence-free survival, and molecular subtypes in multiple cancers, including breast, lung, colorectal, and glioma.^167–169 These models often rival or surpass traditional molecular classifiers, while being significantly more scalable and cost-effective since they use routine H&E slides.

Biomarker scoring

AI-driven biomarker scoring is transforming how IHC and ISH biomarkers are quantified in digital pathology, enabling faster, more standardized assessments that support precision oncology. These tools analyze stained tissue sections—such as PD-L1, HER2, ER/PR, and Ki-67—on WSIs to provide objective, reproducible scoring aligned with clinical guidelines or companion diagnostics (CDx).

Multiple companies now offer AI solutions that aid biomarker quantification. For example, PathAI, Aiforia, Visiopharm, and Roche have developed algorithms for PD-L1 scoring in non-small cell lung cancer (NSCLC), triple-negative breast cancer, and gastric cancers. These tools assist in calculating tumor proportion score or combined positive score, improving consistency in determining patient eligibility for immunotherapies such as pembrolizumab or atezolizumab.

PathAI’s AISight platform has supported studies in PD-L1 and Ki-67 scoring, including partnerships with pharmaceutical sponsors to enhance companion diagnostic reproducibility. In one study involving PD-L1 scoring for gastric cancer, AI-assisted pathologists showed greater concordance with expert consensus than unaided readers.¹⁷⁰

Biomarker scoring with AI also supports HER2 status determination, a critical factor in selecting HER2-targeted therapies in breast and gastric cancers. Tools from Visiopharm and Paige are being explored for HER2 quantification and HER2-low classification, which has emerged as a new clinical category.^155,171

While reimbursement for AI biomarker tools is often embedded within broader digital pathology services or companion diagnostic workflows, they are increasingly being recognized in regulatory filings. The AMA’s CPT Editorial Panel has introduced taxonomy guidance to classify autonomous AI scoring tools (CPT Panel, 2023), and CMS has begun to consider coverage pathways for software that enhances FDA-approved CDx tests.

Ultimately, AI-powered biomarker scoring provides a scalable way to reduce variability, shorten turnaround times, and ensure more equitable access to precision medicine. By enabling consistent interpretation of complex markers, these tools help ensure the right patients get the right therapy—especially in contexts where biomarker scoring is known to be subjective or variable.

Quality control

AI-enabled QC in tissue slide preparation is an emerging yet essential application in digital pathology, addressing a persistent and often underrecognized contributor to diagnostic error: preanalytic variability. From tissue processing and sectioning to staining and slide digitization, suboptimal slide quality can compromise diagnosis, delay turnaround, and reduce the utility of downstream AI tools. AI-driven QC solutions help flag poor-quality slides in real time, enabling reprocessing before diagnostic review or AI inference.

A number of different efforts have illustrated the degree to which AI algorithms can perform various types of QC in an automated fashion commensurate with human pathologists. An early attempt was the work of Janowczyk et al,¹⁷² which illustrated that hand-crafted machine learning features could be used to identify artifactless regions of interest in digital slides and demonstrated strong agreement (94%) with human experts. Using a larger dataset from TCGA,¹⁷³ Haghighat et al¹⁷⁴ demonstrate a correlation of 0.89 with human pathologists at slide-level overall diagnostic usability using a multistage deep learning algorithm. Weng et al¹⁷⁵ and Jabar et al¹⁷⁶ demonstrate high pixel-level accuracy (Dice score of ∼94%) for segmentation of artifactless tumor pixels from a variety of institutions and digital slide scanners.

While these efforts have demonstrated the viability of automated QC, a number of companies have commercialized AI tools for automated quality assurance of H&E and IHC slides. Algorithms such as Conflux’s QC suite,¹⁷⁷ Aiosyn’s AiosynQC,¹⁷⁸ Leica’s SlideQC BF,¹⁷⁹ and PathAI’s Artifact Detect¹⁸⁰ identify common artifacts in WSIs, such as air bubbles, folds, debris, blurry/out-of-focus regions, and pen marks, and can highlight affected areas to flag slides that may need rescan or manual review.

While most QC tools have not yet undergone formal FDA review, they are increasingly integrated into LDT pipelines and digital pathology platforms. For example, PathAI’s Artifact Detect can be added to their AISight Image Viewer,¹⁸¹ and AiosynQC has been integrated into Sectra’s image viewer via their app marketplace.¹⁸² Some vendors offer QC modules as part of broader whole-slide imaging systems, such as the Leica Aperio iQC Software¹⁸³ or Proscia’s Automated QC¹⁸⁴ in Concentriq LS, which monitors scan fidelity and alerts technicians in real time.

Triage

AI-powered triage systems in digital pathology can improve efficiency by automatically identifying and prioritizing slide cases based on clinical urgency and diagnostic complexity. Slides most likely to be benign or low-risk—such as benign prostate or breast biopsies—can be flagged for later review, allowing pathologists to focus first on complex or high-likelihood-of-disease cases. By ensuring that critical cases are reviewed earlier, these systems can facilitate earlier diagnosis, which in turn enables timely treatment initiation and more efficient scheduling of additional testing. This targeted prioritization helps reduce diagnostic backlog, mitigate fatigue, and streamline workflows in high-volume laboratories. It also supports pathologists who must balance multiple responsibilities, such as intraoperative consultations, fine needle aspiration procedures, and tumor boards, by ensuring that critical cases rise to the top of the worklist.

The impact of triage has been demonstrated in radiology, where AI-based prioritization has led to measurable clinical benefits. At Cedars-Sinai, the implementation of AI triage software for intracranial hemorrhage and pulmonary embolism reduced hospital length of stay by 1.3 days (11.9%) and 2.07 days (26.3%), respectively, compared with pre-AI periods. These reductions were attributed to faster case flagging, earlier radiologist notification, and quicker clinical intervention, all of which are critical for patient outcomes.¹⁸⁵ Similar AI-driven triage in digital pathology could enable earlier diagnosis of high-risk cases, faster communication with clinical teams, and more efficient use of pathologists’ time, ultimately improving patient care while maintaining safety for low-risk cases.

One of the most established examples is Paige Prostate Detect,⁹⁹ which is FDA-cleared to assist in the identification of prostate cancer in WSIs. While its core function is cancer detection, the tool has been shown to triage out benign slides with high negative predictive value, enabling pathologists to spend up to 65% less time on negative cases.¹⁰⁹ The tool achieved an AUC of 0.99 for differentiating between benign and malignant cases in multireader studies.

Similarly, Ibex’s Galen™ Prostate platform, CE-marked and in routine clinical use in Europe, provides triage and prioritization functionality, highlighting suspicious regions and enabling lab workflows that automatically escalate slides with AI-flagged abnormalities.¹⁸⁶ In real-world deployments, this has led to reductions in diagnostic turnaround times and improved case prioritization for subspecialty review.

Academic studies have replicated this utility across other cancer types. For example, AI models trained on breast cancer biopsies can triage out benign cases with >95% accuracy, reducing review burden while maintaining diagnostic safety nets. Triage models have also been evaluated for colorectal polyps, cervical cytology, and lung nodules, with high performance in excluding normal or low-risk findings.

Tools like Paige PanCancer Detect,⁹⁹ a foundation model–based system recognized by the U.S. FDA as a Breakthrough Device, are designed to flag slides suspicious for malignancy across multiple tissue and organ types. In real-world testing, Paige PanCancer achieved high sensitivity (93–95%) and 90% overall accuracy, even identifying previously missed small carcinoma foci.²⁶

While these tools are not typically reimbursed as standalone services, their operational impact is substantial. Pathology departments using triage AI report shortened average case turnaround times, lower review variability, and reduced subspecialty bottlenecks—especially valuable in under-resourced or high-volume labs.

Triage AI is increasingly seen not just as a convenience but as a critical component of safe and scalable digital pathology—automating what pathologists already do intuitively and ensuring that scarce human expertise is focused where it matters most.

Future Directions

The landscape of digital pathology continues to evolve at an unprecedented pace, driven by technological innovations in both hardware and AI systems. As we look toward the future, several key developments will likely reshape the field and present new opportunities and challenges for pathology practice.

Emerging technologies in slide scanning

The next generation of slide scanning technologies promises significant advances in imaging capabilities, throughput, and accessibility. We anticipate developments in ultra-high-resolution imaging systems that may surpass current 40× magnification standards, potentially enabling visualization of subcellular structures with greater clarity. Multispectral and hyperspectral imaging technologies¹⁸⁷ are emerging that could provide enhanced tissue characterization beyond traditional brightfield microscopy, offering new diagnostic capabilities through spectral analysis of tissue components. For example, quantification of HER2 using multispectral imaging provides superior prognostic information for invasive breast cancer compared with conventional RGB imaging, demonstrating higher predictive accuracy for 5-year disease-free survival and stronger association with patient outcomes.¹⁸⁸

Real-time scanning technologies may eliminate current batch-processing limitations, allowing for immediate digitization as slides are prepared. In addition, portable and point-of-care scanning solutions are being developed that could democratize access to digital pathology in resource-limited settings and enable rapid consultation in surgical and clinical environments.

Advances in optical technologies, including computational imaging and deep learning-enhanced image reconstruction, may also improve image quality while reducing scanning time and storage requirements. These developments could make digital pathology more efficient and cost-effective for widespread adoption.

Artificial intelligence

The integration of AI in digital pathology represents one of the most transformative aspects of the field’s future. The vast majority of AI tools in clinical use are ultimately relied on by physicians to obtain a better outcome while maintaining pathologist oversight and final decision-making authority.

At present, there are no widely deployed Autonomous Level II (systems that can make independent diagnostic decisions with pathologist review) or Autonomous Level III (fully autonomous diagnostic systems) solutions in routine clinical practice. However, the rapid pace of AI development suggests this landscape will continue to evolve significantly.

We anticipate continued expansion of AI tools that help pathologists identify regions of interest, quantify biomarkers, and detect potential diagnostic pitfalls. Augmentative systems will likely become more sophisticated in their ability to enhance pathologist capabilities through advanced image analysis, pattern recognition, and integration of multimodal data, including genomic, clinical, and imaging information. Autonomous Level I applications may expand to cover more specialized areas of pathology, potentially including complex morphological assessments and prognostic and predictive scoring systems. As these systems mature and demonstrate consistent performance across diverse populations and institutions, the regulatory and clinical pathways for higher levels of autonomy may begin to emerge.

The development trajectory toward higher autonomy levels (Level II and Level III) remains to be seen, not just in digital pathology but across other areas of radiological imaging and health care in general.

References

Bulten

, Balkenhol

, Belinga

J-JA

, et al.; ISUP Pathology Imagebase Expert Panel. Artificial intelligence assistance significantly improves Gleason grading of prostate biopsies by pathologists. Mod Pathol, 2021; 34(3):660–671.

Sakamoto

, Furukawa

, Pham

HHN

, et al. A collaborative workflow between pathologists and deep learning for the evaluation of tumour cellularity in lung adenocarcinoma. Histopathology, 2022; 81(6):758–769.

, Nguyen

N-NJ

, Meyer

, et al. AI improves accuracy, agreement and efficiency of pathologists for Ki67 assessments in breast cancer. Sci Rep, 2024; 14(1):1283.

Raciti

, Sue

, Retamero

, et al. Clinical validation of artificial intelligence–augmented pathology diagnosis demonstrates significant gains in diagnostic accuracy in prostate cancer detection. Arch Pathol Lab Med, 2023; 147(10):1178–1185.

Marron-Esquivel

, Duran-Lopez

, Linares-Barranco

, et al. A comparative study of the inter-observer variability on Gleason grading against Deep Learning-based approaches for prostate cancer. Comput Biol Med, 2023; 159:106856.

Wetstein

, de Jong

VMT

, Stathonikos

, et al. Deep learning-based breast cancer grading and survival analysis on whole-slide histopathology images. Sci Rep, 2022; 12(1):15102.

Ertosun

, Rubin

. Automated grading of gliomas using deep learning in digital pathology images: A modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc, 2015; 2015 pp:1899–1908.

Nagpal

, Foote

, Liu

, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer. NPJ Digit Med, 2019; 2(1):48.

Simard

, Shen

, Bräutigam

, et al. Immunocto: A massive immune cell database auto-generated for histopathology. ArXiv, 2024Prepr ArXiv240602618.

10.

Graham

, Vu

, Raza

SEA

, et al. Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Med Image Anal, 2019; 58:101563.

11.

Negahbani

, Sabzi

, Jahromi

, et al. PathoNet: Deep learning assisted evaluation of Ki-67 and tumor infiltrating lymphocytes (TILs) as prognostic factors in breast cancer; A large dataset and baseline. ArXiv, 2020 Prepr ArXiv201004713.

12.

Esteva

, Feng

, van der Wal

, et al.; NRG Prostate Cancer AI Consortium. Prostate cancer therapy personalization via multi-modal deep learning on randomized phase III clinical trials. NPJ Digit Med, 2022; 5(1):71.

13.

Zhang

, Yang

, Chen

, et al. Histopathology images-based deep learning prediction of prognosis and therapeutic response in small cell lung cancer. NPJ Digit Med, 2024; 7(1):15.

14.

Loeffler

CML

, Sainath

, Jiang

, et al. 545P HIBRID: Histology and ct-DNA based risk-stratification with deep learning. Ann Oncol, 2024; 35:S454.

15.

, Yang

, Haeri

, et al. Augmenting pathologists with NaviPath: Design and evaluation of a human-AI collaborative navigation system. In: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 2023. pp. 1–19.

16.

Gaffney

, Mirza

. Pathology in the artificial intelligence era: Guiding innovation and implementation to preserve human insight. Acad Pathol, 2025; 12(1):100166.

17.

Retamero

, Gulturk

, Bozkurt

, et al. Artificial intelligence helps pathologists increase diagnostic accuracy and efficiency in the detection of breast cancer lymph node metastases. Am J Surg Pathol, 2024; 48(7):846–854.

18.

Vaidya

, Chen

, Williamson

DFK

, et al. Demographic bias in misdiagnosis by computational pathology models. Nat Med, 2024; 30(4):1174–1190.

19.

Pantanowitz

, Wiley

, Demetris

, et al. Experience with multimodality telepathology at the University of Pittsburgh Medical Center. J Pathol Inform, 2012; 3(1):45.

20.

Minervini

, Yagi

, Marino

, et al. Development and experience with an integrated system for transplantation telepathology. Hum Pathol, 2001; 32(12):1334–1343.

21.

Nass

, Patlak

, Balogh

. Improving cancer diagnosis and care: Patient access to oncologic imaging and pathology expertise and technologies: proceedings of a workshop. 2018.

22.

Nass

, Cohen

, Nayar

, et al. Improving cancer diagnosis and care: Patient access to high-quality oncologic pathology. Oncologist, 2019; 24(10):1287–1290.

23.

Nakhleh

, Myers

, Allen

, et al. Consensus statement on effective communication of urgent diagnoses and significant, unexpected diagnoses in surgical pathology and cytopathology from the College of American Pathologists and Association of Directors of Anatomic and Surgical Pathology. Arch Pathol Lab Med, 2012; 136(2):148–154.

24.

Nakhleh

, Souers

, Brown

. Significant and unexpected, and critical diagnoses in surgical pathology: A College of American Pathologists’ survey of 1130 laboratories. Arch Pathol Lab Med, 2009; 133(9):1375–1378.

25.

Robboy

, Gross

, Park

, et al. Reevaluation of the US Pathologist Workforce Size. JAMA Netw Open, 2020; 3(7):e2010648.

26.

Rienda

, Vale

, Pinto

, et al. Using artificial intelligence to prioritize pathology samples: Report of a test drive. Virchows Arch, 2025; 487(1):203–208.

27.

Wang

, Jin

, Shieh

C-C

, et al. Real world validation of an AI-based CT hemorrhage detection tool. Front Neurol, 2023; 14:1177723.

28.

U.S. Centers for Medicare & Medicaid Services. State Operations Manual, Chapter 2: The certification process, section 2825 – Federally Qualified Health Centers (FQHCs) – citations and description [internet]. 2024. Available from: https://www.cms.gov/regulations-and-guidance/guidance/manuals/downloads/som107c02.pdf

29.

Henley

, Anderson

, Thomas

, et al. Invasive cancer incidence, 2004–2013, and deaths, 2006–2015, in nonmetropolitan and metropolitan counties—United States. MMWR Surveill Summ, 2017; 66(14):1–13.

30.

Levit

, Byatt

, Lyss

, et al. Closing the rural cancer care gap: Three institutional approaches. JCO Oncol Pract, 2020; 16(7):422–430.

31.

Unger

, Moseley

, Symington

, et al. Geographic distribution and survival outcomes for rural patients with cancer treated in clinical trials. JAMA Netw Open, 2018; 1(4):e181235.

32.

Impact of the Health Center Program | Bureau of Primary Health Care. Available from: https://bphc.hrsa.gov/about-health-center-program/impact-health-center-program [Last accessed: September 30, 2025].

33.

Lavelle

, Rose

, Timbie

, et al. Utilization of health care services among Medicare beneficiaries who visit federally qualified health centers. BMC Health Serv Res, 2018; 18(1):41.

34.

, Postman

. Strategies by federally-funded Health Centers to facilitate patient access to specialty care. Available from: https://aspe.hhs.gov/sites/default/files/private/pdf/259201/SpecialtyAccess.pdf

35.

Nayak

, Beaulieu

, Rubin

, et al. A picture is worth a thousand words: Needs assessment for multimedia radiology reports in a large tertiary care medical center. Acad Radiol, 2013; 20(12):1577–1583.

36.

Cabarrus

, Naeger

, Rybkin

, et al. Patients prefer results from the ordering provider and access to their radiology reports. J Am Coll Radiol, 2015; 12(6):556–562.

37.

Alarifi

, Patrick

, Jabour

, et al. Full radiology report through patient web portal: A literature review. Int J Environ Res Public Health, 2020; 17(10):3673.

38.

Ranot

, Noghani

, Rothwell

, et al. If you provide them, they will come: An observational study of online pathology report access by patients at a large, academic, tertiary care hospital in Canada. J Clin Pathol, 2025; 78(9):636–640.

39.

Hulter

, Langendoen

, Pluut

, et al. Patients’ choices regarding online access to laboratory, radiology and pathology test results on a hospital patient portal. PLoS One, 2023; 18(2):e0280768.

40.

Joseph

. Pathology clinics can address pathology burnout. Clin Lab Med, 2025; 45(3):497–505.

41.

Cox

WAS

, Cavenagh

, Bello

. What are the benefits and risks of sharing patients’ diagnostic radiological images with them? A cross-sectional study of the perceptions of patients and clinicians in the UK. BMJ Open, 2020; 10(1):e033835.

42.

Wiener

, Gould

, Woloshin

, et al. What do you mean, a spot?: A qualitative analysis of patients’ reactions to discussions with their physicians about pulmonary nodules. Chest, 2013; 143(3):672–677.

43.

Campanella

, Chen

, Singh

, et al. A clinical benchmark of public self-supervised pathology foundation models. Nat Commun, 2025; 16(1):3640.

44.

Chen

, Ding

, Lu

, et al. Towards a general-purpose foundation model for computational pathology. Nat Med, 2024; 30(3):850–862.

45.

Vorontsov

, Bozkurt

, Casson

, et al. Virchow: A million-slide digital pathology foundation model. ArXiv, 2023Prepr ArXiv230907778.

46.

Alber

, Tietz

, Dippel

, et al. Atlas: A novel pathology foundation model by Mayo Clinic, Charit\backslash’e, and Aignostics. ArXiv, 2025Prepr ArXiv250105409.

47.

Vorontsov

, Bozkurt

, Casson

, et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nat Med, 2024; 30(10):2924–2935.

48.

, Chen

, Williamson

DFK

, et al. A visual-language foundation model for computational pathology. Nat Med, 2024; 30(3):863–874.

49.

, Guo

, Zhou

, et al. A generalizable pathology foundation model using a unified knowledge distillation pretraining framework. Nat Biomed Eng, 2026; 10(3):545–564.

50.

Ding

, Wagner

, Song

, et al. A multimodal whole-slide foundation model for pathology. Nat Med, 2025:1–13.

51.

Juyal

, Padigela

, Shah

, et al. Pluto: Pathology-universal transformer. ArXiv, 2024 Prepr ArXiv240507905.

52.

Filiot

, Jacob

, Mac Kain

, et al. Phikon-v2, a large and public feature extractor for biomarker prediction. ArXiv, 2024 Prepr ArXiv240909173.

53.

Nechaev

, Pchelnikov

, Ivanova

. Hibou: A family of foundational vision transformers for pathology. ArXiv, 2024Prepr ArXiv240605074.

54.

Aben

, de Jong

, Gatopoulos

, et al. Towards large-scale training of pathology foundation models. ArXiv, 2024 Prepr ArXiv240415217.

55.

International Medical Device Regulators Forum (IMDRF). Software as a Medical Device (SaMD): Clinical Evaluation [Internet]. International Medical Device Regulators Forum; 2017. Report No.: IMDRF/SaMD WG/N41 Available from: https://www.fda.gov/medical-devices/digital-health-center-excellence/software-medical-device-samd [Last accessed: Sep 30, 2025].

56.

American Clinical Laboratory Association , et al. v. U.S. Food and Drug Administration, et al. 2025.

57.

College of American Pathologists. CAP Laboratory Accreditation Program [Internet]. 2025. Available from: https://www.cap.org/laboratory-improvement/accreditation/laboratory-accreditation-program

58.

Centers for Medicare & Medicaid Services. 42 C.F.R. § 415.102—General rule (Services of teaching physicians) [Internet]. 2025. Available from: https://www.ecfr.gov/current/title-42/part-415/section-415.102

59.

European Parliament and Council of the European Union. Regulation (EU) 2017/746 of the European Parliament and of the Council of 5 April 2017 on in vitro diagnostic medical devices (IVDR) and repealing Directive 98/79/EC and Commission Decision 2010/227/EU [Internet]. 2017. Available from: https://eur-lex.europa.eu/eli/reg/2017/746/oj

60.

Parliament E, Union C of the E. Directive 98/79/EC of the European Parliament and of the Council on in vitro diagnostic medical devices [Internet]. 1998. Available from: https://eur-lex.europa.eu/eli/dir/1998/79/oj

61.

International Organization for Standardization. ISO 13485:2016 Medical devices—Quality management systems—Requirements for regulatory purposes [Internet]. 2016. Available from: https://www.iso.org/standard/59752.html

62.

International Organization for Standardization. ISO 15189:2022 Medical laboratories—Requirements for quality and competence [Internet]. 2022. Available from: https://www.iso.org/standard/76677.html

63.

Branch LS. Consolidated federal laws of Canada, Medical Devices Regulations [Internet]. 2025. Available from: https://laws.justice.gc.ca/eng/regulations/SOR-98-282/ [Last accessed: Jan 26, 2026].

64.

Health Canada. Medical Device Licence (MDL) under the medical devices regulations [Internet]. 2026. Available from: https://www.canada.ca/en/health-canada/services/drugs-health-products/medical-devices/about-medical-devices.html

65.

Pharmaceuticals and Medical Devices Agency [Internet]. Regulations and approval/certification of medical devices. Available from: https://www.pmda.go.jp/english/review-services/reviews/0004.html [Last accessed: Jan 26, 2026].

66.

Administration (TGA) TG. Understanding regulation of software-based medical devices | Therapeutic Goods Administration (TGA) [Internet]. Therapeutic Goods Administration (TGA); 2024. Available from: https://www.tga.gov.au/resources/guidance/understanding-regulation-software-based-medical-devices [Last accessed: Jan 26, 2026].

67.

Frank

, Jarrin

, Pritzker

, et al. Developing current procedural terminology codes that describe the work performed by machines. NPJ Digit Med, 2022; 5(1):177.

68.

Clinical and Laboratory Standards Institute (CLSI). Evaluation of Precision of Quantitative Measurement Procedures; Approved Guideline—Third Edition (EP05-A3) [Internet]. Wayne, PA: Clinical and Laboratory Standards Institute; 2014. Report No.: EP05-A3. Available from: https://clsi.org/standards/products/ep05/

69.

Clinical and Laboratory Standards Institute (CLSI). Evaluation of Detection Capability for Clinical Laboratory Measurement Procedures; Approved Guideline—Second Edition (EP17-A2) [Internet]. Wayne, PA: Clinical and Laboratory Standards Institute; 2012. Report No.: EP17-A2. Available from: https://clsi.org/standards/products/ep17/

70.

Clinical and Laboratory Standards Institute (CLSI). User Verification of Precision and Estimation of Bias; Approved Guideline—Third Edition (EP15-A3) [Internet]. Wayne, PA: Clinical and Laboratory Standards Institute; 2014. Report No.: EP15-A3. Available from: https://clsi.org/standards/products/ep15/

71.

Centers for Medicare & Medicaid Services. Code of Federal Regulations: 42 CFR § 493.1253 – Establishment and Verification of Performance Specifications [Internet]. 2024. Available from: https://www.ecfr.gov/current/title-42/part-493/subpart-K/section-493.1253 [Last accessed: Sep 30, 2025].

72.

College of American Pathologists. Laboratory general checklist: Laboratory Accreditation Program [Internet]. Northfield, IL: College of American Pathologists; 2017. Available from: https://documents-cloud.cap.org/appsuite/learning/LAP/BAP/2018/BAP_CL_GEN.pdf

73.

Kroll

, Biswas

, Budd

, et al. Assessment of the diagnostic accuracy of laboratory tests using receiver operating characteristic curves; Approved Guideline—Second Edition. Clinical and Laboratory Standards Institute, 2011. p;31(23):1–45. Report No

74.

Definition of clinical utility - NCI Dictionary of Genetics Terms - NCI [Internet]. 2012. Available from: https://www.cancer.gov/publications/dictionaries/genetics-dictionary/def/clinical-utility [Last accessed: Feb 3, 2026].

75.

McCormack

, Billings

. Clinical utility: Informing treatment decisions by changing the paradigm. NAM Perspect, 2015.

76.

Teutsch

, Bradley

, Palomaki

, et al.; EGAPP Working Group. The evaluation of genomic applications in practice and prevention (EGAPP) initiative: Methods of the EGAPP working group. Genet Med, 2009; 11(1):3–14.

77.

Medicare Coverage of Items and Services | CMS Available from: https://www.cms.gov/cms-guide-medical-technology-companies-and-other-interested-parties/coverage/medicare-coverage-items-and-services [Last accessed: Feb 3, 2026].

78.

Medicare Program Integrity Manual.

79.

Samson

, Schoelles

. Chapter 2: Medical tests guidance (2) developing the topic and structuring systematic reviews of medical tests: Utility of PICOTS, analytic frameworks, decision trees, and other frameworks. J Gen Intern Med, 2012; 27(Suppl 1):S11–S19.

80.

Group SSSS. othersRandomised trial of cholesterol lowering in 4444 patients with coronary heart disease: The Scandinavian Simvastatin Survival Study (4S). The Lancet, 1994; 344(8934):1383–1389.

81.

U.S. Food and Drug Administration. Summary of safety and effectiveness data (SSED) for MI cancer seek (PMA P240010) [Internet]. 2024. Available from: https://www.accessdata.fda.gov/cdrh_docs/pdf24/P240010B.pdf

82.

U.S. Food and Drug Administration. Summary of safety and effectiveness data (SSED) for therascreen PIK3CA RGQ PCR Kit (PMA P190004) [Internet]. 2019. Available from: https://www.accessdata.fda.gov/cdrh_docs/pdf19/P190004B.pdf

83.

Hanna

, Ardon

, Reuter

, et al. Integrating digital pathology into clinical practice. Mod Pathol, 2022; 35(2):152–164.

84.

Baidoshvili

, Bucur

, van Leeuwen

, et al. Evaluating the benefits of digital pathology implementation: Time savings in laboratory logistics. Histopathology, 2018; 73(5):784–794.

85.

Pantanowitz

, Valenstein

, Evans

, et al. Review of the current state of whole slide imaging in pathology. J Pathol Inform, 2011; 2(1):36.

86.

Baxi

, Edwards

, Montalto

, et al. Digital pathology and artificial intelligence in translational medicine and clinical practice. Mod Pathol, 2022; 35(1):23–32.

87.

Chaudhari

, Gupta

, Srivastav

, et al. Digital Versus conventional teaching of surgical pathology: A comparative study. Cureus, 2023; 15(9):e45747.

88.

Jahn

, Plass

, Moinfar

. Digital pathology: Advantages, limitations and emerging perspectives. J Clin Med, 2020; 9(11):3697.

89.

Moscalu

, Moscalu

, Dascălu

, et al. Histopathological images analysis and predictive modeling implemented in digital pathology—current affairs and perspectives. Diagnostics (Basel), 2023; 13(14):2379.

90.

, Wang

, Shang

, et al. Assessment of deep learning assistance for the pathological diagnosis of gastric cancer. Mod Pathol, 2022; 35(9):1262–1268.

91.

Litjens

, Sánchez

, Timofeeva

, et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep, 2016; 6(1):26286.

92.

Sandbank

, Bataillon

, Nudelman

, et al. Validation and real-world clinical application of an artificial intelligence algorithm for breast cancer detection in biopsies. NPJ Breast Cancer, 2022; 8(1):129.

93.

Bernard

, Chandrakanth

, Cornell

, et al.; Canadian Association of Pathologists Telepathology Guidelines Committee. Guidelines from the Canadian Association of Pathologists for establishing a telepathology service for anatomic pathology using whole-slide imaging: The Canadian Association of Pathologists Telepathology Guidelines Committee. J Pathol Inform, 2014; 5(1):15.

94.

Mukhopadhyay

, Feldman

, Abels

, et al. Whole slide imaging versus microscopy for primary diagnosis in surgical pathology: A multicenter blinded randomized noninferiority study of 1992 cases (pivotal study). Am J Surg Pathol, 2018; 42(1):39–52.

95.

World Health Organization. Guide to Cancer Early Diagnosis [Internet]. Geneva: World Health Organization; 2017. Available from: https://www.who.int/publications/i/item/9789241511940

96.

Evans

, Brown

, Bui

, et al. Validating whole slide imaging systems for diagnostic purposes in pathology: Guideline update from the College of American Pathologists in collaboration with the American Society for Clinical Pathology and the Association for Pathology Informatics. Arch Pathol Lab Med, 2022; 146(4):440–450.

97.

Aubreville

, Bertram

, Veta

, et al. Quantifying the scanner-induced domain gap in mitosis detection. ArXiv, 2021Prepr ArXiv210316515.

98.

Swiderska-Chadaj

, de Bel

, Blanchet

, et al. Impact of rescanning and normalization on convolutional neural network performance in multi-center, whole-slide classification of prostate cancer. Sci Rep, 2020; 10(1):14398.

99.

U.S. Food and Drug Administration. Paige Prostate De Novo Summary (DEN200080) [Internet]. 2021. Available from: https://www.accessdata.fda.gov/cdrh_docs/reviews/DEN200080.pdf

100.

F. Hoffmann-La Roche Ltd. Roche granted FDA Breakthrough Device Designation for first AI-driven companion diagnostic for non-small cell lung cancer [Internet]. 2025. Available from: https://www.roche.com/investors/updates/inv-update-2025-04-29

101.

Hanna

, Olson

, Zarella

, et al. Recommendations for performance evaluation of machine learning in pathology: A concept paper from the College of American Pathologists. Arch Pathol Lab Med, 2024; 148(10):e335–61–e361.

102.

Duenweg

, Bobholz

, Lowman

, et al. Whole slide imaging (WSI) scanner differences influence optical and computed properties of digitized prostate cancer histology. J Pathol Inform, 2023; 14:100321.

103.

Bandi

, Geessink

, Manson

, et al. From detection of individual metastases to classification of lymph node status at the patient level: The camelyon17 challenge. IEEE Trans Med Imaging, 2019; 38(2):550–560.

104.

Boschman

, Farahani

, Darbandsari

, et al. The utility of color normalization for AI-based diagnosis of hematoxylin and eosin-stained pathology images. J Pathol, 2022; 256(1):15–24.

105.

Lin

, Zhou

, Watson

, et al. Impact of stain variation and color normalization for prognostic predictions in pathology. Sci Rep, 2025; 15(1):2369.

106.

Us F. Principles for codevelopment of an in vitro companion diagnostic device with a therapeutic product. Silver Spring MD. US FDA. 2016;

107.

Huang

, Randhawa

, Jain

, et al. Development and validation of an artificial intelligence–powered platform for prostate cancer grading and quantification. JAMA Netw Open, 2021; 4(11):e2132554.

108.

Campanella

, Hanna

, Geneslaw

, et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med, 2019; 25(8):1301–1309.

109.

Steiner

, MacDonald

, Liu

, et al. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer. Am J Surg Pathol, 2018; 42(12):1636–1646.

110.

Coudray

, Ocampo

, Sakellaropoulos

, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med, 2018; 24(10):1559–1567.

111.

Miranda Ruiz

, Lahrmann

, Bartels

, et al. CNN stability training improves robustness to scanner and IHC-based image variability for epithelium segmentation in cervical histology. Front Med (Lausanne), 2023; 10:1173616.

112.

Ochi

, Komura

, Onoyama

, et al. Registered multi-device/staining histology image dataset for domain-agnostic machine learning models. Sci Data, 2024; 11(1):330.

113.

Persson

, Wilderäng

, Jiborn

, et al. Interobserver variability in the pathological assessment of radical prostatectomy specimens: Findings of the Laparoscopic Prostatectomy Robot Open (LAPPRO) study. Scand J Urol, 2014; 48(2):160–167.

114.

Veloso

, Lima

, Salles

, et al. Interobserver agreement of Gleason score and modified Gleason score in needle biopsy and in surgical specimen of prostate cancer. Int Braz J Urol, 2007; 33(5):639–651.

115.

Montironi

, Lopez-Beltran

, Cheng

, et al. Central prostate pathology review: Should it be mandatory? Eur Urol, 2013; 64(2):199–201.

116.

Bottke

, Golz

, Störkel

, et al. Phase 3 study of adjuvant radiotherapy versus wait and see in pT3 prostate cancer: Impact of pathology review on analysis. Eur Urol, 2013; 64(2):193–198.

117.

Ozkan

, Eruyar

, Cebeci

, et al. Interobserver variability in Gleason histological grading of prostate cancer. Scand J Urol, 2016; 50(6):420–424.

118.

Robbins

, Pinder

, de Klerk

, et al. Histological grading of breast carcinomas: A study of interobserver agreement. Hum Pathol, 1995; 26(8):873–879.

119.

Van Bockstal

, François

, Altinay

, et al. Interobserver variability in the assessment of stromal tumor-infiltrating lymphocytes (sTILs) in triple-negative invasive breast carcinoma influences the association with pathological complete response: The IVITA study. Mod Pathol, 2021; 34(12):2130–2140.

120.

Rabe

, Snir

, Bossuyt

, et al. Interobserver variability in breast carcinoma grading results in prognostic stage differences. Hum Pathol, 2019; 94:51–57.

121.

Ginter

, Idress

, D’Alfonso

, et al. Histologic grading of breast carcinoma: A multi-institution study of interobserver variation using virtual microscopy. Mod Pathol, 2021; 34(4):701–709.

122.

Dano

, Altinay

, Arnould

, et al. Interobserver variability in upfront dichotomous histopathological assessment of ductal carcinoma in situ of the breast: The DCISion study. Mod Pathol, 2020; 33(3):354–366.

123.

Chandler

, Houlston

. Interobserver agreement in grading of colorectal cancers—findings from a nationwide web-based survey of histopathologists. Histopathology, 2008; 52(4):494–499.

124.

Reis

, Matsushita

, Santos

, et al. Assessing the applicability and interobserver variability of tumor budding and poorly differentiated clusters in colorectal cancer. Surg Exp Pathol, 2024; 7(1):1.

125.

Komuta

, Batts

, Jessurun

, et al. Interobserver variability in the pathological assessment of malignant colorectal polyps. Br J Surg, 2004; 91(11):1479–1484.

126.

Karamchandani

, Gonzalez

, Lee

, et al. Interobserver agreement and practice patterns for grading of colorectal carcinoma: World Health Organization (WHO) classification of tumours 5th edition versus American Joint Committee on Cancer (AJCC) staging manual. Histopathology, 2025; 86(7):1101–1111.

127.

Smits

LJH

, Vink-Börger

, van Lijnschoten

, et al. Diagnostic variability in the histopathological assessment of advanced colorectal adenomas and early colorectal cancer in a screening population. Histopathology, 2022; 80(5):790–798.

128.

Paech

, Weston

, Pavlakis

, et al. A systematic review of the interobserver variability for histology in the differentiation between squamous and nonsquamous non-small cell lung cancer. J Thorac Oncol, 2011; 6(1):55–63.

129.

Butter

, Hondelink

, van Elswijk

, et al. The impact of a pathologist’s personality on the interobserver variability and diagnostic accuracy of predictive PD-L1 immunohistochemistry in lung cancer. Lung Cancer, 2022; 166:143–149.

130.

Trontzas

, He

, Wurtz

, et al. Quantitative protein expression of antibody–Drug conjugate targets in egfr mutated and wild-type non–small cell lung cancer. Clin Cancer Res, 2025; 31(13):2767–2776.

131.

Y-L

, Xiao

, Li

, et al. 357P: Association of TROP2 quantitative continuous scoring (QCS) normalised membrane ratio (NMR) with efficacy in Chinese patients (pts) with advanced/metastatic non-small cell lung cancer (a/mNSCLC) treated with datopotamab deruxtecan (Dato-DXd) in TROPION-PanTumor02. J Thorac Oncol, 2025; 20(3):S212–S213.

132.

Garassino

, Sands

, Paz-Ares

, et al. PL02. 11 normalized membrane ratio of TROP2 by quantitative continuous scoring is predictive of clinical outcomes in TROPION-lung 01. J Thorac Oncol, 2024; 19(10):S2–S3.

133.

Chang

, Launer

, Narayan

, et al. Computational Histology Artificial Intelligence (CHAI) enhances risk stratification of high-grade Ta non–muscle-invasive bladder cancer in a multicenter cohort: Comparison to current European Association of Urology and American Urological Association stratification schemes. Eur Urol, 2025; 88(4):411–413.

134.

Shamai

, Cohen

, Binenbaum

, et al. Deep learning on histopathological images to predict breast cancer recurrence risk and chemotherapy benefit. medRxiv, 2025:2025.05.15.25327686. 2025–05.

135.

Nair

, Muhammad

, Jain

, et al. A novel artificial intelligence-powered tool for precise risk stratification of prostate cancer progression in patients with clinical intermediate risk. Eur Urol, 2025; 87(6):728–729.

136.

Valieris

, Martins

, Defelicibus

, et al. Weakly-supervised deep learning models enable HER2-low prediction from H &E stained slides. Breast Cancer Res, 2024; 26(1):124.

137.

Wang

, Zhang

, Li

, et al. Development and clinical validation of deep learning-based immunohistochemistry prediction models for subtyping and staging of gastrointestinal cancers. BMC Gastroenterol, 2025; 25(1):494–411.

138.

Liu

, Li

, Zheng

, et al. Predict Ki-67 positive cells in H&E-stained images using deep learning independently from IHC-stained images. Front Mol Biosci, 2020; 7:183.

139.

Das

, Tomita

, Syme

, et al. Cross-modality learning for predicting IHC biomarkers from H&E-stained whole-slide images. Am J Pathol, 2025

140.

Levin

, Alexanian

, Kallen

, et al. 1195 Multi-modal spatial analysis of classic Hodgkin lymphoma microenvironments utilizing multiplex immunofluorescence and virtual staining. BMJ Specialist Journals, 2024

141.

Büttner

, Gosney

, Skov

, et al. Programmed death-ligand 1 immunohistochemistry testing: A review of analytical assays and clinical implementation in non–small-cell lung cancer. J Clin Oncol, 2017; 35(34):3867–3876.

142.

Rimm

, Han

, Taube

, et al. A prospective, multi-institutional, pathologist-based assessment of 4 immunohistochemistry assays for PD-L1 expression in non–small cell lung cancer. JAMA Oncol, 2017; 3(8):1051–1058.

143.

Zarella

, Bowman

, Aeffner

, et al. A practical guide to whole slide imaging: A white paper from the digital pathology association. Arch Pathol Lab Med, 2019; 143(2):222–234.

144.

Tizhoosh

, Pantanowitz

. Artificial intelligence and digital pathology: Challenges and opportunities. J Pathol Inform, 2018; 9(1):38.

145.

U.S. Food and Drug Administration. Digital pathology software regulatory considerations [Internet]. 2021. Available from: https://www.fda.gov/media/154409/download

146.

Meyer

, Khademi

, Têtu

, et al. Impact of artificial intelligence on pathologists’ decisions: An experiment. J Am Med Inform Assoc, 2022; 29(10):1688–1695.

147.

Wong

, Otles

, Donnelly

, et al. External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients. JAMA Intern Med, 2021; 181(8):1065–1070.

148.

Marcu

, Marcu

. Examining the role of AI in cancer imaging through the lens of clinical studies. Health Technol, 2025; 15(6):1065–1074.

149.

Echle

, Rindtorff

, Brinker

, et al. Deep learning in cancer pathology: A new generation of clinical biomarkers. Br J Cancer, 2021; 124(4):686–696.

150.

Kather

, Pearson

, Halama

, et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med, 2019; 25(7):1054–1056.

151.

Device Classification Under Section 513(f)(2)(De Novo) [Internet]. Available from: https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/denovo.cfm?id=DEN240068 [Last accessed: Jan 28, 2026].

152.

Physicians—Actionable Insights | Decipher® Prostate. Decipher by Veracyte. 2025. Available from: https://decipherbio.com/decipher-prostate/physicians/decipher-prostate-overview/ [Last accessed: Oct 6, 2025].

153.

GPS Physician. mdxhealth. Available from: https://mdxhealth.com/gps-physician/ [Last accessed: Oct 6, 2025].

154.

Centers for Medicare & Medicaid Services. Clinical Laboratory Fee Schedule [Internet]. 2019. Available from: https://www.cms.gov/files/zip/25clabq4.zip

155.

Brügmann

, Eld

, Lelkaitis

, et al. Digital image analysis of membrane connectivity is a robust measure of HER2 immunostains. Breast Cancer Res Treat, 2012; 132(1):41–49.

156.

Niazi

MKK

, Parwani

, Gurcan

. Digital pathology and artificial intelligence. Lancet Oncol, 2019; 20(5):e253–61–e261.

157.

Eloy

, Marques

, Pinto

, et al. Artificial intelligence–assisted cancer diagnosis improves the efficiency of pathologists in prostatic biopsies. Virchows Arch, 2023; 482(3):595–604.

158.

Golden

. Deep learning algorithms for detection of lymph node metastases from breast cancer: Helping artificial intelligence be seen. JAMA, 2017; 318(22):2184–2186.

159.

Steiner

, Nagpal

, Sayres

, et al. Evaluation of the use of combined artificial intelligence and pathologist assessment to review and grade prostate biopsies. JAMA Netw Open, 2020; 3(11):e2023267.

160.

Perincheri

, Levi

, Celli

, et al. An independent assessment of an artificial intelligence system for prostate cancer detection shows strong diagnostic accuracy. Mod Pathol, 2021; 34(8):1588–1595.

161.

Roche granted FDA Breakthrough Device Designation for first AI-driven companion diagnostic for non-small cell lung cancer [Internet] Available from: https://www.roche.com//media/releases/med-cor-2025-04-29 [Last accessed: Jan 28, 2026].

162.

Guyatt

, Oxman

, Vist

, et al.; GRADE Working Group. GRADE: An emerging consensus on rating quality of evidence and strength of recommendations. BMJ, 2008; 336(7650):924–926.

163.

Ibex Prostate becomes first standalone AI-powered cancer diagnostics solution to obtain CE mark under the IVDR - IBEX [Internet]. 2023. Available from: https://ibex-ai.com/ivdr23/ [Last accessed: Jan 1, 2026].

164.

U.S. Food and Drug Administration. 510(k) Decision Summary: K241232 [Internet]. 2025 Available from: https://www.accessdata.fda.gov/cdrh_docs/pdf24/K241232.pdf

165.

Chen

, Lu

, Williamson

, et al. Fast and scalable search of whole-slide images via self-supervised deep learning. Nat Biomed Eng, 2022; 6(12):1420–1434.

166.

Spratt

, Tang

, Sun

, et al. Artificial intelligence predictive model for hormone therapy use in prostate cancer. NEJM Evid, 2023; 2(8):EVIDoa2300023.

167.

Chen

, Lu

, Wang

, et al. Pathomic fusion: An integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans Med Imaging, 2022; 41(4):757–770.

168.

Chunduru

, Phillips

, Molinaro

. Prognostic risk stratification of gliomas using deep learning in digital pathology images. Neurooncol Adv, 2022; 4(1):vdac111.

169.

Wulczyn

, Steiner

, Xu

, et al. Deep learning-based survival prediction for multiple cancer types using histopathology images. PLoS One, 2020; 15(6):e0233678.

170.

Choi

, Kim

. Artificial intelligence in the pathology of gastric cancer. J Gastric Cancer, 2023; 23(3):410–427.

171.

Oakley

III , Reis-Filho

, Klimstra

, et al. Deep learning-based assessment of HER2-low expression on breast cancer H&E digital whole slide images. In: CANCER RESEARCH. AMER ASSOC CANCER RESEARCH. 615 CHESTNUT ST, 17TH FLOOR, PHILADELPHIA, PA. 2023.

172.

Janowczyk

, Zuo

, Gilmore

, et al. HistoQC: An open-source quality control tool for digital pathology slides. JCO Clin Cancer Inform, 2019; 3:1–7.

173.

Weinstein

, Collisson

, Mills

, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet, 2013; 45(10):1113–1120.

174.

Haghighat

, Browning

, Sirinukunwattana

, et al. Automated quality assessment of large digitised histology cohorts by artificial intelligence. Sci Rep, 2022; 12(1):5002.

175.

Weng

, Seper

, Pryalukhin

, et al. GrandQC: A comprehensive solution to quality control problem in digital pathology. Nat Commun, 2024; 15(1):10685.

176.

Jabar

, Rasmussen Busund

L-T

, Ricciuti

, et al. Fully automatic content-aware tiling pipeline for pathology whole slide images. Intell-Based Med, 2025; 12:100318.

177.

Conflux Consult. Conflux Available from: https://www.conflux.xyz/

178.

Automated Quality Control for digital pathology slides [Internet] Available from: https://www.aiosyn.com/automated-quality-control/ [Last accessed: Jan 1, 2026].

179.

SlideQC BF for automated digital slide quality control [Internet] Available from: https://www.leicabiosystems.com/us/digital-pathology/analyze/aperioaistore/gallery/slideqcbf/ [Last accessed: Jan 1, 2026].

180.

AP Laboratory Solutions [Internet] Available from: https://www.pathai.com/ap-lab-solutions [Last accessed: Jan 1, 2026].

181.

PathAI announces launch of ArtifactDetect model on AISight, pioneering automated slide quality analysis in pathology labs [Internet]. 2023. Available from: https://www.pathai.com/resources/pathai-launches-artifactdetect-model-on-aisight-pioneering-automated-slide-quality-analysis-in-pathology-labs [Last accessed: Jan 1, 2026].

182.

Pathology - Sectra Amplifier Marketplace [Internet] Available from: https://amplifiermarketplace.sectra.com/pathology/ [Last accessed: Jan 1, 2026].

183.

Aperio iQC: AI-powered digital pathology quality control [Internet] Available from: https://www.leicabiosystems.com/us/digital-pathology/scan/aperio-iqc-software/ [Last accessed: Jan 1, 2026].

184.

Automated QC - Proscia Available from: https://proscia.com/ai/automated-qc/ [Last accessed: Jan 1, 2026].

185.

Petry

, Lansky

, Chodakiewitz

, et al. Decreased hospital length of stay for ICH and PE after adoption of an artificial intelligence-augmented radiological worklist triage system. Radiol Res Pract, 2022; 2022(1):2141839.

186.

Pantanowitz

, Quiroga-Garza

, Bien

, et al. An artificial intelligence algorithm for prostate cancer diagnosis in whole slide images of core needle biopsies: A blinded clinical validation and deployment study. Lancet Digit Health, 2020; 2(8):e407–16–e416.

187.

, Fei

. Medical hyperspectral imaging: A review. J Biomed Opt, 2014; 19(1):10901.

188.

Liu

, Wang

, Liu

, et al. A comparative performance analysis of multispectral and rgb imaging on her2 status evaluation for the prediction of breast cancer prognosis. Transl Oncol, 2016; 9(6):521–530.