Abstract
Introduction:
This study is part of the broader Stem Line project Mito-Cell-UAB073, specifically focusing on “Stem Cell Lines-Quality Control,” and aims to innovate in the field of Quality Control (QC) through a unique, artificial intelligence (AI)-powered model known as Life Cell AI UAB. This model utilizes deep learning algorithms and computer vision, allowing it to make accurate viability assessments of cell and stem cell lines based solely on static images captured through standard optical microscopes.
Aim:
The aim of this study was to develop and validate an AI-driven, image-based model that reliably predicts cell line viability.
Methods:
Our methodology involved training the Life Cell AI UAB model on single static images of cell lines using advanced computer vision and deep learning techniques. Performance evaluation was conducted on three independent blind test sets sourced from various biotechnology laboratories, allowing for assessment across diverse environments.
Results:
The Life Cell AI UAB model achieved a sensitivity of 82.1% in identifying viable cell lines and a specificity of 67.5% for non-viable lines across the test sets. Each blind test set exhibited a weighted accuracy above 63%, with a combined accuracy of 64.3%. Notably, predictions showed a clear distinction between correctly and incorrectly classified cells. The model outperformed traditional QC methods by improving accuracy in binary classification tasks by 21.9% (p = 0.042) and demonstrated a 42.0% enhancement over conventional Standard Operation Procedure (SOP) procedures (p = 0.026).
Conclusion:
The Life Cell AI UAB model represents a notable advancement in biobanking QC, offering a precise, standardized, and non-invasive method for assessing cell line viability. This model has the potential to streamline QC processes across laboratories, minimizing the need for time-lapse imaging and promoting uniformity in QC practices for both cell and stem cells.
Introduction
Biobanks have, for more than a decade, systematically acquired high-resolution cell line and stem cell imaging data (optical microscopy, fluorescent microscopy, etc.) from several hundreds of carefully screened and well-characterized healthy individuals and patients with various cancer and auto-immune disorders.1,2 Using biochemical, genetic, and imaging data, we have built an extensive cohort database (the Cells Line Database cellosaurus.org) that contains a wide range of imaging-associated data, including demographic and cell line viability.3–5 The Biobanks are comprised of the associated collection of biological specimens from these cohorts, including saliva, blood, and, in some instances, urine and hair samples, which allow for additional biochemical and genetic analyses. Stem cell products should be generated in compliance with Good Manufacturing Practices (GMP).5–8 No uniform global guidelines exist for producing and clinically applying stem cell products. The United States Food and Drug Administration, the European Medicines Agency, the Japanese Pharmaceuticals and Medical Devices Agency, and other regulatory agencies currently provide GMPs to promote the safe use of therapies for patients.
Artificial intelligence (AI) is one of the most dynamic and fastest-growing global trends and markets. The global AI market is expected to reach over USD 1591.03 billion by 2030. The accelerated market hype around AI has made it a buzzword in almost every industry. Regardless of their industry, businesses are interested in investing in the potential of AI to automate, assist, and augment various value-based tasks.9,10
Demonstration of comparability depends on agreement on the critical quality attributes. In other words, biological properties (those physical chemicals) must be within the reference limits and range necessary to ensure the quality and safety of the product in its use. Also, it is necessary to consider the methods (validation and verification) that should be used to measure these parameters.3,11,12
The list of Quality Control (QC) criteria for stem cells and cell lines is widely acknowledged by researchers,13–15 to include identity, microbiological sterility, genetic fidelity, stability, viability, characterization, and potency. There is only one quality testing program for biological products used for basic, translational, and clinical research—the Integrated BioBank of Luxembourg (IBBL). However, this program has no stem cells and cell lines. Some biobanks contain stem cells, as well as distinct cell lines; thus, we realize that there is a need for a standard organization to be created to conduct the tests.
Recent studies highlight the increasing demand for standardized quality control in stem cell research, especially as their clinical applications expand. The lack of consistency in quality control protocols has been identified as a significant barrier to the reproducibility of research and the successful translation of stem cell therapies into clinical practice.16,17 Validation and clinical evaluation of stem cells and associated biomarkers are critical for advancing global health initiatives, as these efforts rely on access to a wide array of standardized and high-quality samples. For instance, inadequate quality control measures have been linked to variability in experimental outcomes, impacting therapeutic reliability and safety.18,19 Establishing a comprehensive quality control program would facilitate interlaboratory collaboration, enhance the reliability of future collections, and optimize the use of existing specimens.
The European, Middle Eastern & African Society for Biopreservation and Biobanking is uniquely positioned to spearhead initiatives focused on improving quality control for stem cells and cell lines, given its central role in promoting biobanking standards. As biobanking increasingly incorporates advanced technologies, the digitization of biological data combined with AI presents unprecedented opportunities for improving quality control processes. AI-driven image analysis, in particular, can enhance the accuracy, reproducibility, and scalability of quality assessments, addressing long-standing challenges in the field. This effort aligns with global initiatives to improve the reliability and accessibility of stem cell-based therapies, underscoring the critical importance of robust quality control systems.20,21
This study aims to evaluate AI’s capabilities to assess the quality of stem cells and cell lines based on image data processing on the visible spectrum.
Materials and Methods
A pilot project on quality control of stem cells and cell lines was developed and implemented in partnership with the Ukrainian Association of Biobanks (UAB), Institute of Bio-Stem Cell Rehabilitation, International Biobanking, Department of Surgery No.1, Kharkiv National Medical University and the Department of Medical Genetics, Yerevan State Medical University.
The program was conducted according to international standards of “schemes” of interlaboratory comparisons with the simultaneous participation of biorepository laboratories in different countries.
Our biosample quality control program works as an external quality assessment tool that allows the user to verify and compare the performance of biosample processing or testing methods through the following steps:
Step 1
Standardized samples
After registration in the pilot program, the Biobank receives standardized samples sent to us from participating labs.
Step 2
Routine methods
The Biobank uses a standard processing method to retrieve samples or a routine method to characterize samples.
We used the standard STEMCELL quality control kit (STEMCELL Quality Control Kit, Catalog # 00651, STEMCELL Technologies, and BD™ Stem Cell Control CD34+ Whole Blood Process Control) for the quality control of stem cells. STEMCELL quality control kits are available with human bone marrow (STEMCELL QC-BM) or umbilical cord blood (STEMCELL QC-CB) cells to evaluate the cell type most appropriate for a particular laboratory application.
Step 3
Extracted samples and results
Participants send us their test results via fluorescent in situ hybridization, comparative genomic hybridization, Giemsa (GTG) karyotyping, or whole genome sequencing of stem cells.
Step 4
Statistical analysis and benchmarking
We analyze the received samples, collect the results from all participants, and perform statistical and comparative studies.
Step 5
Using AI, we analyze and characterize cell lines based on the processes of visible spectrum (RGB) image data.
Images were acquired using an inverted microscope (Carl Zeiss IM35) and fluorescence microscopy EVOS AME-3206 Digital Inverted Microscope. Each image was obtained at 20 and 50× optical magnification with a size of 1055 × 1450 pixels. The IT department at UAB (Ukraine Association of Biobank, Kharkiv, Ukraine) conducted the analysis. The cells were seeded at a density in a mixture ratio of 1:100. Microscopy images were taken daily.
Data analysis was performed at the UAB IT department using machine learning (ML) Python. Classification models were developed and evaluated using the MATLAB module “Statistics and Machine Learning Toolbox.” The use of ML mainly focuses on three parameters and the main components of any learning algorithm, namely Task (T), Performance (P), and Experience (E). In this context, we can simplify this definition as—ML is an area of AI consisting of learning algorithms that improve their performance (P) when performing some task (T) over time with experience (E).
Within the IT department of UAB, classification models were developed and evaluated using the Statistics and Machine Learning Toolbox.
The Life Cell AI UAB model was trained using a robust pipeline to ensure precise analysis and characterization of cell lines based on high-resolution static images acquired under controlled visible-spectrum imaging conditions. Each image was resized to a standardized resolution size of 1055 × 1450 pixels to maintain uniform input dimensions and normalized to a pixel intensity range of 0–1, enabling consistency across datasets captured under varying lighting conditions. Histogram equalization was applied to enhance contrast, amplifying features such as membrane boundaries and intracellular structures, thereby improving the model’s ability to extract meaningful patterns.
The preprocessing pipeline incorporated the Canny edge detection algorithm to eliminate background noise and isolate cellular structures. This algorithm identified high-gradient regions corresponding to cellular boundaries. Otsu’s thresholding method was then used to convert the images into binary form, separating foreground cells from the background. To further refine cell delineation, watershed segmentation was applied, facilitating the accurate separation of overlapping cells and ensuring a clean dataset for model training.
The model’s supervised training framework employed a convolutional neural network (CNN) architecture optimized for multi-class classification and regression tasks. The loss function was minimized using stochastic gradient descent with momentum, represented mathematically as:
The dataset, comprising 10,000 annotated images, was divided into training (70%), validation (15%), and testing (15%) subsets. For example, images in the training subset ranged from proliferative cell cultures exhibiting rapid growth to senescent populations with altered morphology. The testing subset, containing unseen data, was used to evaluate the model’s generalizability.
Performance metrics included accuracy, precision, recall, F1-score, and mean squared error (MSE). For classification tasks, accuracy was computed as follows:
For regression outputs such as growth rate prediction, MSE was calculated as:
This combination of advanced preprocessing techniques, meticulous training processes, and rigorous evaluation metrics ensured the Life Cell AI UAB model’s robustness and applicability in biobanking workflows, setting a new benchmark for cell line characterization.
The obtained data were statistically processed using Statistica v. 6.0 (Statsoft Inc.).
Results and Discussion
Using a survey of seven Biobank laboratories that manufacture stem cells and/or cell lines, we show how diverse parameters, assays, and standards are. These differences in institutional definitions and processes can result in variations in clinical-grade cell line production. To define and minimize these inconsistencies, we created the biosample quality control program containing five steps that provide a standardized approach. The steps in Figure 1 specify a systematic way to rigorously characterize biosamples for quality, safety, and compliance and represent an integrated structure applicable to various institutional practices.

In step 1, stem cell lines of all types were received from participating biobank laboratories. Quality control steps 1–5 (standardized samples, routine methods, extracted samples/results, statistical analysis/benchmarking, AI-based visible spectrum image analysis) are shown. AI, artificial intelligence.
Based on the results of a pilot project on quality control of stem cells and cell lines, the following parameters were analyzed: cell phenotype, viability, and growth activity, purity and homogeneity, sterility testing and testing for mycoplasmas, detection of endogenous pathogens, testing for endotoxins, abnormal immunological response, tumorigenicity, biological activity testing, residual culture media.16–18
To thoroughly evaluate the integrity of the genomes of the stem cell lines, we conducted single-cell genome sequencing on a large volume of samples, as illustrated in Figure 2. Such sequencing stratified chromosomal stability and possible genomic aberrations that can affect cell lines’ functionality and therapeutic capacity.

The results of a pilot project on quality control of stem cells and cell lines, showing genome integrity evaluation, metabolic activity assessment, and flow cytometry data for surface markers (CD105, CD73, CD90, etc.).
Concomitantly, hallmark phenotypic markers (CD105, CD73, and CD90) were characterized to validate MSC identity, with ≥95% of the MSC population expressing these markers. To exclude hematopoietic or other non-MSC contaminants, the expression of CD45, CD34, CD14 or CD11b, CD79a, CD19, and HLA class II was also evaluated, with a threshold of 2% positivity.12,19 These evaluations were conducted via flow cytometry, resulting in quantitative information on marker expression.
Measurements were taken at multiple time points after thawing (0 hours, 2 hours, 4 hours, and 24 hours) to investigate alterations in marker expression. This enabled the identification of any trends or changes in the kinetics of surface marker expression over time, indicating cell stress or recovery dynamics. Furthermore, the viability and metabolic activity of the stem cell lines were evaluated using cellular ATP concentration as a robust indicator of cellular health and energy status after cryopreservation.
These data are integrated in Figure 2 and provide a comprehensive overview of genome integrity, phenotypic stability, and metabolic activity in stem cell lines. This comprehensive approach emphasizes the need for multidimensional evaluations, which are crucial for the quality and clinical potency of MSC populations. The stem cell lines were determined by cellular ATP concentration.20–23
As a result of our study, we found no differences when testing stem cells and cell lines from different laboratories using standardized operation procedures) for purity and homogeneity, as well as when testing for sterility, mycoplasmas, and the detection of endogenous pathogens.
When sequencing the genome of a single cell in large samples to understand the integrity of the genome in stem cell lines, we did not find a significant difference between laboratories. In addition, most of the copy number variance found by this analysis occurred in regions identified as segmental duplication or low complexity and was annotated in the genomic variance database.24,25
In the results below, we provide a detailed description of each stage of image processing, microscopy workflow, and inference training for cell segmentation and proceed with image reconstruction, foreground and background segmentation, cell detection, and final single-cell segmentation (image segmentation to create a database for AI, Fig. 3).

Workflow and implementation of the AI training pipeline for cell segmentation: preprocessing (Canny, Otsu, watershed), CNN training, and inference for stem cell classification. CNN, convolutional neural network.
A total of 3781 culture images and 456,998 cell images from 7 participating biotechnology laboratories and 2050 control cell lines were entered into the database for model training, which was done by cross-validation.
In three independent blinded sets of tests from different biotechnology laboratories, the Life Cell AI UAB model showed a sensitivity of 82.1% for viable cell lines for non-viable cell lines with a specificity of 67.5%. The reliability of the AI model is demonstrated by overall accuracy in each set of blind tests for both viable and non-viable cell lines. It was >63%, with a pooled accuracy of 64.3%, respectively. Distributions of predictions showed clear separation of correctly and incorrectly classified cell lines. Binary comparison of viable/non-viable cell line classification improved accuracy by 21.9% (p = 0.042, Student’s t-test). SOP QC comparison showed a 42.0% improvement over standard quality control kits (p = 0.026, Student’s t-test).
Some limitations of this study need further exploration. On the one hand, the classification models were trained and validated using a limited dataset from a small number of specific cell lines instead of covering a broader range of cell types, which may limit the findings’ potential to be more broadly applicable. Although the models achieved high sensitivity and specificity when tested under controlled laboratory conditions, their performance should be validated in real-life biobanking environments with different imaging systems and tissue sample quality. Finally, the real-time analysis also requires a relatively high computational infrastructure level, which could limit its applicability to smaller biobanks with tighter budgets.
In the future, the dataset will be expanded to better characterize the variability in appearance across different cell types and imaging conditions to improve the robustness and generalizability of the classifier models. Moreover, the next phase will focus on testing the models under real-world biobanking settings to ensure their usability in various operational environments. In addition, the models will be made more computationally efficient, such that they will be accessible to facilities that may have limited computational resources. The user-friendly software platform that will integrate these models is expected to promote the uptake of these advancements and support the adoption of standardized biobanking practices.
At the end of the project, an open-access repository will be created to publish the classification models developed during this study. The models aid quality control procedures in biobanking because visible spectrum images allow for accurate, standardized, and non-invasive cell line characterizations. These computational models use algorithms to rate essential cellular features and properties, such as viability, proliferation rates, and phenotype markers, thereby assuring consistency and fidelity in biobanking activities.
These models are helping set a standard framework for cell line characterization, a notorious area of inconsistency between biobanking facilities and research studies. Moreover, the models will be an essential resource to support cellular biology and computational biomedicine research and provide a solid foundation for further exploration and refinement of biobanking quality control applications and consistency and transparency within the scientific community.
Their publicly accessible nature is anticipated to promote transparency and collaboration, catalyzing the development of harmonized protocols and tools to quantify biological materials. This project demonstrates a pledge to progress in biobanking while ensuring the most recent technological developments have broad access and application.
Conclusion
Based on the results of a quality control pilot project, we determined that to ensure the safety and efficacy of stem cell products, each batch of stem cell preparation must meet existing quality requirements for stem cells, including cell identification, viability and growth activity, purity and homogeneity, sterility testing, and mycoplasma testing, endogenous pathogen detection, endotoxin testing, abnormal immunological response, tumorigenicity, biological activity testing, residual culture media, and other optional components.
The Life Cell AI UAB model showed sensitivity (82.1%) when determining viable cell lines while maintaining specificity (67.5%) for non-viable cell lines in three independent sets of blind tests performed in different biotechnological laboratories and Biobanks. The high accuracy of UAB’s Life Cell AI model can improve the Biobank’s sample quality control assessment. It could also optimize the standardization of quality control methods for cell lines and stem cells across different environmental conditions while eliminating the need for complex time-lapse imaging equipment.
Advanced techniques, such as single-cell genome sequencing in large samples, can provide a better understanding of genome integrity in stem cell lines. In addition, using AI allows the user to process a large amount of data. These evaluations should be carried out after cryopreservation during the development and testing phases to ensure no cryodamage. More resources and research must be devoted to optimizing stem cell quality control.
Authors’ Contributions
I.A.K., S.G., and Y.V.I. constructed the study design. I.A.K., K.S., and Y.V.I. contributed to data interpretation and article drafting. Y.V.I. and S.G. contributed to the statistical analysis. I.A.K., K.S., S.G., and M.N.B. prepared the figures. K.S., S.G., and Y.V.I. participated in the clinical investigation and contributed to the epidemiological data collection. S.G., K.S., and E.H. revised the article. All authors read and approved the final article.
Footnotes
Acknowledgments
Availability of Data and Materials
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Ethical Approval and Consent to Participate
This all-institution retrospective cohort study was handled in accordance with the Declaration of Helsinki. The use of registered data follows the General Data Protection Regulation of the European Union. This research was part of the research work of the Kharkiv National Medical University “Improvement and development of methods for diagnosis and surgical treatment of diseases and injuries of the abdominal cavity and chest, vessels of the upper and lower extremities using mini-invasive techniques in patients at high risk of postoperative complications.” The number of state registration is 0116u00499. The Bioethics Committee of the Ukrainian Association of Biobanks reviewed and approved the research protocol pilot project on quality control of stem cells and cell lines developed and implemented in partnership with the Ukrainian Association of Biobanks (UAB), Institute of Bio-Stem Cell Rehabilitation, International Biobanking, Department of Surgery No.1, Kharkiv National Medical University and the Department of Medical Genetics, Yerevan State Medical University (protocol No. 2003/2021, dated 11.11.2021).
