Abstract
The International Agency for Research on Cancer (IARC) is the World Health Organization's (WHO) cancer research agency. The agency conducts research on cancer with worldwide collaborations, adopting a multidisciplinary approach of epidemiology and laboratory-based studies on cancer causes, as well as preventive interventions. The IARC Biobank stores multiple collections of samples and conducts preanalytical services for studies conducted worldwide in support of the research activities. Traditionally, the multiple collections from these studies were managed by the individual research groups in different project-specific databases. In 2010, a program to centralize sample collections into a single platform was initiated by adopting a common database with the introduction of a minimum dataset for sample collections received at IARC. The process involved checking data files, verifying the storage location of samples, conducting data harmonization, and importing or migrating existing data from project-specific spreadsheets and databases into the common database. In addition to the creation of a common biobank database and an up-to-date inventory of IARC's biological resources, a governance structure was established. The creation of the Biobank Steering Committee and the adoption of an access policy is to facilitate and guide the sharing of IARC's resources in a transparent manner, while taking into account Ethical, Legal, and Social Issues.
Introduction
Advances in research employing assays and state-of-the art technologies1,2 rely on the availability of high-quality samples and data and are driven by the goals of personalized and precision medicine. 3 Biobanks play a key role in these developments and have enabled the molecular and genetic analysis of large numbers of samples from population-based studies. However, maintaining biobanks requires management of infrastructure and processes. The adoption of reliable information technology (IT) tools, quality assurance processes, and a robust governance structure with due consideration of Ethical, Legal, and Social Issues (ELSI) 4 contribute to the success of biobanks.
The International Agency for Research on Cancer (IARC) Biobank is a key infrastructure supporting IARC's mission of promoting cancer research worldwide. IARC adopts a multidisciplinary approach of epidemiology and laboratory-based studies, focusing on genetics, environment and occupation, nutrition and metabolism, and infections.
IARC plays a leading role in developing international biobanking standards and guidelines, particularly in low-resource countries. The “Common Minimum Technical Standards and Protocols for Biological Resource Centers Dedicated to Cancer Research” was published in 2007 as one of the pioneering documents of standard protocols for biobanking. A follow-up publication, the “Common Minimum Technical Standards and Protocols for Biobanks Dedicated to Cancer Research” 5 was published in 2017 and provides new information on biobanking with specific sections dedicated to ELSI.
IARC's role in international biobanking is furthered by its membership and contribution to international partnerships. Such as memberships of the European and Middle Eastern Society for Biopreservation and Biobanking (ESBB), the Biobanking, BioMolecular resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC), 6 and the International Society for Biological and Environmental Repositories (ISBER). In 2013, IARC established the Low and Middle Income Country (LMIC) Biobank and Cohort Building Network (BCNet), 7 in partnership with several international organizations, including U.S. National Cancer Institute-Center for Global Health (NCI-CGH). BCNet members benefit from training and technical support for the upgrade or establishment of biobanks.
The IARC Biobank stores samples from population- and disease-based collections, from studies focused on cancer etiology and prevention, including cohort, case–control, and case-only studies (Table 1).8–15 The samples are divided into three categories (category 1, 2, and 3): category 1 consists of collections originating from ongoing worldwide consortium studies. The main collection in this category is the European Prospective Investigation into Cancer and Nutrition (EPIC). EPIC is one of the largest population-based studies with more than half a million (521,000) participants (350,000 provided blood samples) recruited from 23 centers across 10 European countries. Other major collections include multicenter case–control studies conducted by IARC in collaboration with local partners on breast cancer in Latin America, human papilloma virus vaccination and cervical screening in Europe and in LMIC, hepatitis B virus vaccination and liver cancer studies in Africa and Asia, and case–control studies on lung, head and neck, and kidney cancer in Europe.
IARC Sample Collection with Information on Study Design, Date, Sample Origin, and Type
FFPE, formalin-fixed paraffin-embedded; HIV, human immunodeficiency virus; HPV, human papilloma virus; IARC, International Agency for Research on Cancer; NPC, nasopharyngeal carcinoma; RBCs, red blood cells; UADT, upper aerodigestive tract.
Category 2 samples are from studies that have fulfilled the primary endpoint and the samples are available for international collaboration. Category 3 samples are human or animal cell lines established at IARC or obtained from collaborators, representing over 13,000 vials.
Prior to the centralization, existing data on the three categories of samples were kept at different places (e.g., biobank and research groups) and in different formats (access, excel, word, and paper documents). The IARC Biobank also accommodates the preanalytical service platform for sample reception, storage, retrieval from storage facilities, aliquoting, DNA extraction and quantification and quality control (QC) on extracted DNA, and worldwide distribution of samples.
The biobank is a good example of a facility that started by providing storage facilities for projects managed by individual groups and evolved into a multiuser resource. Due to an increase in the number of collections, a change in scope and focus of IARC, and demand for the reuse of samples, the IARC Biobank needed to transition into a centralized facility. The resources are managed through a common database developed in-house 16 to enhance data harmonization and guided by international guidelines.
The adoption of the common database facilitated the centralization process. However, there were various challenges to overcome in terms of concept and resources. In this article, we will present the process, challenges, and solutions employed in achieving the centralization, with a view to providing a case study of a not uncommon transition.
Materials and Methods
Organizing the sample collections stored at IARC Biobank
From the early 1970s to the late 1990s, samples were stored in freezers and liquid nitrogen (LN2) tanks located in multiple sites. Located in the research laboratories and in the basement of the main IARC building without any global management system of the storage facilities. The Biological Resource Center (BRC) was constructed in 1995, to host the LN2 tanks for the EPIC samples. An automatic refilling system was installed, with a 7000-L supply tank located outside the building.
Early in the 2000s, the sample storage facilities were reorganized. A dedicated storage area was allocated to accommodate 40 freezers (−80°C, −40°C, and −20°C) and 9 (30–50 L) LN2 tanks. A basic alarm system to monitor freezer temperature was installed, based on the internal freezer temperature probe and the existing IARC Centralized Technical Management system. Later, the LN2 tanks were transferred to the BRC, creating an LN2 facility of 34 (600-L) tanks (containing almost 3.5 million straws) and 9 smaller tanks. For optimizing LN2 delivery schedule and reducing the related cost, in 2012, the 7000-L supply tank was replaced by a 20,000-L tank.
Over the years, the small tanks were replaced by bigger additional tanks to house new collections. Currently, over 4.3 million blood-derived products, tissue, and cell lines kept in straws and vials are stored in a total of 50 LN2 tanks. In addition, over 900,000 samples are stored in 67 freezers (e.g., blood-derived products, tissues, body fluids, exfoliated cells, and nucleic acids) and at ambient temperature (e.g., dried blood spots on filter paper, hair, nail, formalin-fixed paraffin-embedded blocks, and slides) (see Table 2 for storage conditions). Freezer space is continuously being expanded to cater for growing needs, to cater for adequate backup space to accommodate the heterogeneity in terms of racks and content, and also the increase in breakdowns, which is anticipated with the aging freezer units.
Storage Conditions of Samples Stored Within the IARC Biobank
H&E, hematoxylin and eosin; OCT, optimal cutting temperature; PBMC, peripheral blood mononuclear cell; RT, room temperature.
See Figure 1 and Supplementary Table S1 for geographical distribution of samples by country in the IARC Biobank; a comprehensive list of the collections is available at the IARC Biobank website and in Table 1.

Geographic distribution of samples by country in the IARC Biobank. The samples are from ongoing or completed studies conducted worldwide in collaboration with IARC. IARC, International Agency for Research on Cancer.
Establishing the centralized biobank
The centralization program initiated in 2010 involved the reorganization and verification of samples in defined storage locations, the introduction of a common database, and the establishment of a governance structure. The IARC Laboratory Services and Biobank (LSB) Group became the focal point to implement the centralization and manage the changes in line with the strategic direction provided by the Biobank Steering Committee (BSC). The LSB group conducted its biobanking activities in collaboration with the project sample custodians and sample managers, and was supported by the IT department.
The common biobank laboratory information system (SAmple Management for IARC Biobank [SAMI]) was developed in-house and adopted in 2011 to store and manage sample location and movement. The program stores information on sample identifiers, geographical origin, age, sex, sampling date, and links with clinical, epidemiological, and related individual identifications (IDs). Sample IDs and locations were verified for existing collections by comparing extracted data from spreadsheets and project-specific databases with actual storage locations.
Security, quality, and monitoring
Access to the storage facilities is controlled and restricted to authorized staff members by badge reader at the entry of storage areas. QC measures are in place, including the checking of the worklists and the samples prepared for extraction, aliquoting, and shipment by a second person to ascertain that the correct samples are being pulled out of LN2. Also, routine QC checks are performed for DNA purity and stability by electrophoresis on agarose gel, immediately after extraction and monthly checking of random samples during storage. The biobank participates in annual proficiency testing schemes that evaluate the performance of DNA extraction and quantification and has obtained excellent results (IBBL Biospecimen Proficiency Testing).
In addition, the LSB group conducted studies to evaluate the impact of preanalytical variations during blood and tissue processing and freezing on the quality of extracted DNA and RNA in the context of studies using standard protocols.17,18 The majority of samples used for those studies came from the EPIC collection. Although the results were not shared with sample custodians, as the recruitment was completed at the time of the analysis, the results contributed to the modification and confirmed the DNA extraction protocols, which had previously been introduced to increase the number of buffy coat aliquots used for DNA extraction from single straw to two.
Establishing a governance structure
Adopting transparent principles for sharing samples and data for international collaborations is not only a requirement of a publicly funded biobank but also fulfils IARC's goal to promote collaborative cancer research worldwide. In this regard, a governance structure was put in place with the establishment of a BSC. The committee is chaired by a senior scientist, other members include representatives of the research groups, technical and administrative support, and the LSB Group. The committee's roles include ensuring that the IARC Biobank operates in line with international technical and ethical standards, reviewing access requests when necessary, and providing advice to management in terms of strategic development.
A formal Access Policy was adopted to facilitate a transparent process of access, attract potential research collaborations, and provide the opportunity for IARC to share its biological resources at the IARC Biobank website.
Procedures for sample and data access, and material transfer were strengthened with the introduction of policies and standard operating procedures to establish material (and data) transfer agreements in line with international guidelines (IARC, 5 NCI, 19 and ISBER 20 ) and are published on the IARC Biobank website. Approval by the IARC Ethics committee is sought before the inclusion of samples in research projects.
Results
The reorganization and centralization of the IARC Biobank offered the opportunity to review and upgrade the biobank facilities. It was also possible to prioritize the stored samples based on their value, strengthen the governance structure, and increase the visibility of the IARC Biobank in the international biobanking landscape. Altogether, the creation of the IARC Biobank website in 2011 and the formal Access Policy had a significant impact on the access rate. All requests sent to the IARC Biobank are centralized, referenced, and archived.
The sample re-organization involved the verification and manual checking either of all samples (for critical collections) or 10% of randomly selected samples. When discrepancies occurred among the 10% random samples, a larger proportion of the samples were checked. The process of verification, inventory, and upload into SAMI for samples stored in LN2, in freezers, and at ambient temperature involved three biobank technicians, spending a total of ∼13,500 hours/1 million samples and was conducted over a period of 4 years. The process created the opportunity to (1) rectify discrepancies between the physical location of samples and the record in the original databases, (2) rationalize and gain space in the storage device by filling the gaps left by sample usage, and (3) correctly label all storage containers (racks and boxes) with appropriate, printed labels resistant to cryogenic temperatures.
The freezer temperature monitoring system was upgraded in 2012 to an integrated monitoring system equipped with independent and external sensors that provides real-time temperature records and a reliable alarm reporting. Trained personnel respond to alarms in a timeframe that prevents or minimizes loss or damage to samples. A second layer of backup is provided by on-call duty roster staff members, available 24 hours-a-day, who are called in the event of an emergency requiring intervention. The air conditioning and electrical network were also upgraded to secure the freezer facility in case of power cut. A diesel generator is available in case of major power cut and is tested every year.
To standardize the information received with the new collections that are shipped to IARC from international sites, the Minimum Dataset (MDS) was developed. The MDS was used to capture the different preanalytical processing conditions that could affect downstream analysis in the samples (Table 3). The “Common Minimum Technical Standards and Protocols for Biobanks Dedicated to Cancer Research” (2017) provided recommendations and guidelines for the MDS. The recommended MDS is being adopted for new collections, received at IARC, as many of the older collections do not have the required information related to preanalytical processing conditions. Samples were uploaded in SAMI with the available data and no collection of noncompliance with the new MDS was discarded.
Minimum and Recommended “Dataset” for IARC Biobank
Items are optional, but highly recommended.
ID, individual identification; MOU, Memorandum of Understanding; SAMI, Sample Management for IARC Biobank.
Discussion
Before the centralization, samples were stored by individual research groups and managed using a variety of IT systems specific to their projects and downstream applications. This meant that harmonization between databases would require extra steps to standardize datasets and the lack of a uniform biobank platform was a challenge for adherence to best practice principles and international guidelines.
Centralization created the opportunity to organize the samples and associated data, create a common platform and protocols, develop a catalogue of available resources, and encourage adherence to best practice principles and guidelines.
Three important benefits resulted from this work; first, the centralized storage facilities are maintained under the best possible conditions, where it is easy to monitor and maintain a stable and secure environment.
Second, the common MDS and documentation of the preanalytical processing information obtained for samples stored at IARC will facilitate harmonization and interoperability between sample collections and associated database.
Third, incorporating the heterogeneous sample collections into a global catalogue will facilitate the search for information on IARC-based collections and available resources to maximize the utility and visibility of available resources. This is in line with IARC's goal to promote international cancer research collaboration (http://internet.iarc.fr/). During this exercise, equipment replacement, upgrading the facilities, and recommendation of 10%–15% backup system were identified as urgent need for the biobank's sustainability due to the high proportion of aging and obsolete equipment. As a result of this work, funds were allocated in 2015 for the acquisition of new cold storage units to support existing collections and cater for expansion with adequate provision for backup facilities.
Three challenges had to be dealt with during reorganization of the IARC Biobank. First, in the sample management IT, the system had to provide the functions to annotate, archive, track, and manage the diverse collections, sample types, and associated data. It had to be compatible with existing IT platforms. Due to the difficulty in finding a fit-for-purpose commercial system, an in-house system (SAMI) was developed. The IT tools (Oracle software, i.e., database and application server, and dedicated servers) were already available in-house; thus, the improvement and upgrading of SAMI are performed locally at minimal extra cost.
Second, the existing project-specific databases lacked standard items, which are available in the Minimum Information About BIobank data Sharing (MIABIS), a minimum dataset for sharing biobank samples, information, and data. 21 Thus, for data harmonization between different databases, the data in the existing databases had to be standardized before migration and/or importation into SAMI, and this was time-consuming.
The third challenge concerned the acquisition of complete information on preanalytical variables from the clinical or field sites where samples are collected. The biobank is distant from the sites and preprocessed biological materials are shipped to IARC either frozen or in preservative depending on the sample type and the project protocol. The full adoption of the MDS requirement poses a challenge and requires close collaboration between the biobank, the principal investigators, and recruitment centers to ensure compliance.
A key objective of the reorganization is to increase access and usage of IARC's resources. In this regard, the central management of access requests for category 2 and 3 collections to enable the biobank to monitor requests. This has resulted in an increase in response to requests for cell lines and the establishment of new collaborations. Specifically, in 5 years (since 2013), the IARC Biobank has provided samples for a total of seven requests for category 2 and 3 samples; six of them were for cell lines resulting in 1.4 requests/year. This compared to four requests from 2004 to 2012, that is, 0.44 requests/year.
Access to category 1 samples is sought through the respective steering committee, which maintains the record.
The IARC Biobank is set up to benefit from participation in global catalogues such as the one established by the BBMRI-ERIC, 6 of which IARC is a member (with observer status). It is anticipated that this collaboration will continue to increase the utility of IARC stored samples.
In conclusion, a centralized biobank platform was created to address the fragmentation of data and information for stored samples and produce harmonized data sets with the potential for interoperability with related platforms and databases. The creation of a centralized biobank platform, which provides core services to support IARC's research and international collaborative studies, would not have been possible without the collaboration of the key stakeholders: the principal investigators, sample custodians, sample managers, BSC, LSB, IT department, and support from IARC management.
Continuous upgrading of the procedures and protocols ensures that the biobank maintains a high level of quality and provides the opportunity for staff training and on-site training for international collaborators, and for students enrolled in a biobank masters course at the Catholic University, Lyon.
Footnotes
Acknowledgments
IARC Biobank Steering Committee: Dr. Rolando Herrero, Dr. Rosita Accardi-Gheit, Dr. Gary Clifford, Ms. Elisabeth Françon, Dr. Ausrele Kesminiene, Dr. James McKay, Dr. Sabina Rinaldi, Dr. Augustin Scalbert, Dr. Ghislaine Scélo, Dr. Eduardo Seleiro, and Dr. Jiri Zavadil.
The authors would like to thank Dr. Catherine Voegele for her expertise and assistance in the development of SAMI and Lucile Alteyrac for her expertise and daily support for the development and upgradation of SAMI.
Author Disclosure Statement
No conflicting financial interests exist.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
