Abstract
Cancer biobanks, when located within a comprehensive cancer center, are characterized by management and organizational peculiarities mainly related to the multidisciplinary information available of such specialized centers and the continuous collection stream of quality-assessed biospecimens. The present study summarizes the main characteristics of comprehensive cancer center biobanks and, more in detail, procedures addressed in order to maintain full control over interlacement issues that occur at every level, from patient enrolment eligibility and consenting to dissemination and utilization of specimens and associated data. Dedicated personnel, appropriate storage facilities, as well as ethical, legal, and technical requirements are among the most relevant aspects strongly conditioning the quality of these structures. Because of its location and the need to be directly connected with clinical units, such as pathology, oncology, surgery, etc., ad hoc information technology tools are crucial to support all aspects of biorepository operations, including (but not limited to) patient enrolment and consent; biospecimen collection, processing, storage, and distribution; quality assurance and quality control; collection of patient data; validation documentation; and management reporting functions.
Introduction
Comprehensive cancer centers (CCCs) are scientific institutions recognized at national level and are privileged sources for biospecimens and longitudinal clinical information. These institutions behold administrative and operative autonomy and have the peculiarity of pursuing a double task; besides the organization and management of highly specialized healthcare services for cancer patients, they pursue research purposes with standards of excellence in the clinical and translational medical fields. CCCs are also involved in establishing guidelines and support to other national health services, providing technical and operational support in order to improve patient services. CCCs also have a predominant role in carrying out the objectives of research in the national health plan and in providing high-level training to healthcare professionals.
Biobanks within these institutions are integrated service infrastructures that provide biospecimens and data critical for understanding the long-term benefits and/or side effects of drugs and various treatment options.
The mandate of a CCC's biobank is purely for research in the field of cancer; therefore, it is disease focused and specifically organized in an integrated facility with multiple sections that take advantage of surgical and clinical facilities within the same location for the procurement of biological specimens and data.
Biobank Workflow
Because of the nature of a cancer-oriented biobank, the policies are designed in ways to gather consecutive collections, meaning that there is no specific aim at time of collection. According to this banking model, in fact, samples of potential interest are collected and stored until needed. The advantage is that it immediately provides investigators with all the samples and data required. Samples must be “fit for use” and this also encourages the collection of specimens even when variables exceed recommended limits, with the prospective to match requirements for future application. As a result of this organizational model, CCC biobanks reveal a peculiarity that lies within the fact that there is a continuous stream of collection of samples and related information, unlike in individual research projects in which samples and annotation are collected only from selected candidates with specific requirements. This means that a CCC biobank has to be considered as a fully operational unit and not a subsidiary service, as in certain cases, characterized by continuous activity and involvement of specific and well-trained staff.
Figure 1 describes the ideal scheme of the complex workflow of an institutional biobank located within a CCC. Complexity arises from the fact that all stages are interlinked and require high-level organization in order to comply with required quality standards.

Institutional Biobank Giovanni Paolo II Comprehensive Cancer Center.
Specific procedures must be addressed in each stage. Figure 2 describes a flow chart referred to the Bari National Cancer Biobank within a progression diagram.
• Collecting patient consent for the collection and utilization of residual surgical tissues and personal and clinical data; • Surgery activity monitoring to minimize latency between excision and processing or storage; • Sample handling must avoid contamination and stress-induced biochemical variations; • Storage conditions should be suitable to sample type and presumed utilization; • Distribution of samples and data in compliance with ethical and international regulations, under supervision of a local ethical committee. Process diagram of an institutional biobank.

Despite the objective being always to reach a high-quality organizational level, a CCC biobank, as an integrated not-for-profit service provider, has to face all technical, financial, legal and ethical issues involved in biobanking and can require, in certain cases, “adjustments” to minimal standards that can be “sufficient” for carrying out activities. A general idea of the life of these biobanks arose from results of a survey conducted in 2007 among most of the existing hospital-based cancer biobanks for research cooperating with the Italian Network of Cancer Biobanks (RIBBO). This network was supported in 2007 by the Italian Ministry of Health and the Action Against Cancer, a project coordinated by Angelo Paradiso of the National Cancer Institute of Bari and Giovanni Migliaccio of the Istituto Superiore di Sanità. The main goals of the project were to develop a national network plan for integration of existing cancer biobanks in Italy. In its initial phase, it has provided an inventory of existing collections suitable for distribution among external researchers. The current phase is focusing on the development of a subsequent program that will define the final structure and management model for biobanks that possess the correct standard requirements as well as the identification of funding sources.
All units taking part in the project submitted a questionnaire containing information about their organizational models, type of samples collected, collection, storage and processing procedures, safety procedures, as well as legal and ethical policies and information technologies (ITs). Evaluation of the questionnaires resulted in substantial operational differences among different units and the need to establish common management rules and standardized operating procedures (SOPs) and to define ethical and regulatory policies. In fact, 68%–84% of SOPs were applied depending on the type of SOP.
Consenting procedures differed substantially among the institutions and the lack of a specific and internationally approved consent form stressed the need, therefore, to develop consensus on a shared procedures and documentation.
Variable-clinical data annotation differed substantially according to the type of biobank, suggesting the usefulness of defining a minimum dataset that would allow homogeneous and optimal classification.
All biobanks performed sample quality control and some of them only during collection and utilization (Fig. 3), suggesting the need to define shared procedures able to establish sample quality and storage conditions over time.

Sample quality controls.
Each biobank had developed its own biobank database software according to specific needs, resulting in issues concerning data sharing between them. Therefore, there is a need to develop a common web-based platform, managed by a data manager, capable of performing conversion and integration of different data formats.
Overall, the survey showed the relevance of some specific issues in conduction of a CCC biobank. These issues are addressed in the following discussion.
Legal and Ethical Issues
Biological samples cannot be collected and stored in a biobank unless the sample donor is informed of the reasons and objectives for which the samples may be used. The donor then may or may not provide consent. In common practice, a procedure that provides the sample donor with fairly basic information on the nature of a biobank and the regulations it follows should be sufficient for the donor to provide or refuse consent for storage and use of his or her biological material for certain generally described objectives, for instance, medical treatment and/or research. Rules governing practices within the healthcare system are produced by country councils and other responsible authorities within the healthcare system.
Consent can be withdrawn entirely or partially at any time, and when referred to any type of use, the sample and/or data must be destroyed or all identification labels removed. In Italy, the Privacy Guarantor, who is responsible of surveillance for the application of the legislation on personal data (law 675/96 and subsequent modifications 9/5/97 and 28/7/97), has authorized data processing only for scientific research purposes. The results obtained may be diffused only in an anonymous and/or aggregated form.
Therefore, a consent form should be developed with characteristics that present potential human donors with sufficient information—including anticipated procedures, risks, and benefits—to make an independent decision to participate in research studies. Aspects that should be clearly addressed are those concerning donor risks and benefits. As these represent a major concern for the donor, it is essential to explain that there are no medical risks for the donor. However, although every effort is made to protect identity, there is a small risk of loss of privacy. Further, it should be clearly stated that no direct medical or personal benefit will result from tissue donation and that researchers hope to learn from research conducted using their biosamples and be able to help others in the future. The regulatory model of informed consent should be based on the type of research in which 1 investigator at 1 institution conducts a study with a limited number of patients.
Obtaining an appropriate informed consent for the collection and storage of biological specimens and data for future use, on the other hand, represents a major challenge, as there is no specific aim at time of collection and this may affect donor confidence in according consent. It appears difficult to completely inform potential donors about anticipated risks and benefits.
The informed consent should not be imposed rigidly and the donor must receive proper information regarding the following points:
• Presumptive scientific purposes • Achievable results • The possibility to restrict access to their data and samples • The right to request and obtain access to all data resulting from the use of their samples • The possibility that the data and the biological samples can be transferred and used in other research projects and in other facilities • How data and biological samples will be stored
Moreover, to ensure individual rights and confidentiality of data, the informed consent should contain explicit information regarding the following points:
• The level of anonymity of samples and personal data that will be guaranteed; • Compliance with adequate procedures to allow the identifiability of the donor exclusively during sample collection time; • That consent is given freely and is revocable at any time; • That sampling procedure for scientific purposes will never interfere with the proper clinical treatment as well as refusing consent will never change the attitude of doctors toward the patient; • That in the event that consent is withdrawn, specimens will be destroyed and any distributed samples may be returned to the biorepository. Nevertheless, a processed sample and the research data generated from it cannot be rescinded; • In the case of violation of privacy, with a consequent damage to the patient, the institution will provide adequate compensation; • The biospecimens, following authorization by the ethical committee, will be stored and used for scientific purposes only and never for commercial purposes; • Donors will never be charged and will not receive compensation for any biobanking procedures.
It is worthwhile that Art. 13 of the directive 2004/23/CE, concerning the definition of quality and safety rules, indicates that the collection of human tissues and cells can be performed only with the authorization of the ethical committee of their own institution and only after the donor has signed the informed consent forms.
Standard Operating Procedures
The absence of a uniform tissue collection procedure and standardized protocols for specimen handling may certainly limit the usefulness of many specimens. SOPs are an essential tool for minimizing preanalytical handling variability that can affect biospecimen quality.
Tissue procurement and storage: technical aspects
Technical issues are the bottom line for establishing a biobanking facility that can deliver high-quality and well-annotated biospecimens. The complexity of the pathway, as described in Fig. 1, of the institutional biobank's flowchart can involve an endless number of activities that need to be performed according to standardized procedures. Standardized procedures can not only guarantee biospecimen quality but also allow samples collected in different sites to be used in the same study and compare results from different studies. As a matter of fact, one of the problems encountered in biospecimen research has been the difficulty in comparing studies because of their different experimental designs and specimen handling procedures, so a standard protocol is necessary to address this problem.
Several international organizations7–10 have provided written guidelines for biorepositories, allowing biobanks to develop harmonized operating procedures that comply with good practice and regulations.
Topics addressed by our institutional biobank in the process of reallocation in the new complex are discussed in the following text.
Preanalytical variables
Specimen collection and handling practices prior to their inclusion in downstream testing may determine sample degradation due to preanalytical variables. Heterogeneity and variability of preanalytical practices is a major source of error in analyzing biobanked specimens. 6
Variables must be carefully monitored and recorded in order to maintain adequate quality standards. Keeping records of preanalytical variables is a gold standard that can allow proper utilization of biospecimens according to study requirements.
Unfortunately, not all preanalytical variables can be controlled, mainly because of the fact that they are out of reach of biobank personnel. In other words, all procedures that are carried out prior to delivery of specimens to qualified biobank personnel can be liable of improper handling and, therefore, compromise sample quality. The medical status of specimen donors can also cause variations of expected results, but these can only be recorded and can represent selection criteria for inclusion in a study.
Another preanalytical variable that cannot be controlled is, for instance, warm ischemia time (from blood vessel ligation to surgical excision time), because it depends on the surgical procedure, which takes absolute priority and sometimes delays delivery of samples to the biobank, causing stress-induced biochemical changes. 11
The cold ischemia time can be controlled and the acceptable limit for cold ischemia time is still a matter of debate, although recent recommendations suggest that it should be kept to a minimum and not exceed 1 h. 12 However, biospecimens collected from surgery should be placed in sterile containers and kept on wet ice during transport to the pathology department and until preserved for long-term storage. The time the tissue was removed from the donor and the elapsed time before tissue was preserved for long-term storage are recorded and entered into the biorepository database. 13
Pathology
A crucial role is played by pathologists, who are responsible for evaluation of sample quality and suitability for biobanking purposes. The pathologist ensures that biospecimens for banking (collected for storage in and distribution by the repository) are obtained only after all patient diagnostic needs have been met and subject to appropriate bioethical structures and procedures to ensure patient protection. 14
Cellularity can be checked by preparing H/E slides of specimens. Specimens should also be trimmed to ensure that specimens (normal appearing or tumor) are >75% pure, thereby minimizing the diluting effect attributable to tissue heterogeneity. One hundred to 500 mg of solid tissue samples should be divided into aliquots and stored, to avoid repeated thawing and refreezing for research.
Storage
Solid tissue specimens can be preserved in 4 ways:
1. Flash frozen immediately with liquid nitrogen 2. Embedded in optimal cutting temperature medium and frozen on dry ice 3. Formalin-fixed and embedded in paraffin (FFPE) 4. RNAlater
The various issues concerning each of the above preservation methods have been widely discussed in literature.
First-hand flash freezing can be considered, but FFPE is the choice when sample quantities are limited.
It is preferable to place specimens in cryotubes prior to flash freezing to avoid contamination.
For long-term storage of specimens, −80°C mechanical freezers are recommended, although liquid nitrogen isothermal freezers would be more appropriate but require higher running costs. 15
Issues have been reported when using RNAlater because of problems concerning more mechanical degradation occurring during cell disruption; enzyme activity is completely inhibited, precluding any trypsin treatment; RNAlater is unsuitable for comparative proteomics analyses. 16
Biospecimen identification
A unique local identifier should be assigned to each aliquot and depending on the nature of the donation process and terms of utilization consented by the donor; this can be done according to procedures provided by the International Conference on Harmonization of Technical Requirements (ICH, June 1996). These procedures contemplate 4 levels of sample identification:
(1) “Identified” data and samples are labeled with personal identifiers such as name or identification numbers. (2) “Coded” data and samples are labeled with at least 1 specific code and do not carry any personal identifiers. (3) “Anonymized” data and samples are initially single or double coded but the link between the subjects' identifiers and the unique code(s) is subsequently deleted. Once the link has been deleted it is no longer possible to trace the data and samples back to individual subjects through the coding key(s). (4) “Anonymous” data and samples are never labeled with personal identifiers when originally collected, neither is a coding key generated. Therefore, there is no potential to trace back genomic data and samples to individual subjects.
Because of the need to update the biobank database with patient follow-up information, we apply the “coded” procedure in our biobank. A “connection” code is generated to link samples to patient's personal and clinical information and access to patient information is restricted to authorized personnel.
Annotation
Prior to the collection or removal of biospecimens, a plan should be in place to allow for the appropriate annotation of the biospecimens. This annotation should include information about donator and timing of collection and processing activities. 10 The data should be maintained in a database that can be linked to the specimen at all times.
A problem encountered within the Italian Network of Cancer Biobanks (RIBBO), of which the Institutional Biobank of the Giovanni Paolo II NCC is coordinator, was the amount and type of information to be annotated. Each of the 20 participating institutions had its own dataset not matching to the other.
Minimum dataset
As for the kind of data collected, obviously this will depend on the characteristics of the single biobank and of the single project associated with a given sample collection. Nevertheless, the definition of a communal minimum dataset is fundamental for sample and data sharing. In Italy, the RIBBO network of biobanks (the network of biobanks of Alleanza Contro il Cancro) has developed, after a long discussion, a common minimum dataset for cancer biobanks.
The dataset consists of a number of variables, some of which are compulsory, some not. The dataset is organized with variables (attributes) associated to the entity “donor” and some associated with the entity “sample.” Donor variables include information about age, sex, vital status, and the donor local code. Sample variables include a section on general sample information (date of collection, local code, organ description, cold ischemia time, etc.) and specific sample data (diagnosis, topography, TNM, grade, etc.). Conservation information and quantity are also requested. Moreover, the single biobank has to flag its samples, indicating the type of usage allowed (no restriction, restricted to certain types of projects, depending on a collaboration agreement). One important figure of this minimum dataset is that it is not required to give detailed information about demographic, clinical, or epidemiological information or on the genotypic end/or phenotypic characterization of the samples, but the biobank has to indicate for every sample their availability. In this way, the scientist querying the communal database will be able to identify samples of interest and the associated information can be retrieved on a case-to-case basis.
A biobank is defined as a collection of biological samples and of their associated demographic, clinical, and biomolecular data. It has been suggested that the scientific value of a collection correlates with the relative availability of data associated to a given sample: samples with only “basic” information associated, such as diagnosis or stages, will be more abundant but less “valuable” than samples with more complex data, such as treatment outcomes or long-term follow-up. 12 Nevertheless, if collecting detailed information increases the value of a collection, on the other hand, international cooperation among biobanks and research institutions requires more and more the adoption of compatible IT solutions for data management and the definition of a common minimum dataset and of a common coding system for topography and diagnosis. Although, for the latter, international, universally used coding system such as ICD-10 and Snored are available, the minimum dataset has to be established on a case-to-case basis for different biobank networks.
IT Solutions
Collecting and managing clinical data
Generally, IT has been defined as being the study, design, development, implementation support, and/or management of any computer-based information system. IT deals with using dedicated softwares to convert, store, protect, process, retrieve with security, or transmit any information.
Extensive annotation of tissue specimens is crucial to the overall usefulness of a biorepository as a resource for scientific research. In particular, the data recorded by tumor biobanks, which are utilized for a variety of purposes, including target treatment discovery and validation, prevention research, research on early detection, and genetic and epidemiologic analyses, will depend on the types of collected samples and the studies they support. Therefore, IT, becoming increasingly critical to the research purposes of tumor biobanks, must be robust and reliable and able to meet changing needs while remaining interoperable. In this context, they should cover 3 main fields:
(1) Data management (2) Inventory control (3) Tracking
An IT system should support all aspects of biorepository operations, including (but not limited to) patient enrolment and consent; biospecimen collection, processing, storage, and distribution; quality assurance and quality control; collection of patient data; validation documentation; and management reporting functions.
The IT system should also manage specimen clinical annotations and, where possible, provide patient follow-up information in compliance with ethical considerations and appropriate regulations.
The IT system should also be interoperable with different endpoint data (eg, proteomics and genomics) to ensure that integration of data from multiple sources is possible.
General informatics guidelines
(1) A unique identifier should be assigned to each sample (number and/or barcode) at the time of collection.
(2) The same identifier should be assigned to specific clinical and epidemiological data.
(3) The same number or code should be used to track a tissue specimen from the biobank through processing, storage and distribution.
(4) The database should be updated each time the specimen is moved within or out the biorepository.
Functionality of IT
IT should be based on the specific needs and purposes of the biobank. SOPs for the activities carried out in a biorepository should largely drive the design of the informatics systems. Moreover, IT should focus on inventory functions, tracking all phases of sample acquisition, processing, handling, quality control, and distribution from collection site (patient) to utilization (researcher). Tracking should be also provided in case of restocking of returned, unused samples from the researcher.
IT must be able to link the information to the specimen (paper labels/bar codes) and should be able to track clinical data associated to a specimen. To this end, a minimal clinical dataset should be collected for all stored samples.
Because of the difficulty and time needed to transfer surgical pathology reports into the biobank database, a helpful tool can be software that is able to extract structured information contained in pathology reports and send it to the biobank database. Obviously, such tools performances should be routinely monitored in order to be validated for accuracy. After initial development, IT should be periodically revised.
Access to the database should have restriction levels in order to minimize the risk of improper use of personal data.
Assessing biobank IT
Existing biobanks should be evaluated on the basis of their respective levels of informatics capabilities, including the use of common data elements, access to data through standard queries, data accuracy, and adherence to other stated guidelines. The IT systems should provide reporting capabilities that allow biobank managers to monitor status in terms of the scientific best practices. The system should be able to provide information in order to guarantee specimen quality.
An efficient IT system should also be able to provide full system statistics and audit logs regarding all accesses in order to protect health information in the database.
Commercial softwares for biobank management [Biobank Information Management Systems (BIMS)] have been mostly developed de novo, but in several cases previously developed Laboratory Information Management Systems have been adopted and several biobanks have developed internally their own management system to better respond to their local management and organizational requirements.
In the absence of legislation that defines correct criteria for biobank accreditation and functioning, it is hard to define minimum requirements for IT systems: in Italy, the references have been so far 2 acts of “soft law”: the authorization to genetic data handling of the Authority for Protection of Personal Data and the guidelines of the National Committee for Biosecurity, Biotechnology, and Life Sciences. In particular, the authority for genetic data handling imposes to keep donor identity separated from clinical and molecular data. From the technical point of view, this can be done either by using 2 separate subject ID and clinical and molecular data databases linked by a code or by using a specific software that records automatically the donor ID and the clinical/molecular data on 2 separated tables/databases. Moreover, although it does not represent specific requirements for BIMS, many products have been developed in order to be compliant with the FDA CFR 21 Part 11 regulation (the rules that allow the use of electronic records and electronic signatures for any record that is required to be kept and maintained by other FDA regulations) and Good Automated Manufacturing Practice 5 (GAMP5).
On a technical level, an IT system for the management of a biobank should consist of 2 layers: 1 layer for the management of the biological samples capable of recording information such as storage, the cold ischemia time, as well as sample processing, and a second layer for the management of the clinical, demographic, epidemiological, and biomolecular data associated to a sample.
A useful way of designing a biobank database is by applying an entity/relationship (E/R) model 17 ; in simpler words, entities are elements that are recognized as being capable of an independent existence and that can be uniquely identified (in our case, entities normally are “donor,” “biological sample,” and “aliquot”). Relationships are the description of how 2 entities are related to one another (in our case, the entity patient “owns” the entity “sample,” which in turn owns the entity “aliquot”). This E/R model can be easily described with a diagram as shown in Fig. 4.

Entity/relationship model.
Each entity will have a number of attributes that represent the associated data. For example, the entity “patient” will have attributes such as birth date, ethnic origin of the donor, the familiarity for a certain disease, and so on. In the same way, the attributes of the entity “sample” will be the sample type (solid tissue, blood, urine, serum, and so on), the organ from which the sample was obtained, the lag time between resection and freezing, etc., whereas the attributes of the “aliquot” entity will be the storage location, specimen availability flag, or the study in which the aliquot has been used.
Networking
Research studies require not only high-quality samples but also adequate numbers. The rarer the disease, or specific subtype of interest, the more difficult this is to provide. As a CCC, the number of samples is restricted not only by the number of cancer cases seen per year, but also by factors such as the diminishing size of cancers as a result of successful early detection through screening programs. Often, cancers are too small to permit sampling for research. To increase the sample numbers available for research studies, tumor banks need to collaborate and pool their collections. Building the capacity of biological resources is one of the main goals of national and international networks along with exchange of data and material at the international level. Again this means that SOPs are mandatory in order to obtain comparable results when using samples delivered from multiple collection sites.
Data associated with research samples and patients from whom they are donated must be structured, comparable, and machine readable. Therefore, one of the key issues in national and international biobanking organizations is to define and apply standardized annotation and management of clinical data. Data integration is important on multiple levels. As the situation exists today, data integration tools and appropriate methodologies are important to maximize scientific value from the multiple datasets at biobanks. As discussed earlier in this article, biobanks should agree on a minimum dataset to be interchangeable between biobanks and identify the complete ontology and multilingual definition for this dataset.
Conclusions
A CCC biobank, as described earlier, is an extremely complex structure that can present issues at every level of activities. These issues must be addressed in the outset of the biobank, but in the case of biobanks deriving from the evolution of a repository collection, “adjustments” are necessary in order to provide the biobank with the correct instruments for conducting activities with high-quality standards.
Unlike project-based collections, in which donors are identified and located in the effort to gather follow-up information, the consecutive collection biobank model of CCCs has an added value because of the opportunity to collect high-quality cancer biospecimens with extended clinical annotation including follow-up information from patients who undergo surgery and treatment within the same hospital.
The advantage of being able to monitor and control every process during collection, storage, and distribution of samples and data represents a high-level quality control tool that can rely on procedures carried out in the same facility.
Sample distribution and utilization are guaranteed owing to supervision of the ethical and the scientific committees within the biobank institution.
List of RIBBO Biobanks and Coauthors of the Article
P. Roazzi ISS Roma, Italy; G. Botti Fondazione G. Pascale, Napoli, Italy; A. Steffan CRO Aviano, Italy; M. Truini IST Genova, Italy; S Pece IEO Milano, Italy; M. Rugge IOV Padova, Italy; T. Faraggiana IDI Roma, Italy; M. Alberghini IOR Bologna, Italy; A. Albini Multimedica Milano, Italy; G. Finocchiaro Neurologico Besta Milano, Italy.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
