Abstract
The overall goal of the Qatar Biobank (QBB) is to collect, manage, and distribute high-quality human biospecimens with appropriate clinical and/or research annotation and associated phenotypic data, aiming to be an important and essential resource of medical research and evidence-based health care system policies in Qatar. To manage and collect large volumes of data, the QBB has been investing in a number of information management solutions, trying to avoid inflexibility of traditional systems and accommodate changes in data sources and workflows. This article aims to present the information technology solutions of QBB based on a free, open-source software solution, considered a reliable alternative to commercial solutions. After evaluating the free, open-source software solutions available for biobanks, Onyx from ObiBa was utilized to develop custom components to interface various clinical devices, LIMS and Picture Archiving and Communication System, which has varying integration capabilities. This is a showcase for biobanks to carefully evaluate and select hardware and software to automate their operations providing the functions required for business continuity.
Introduction
Since 2012, the Qatar Biobank (QBB) has collected biological samples and health data from adult (≥18 years of age) men and women, Qatari or long-term residents (≥15 years living in Qatar), and follows up on these participants every 5 years.1,2 At present, QBB has enrolled more than 25,000 participants and is aiming to reach a total of 60,000 participants. QBB acts as Qatar's National research platform, providing the scientific community with biological samples and annotated data (i.e., phenotypic, clinical, multiomic data) encouraging practices of collaboration that hold the potential to invent novel health care interventions, improving our knowledge on the development of human diseases. 2
In addition to the data and sample repository, QBB supports researchers by providing facilities services (i.e., examination rooms), analysis services (i.e., DNA and RNA extraction and Complete Blood Count (CBC) analysis), storage services (i.e., −80°C, liquid nitrogen), and information technology (IT) services (i.e., customized solutions) according to the needs of different research initiatives. 2
Modern biobanking has become an essential part of biomedical research.3,4 The mission of QBB is to manage and distribute high-quality biospecimens with appropriate clinical and/or research annotation and the associated phenotypic data to its stakeholders. Moreover, QBB is linked with Qatar's national health care system to exchange information either by receiving medical data or by referring participants to the Hamad General Hospital. In this context, an efficient, robust, and scalable computing or informatics systems should be in place to support the various key functionalities of QBB.
At the time QBB was established, there were limited available IT solutions for biobanking. Acknowledging the importance of the IT department, the QBB invested in it by hiring many new staff. The QBB IT team then developed a dynamic information infrastructure consisting of heterogeneous solutions using open-source, in-house developed, commercial, and customized solutions to support the processes and workflows between QBB and internal departments and/or external stakeholders.
This article aims to outline the current QBB IT infrastructure and describe our approach to data integration, methods for data extraction, de-identification, and the consolidation of data in a shared data model, the data marts, and abstraction layers, as well as the query system for the researchers; and discuss the major challenges to build and maintain such infrastructure.
QBB IT Infrastructure Design and Implementation
Conceptual basis of QBB IT infrastructure
The hardware and software at QBB support processes such as the management of large data, and the maintenance of the privacy and confidentiality of the participants, facilitates efficiencies in the laboratory, and enables the management of multi-omics data, as well as data integration with the Qatar national health care system. QBB Informatics team designed the IT infrastructure by first delineating critical requirements to meet departmental expectations. The conceptual basis of the QBB IT infrastructure design is presented in Figure 1. QBB operates within two different networks; the Clinic network that is a closed (air gapped), secured network and the Admin network that is open, exposed to the internet, managing communication and data integration with external systems (Fig. 1). This design is built to comply with the QBB data management and information security policy, minimizing the risk of a breach of sensitive information. More specifically, the Clinic network encompasses systems such as the production application servers of clinic and laboratory, the Production (Active Directory) AD Servers and Files sharing Server, production databases, production Picture Archiving and Communication System (PACS) servers, the Clinic network management system, development/test servers for clinic/laboratory systems, Cantab server (cognition function application), Automated Biorepository server, PACS for magnetic resonance imaging (MRI) and ultrasound, Clinic personal computers (PCs) and Clinic Instruments, DICOM modalities, and Laboratory PCs and instruments. The Admin network incorporates the production systems such as temperature monitoring solutions, Door Access, File Server and Active Directory (users access management), Participant Appointment Management System (PAMB), Development PCs used by system developers, MRI modality, Automated Biostore, connection to health care providers (i.e., Hamad General Hospital and Sidra hospital) through Qatar Government Network, and the Test and Development Servers for the various applications. The access to internet and shared services is restricted to administration PCs, Wireless networks, and iPads through MPLS line (Bandwidth 50 Mbps) to Qatar Foundation Headquarter (Fig. 1).

Conceptual basis of QBB IT infrastructure. A schematic presentation of the QBB IT infrastructure design, showing the air-gapped network Clinic and Admin, the DMZs, and links with external entities. AD, active directory; DMZ, demilitarized zone; GN, Government Network; IT, Information Technology; Prod, Production; QBB, Qatar Biobank; SSID, Service Set Identifier; VLAN, Virtual Local Area Network.
QBB IT department structure, competencies, and role
The biobanking landscape is continuously evolving from local biorepositories to robust organizations.3,5–7 Likewise, QBB has been evolving over the years by using high-throughput technologies, adding different types of biospecimens in the data repository to meet the requirements of the different research initiatives in sample collection, processing, and storage, handling large amount of data, ensuring compliance with ethical laws, policies, and standards, and providing data sharing services. The QBB IT department consists of several staff with distinct roles (Table 1) to support QBB business continuity and evolution.
Qatar Biobank Information Technology Team Competencies and Role Description
IT, Information Technology; PCs, personal computers; QBB, Qatar Biobank.
QBB hardware and basic software
Key components for the success of the QBB IT infrastructure are the proper operating systems to run the software with sufficient memory and acceptable processor speed for efficient operations. As the QBB research platform grows, system and integration needs are evolving as well, requiring greater processing power and extra storage capacity. Table 2 describes the current hardware and basic software specifications that support the QBB IT infrastructure. The QBB operating system was selected based on its broad compatibility with the systems used or potentially to be used. An important element of the QBB IT infrastructure for the protection of sensitive participant information is the security software against malware. Security systems and processes for regular scheduled system scans and updates are in place to protect QBB data from malware, including viruses, spyware, adware, and ransomware. In addition, hardware firewalls in the two different networks (Clinic and Admin) are in place to monitor network traffic and when malicious actions are detected, data transmission is blocked (Fig. 1). Another main component for QBB data integrity is the system and data backup. QBB system backs up data through Direct Attached Storage and Network Attached Storage in physical storage forms. No cloud storage is used to minimize the possibility of a data breach and in compliance with Qatari Government Regulation (Data Management Policy). 8 The primary backup data set up is located at QBB facilities, including servers, storages, and backup tapes.
Sample of Qatar Biobank Clinic and Admin Servers Specifications
CANTAB, Cambridge Neuropsychological Test Automated Battery; LIMS, Laboratory Information Management System; PACS, Picture Archiving and Communication System.
Periodically encrypted backup tapes are transferred outside the QBB building and stored in a secured environment within a fire safe cabinet. Furthermore, QBB established a secondary data center site to support business continuity in case of primary data center site failure. The data center is built on a virtualization environment to ensure optimal usage of IT hardware resources and fast response to business requirements.
QBB application landscape: functional components and integrations
The core of the QBB IT infrastructure consists of four main functional components: (1) Data management, (2) biorepository management, (3) reporting system, and (4) participant management. All functional components are critical for the QBB business continuity and success. These functional areas are operated by various distinct software applications with a high degree of integration between them. Figure 2 presents the QBB application landscape and integration level between the distinct internal and external applications. Table 3 describes the different software applications, scope, and sources. The QBB data management supported by the IT infrastructure links thousands of participants with more than 2.5 million biological samples, and clinical data and medical records data (Diagnosis Records) that are integrated through Qatar's national health care system. The QBB Clinical Information System (CIS) is located within the Clinic network and tracks participant registration, informed consent forms, data collection through questionnaires software, and physical measurements through Onyx system (Fig. 2) by CIS through a two-way communication. This is integrated with the Clinical Device Software (CDS) using Onyx integration library to exchange the information captured from the different device software systems to the participant level (Fig. 2). CDS is integrated (two way) with the PACS providing convenient access to images from the relevant devices (Ultrasound machines, iDXA, and so on). An important component of the data management functional area is the integration of the CIS with the Laboratory Information Management System (LIMS) and the Medical Review System (MRS). LIMS is the main system for sample management and is integrated with the Laboratory Device Software, including all automated systems within the laboratory environment (i.e., Brooks automation biostore, Fluidigm, and flow cytometry) (Fig. 2). MRS is an application developed in-house allowing the QBB medical review office the ability to view and evaluate the results of certain measurements, laboratory results, and participant medical, as well as their family medical history. MRS also supports reporting on participants and referral to Hamad Medical Corporation (Fig. 2). Participant management is based on two systems, the participant management system of the actual QBB participants (consented to participate in the study) is hosted in the Clinic network and the system to manage the potential participants and their appointments is hosted in the Admin network. More specifically, Admin network facilitates all the applications connected outside the QBB facilities. The participant appointment scheduling application (PAMP Booking) manages participant booking and appointment (through QBB website or phone call), while the participant registration is managed through the CIS system (Fig. 2). The Admin network supports temperature monitoring systems (hosted on the cloud), integration with Qatar national health care system (Cerner), Online Feedback Surveys, Research Portal, and Qatar Foundation services, that is, e-mail, Enterprise Resource Planning (ERP), and SharePoint portal (Fig. 2).

QBB application landscape. A schematic of the QBB applications and integration level between the various internal and external systems.
Qatar Biobank Information Technology Infrastructure Systems
DB, database; ECG, electrocardiogram; HMC, Hamad Medical Corporation; RCMS, Research Collaboration Management System.
The Data management component starts with data collection on the clinic side, and these data are stored in secure access-controlled database management systems, where critical changes (such as edit or delete) of operational data within the CIS are being tracked using audit trail functionality with timestamp. When researchers make a request for certain groups of data, they are retrieved from the data mart and anonymized before handing it over for research use. This anonymization also applies to different types of images type (retinal images, ultrasound, etc.). The method/tool used for anonymization is based on the type of data that are to be anonymized. For example, for radiology images, QBB uses CTP/RSNA toolset to de-identify PII.
The Medical office in QBB can retrieve participants' reports through MRS; they update the status of participants' review process back to the system (i.e., referrals to Hamad Medical Corporation (HMC)). Medical office performance is being measured through business intelligence dashboard, which shows various statistical charts and reports on the clinical review processes. The business intelligence tool depends on centralized data base that aggregates data from the live system of MRS and CIS on daily basis (at Night) in order not to disturb performance of the operational system. The Biostore facility, Liquid Nitrogen tank, freezer, and room temperatures within the QBB are monitored through a cloud-based temperature monitoring service, where the laboratory staff can monitor the live temperature data on their mobile phones. An automatic phone call alert is sent to the responsible team in the case of a temperature drop. Samples are collected and managed through a laboratory information system developed in-house and a set of laboratory devices that are integrated with it. The integration is accomplished through files or databases (DBs) depending on the integration method supported by different devices. The same applies to CIS and clinical instruments and 95% of the data generated by clinical devices are stored into data base management systems in a structured format, while the remaining 5% is stored as received from the clinical devices.
The QBB applies both physical and digital security control. Access to the QBB data center is restricted to authorized personnel and controlled through access card managed by a Door Access Control System. Access to clinic, laboratory, and medical office systems is restricted to authorized users where a different level of privilege is granted to the users depending on their role (Normal users, Supervisor, and IT Admin). Login to the network to access the applications is managed through Operating System user's credentials and is managed and controlled through the application of Active Directory password protection policies.
Data extraction for the researchers is manually processed through an in-house developed data mart system, which reads data from different data sources, including legacy systems, which hold participants' clinical and laboratory tests data. QBB is in the process of automating the process of receiving researchers' requests and allowing them to browse the QBB data catalogue online through a portal. The project is in the planning/implementation stage and would enhance time and quality of service provided by QBB to researchers.
Participants' relationship management is an important focus of the QBB; therefore, to enhance the experience of participants, QBB initiated a new project to develop an online portal where participants can manage their appointments online, access their reports, and withdraw from the studies upon their desire. The portal will be integrated with the clinical system to improve data integrity and accuracy.
In addition to the aforementioned components, QBB utilizes several IT system utilities to manage the data center, end users' security, and communication facilities, such as conference room management system, VMware, a backup system, and Anti-virus software.
Discussion
Available informatics and computing solutions related to biobank science have emerged and matured over recent years.5,7,9,10 Challenges faced while building the QBB IT infrastructure were similar to those faced by similar industries. The main examples were managing QBB scope and finding cost-effective solutions to prevent budget overrun, the decision made to purchase commercially available applications versus in-house built applications, and ensuring system security, usability, performance, and compatibility with other systems.
The principal design feature of the QBB IT infrastructure is to ensure the security and integrity of participant data. In this context, controls for data security are provided in several levels. Two secure network connections (Clinic and Admin, Figs. 1 and 2) are in place to support data transfer between client machine and application servers. Internal staff access to critical systems is controlled through user credentials. This transport level security is facilitated by Digital Certificates to ensure that the data are encrypted on transport when they are accessing QBB services online (through the web). The main challenge QBB faces with this infrastructure design is data integration and real-time data integration between the systems sited in the two different networks. QBB currently manually handles data integration between these two networks. More security measures are applied to facilitate system to system data integration such as redesigning both networks and establishing demilitarized zones that are isolated and positioned between the Internet and the air-gapped network, and upgrading Firewalls sited within QBB IT Next Generation Firewalls. These measures allow QBB extra time to detect and address breaches before they would further penetrate into local networks. QBB is following ISBER Best practices; to evaluate the QBB Information System, we have revised it against the ISBER Information System Evaluation checklist presented in Table 4.
Qatar Biobank Information System Review Against ISBER Information System Evaluation Checklist
API, application program interface; DBMS, database management system; MRI, magnetic resonance imaging.
Decisions taken to build in-house or purchase commercial products were reached by performing factor analysis of the cost, data security, maintainability, and supportability. One of the core systems in QBB is the CIS, which collects the phenotypic data of participants on their visit to the QBB clinic. When QBB decided to replace the legacy CIS for performance and expendability reasons, QBB evaluated many commercial solutions for biobanks. QBB was not able to find any solution that met the functional requirements. Most of the solutions were extension LIMS and mainly lacked flexible and extensible clinical device integration capabilities. QBB also considered an Electronic Medical Records solution and found that it would overkill as it would require bigger team and cost for maintenance and support. QBB then looked into the open-source space and evaluated some of the open-source solutions. After evaluation and a Proof of Concept, QBB decided to implement Onyx application from Obiba. Onyx is a solution built for similar setup and has a good questionnaire module and an extensible framework for device integration. With the support from the Obiba team, legacy CIS was replaced with Onyx. Although Onyx was built for similar institutes, it lacked functionalities that are needed for the Biobank; and QBB's in-house team extended Onyx to introduce the missing functionalities. Our main challenge has been the shortage of IT staff to develop in-house solutions. However, the main advantages of the in-house built applications are the dynamic maintenance and system enhancement based on QBB's ongoing operation. We tried to overcome this by hiring temporary staff or by trying to customize commercial solutions according to our needs.
The main challenge of the readymade commercial biobanking products were either the mismatch between the compliance requirements of QBB security policies or the compliance with our operational model. QBB has various clinical and laboratory devices that are capable of electronically transferring clinical data. As most of the devices have custom interfacing capabilities and do not support international standards such as American Society for Testing and Materials (ASTM), Health Level Seven International (HL7), interfacing of new devices with the legacy CIS system (commercial), was a challenge. After the introduction of Onyx, as it has a simple and flexible interfacing library, QBB is now capable of interfacing new devices with lesser effort. QBB has developed custom components to interface various clinical devices, LIMS and PACS, which has varying integration capabilities. Migration of data from legacy systems was a challenge due to data quality issues and lack of documentation of the source system database. For new systems, QBB ensures that the data sources are well documented and rigorously tested for data quality issues before going live.
To conclude, we believe that a competent IT department with a flexible and scalable IT system is vital for the biobanking industry. Moreover, ISBER Best Practices are a valuable tool that it can be used as a starting point from biobanks to develop or upgrade their IT infrastructure.
Footnotes
Author Disclosure Statement
No conflicting financial interests exist.
Funding Information
No funding was received.
