Abstract

The clinical use of 'omics technology and the digitalization trend are fueling dramatic growth in the volume of healthcare data. This data complexity conundrum is driving the need for artificial intelligence (AI) decision-support tools. Simply put, AI is a generic term for different ways computers learn to recognize patterns in vast amounts of data. Although human oversight may be required to come up with a workable model, AI can also work alone; especially when data volume makes direct human analysis unfeasible.
To date, decision-support AI tools have been applied in certain clinical situations. We are not, nor do we need to be, at a point where AI makes the final decision in terms of therapy or diagnostics, says Hans Cobben, the CEO of BlueBee, a Dutch genomic data analysis company. A current constraint is that systems are trained with a narrow data set, making them very specific.
Implementing AI
According to Vlad-Mihai Sima, Ph.D., BlueBee co-founder and Head of Research, instead of a narrow focus, BlueBee has taken a generic path to build an AI platform for genomics decision support to leverage what AI excels at—hypothesis generation.
To make a prediction or generate a hypothesis, AI works best based on thousands of samples and really takes off when input reaches tens of thousands; larger data volumes allow more learning. Samples are a collection of information in multiple dimensions, such as imaging, 'omics data, and health records. For example, in many countries, biobank projects collect and standardize patient data over a period of time and provide access to these large sets data for predictions for particular situations.
Combining different dimensions is complex. AI determines patterns, clusters, and characteristics in data, unperceived from a human perspective to generate different hypotheses and make predictions. More complete data increases the probability of a better result; the broader the view the better, says Sima.
Managing meta-data, like procedures followed, instruments used, data types, etc. is essential for the system. This allows the AI platform to have a good view, understand and transform the data, and use it in multiple instances. The data warehouse solution is an important component in any AI platform.
BlueBee's advice to someone considering an AI strategy is to focus on the science and let the computers do the work (whenever possible) in data aggregation, exploration, and management, emphasizes Sima. You should not have to manually work with the data, struggle with the infrastructure, or learn about all the small details. At the same time, you should have access to all of the small details, including a full audit trail of all actions taken in the platform and on the data.
BlueBee users collect the data and define the metadata. There is no harsh manual data pre-preparation or processing—instead, the data are put into a large data store, BlueBase. The system aggregates the data and then the BlueBrain AI system looks at the data collection to suggest approaches or hypotheses. The platform clusters and characterizes the multi-dimensional information with a high degree of granularity, and from this complexity, patterns emerge.
AI's Evolving Role
Cobben says his team has built an environment that is hyper-secure, controlled, audited, completely regulated, and compliant worldwide. Applications are built on top of that infrastructure. This, he says, is the only way the AI domain will mature over the coming years.
Today the BlueBee platform is available on a global scale for genomics analysis in real time at high service levels and provides actionable information in a couple of hours. Adding AI gives researchers and product managers more insight and developmental capability. Soon it may be possible that this approach will also facilitate precision medicine by providing therapy suggestions based on an ever-growing dataset that is managed and exploited by AI.
