Abstract

Jarret Glasscock, Ph.D., CEO, Cofactor Genomics
Jarret Glasscock is the founder and CEO of Cofactor Genomics, a diagnostics company leveraging the power of RNA and Predictive Immune Modeling to build multidimensional biomarkers that help better predict responses to drugs and therapy. Previously, Jarret was involved in the Human Genome Project at Washington University, published the first Cancer Genome, lead the Institute’s Computational Biology Group, and was part of the Institute’s Technology Development Group.
As scientists, our tendency is to focus on the technical aspects of our mission statement, but the potential to improve human health is what drives our team. Making complex technologies robust enough to benefit patients is how we can advance the impact of RNA. We’re building the ideal assay—a fully-vertical solution to easily move from tissue to clinical insights. More specifically, we’re applying our work in the new discipline of Predictive Immune Modeling to enable advancement in the field of immuno-oncology. We’re approaching five years and significant investment in our data models, called Health Expression Models. These transcriptome models of key immune cell types determine response and activity for immunotherapies. Inspired by the work of others, who demonstrated the power of a more global approach to modeling RNA, we’re taking a very different approach to traditional RNA-seq and moving beyond ranked-gene list associations. This multidimensional modeling has unlocked the promise and power we were seeking in RNA data and will help us accomplish our mission.
As a computational biologist, there are three very common misconceptions in our field: (1) The more data the better, (2) biological data plus machine-learning will unlock all answers in precision medicine, and (3) data science follows the same dynamics of 19th century labor. For #1: simply put, biological data is “dirty.” There are so many variables that can be responsible for observations in the data. It is more useful to have 100 samples that have been treated in a standardized way which allows discovery of a biological signal, rather than having 10,000 samples with high variability that may mask a useful signal. #2: although it may sound blasphemous, while biological data (like RNA) may be very useful in defining and predicting certain diseases and treatment outcomes, it will not provide a solution in all cases. Substitute any analyte or measurement in the statement above. For #3, it is important to realize we are working in a time of unscaled industry—as Hemant Taneja and Kevin Maney write about in their book Unscaled: How AI and a New Generation of Upstarts are Creating the Economy of the Future. This means a different approach, fueled by unique insight, will win over a factory of 100 data engineers. To summarize, we believe that innovative uses of big data, machine-learning, and genomics technologies can provide us unprecedented advances in human health—but it must be subject to the same scrutiny as other approaches and appreciated for its unique strengths and weaknesses.
One of the most important lessons I have learned from the last two decades of genomics is that data needs context. While the last ten years have focused on generating data, I am excited that these next five years will focus on deriving insight and, more importantly, utility. This requires us to take the voluminous and complex data we have and put each additional sample we look at in the context of all the data, and preferably models which represent the biological signals or features we want to understand. Ideally, this will translate all our data into simple utilitarian tools for use in precision medicine. Isn’t that the hallmark of great technology after all, making the complex simple and useful?
