Abstract

Most of the world’s tumor samples stored by medical centers for pathological diagnosis are preserved as formalin-fixed, paraffin-embedded (FFPE) tissue specimens, often paired with clinical and outcome data.
The formalin cross-links proteins, creating a web that complicates efforts to extract DNA efficiently, typically resulting in small DNA fragments of poor quality that are difficult to sequence. While some medical centers opt for preserving samples as fresh frozen tissue, frozen samples are harder to obtain and store and yield smaller numbers of samples, making them a costlier alternative to FFPE.
In what the company says is a solution to these limitation of FFPE, WuXi NextCODE introduced late last year its answer to the problem—a novel FFPE method that has proven capable of extracting DNA fragments of higher molecular weight.
Dubbed SeqPlus, the technology is designed to enable whole-genome sequencing of FFPE samples by producing sequence alignments that cover 98% of the genome, at a depth of 20x—coming close to the results obtained with fresh frozen samples at comparable levels of sequencing, with similar numbers of heterozygous and homozygous calls.
SeqPlus offers a whole-genome alternative to panel-based sequencing for FFPE samples, which perform at a median depth of coverage of >500x for sensitive and specific detection of alterations at low frequency, compared with the typical 30x when sequencing a whole genome.
WuXi NextCODE team viewing genome at its research facility in Shanghai, China.
“Instead of going to 1,000 or 2,000-fold coverage for a whole genome, we’re able to achieve very good coverage of the whole genome with just 60x or 70x coverage, very reasonable coverage for a tumor sample,” WuXi NextCODE CSO Jeffrey Gulcher, M.D., Ph.D., said.
‘More Gently Pulling the DNA Out’
Gulcher said SeqPlus is able to extract larger DNA samples through a method the company won’t disclose, except to say it avoids traditional methods that break up the proteins, through long-term incubation in proteinase K.
“The method basically is a way of more gently pulling the DNA out to get results, much larger fragments, and we have about a 99% success rate at whole-genome sequencing,” Gulcher said. “The result is, you can get more efficient sequencing, which results in getting better coverage at a much smaller price. Normally you’d have to do 500x to 1,000x to get good coverage on FFPE and even that would generally fail to provide broad coverage over the entire genome. Here you can do it at 40x to 90x.”
The company has launched SeqPlus pilot studies with the NIH’s National Cancer Institute (NCI), biopharma companies, and academic medical centers in the U.S. and Europe, aimed at testing SeqPlus on their samples. One undisclosed biopharma, according to WuXi NextCODE, achieved a three-fold increase in efficiency compared to conventional methods, even with whole-exome sequencing.
“With whole-exome sequencing, what most people do is they increase the sequencing coverage by 500 to 1,000 to get good coverage,” Gulcher said. “Of course, it ends up being much more costly than when you normally do whole exome sequencing at about 100x coverage. Furthermore, it is too costly to do 500x coverage with whole-genome sequencing. This could also reduce the price of sequencing as well as improve the success rates.”
After a study found FFPE whole genome sequencing via SeqPlus to generate similar call data to fresh frozen samples—using measures that included allele frequency and heterozygous to homozygous call ratio—WuXi NextCODE then tested its method through the largest-to-date study of FFPE samples.
The company sequenced 516 pairs of sample tissue—one tumor and one normal per pair. The tumor samples, which were sequenced at a targeted depth of 70x, demonstrate average coverages of 99% and 98% at 10x and 20x, respectively. The paired normal tissue samples, which were sequenced at a lower targeted depth of 30X, demonstrated average coverages of 98.1% and 92.6% at 10x and 20x, respectively. Both of these samples demonstrate data comparable to the fresh-frozen controls.
10 to 15 Med Center Partnerships
Gulcher anticipates WuXi Next-CODE will work with at least 10 to 15 different academic medical centers to help it collect a critical mass of pertinent data.
“I don’t think it’s going to be as simple as being able to use just one medical center for a given cancer. We’re probably going to have to use two or three medical centers, just to get enough samples for human cancer for the matched metastatic vs primary tumor projects. With some centers, we’ll work with certain cancer types. With others, we’ll work with different cancer types, depending on their expertise.”
In some of WuXi NextCODE’s partnerships, he added, the medical centers use the company’s tools to access the cancer datasets of The Cancer Genome Atlas (TCGA), which houses data from approximately 11,000 patients and 33 cancer types, after gaining permission from the NCI. One such tool is the Tumor Mutation Analyzer (TMA), designed to provide a fast, intuitive and visual means of analyzing and comparing next-generation sequence data from tumor-normal pairs and tumor cohorts.
TMA simultaneously integrates the MuTect algorithm from the Broad Institute, and the VarScan2 algorithm of Washington University in St. Louis, to pick out the most promising variants. TMA uses the company’s Genomically Ordered Relational (GOR) database architecture, and is fully integrated with its germline Clinical Sequence Analyzer and Sequence Miner.
WuXi NextCODE CSO Jeffrey Gulcher
Smita Jacob for WuXi NextCODE
WuXi NextCODE will work with the multiple medical centers to gain insights into how primary tumors progress to metastatic cancer, matching 500 metastatic tumors with 500 primary tumors from the same patients in the same cancer. The most common cancers will be studied, including lung adenocarcinoma, non-squamous cell lung, breast, colon, pancreatic, ovarian, and liver cancers.
The company will carry out an integrated multiple-omics analysis comparing metastatic FFPE tumor samples and primary tumor samples from the medical centers using artificial intelligence, carrying out whole-genome sequencing to generate SNPs, indels, CNV, RNA-seq, and microRNA data, and DNA methylation—the types of data previously gathered from primary tumors stored by TCGA.
Looking for the Difference
“What we really want to know is what’s the difference between the metastatic and the primary tumors in patients who have recurrence and have bad outcomes?” Gulcher said. “The TCGA collection doesn’t tell us the pathways that separate those who do really well with treatment of the primary tumor and those that progress to metastatic disease. Just imagine if you were able to unlock those samples, to allow you to make those comparisons. Our efficient FFPE sequencing provides a way to do these large studies for the first time.”
He added that efficient formalin-fixed sequencing has opened up further studies of large sample collections outside oncology, from central nervous system tissue to liver samples, where the company is proceeding with a comparison of 1,000 non-alcoholic steatohepatitis (NASH) biopsies versus 1,000 fatty liver samples without NASH, to be followed by multi-omics analysis with deep learning.
Huntstock / Getty Images
Two different feature reduction methods have been developed at WuXi NextCODE to feasibly apply deep learning methods to large genomics datasets, which are orders of magnitude larger than the features used for facial recognition and image-analysis algorithms. One is a modified version of GOSeq designed to cluster genes with similar functions or similar pathways, developed by Thomas Chittenden, Ph.D., D.Phil., founding director of the WuXi NextCODE advanced artificial intelligence research laboratory. The other method entails clustering genes based on co-expression or co-correlation.
WuXi NextCODE markets three SeqPlus offerings:
SeqPlus Lab, which offers whole-genome sequencing of FFPE samples with a FASTQ file returned.
SeqPlus Secondary, which delivers sequencing of FFPE samples as well as aligned data, with output presented as a BAM file. As with SeqPlus Lab, customers can process as little as a single sample, with pricing based on number of samples—but must conduct their own analysis.
SeqPlus Interpretation, which offers sequencing of 10 or 20 slides, with full analysis. Users get back a report with sample and sequencing QC metrics including DNA quality, coverage, PCR duplicates, numbers of mutations, SNP and indel counts, and sample profile—including mutational signatures and other sample/cohort specific information.
WuXi NextCODE operates from offices in Shanghai; Cambridge, MA; and Reykjavik, Iceland. The company took its present form in 2015 when NextCODE Health was acquired by WuXi PharmaTech for $65 million; NextCODE—a spinout of personalized medicine pioneer deCODE Genetics, acquired by Amgen in 2012—was merged with WuXi’s Genome Center to form WuXi NextCODE.
Last year WuXi NextCODE completed a $240 million Series B financing, with proceeds intended to accelerate the extension of its platform infrastructure by attracting new users and data through precision medicine and diagnostics partnerships.
