Preface: Special Issue

Abstract

In this special issue of the Journal of Computational Biology, we take great pleasure in celebrating the landmark birthdays of two leaders in our field—Mike Waterman and Simon Tavaré—who this year are 70 and 60 years old, respectively.

Mike Waterman is often referred to as “the grandfather of computational biology.” His achievements in the field are too numerous to cover adequately, but he is perhaps most famous for developing, in collaboration with Temple Smith, the Smith-Waterman algorithm for local sequence alignment in 1981. As of this writing, the original article has been cited approximately 6000 times, and the algorithm remains one of the foundations of the field. A further landmark is the famous Lander-Waterman model for physical mapping, a model that has played a significant role in the Human Genome Project, as well as in the current wave of next-generation sequencing (NGS) technologies.

Simon Tavaré wears many hats these days, but one of his landmark contributions to the field is a body of articles regarding the theoretical underpinnings of the coalescent, a mathematical model of evolution that exploits the old adage of Søren Kierkegaard: “Life can only be understood backwards; but it must be lived forwards,” the wisdom of which becomes apparent to us all as we age. Of course, Kierkegaard is also quoted as having said: “Far from idleness being the root of all evil, it is rather the only true good.” Happily for the field, Simon embraced this latter quote with rather less enthusiasm. Most recently, Simon has been a pioneer in the development of “approximate Bayesian computation.”

Over the last few decades, Mike and Simon trained and influenced many colleagues, postdoctoral fellows, and graduate students in the field of computational biology. In addition, Mike founded the Journal of Computational Biology in 1994 and has been an Editor-in-Chief of that journal. He also founded one of the premier computational biology conferences, the Annual International Conference on Research in Computational Molecular Biology (RECOMB), which is closely related to the journal. The articles in this issue are written by colleagues who have been mentored by, collaborated with, or have simply shared a beer with Mike and Simon. They cover a range of subjects in the field that Mike and Simon have helped nurture over the years. In particular:

• Aguiar and Istrail developed an efficient novel algorithm for haplotype assembly of densely sequenced human genome data.

• Bayzid and Warnow consider estimating species trees from gene trees, where not all the genes contain sequences for all the species.

• Frazier and Alber provide an algorithm that can increase the time scales of typical Brownian dynamics simulations by an order of magnitude, while maintaining similar accuracy in the reaction diffusion modeling.

• Gao et al. introduce a new powerful algorithm (CLiP) that captures biclusters with a local linear pattern.

• Helmkamp et al. estimate species trees from gene trees, correcting biases present in the estimation of divergence times for a variety of popular methods.

• Joyce et al. provide a computationally tractable method for simulating and analyzing data under a class of non-neutral population-genetic models.

• Kinsella and Bafna propose a rigorous mathematical model for a seven-decades-old breakage-fusion-bridge (BFB) mechanism to explain genome variability and gene amplification in cancer.

• Lai proposes a statistical framework, based on change point analysis, to perform integrative analysis of chromosomal genotype and DNA copy number variations in allele-specific copy number variation data.

• Lavi et al. develop a new kernel-based computational method to integrate gene expression data with protein-protein interaction networks to classify gene expression profiles into distinct disease phenotypes.

• Li et al. develop the first computational method for pattern mining across many two-layered graphs, with the two layers representing coupled biological networks of different types.

• Luo et al. provide novel genome-information content-based statistics for testing association between the entire allele frequency spectrum of genomic variation and disease status.

• Manolopoulou and Emerson develop a fast and flexible heuristic algorithm for inferring the geographic origin of a sample of non-recombining haplotypes.

• McPeek develops a powerful method for imputation in case-control studies using related individuals.

• Ni and Vingron provide a new score to effectively measure the similarity between ranked lists of genes.

• Nunez-Iglesias and Grover introduce a new approach towards optimal sampling strategy for high-throughput screening data.

• Rito et al. describe patterns of relative ages of proteins in protein-protein interaction networks and show that pairwise interactions and triangle interactions among old proteins are over-represented.

• Schbath et al. carefully compare nine algorithms for mapping next generation sequencing reads, providing practical guidelines for their use.

• Wang et al. develop a novel algorithm to represent similar microbial genome sequences by a De Bruijn graph, and a maximum likelihood estimation method to compute genus abundance levels from metagenome shotgun sequencing data sets.

• Xu and Zhang propose a novel generalized linear model for accurate detection of peaks in ChIP-Seq experiments.

• Zhai et al. extend studies of the approximate distribution of the number of occurrences of word patterns in long sequences to next generation sequencing reads.

• Zhang et al. develop a powerful new approach, VERSE, to predict splicing regulatory elements.

Finally, we would like to end this preface on a personal note. To have spent the bulk of one's working life in the company of somebody who is any one of (a) “a pioneer in the field,” (b) “an outstanding mentor,” or (c) “a great friend as well as a great colleague” is indeed a lucky position in which to find one's self. To have shared one's career with not one but two colleagues who are all of the above is a rare privilege indeed, and it is one that all of us have been very grateful and lucky to have enjoyed during our time at the University of Southern California. We work in a field they helped create, in a building that they helped build (figuratively if not literally), and we are very grateful to have this opportunity to publically acknowledge our debt of thanks to them. We hope this collection of articles is a fitting celebration of their achievements, and we also offer our sincere thanks to those who have contributed articles to this special issue.