Toward a Taxonomy for Multi-Omics Science? Terminology Development for Whole Genome Study Approaches by Omics Technology and Hierarchy

Abstract

Omics is a form of high-throughput systems science. However, taxonomies for omics studies are limited, inviting us to rethink new ways in which we classify, prioritize, and rank various omics systems science studies. In this overarching context, the genome-wide study approaches have proliferated in number and popularity over the past decade. However, their hierarchy is not well organized and the development of attendant terminology is not controlled. In the present study, we searched the literature in PubMed and the Web of Science databases published from March 1999 to September 2016 using the keywords, including genome-wide, association, whole genome, transcriptome-wide, metabolome, epigenome, and phenome. We identified the whole genome study approaches and sorted them according to the omics technology types (genomics, proteomics, and so on) and hierarchy. Thirty-four studies from over 90 publications were sorted into 10 omics groups: DNA level, transcriptomics, proteomics, interactomics, metabolomics, epigenomics, miRNomics/ncRNomics, phenomics, environmental omics, and pharmacogenomics. We suggest here modifications of terminology for study approaches, which share the same acronyms such as EWAS for epigenome-wide association and environment-wide association studies, and MWAS for methylome-wide association and metabolome-wide association studies. Taken together, our study presented here provides the first systematic review and analyses of whole genome approaches and presents a baseline for further controlled terminology development, with a view to a new taxonomy for omics and multi-omics studies in the future. Finally, we call for greater dialogue and collaboration across diverse omics knowledge domains and applications, for example, across plants, animals, clinical medicine, and ecology.

Introduction

Omics is a form of high-throughput systems science. Within the omics research space, many study approaches have been developed and share the same or similar aims to annotate and edit the information content of genomes and to understand genomic causes of complex diseases. However, taxonomies for omics studies are limited, inviting us to rethink new ways in which we classify, prioritize, and rank various omics systems science studies.

In this overarching context, the genome-wide (whole genome) study approaches have proliferated in number and popularity over the past decade. One of the first developed whole genome study approaches was genome-wide association study (GWAS), which tries to find significant associations between single-nucleotide polymorphisms (SNPs) and studied traits (Manolio, 2010). Despite promising at the beginning, several reported findings of GWAS have only small effects on observed phenotypes and in addition explain only a small proportion of the variance in a trait that is known to be influenced by genetics. Therefore, several complimentary approaches for GWAS have been developed such as genome-wide association interaction study (GWAI) (Gusareva et al., 2014), pathway-based GWAS (Jiao et al., 2015), metabolome-wide association study (MWAS) (Chadeau-Hyam et al., 2010), epigenome-wide association study (EWAS) (Flanagan, 2015), and phenome-wide association study (PheWAS) (Hebbring, 2014). The development of numerous whole genome study approaches has been enabled by the development of several experimental as well as computational technologies.

For example, the development of next-generation sequencing (NGS) enables to sequence DNA and RNA much more quickly and cheaply and thus has laid the foundation for the development of several study approaches (Metzker, 2010). Furthermore, many modifications of NGS such as bisulfite sequencing (Li and Tollefsbol, 2011) and chromatin immunoprecipitation sequencing (ChIP-seq) (Hoffman and Jones, 2009) have been developed. However, hierarchy of whole genome study approaches and technologies is currently not organized, and thereby, the development of terminology is also lagging behind. As a result, certain study approaches have been named with different terms and in addition the same acronyms are used for different approaches. Therefore, problems in literature searching are occurring, and novel developed study approaches do not have any guidance in terminology development, which leads in even greater disorganization.

The aim of the present study was, therefore, to provide an overview of the genome-wide study approaches and establish a baseline for terminology development. We have organized study approaches hierarchically and attempted to organize the current terminology.

Materials and Methods

We searched systematically the types of genome-wide study approaches from scientific literature in PubMed (www.ncbi.nlm.nih.gov/pubmed) and Web of Science (http://apps.webofknowledge.com) from March 1999 to September 2016. The articles have been identified using the keywords such as genome-wide, association study, whole genome, whole transcriptome, transcriptome-wide, metabolome, epigenome, phenome, interactions, and environment. Collected study approaches have been sorted according to the omics types in 10 groups: genomics/DNA level, transcriptomics/RNA level, proteomics, interactomics, metabolomics, epigenomics, miRNomics/ncRNomics, phenomics, environmental omics, and pharmacogenomics. In addition, we have extracted the names of study approaches and organized them hierarchically. Furthermore, we have briefly described collected study approaches.

Results

We have collected over 90 publications describing study approaches and methods on a genome-wide scale. Study approaches were then sorted by omics type (Fig. 1). Examples of the reports using whole genome study approaches found in the literature, their acronyms, and references are presented in Table 1. The most common experimental techniques that are used at different omics levels are listed in Table 2. Based on the collected studies, we created hierarchy organization of genome-wide study approaches, which is presented in Table 3. This suggested list of study approaches now serves as a baseline for extending and complementing the list using approaches in future reports.

FIG. 1.

Presentation of different study approaches and omics types used in genomics. Studies could be performed on a single locus or on a genome-wide scale.

Table 1.

Examples of Published Original and Review Studies Describing Genome-wide Study Approaches

Acronym	Study approach	Reference
DNA level
WGA	Whole genome analysis	Dolled-Filhart et al. (2013)
WGS	Whole genome sequencing	Chrystoja et al. (2014)
WES	Whole exome sequencing	Rabbani et al. (2014)
GWAS	Genome-wide association study	Manolio (2010)
	Whole genome comparative QTL analysis	Chaim et al. (2006)
RNA level, transcriptomics
WTA	Whole transcriptome analysis, transcriptome-wide analysis	Jiang et al. (2015)
WTS, WTSS	Whole transcriptome sequencing, whole transcriptome shotgun sequencing, RNA sequencing	Chu and Corey (2012)
Proteomics
	Whole proteome analysis	Fan et al. (2014)
Interactomics
	Genome-wide analysis of physical interactions between DNA molecules	van Berkum et al. (2010)
GWIA	Genome-wide interaction analysis	Hibar et al. (2015)
GWAI	Genome-wide association interaction analysis	Gusareva et al. (2014)
	Genome-wide gene–gene interaction analysis	Chu et al. (2014)
	Genome-wide search for SNP-SNP interactions	Murk and DeWan (2016)
	Genome-wide analysis of protein–DNA interactions	Kim and Ren (2006)
	Genome-wide mapping of transcription factor binding sites	Greenald et al. (2015)
	Genome-wide analysis of RNA–protein interactions	Barkan (2009)
	Transcriptome-wide identification of RNA-binding protein and miRNA target sites	Hafner et al. (2010)
	Genome-wide protein–protein interactions analysis	Lv et al. (2015)
	Network-based analysis of genome-wide association studies	Cho et al. (2012)
PINBA	Protein interaction network-based analysis	Yu et al. (2014)
INAS	Integrative network-based association study	Califano et al. (2012)
	Pathway-based genome-wide association study	Wang et al. (2010)
Metabolomics
MWAS	Metabolome-wide association studies	Bictash et al. (2010)
mGWAS	Metabolite-based genome-wide association study	Luo (2015)
Epigenomics
EWAS	Epigenome-wide association study	Flanagan (2015)
MWAS	Methylome-wide association study	Aberg et al. (2012)
WGBS	Whole genome bisulfite sequencing	Li and Tollefsbol (2011)
	Genome-wide analysis of histone methylation	Araki et al. (2009)
	Genome-wide analysis of histone acetylation	Agrawal-Singh et al. (2012)
	Genome-wide nucleosome mapping	Jansen and Verstrepen (2011)
miRNomics, ncRNomics
	Genome-wide miRNA profiling	Pritchard et al. (2012)
	Genome-wide analysis of long noncoding RNA expression	Yarmishyn and Kurochkin (2015)
Phenomics
PheWAS	Phenome-wide association study	Hebbring (2014)
Environmental omics
EWAS	Environment-wide association study	Hall et al. (2014)
GWEI	Genome-wide environmental interaction study	Winham and Biernacka (2013)
Pharmacogenomics
	Genome-wide mapping of drug–DNA interactions	Erwin et al. (2016)

miRNA, microRNA; QTL, quantitative trait locus; SNP, single-nucleotide polymorphisms.

Table 2.

Examples of Experimental Methods Sorted According to the Omics Types

DNA level

DNA sequencing

Next-generation sequencing

Whole genome sequencing

Whole exome sequencing

Chip-based microarray technology

RNA level, transcriptomics

RNA sequencing

Whole transcriptome sequencing

Whole transcriptome shotgun sequencing (WTSS)

Whole transcriptome tag/target sequencing

DNA microarrays

Proteomics

Protein microarray, proteome microarray

Interactomics

DNA–DNA interactions

Hi-C method

DNA–protein interactions

Chromatin immunoprecipitation (ChIP)

ChIP combined with DNA microarrays (ChIP-chip)

ChIP combined with next-generation sequencing (ChIP-seq)

RNA–protein interactions

Crosslinking and immunoprecipitation (CLIP)

Photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP)

Crosslinking immunoprecipitation coupled with high-throughput sequencing (HITS-CLIP, CLIP-seq)

Individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP)

Microarray profiling of RNAs associated with immunopurified RNA-binding proteins (RIP-Chip)

Protein–protein interactions

Protein microarray, proteome microarrays

Mass spectrometry

Metabolomics

Nuclear magnetic resonance (NMR) spectroscopy

Mass spectrometry (MS)

Liquid chromatography–mass spectrometry (LC-MS)

Gas chromatography–mass spectrometry (GC-MS)

Flow injection analysis–mass spectrometry (FIA-MS/MS)

Epigenomics

Analysis of DNA methylation

Whole genome bisulfite sequencing (WGBS)

Targeted bisulfite sequencing (TBS-seq)

Methyl-CpG binding domain (MBD) protein-enriched genome sequencing (MBD-seq)

Methylated DNA immunoprecipitation (MeDIP)

DNA methylation analysis using restriction enzyme digestion

Array-based genome-wide DNA methylation analysis

Immunoprecipitation

Analysis of histone methylation

ChIP-Seq

Analysis of histone acetylation

ChIP-Seq

ChIP-Chip

Analysis of nucleosomes position

MNase ChIP-Seq

miRNomics, ncRNomics

miRNA microarrays

RNA-seq

Pharmacogenomics

The chemical affinity capture and massively-parallel DNA sequencing (Chem-seq)

The crosslinking of small molecules to isolate chromatin (COSMIC)

Table 3.

Proposed Hierarchy Organization of the Whole Genome Study Approaches According to the Omics Types

1. Genomics/DNA level

Whole genome analysis (WGA), genome-wide analysis

Whole genome sequencing (WGS)

Whole exome sequencing (WES)

Genome-wide association study (GWAS)

Whole genome comparative QTL analysis

2. Transcriptomics/RNA level

Whole transcriptome analysis (WTA), transcriptome-wide analysis

Whole transcriptome sequencing (WTS)

Whole transcriptome shotgun sequencing (WTSS)

RNA sequencing (RNA-seq)

3. Proteomics

Whole proteome analysis

4. Interactomics

DNA–DNA interactions

Physical interactions

Genome-wide analysis of physical interactions between DNA molecules

Statistics/epistatic interactions

Genome-wide interaction analysis (GWIA)

Genome-wide association interaction analysis (GWAI)

Genome-wide gene–gene interaction analysis,

Genome-wide search for SNP-SNP interactions

DNA–protein interactions

Genome-wide analysis of protein–DNA interactions

Genome-wide mapping of transcription factor binding sites

RNA–protein interactions

Genome-wide analysis of RNA–protein interactions

Transcriptome-wide identification of RNA-binding protein and miRNA target sites

Protein–protein interactions

Genome-wide protein–protein interactions analysis

Network-based studies

Network-based analysis of genome-wide association study

Protein interaction network-based analysis (PINBA)

Protein interaction network-based pathway analysis (PINBPA)

Integrative network-based association study (INAS)

Pathway-based studies

Pathway-based genome-wide association study

Pathway-based analysis (PBA)

Pathway-wide association study (PWAS)

5. Metabolomics

Metabolome-wide association studies (MWAS)

Metabolite-based genome-wide association study (mGWAS),

Metabolome-based genome-wide association study

6. Epigenomics

Epigenome-wide association study (EWAS)

Methylome-wide association study (MWAS)

Whole genome bisulfite sequencing (WGBS)

Genome-wide analysis of histone modifications

Genome-wide analysis of histone methylation

Genome-wide analysis of histone acetylation

Genome-wide nucleosome mapping

7. miRNomics, ncRNomics

Genome-wide miRNA profiling

Genome-wide analysis of long noncoding RNA (lncRNA) expression

8. Phenomics

Phenome-wide association study (PheWAS)

9. Environmental omics

Environment-wide association study (EWAS)

Genome-wide environmental interaction study (GWEI)

Genome-wide by environment interaction study (GWEIS)

10. Pharmacogenomics

Genome-wide mapping of drug–DNA interactions

Analysis of the collected literature revealed a large disorganization in terminology in this research area. For example, certain study approaches have been named with different names, some study approaches share the same acronym, and some of them do not have an acronym. Therefore, to arrange this disorganization, we have made some suggestions for changes of terminology.

Study approaches used for whole genome analyses

The term whole genome analysis is used in two cases; WGA in a narrow sense focuses on a genome at the DNA level, on the contrary, in a broad sense, WGA considers all omics approaches and a genome as a whole.

DNA level

Whole genome sequencing (genome-wide sequencing)

Whole genome sequencing (WGS) is a process of determining the entire DNA sequence of an organism's genome. The result is the precise order of nucleotides within DNA molecules of the entire genome. Many different methods of WGS exist. The first developed method was traditional WGS that includes cloning of DNA fragments in microbial cells and sequencing of these fragments by using Sanger dideoxy sequencing technique. Newly developed technologies replaced the traditional methods for WGS. These methods are known as next-generation WGS and are cell-free methods.

The cloned fragments are sequenced by using NGS methods. Most of the WGS methods have the same basic steps: (1) breaking of the DNA molecules of a genome up into thousands to millions of more or less random, overlapping small segments, (2) sequencing of each small segment, (3) computationally analyzing and finding of the overlap among the small segments where their sequences are identical; reassembling of sequenced fragments into contigs, and (4) overlapping of ever larger contigs until all the sequence is completed (Bick and Dimmock, 2011; El-Metwally et al., 2013; Pareek et al., 2011).

There are two main strategies of WGS, including de novo genome sequencing and whole genome resequencing. De novo sequencing is initial determination of the primary genomic sequence of a specific organism and it is crucial for detailed genetic analysis of any organism. On the contrary, whole genome resequencing is one of the most frequently used applications of NGS technology and it is used for the identification of SNPs, indels, copy number variations, and structural variations, as well as for identification of multiple individuals or strains (Zhou et al., 2010). The bioinformatics analysis that follows the process of DNA resequencing consists of three basic steps: alignment to reference genome, identification of sequence variations (variant calling), and filtering and annotation of identified variants (Dolled-Filhart et al., 2013).

The genome sequencing projects were long limited only to model organisms, but the progress in high-throughput sequencing technology and the development of many bioinformatics tools have expanded the field of use (Ekblom and Wolf, 2014). Many successful applications of WGS in identification of the etiology of complex diseases have been reported. For example, the results of WGS have helped in choosing the right therapy approach for neoplastic and nonneoplastic diseases and in various aspects of reproductive health (Chrystoja and Diamandis, 2014).

Whole exome sequencing (exome-wide sequencing)

Whole exome sequencing (WES) is used for sequencing of coding regions or the exons of known genes, which constitute about 1–2% of the human genome (Volk et al., 2015). WES covers up more than 95% of the exons, which contain 85% of disease-causing mutations. This is one of the reasons that sequencing of entire human exome has the potential to discover the main genetic causes for rare mostly monogenic diseases (Mendelian disorders) as well as disease predisposing variants in diseases. WES has been successfully used to identify the causative variants in several conditions, including hearing loss, intellectual disabilities, autism spectrum disorders, retinitis pigmentosa, diabetes, cardiovascular disease, hypertension, obesity, and cancer. Furthermore, in comparison to WGS, it is also less costly and consequently more common in clinical and diagnostic tests (Rabbani et al., 2014).

GWAS (whole genome association study)

GWAS is a study approach that involves examination of many common genetic variants, usually SNPs, in different individuals across the genomes to find genetic variants associated with a particular phenotypic trait or disease (Manolio, 2010). Studies using GWAS approach have identified many genetic variants associated with human diseases such as age-related macular degeneration, Crohn's disease, schizophrenia, multiple sclerosis, ankylosing spondylitis, and rheumatoid arthritis (Hebbring, 2014; Manolio, 2010).

There are two main types of GWAS design: (1) the case/control studies that compare the DNA of individuals with/without the disease and (2) the quantitative trait association studies (Dubé and Hegele, 2013). Most commonly used genotyping technology is chip-based microarray technology, which can genotype over one million SNPs across the human genome in a single assay, followed by statistical analysis consisting of independent single-locus statistical tests that examine each SNP for association to the trait (Bush and Moore, 2012). Results of GWAS are usually represented with Manhattan plot (Dubé and Hegele, 2013).

Even though GWASs have identified thousands of genes and genetic variants that contribute to complex traits, especially diseases, there are still a lot of limitations to overcome. For instance, difficulties in obtaining statistical significant results, GWAS's results often just partly explain the heritable variation, and the vast majority of GWAS-significant SNPs are intergenic, which means identifying and characterizing of these SNPs is even more challenging (Hebbring, 2014).

Whole genome comparative quantitative trait locus analysis

Whole genome comparative quantitative trait locus (QTL) analysis is a study approach that compares the locations of QTL controlling studied traits in at least two species. For example, a whole genome comparative QTL analysis has been performed for fruit weight, pericarp thickness, and fruit shape in tomato and pepper. They have identified 95 QTL, including two QTL for fruit weight that was common in both species (Chaim et al., 2006).

RNA level, transcriptomics

Whole transcriptome analysis, transcriptome-wide analysis, genome-wide transcriptome analysis

Whole transcriptome analysis (WTA), also known as transcriptome-wide analysis or gene expression profiling, is a study approach that measures the expression of thousands of genes at once. The aim of WTA is to capture coding and noncoding RNA (ncRNA) to quantify gene expression and identify heterogeneity of gene expression profiles in cells, tissues, organs, and even a whole body. It is used for decoding genome structure, characterizing and annotating genes/genomes previously revealed by DNA sequencing, identifying genetic networks of cellular, physiological, biochemical, and biological systems, and determining molecular biomarkers that respond to diseases, pathogens, and environmental challenges (Jiang et al., 2015). The most common techniques that are used for gene expression profiling are DNA microarrays and RNA sequencing (RNA-seq) (Mantione et al., 2014; Qian et al., 2014).

Whole transcriptome sequencing, whole transcriptome shotgun sequencing, RNA-seq

Whole transcriptome sequencing (WTS), also known as whole transcriptome shotgun sequencing (WTSS) or RNA-seq, is an application of any type of NGS techniques to study RNA (Chu and Corey, 2012). This study approach has become more available with the development of NGS techniques. RNA-seq technology is used for (1) the quantitative analysis of transcript expression levels, (2) the annotation of transcripts, (3) the detection of low abundance transcripts, (4) the study of mRNA processing, (5) the analysis of alternative splicing, and (6) the study of polyadenylation (Marguerat and Bähler, 2010). Sequencing process is almost the same for RNA-seq and DNA sequencing, but the library preparation and the data analysis are different. The preparation of RNA-seq library usually includes reverse transcription and labeling with adapter sequences (Chu and Corey, 2012; Mantione et al., 2014).

Proteomics

Whole proteome analysis

Proteomics is the large-scale study of proteins and their characteristics, including expression level, post-translational modifications, and interactions. The aim of this study approach is to obtain a global integrated view of disease processes, cellular processes, and networks at the protein level (Blackstock and Weir, 1999). There are several methods for the detection, identification, and functional investigation of proteome such as two-dimensional gel electrophoresis (Rabilloud et al., 2010), mass spectrometry (Lesur and Domon, 2015), and protein microarrays (Uzoma and Zhu, 2013). Protein microarrays, known also as proteome microarrays, are offering a wide variety of applications and have been used to study several types of interactions, such as protein–protein, protein–DNA, protein–lipid, protein–drug, protein–receptor, and antigen–antibody interactions (Hall et al., 2007), as well as for identification of protein posttranslational modifications such as phosphorylation, ubiquitylation, acetylation, and S-nitrosylation (Zhu and Qian, 2012). For example, the human proteome microarray has been used to identify proteins that interact with alpha A-crystallin (Fan et al., 2014). Furthermore, human S-nitrosoproteome has been analyzed using high-density protein microarray chip resulting identification of 834 potentially S-nitrosylated human proteins (Lee et al., 2014).

Interactomics

Interactomics include several study approaches that consider various types of interactions such as DNA–DNA, DNA–protein, RNA–protein, and protein–protein.

DNA–DNA interactions

Published reports describe two types of DNA–DNA interactions, physical interactions and statistical (epistatic) interactions.

Physical interactions

Genome-wide analysis of physical interactions between DNA molecules

Chromosome conformation capture (3C) is a method that detects short- and long-range physical DNA–DNA interactions. It was first developed for analysis of chromosomes conformation in yeast (Dekker et al., 2002). This technique has been rapidly adapted as a standard research tool for studying chromosomal interactions (Li, 2016). An extension of 3C method is Hi-C method, which is able to identify long range interactions in an unbiased genome-wide manner. It is a method used for studying the three-dimensional architecture of genomes. The three-dimensional organization of a genome plays important roles in regulation of genes and the functional state of the cell, but yet it is poorly understood how genomes are spatially organized, and how this affects gene expression. The Hi-C method provides a promising tool for deciphering the relationship between chromosome organization and genome activity and it will contribute to the understanding of genomic processes such as transcription and replication (Lieberman-Aiden et al., 2009; van Berkum et al., 2010).

Statistics/epistatic interactions

Several of reported discoveries from GWAS have small effects on observed phenotype and explain only a small proportion of the variance in a trait that is known to be influenced by genetics (Hibar et al., 2015). It has been hypothesized that multivariate approaches such as analyses of genetic interactions may resolve the problem of “missing heritability” observed in GWAS analysis. Therefore, several study approaches have been reported to identify DNA–DNA interactions on a genome-wide scale, including genome-wide interaction analysis (GWIA) (Hibar et al., 2015), GWAI (Gusareva et al., 2014), genome-wide search for SNP–SNP interactions (Murk and DeWan, 2016), and genome-wide gene–gene interaction analysis (Chu et al., 2014).

Methods developed for detection of interactions are typically focused on SNP interactions (epistatic interactions) (Goudey et al., 2013). These methods have been used for various analyses, for example: testing all possible SNP–SNP interactions affecting regional brain volumes (Hibar et al., 2015), identification of Alzheimer's disease associated interacting SNP pairs (Gusareva et al., 2014), identification of SNP–SNP interactions associated with 10 common human diseases (Murk and DeWan, 2016), and identification of epistatic gene pair that was significantly associated with the risk of developing lung cancer (Chu et al., 2014). Although improved processor speeds have lead two-way interactions analyses in common use, there are still limitations due to underpowered statistical approaches (Goudey et al., 2013; Hibar et al., 2015). Currently, the software tools development is heading toward developing algorithms, which will also allow analyzing three- or more-way interactions (Goudey et al., 2013).

DNA–protein interactions

Genome-wide analysis of protein–DNA interactions

Genome-wide analysis of protein–DNA interactions is a study approach that identifies interactions between proteins and specific regions of the genome. ChIP is a common method for detecting interactions between a protein and DNA sequence in vivo. In addition, this method has been combined with DNA microarrays (ChIP–chip) and NGS (ChIP-seq) and thus has enabled genome-wide identification of DNA-binding sites for numerous nuclear proteins (Hoffman and Jones, 2009; Kim and Ren, 2006; Pugh and Gilmour, 2001).

Genome-wide mapping of transcription factor binding sites

For understanding of gene expression regulation, it is also crucial to identify gene-specific transcription factors and transcription factor binding sites (TFBS). Genome-wide mapping of TFBSs is a study approach that identifies specific genomic regions for TFBSs. Genome-wide maps of TFBSs in primary tissues can increase our knowledge about genome function, transcriptional regulation, and genetic alterations that contribute to disease risk (Savic et al., 2013). For example Hif-1α binding sites in the zebrafish genome have been identified using ChIP-seq method, enabling development of zebrafish as an emergent model organism for research into the hypoxic response in a physiological context (Greenald et al., 2015).

RNA–protein interactions

Genome-wide analysis of RNA–protein interactions

Genome-wide analysis of RNA–protein interactions is a study approach that identifies the interactions between RNA binding proteins (RBPs) and specific RNA molecules (Barkan, 2009). Common methods for studying interactions between proteins and RNA molecules are crosslinking and immunoprecipitation (CLIP) (Ule et al., 2003), as well as its variations, including photoactivatable ribonucleoside-enhanced CLIP (PAR-CLIP) (Hafner et al., 2010a), and individual-nucleotide resolution UV CLIP (iCLIP) (König et al., 2010). For instance, transcriptome-wide identification of RBP and microRNA (miRNA) target sites has been performed using PAR-CLIP method (Hafner et al., 2010b).

Protein–protein interactions

Genome-wide protein–protein interaction analysis

Genome-wide protein–protein interaction (PPI) analysis is a study approach that combines data from different sources to predict PPIs and establish PPI networks (Lv et al., 2015). PPIs are generally identified by pull-down experiments or other similar techniques. However, this approach is too slow as well as too expensive to meet the goal of identifying all the PPIs necessary to understand the functional and dynamic properties of the cell. Therefore, several computational methods have been developed to compliment experimental methods. They can efficiently integrate data from several sources to make predictions of the probability of interaction between two proteins (McDowall et al., 2009). The main aim of genome-wide PPI analysis is to investigate functions of proteins and thus provide better understanding of the biological mechanism of proteins, protein complexes, and their molecular interactions (Lv et al., 2015).

Network-based studies

Network-based analysis of GWASs

Network-based approaches have become powerful tools for studying complex diseases. It is known that genes, gene products, and small molecules interact with each other and form a complex interaction network (Cho et al., 2012). The aim of network-based analysis of GWASs is the understanding of the way multiple modestly-associated genes interact to influence a phenotype (Chang et al., 2015a). Different intermolecular interactions exist such as PPIs, protein–DNA interactions, and RNA interactions. These interactions are represented as networks with nodes that symbolize molecules and links that symbolize their interactions.

There are two types of interaction networks: physical and functional. Physical networks represent contacts between proteins, on the contrary, functional interactions networks aim to connect genes with similar or related functions. In studies of complex diseases, researchers are usually focused on groups of related genes, referred to as modules or subnetworks. Often it is easier to predict the function of a module than the function of a single gene. Modules are identified by using diverse genomics information, including genotypic data and expression profiles identified in disease samples (Cho et al., 2012). Several software tools have been developed to generate networks and modules, including network interface miner for multigenic interactions (NIMMI) (Akula et al., 2011) and ARACNE (Margolin et al., 2006). Most network-based analyses combine GWAS data with PPI data (Akula et al., 2011; Baranzini et al., 2009; Consortium, 2013; Zhang et al., 2015a). Simplified workflow of network-based study approach is presented in Figure 3a.

FIG. 3.

The comparison of the network-based and pathway-based genome-wide association study approaches.

Protein interaction network-based analysis

Protein interaction network-based analysis (PINBA) of GWAS data is a network-based method for identifying susceptibility genes, which investigates whether a set of genes with related function is jointly associated with a trait or disease of interest. PINBA combines GWAS data and prior biological knowledge about protein interactions to determine associability (Yu et al., 2014). Protein interaction network-based pathway analysis (PINBPA) has been used to investigate missing heritability in multiple sclerosis. They have combined two largely independent GWAS multiple sclerosis data sets and discovered that proteins encoded by genes that contain risk variants are more likely to interact with each other and take part in the same or related pathways (Consortium, 2013).

Integrative network-based association study

Integrative network-based association study (INAS) integrates different types of data, including GWAS data, protein–protein, and protein–DNA interactions, to generate regulatory networks and thus provides understanding of the roles of genetics and epigenetics in disease predisposition and etiology (Califano et al., 2012). For example, Lage et al. have performed systematic analysis of human protein complexes consisting of gene products involved in several categories of human diseases to create a phenome–interactome network. They have integrated quality-controlled data of human protein interactions and validated a computationally derived phenotype similarity score (Lage et al., 2007).

Pathway-based studies, gene set enrichment analysis

Pathway-based GWAS, pathway-based analysis

Pathway-based study approaches examine whether a group of related genes in the same functional pathway are together associated with a studied trait (Wang et al., 2010). Several different names have been used for this study approach such as pathway-based GWAS (Wang et al., 2010), pathway-based analysis (PBA) (Chang et al., 2015b), and pathway-wide association study (PWAS) (Califano et al., 2012). The term “pathway” in GWASs usually means a gene set, rather than an interconnected biological process (Jiao et al., 2015). Pathway-based studies are complimentary approaches for GWAS. GWASs have focused on the analysis of single markers and thus have often overlooked genetic variants with relatively small effect sizes.

On the contrary, pathway-based GWAS can detect the additive effects of multiple minor genes. Therefore, it has a better chance of identifying novel genes or sets of genes and mechanisms that are involved in disease pathogenesis. Pathway-based GWASs usually examine a collection of predefined gene sets for pathways based on prior biological knowledge. In recent years, a variety of methods have been developed for pathway-based association analysis. The disadvantage of PBA is that this approach sees pathway as a single unit and thus cannot detect a small portion of the pathway or other new combinations of genes that may be associated with a trait (Chang et al., 2015b). Simplified workflow of pathway-based study approach is presented in Figure 3b.

Metabolomics

Metabolite-based GWAS (metabolome-based GWAS)

Metabolite-based GWAS (mGWAS) (metabolome-based GWAS) is a combination of GWAS and metabolomics (Luo, 2015). It is defined as a genome-wide association study with metabolic traits as phenotypic traits (Gieger et al., 2008). The metabolite spectrum and the metabolites' quantities may be viewed as the metabolic phenotype or metabotype. If the metabotype provides a link between gene sequence and visible phenotypes, metabolites can be used as biomarkers for prediction of the traits (Wen et al., 2014). The first mGWAS has investigated associations between genetic variants and metabolite profiles in human serum (Gieger et al., 2008). Using this approach in plants, they have discovered which DNA variants have significant impact on metabolic changes. This has extended the knowledge on the genetic contribution to plant metabolism and on the by now undiscovered biochemical pathways or pathway interactions (Luo, 2015).

Metabolome-wide association studies

MWAS is a study approach that investigates associations between metabolic phenotypes and disease risk and thus can be used for identification of disease-related biomarkers (Chadeau-Hyam et al., 2010). MWAS has some similarities with GWAS, such as case/control approach of the study, high throughput, discovery of novel associations, large complex data arrays, and potential for false-positive association (Bictash et al., 2010; Osborn et al., 2013). It starts with metabolic profiling, which applies metabolomics techniques such as nuclear magnetic resonance spectroscopy and mass spectrometry, liquid chromatography–mass spectrometry, gas chromatography–mass spectrometry, flow injection analysis mass spectrometry to measure hundreds or thousands of metabolites in cells, biofluids, or tissues (Chadeau-Hyam et al., 2010; Petersen et al., 2012). The next step is statistical analysis of the data, which detects statistically significant relationships between molecular variables and phenotype (Chadeau-Hyam et al., 2010).

Epigenomics

Epigenome-wide association study

EWAS is a study approach that is used to analyze epigenetic marks, which are often DNA methylation and histone modifications (Flanagan, 2015). Using EWAS approach, numerous significant associations have been identified for several diseases, especially in the context of the cancer, exposures, and lifestyle factors (Flanagan, 2015; Rakyan et al., 2011).

Methylome-wide association study

MWAS is one of the subclasses of EWAS. DNA methylation is associated with gene silencing (Bird, 2002). Methylation studies are a promising compliment to genetic studies. First, the methylation affects gene expression, and therefore, the knowledge about DNA methylation can contribute to understanding of disease etiology. It has been already discovered that dysregulation of DNA methylation has been associated with many human diseases such as Alzheimer's and Parkinson's diseases, systemic lupus erythematosus, cardiovascular disease, and several types of cancer. Second, methylation is also affected by many factors that can affect the development of diseases such as age, sex differences, and genotype environment interactions. Finally, methylation sites are potential new drug targets (Aberg et al., 2012; Chen et al., 2013). The most common techniques for profiling DNA methylation are sequencing and array-based profiling technologies (Rakyan et al., 2011).

Whole genome bisulfite sequencing (genome-wide bisulfite sequencing)

Bisulfite genomic sequencing is technology for detection of DNA methylation and provides a qualitative, quantitative, and efficient approach to identify 5-methylcytosine at single base-pair resolution (Li and Tollefsbol, 2011). Whole genome bisulfite sequencing (WGBS or WGBS-seq) is a combination of bisulfite conversion of genomic DNA and NGS (Lee et al., 2013). Unmethylated cytosines are after the treatment of sodium bisulfite converted into uracils, while methylated cytosines are immune to this conversion and remain as cytosines (Li and Tollefsbol, 2011). Next step is sequencing that usually includes the pyrosequencing method, which is a quantitative method for DNA methylation analysis (Lee et al., 2013).

WGBS-seq has been successfully used to map the complete methylomes of several human embryonic stem cell (ESC) lines, human peripheral mononuclear cells, and hematopoietic progenitor cells. WGBS-seq analyses have also contributed to discovery of large partially methylated domains in cancer cells. Although there are many advantages of WGBS-seq, this approach is not economically feasible for the large sample sizes. Therefore, other alternative sequencing approaches developed such as targeted bisulfite sequencing (Lee et al., 2013), methyl-CpG binding domain (MBD) protein-enriched genome sequencing (MBD-seq), and methylated DNA immunoprecipitation (Chen et al., 2013; McClay et al., 2014).

Genome-wide analysis of histone modifications

Histone modifications are post-translation modifications of histone proteins, including methylation, phosphorylation, acetylation, ubiquitylation, and SUMOylation. These modifications can impact gene expression by directly altering chromatin structure or on the other hand by providing binding sites for the recruitment of other nonhistone proteins to chromatin (Zhang et al., 2015b).

Genome-wide analysis of histone methylation

Methylation of histone N-terminal tails can act to regulate chromatin states. Methylation of lysines 4, 36, and 79 of histone H3 is associated with euchromatin, while the methylation of histone H3 lysines 9, 27, and histone H4 lysine 20 is associated with heterochromatin and gene silencing (Barski et al., 2007; Guenther et al., 2007).

Genome-wide analysis of histone methylation on two histone H3 lysine residues (H3K4me3 and H3K27me3) and gene expression profiles in naive and memory CD8⁺ T cells has been performed, and the existence of correlation between gene expression and the amounts of H3K4me3 (positive correlation) and H3K27me3 (negative correlation) across the gene body has been identified. The analysis of histone methylation has been performed using ChIP-seq and the global gene expression profile by the human whole genome chip. The correlation between the amount of H3K4me3 or H3K27me3 with mRNA has been confirmed by real-time polymerase chain reaction (PCR) or quantitative PCR (qPCR) and reverse transcription PCR (RT-PCR) (Araki et al., 2009).

Genome-wide analysis of histone acetylation

Epigenetic alternations, including histone acetylation and deacetylation, are important parts of gene regulation. Acetylation of histone H3 lysines 9 and 14 is associated with euchromatin and active gene transcription (Araki et al., 2009). Genome-wide histone H3 acetylation patterns in acute myeloid leukemia (AML) have been analyzed using ChIP-chip in a group of patients with AML and controls to identify epigenetic alterations in AML. They have discovered that in AML, core promotor regions have lower histone H3 acetylation levels compared with CD34⁺ progenitor cells (Agrawal-Singh et al., 2012). In addition, Liu et al. (2015) have performed genome-wide analysis of histone acetylation dynamics during mouse ESC (mESC) neural differentiation by ChIP-seq and revealed that during mESC neural differentiation the histone H3 acetylation level is increased on the neural gene loci while decreased on the neural inhibitory gene loci.

Genome-wide nucleosome mapping

The availability of genome-wide nucleosome maps has contributed to knowledge about the factors that affect nucleosomes' position and how this consequentially influences and regulates various cellular processes, including regulation of gene expression, chromosome segregation, DNA replication and repair, and recombination. These novel findings represent a whole picture of complicated interactions between DNA sequence, various regulatory processes, and the three-dimensional chromatin structure. Most of the nucleosome mapping techniques share the same basic principles. First, the unprotected linker DNA is hydrolyzed by using micrococcal nuclease (MNase), followed by isolation, purification, and analysis of undigested, nucleosomal DNA fragments. Several techniques provide different levels of resolution and accuracy for nucleosome positioning analysis. Genome-wide studies often use microarray hybridization or high-throughput sequencing to identify nucleosome positions. Nucleosomes have been mapped both in vivo and in vitro. In vivo studies represent the chromatin organization in living cells; on the other hand, in vitro studies focus only on DNA-encoded sequence preferences and exclude the effects of other trans factors (Jansen and Verstrepen, 2011).

miRNomics, ncRNomics

ncRNA are functional RNAs that are not transcribed into proteins. According to the 200 bp limit, they are divided to short and long ncRNA (lncRNAs). miRNA are around 22 bp in length. They bind to different target regions and silence or activate expression of target genes (Kunej et al., 2012). They play an important role in regulation of several processes, including organismal development and establishment and maintenance of tissue differentiation (Pritchard et al., 2012). lncRNAs are still poorly characterized, but recently it has been shown that they play an important role in several biological processes, including gene expression, immune response, cellular development, and metabolism through comprehensive mechanisms (Yarmishyn and Kurochkin, 2015; Zhu et al., 2014).

Genome-wide miRNA profiling (whole genome miRNA profiling)

Genome-wide miRNA profiling is a study approach that measures the expression of miRNA molecules in selected cells or tissues. The importance of miRNA in gene expression regulation has led in development of miRNAs as biomarkers for variable molecular diagnostic applications, in cancer, cardiovascular and autoimmune diseases, and also in development of miRNA-based methods for forensic analysis (Pritchard et al., 2012). The most common approaches that are used for genome-wide miRNA profiling are miRNA microarrays and NGS (RNA-seq) (Liu et al., 2004; Munker and Calin, 2011; Pritchard et al., 2012). RNA-seq has several advantages over other approaches, including discovery of novel miRNAs and identification of low-abundance miRNAs (Morin et al., 2008).

Genome-wide analysis of lncRNA expression

Genome-wide analysis of lncRNA expression is a study approach that measures the expression of lncRNA molecules in selected cells or tissues. Most common methods that are used for genome-wide analysis of lncRNA expression are microarray assay, usually validated by qPCR analysis and high-throughput sequencing technologies (Sui et al., 2013; Yarmishyn and Kurochkin, 2015).

Phenomics

Phenome-wide association study

PheWAS is a study approach that investigates the association between SNPs and a wide spectrum of phenotypes, known as phenome (Pendergrass et al., 2012). It is a complementary approach to GWAS, known also as “reverse GWAS” (Denny et al., 2013; Pendergrass et al., 2012). GWAS is a phenotype-to-genotype approach, which means that it begins with a specific phenotype to test for associations over a wide spectrum of genetic variants across the genome; on the other hand, PheWAS uses the reverse approach—genotype-to-phenotype strategy, which means that it starts with a genotype to test for associations over a wide spectrum of human phenotypes. The comparison between PheWAS and GWAS approaches is presented in Figure 3.

Most PheWAS have in common the use of an electronic medical record (EMR) to define the phenome. EMR includes a number of data such as prescription records, family histories, laboratory and imaging test results, physician notes, procedure codes, and most importantly the International Classification of Disease (ICD) codes, which are internationally standardized codes used for defining the disease status. The patients coded for a specific ICD9 code are usually identified as “cases” for that respective code. On the contrary, the patients not coded for a specific ICD9 code become “controls.” The successfulness of PheWAS depends on how well the phenome can be defined. When a phenome is constructed, an association testing can begin (Hebbring, 2014). PheWASs results may be useful in detecting novel relationships between SNPs, phenotypes, as well as for the identification of pleiotropy (Pendergrass et al., 2012).

FIG. 2.

A schematic diagram of the comparison between PheWAS and GWAS approaches. GWAS, genome-wide association study; PheWAS, phenome-wide association study.

Environmental omics

Environmental omics is the application of different omics technologies to better understand the effect of environmental and genetic factors on phenotypes (Ge et al., 2013; Morrison et al., 2006).

Environment-wide association study

It has become clear that for better understanding of complex diseases the measurement of both genetic and environmental factors is crucial. Environmental-wide association study (EWAS) is a study approach that searches for environmental factors associated with disease development on a broad scale. Two EWAS for type 2 diabetes have been performed to date. Patel et al. have conducted the study by using cross-sectional epidemiological data (the National Health and Nutrition Examination Survey; NHANES). This survey collects data by querying participants about their health status, and an extensive set of clinical and laboratory tests are performed on a subset of these individuals. Several environmental characteristic are assayed, such as chemical toxicants, pollutants, allergens, bacterial/viral organisms, and nutrients.

On the contrary, Hall et al. have used four tools for measuring environmental exposures and outcome traits: the PhenX Toolkit, the Diet History Questionnaire (DHQ), the Measurement of a Person's Habitual Physical Activity, and electronic health records. The results of statistical analysis are usually presented using the Manhattan plot, with X-axis representing different environmental classes and Y-axis indicating −log10 (p-value) of the adjusted logistic regression coefficient for each of the environmental factors. Integration of EWAS with GWASs and PheWASs will increase knowledge about the complex interplay of gene and environment in complex traits and it will also elucidate the ways in which exposures modulate pleiotropy (Hall et al., 2014; Patel et al., 2010).

Genome-wide environmental interaction study, genome-wide by environment interaction study

Genome-wide environmental interaction study (GWEI) (Aschard et al., 2012) or genome-wide by environment interaction study (GWEIS) (Dunn et al., 2016) is a study approach that investigates the gene–environment (G-E) interactions on a genome-wide scale.

It is believed that complex diseases are the result of a combination of effects of genes, environment factors, and their interactions; therefore, the study of G-E interactions plays an important role in discovering disease-causing mechanisms (Aschard et al., 2012; Winham and Biernacka, 2013). It is generally considered that G-E interaction occurs if the genotype at a locus modifies the effect of an environmental factor on the phenotype, or vice versa if an environmental factor modifies the effect of a genetic variant on the phenotype. Genome-wide environmental interaction studies are conducted by performing a test of interaction with each genotyped SNP and the environmental factor. This approach has been recognized as severely underpowered for analysis of G-E interactions, and therefore, several screening approaches have been developed to reduce the number of G-E effects tested and improve power of G-E studies, including two-step/filtering methods. Genome-wide environmental interaction studies have still a lot of limitations, therefore, further development of novel approaches and software will be critical to the identification of genome-wide G-E interaction effects in complex traits (Winham and Biernacka, 2013).

Pharmacogenomics

Genome-wide mapping of drug–DNA interactions

Genome-wide mapping of drug–DNA interactions is a study approach that identifies binding sites for small molecules across the genome. In recent years, there has been great interest in the development of molecules that can selectively bind to the genome and modulate gene regulatory networks and thus increase therapeutic index and reduce off-target effects. Recently, several methods to study interactions between small molecules and nucleic acids have been developed such as chemical affinity capture and massively-parallel DNA sequencing (Chem-seq) and crosslinking of small molecules to isolate chromatin (COSMIC). COSMIC is a method that identifies polyamide binding sites across the genome. Polyamides composed of N-methylpyrrole and N-methylimidazole are molecules that can target DNA and have showed antiviral and anticancer properties (Erwin et al., 2016).

Discussion

We performed an analysis of research publications in genomics and collected existing genome-wide study approaches and organized them hierarchically. This study provides first preliminary review of the omics study approaches and presents the start of further more comprehensive analysis. The analysis of the literature revealed disorganization in terminology of genome-wide study approaches, which causes problems in literature searching and experimental work. The main challenge is that different study approaches share the same acronyms and that one study approach is named with several terms. In the future, experts from the field will need to define the differences between similar study approaches and standardize terminology. An example of the study approach that needs standardization is analysis of DNA–DNA interactions as well as network- and pathway-based studies.

In addition, we have found the following examples for study appraches that share the same acronyms. First, the acronym MWAS is used for methylome-wide association study as well as for metabolome-wide association study. Second, the acronym EWAS is used for epigenome-wide association study and environment-wide association study. Instead of changing the acronyms with a new one, extending the existing one might present a compromise. To solve this issue, we proposed some solutions by introducing new acronyms presented in Table 4.

Table 4.

Same Acronyms Used for Different Methods and Suggestions for Solutions of the Terminology

Study approach	Current acronym	Suggested acronym
Metabolome-wide association studies	MWAS	MetabolWAS
Methylome-wide association studies		MethylWAS
Epigenome-wide association study	EWAS	EpiWAS
Environment-wide association study		EnvWAS or EnviroWAS

Furthermore, we suggest that the term epigenome-wide association study (EWAS) is used as a hypernym for all types of studies of epigenetics factors in genome-wide level and that the term methylome-wide association study (MWAS) is used for studies of DNA methylation.

In addition, some study approaches can be classified in more than one omics type. For example, study approaches that analyze expression of miRNA genes may fall under miRNomics as well as transcriptomics. Furthermore, studies that investigate PPIs may be part of proteomics and interactomics.

Plausible explanation for the differences in terminology or acronym of some related approaches is that the study approaches as well as omics approaches may depend on the field of study (e.g., biomedical, microbiology, environment, food science, and cell engineering). Therefore, the scientists engaged in one particular field of study do not pay attention to study approaches in other research areas and thus lead to inconsistencies in terminology.

Conclusions and Outlook

Taken together, our study presented here provides the first systematic review and analyses of whole genome approaches and presents a baseline for further controlled terminology development, with a view to a new taxonomy for omics and multiomics studies in the future.

We have curated and organized the hierarchy of several reported genome-wide study approaches and, moving forward, suggest solutions of terminology for the same acronyms used for different methods. Moreover, our study provides an overview of the field of genomics and can be instrumental for all who are engaged in building a robust taxonomy in this research area. It provides guidelines in terminology, and, by taking our suggestions into account, the scholars in omics can develop terminology for novel study approaches in the future.

Due to rapid development and the extensiveness of the field of genomics, we did not manage to include all reported study approaches and omics types, for example, study approaches in the field of functional genomics and RNA-editomics. Furthermore, environmental omics includes several different omics approaches such as nutrigenomics and metagenomics. Therefore, the presented hierarchy organization and terminology should be further extended and complimented in the future by experts of the fields. In the present study, we were mainly focused on the different study approaches, but for even more complete overview of the field of genomics, it will be also necessary to review the methods and techniques that are used to analyze specific study approaches in more detail.

Footnotes

Acknowledgments

This work was supported by the Slovenian Research Agency (ARRS) through the Research program (P4-0220). The authors thank the editor and anonymous reviewers for their constructive comments, which helped us to improve the article.

Author Disclosure Statement

The authors declare that no conflicting financial interests exist.

Abbreviations Used

References

Aberg

, McClay

, Nerella

, et al. (2012). MBD-seq as a cost-effective approach for methylome-wide association studies: Demonstration in 1500 case—control samples. Epigenomics, 4, 605–621.

Agrawal-Singh

, Isken

, Agelopoulos

, et al. (2012). Genome-wide analysis of histone H3 acetylation patterns in AML identifies PRDX2 as an epigenetically silenced tumor suppressor gene. Blood, 119, 2346–2357.

Akula

, Baranova

, Seto

, et al. (2011). A network-based approach to prioritize results from genome-wide association studies. PLoS One, 6, e24220.

Araki

, Wang

, Zang

, et al. (2009). Genome-wide analysis of histone methylation reveals chromatin state-based regulation of gene transcription and function of memory CD8+ T cells. Immunity, 30, 912–925.

Aschard

, Lutz

, Maus

, et al. (2012). Challenges and opportunities in genome-wide environmental interaction (GWEI) studies. Hum Genet, 131, 1591–1613.

Baranzini

, Galwey

, Wang

, et al. (2009). Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Hum Mol Genet, 18, 2078–2090.

Barkan

. (2009). Genome-wide analysis of RNA-protein interactions in plants. Methods Mol Biol, 553, 13–37.

Barski

, Cuddapah

, Cui

, et al. (2007). High-resolution profiling of histone methylations in the human genome. Cell, 129, 823–837.

Bick

, and Dimmock

. (2011). Whole exome and whole genome sequencing. Curr Opin Pediatr, 23, 594–600.

10.

Bictash

, Ebbels

, Chan

, et al. (2010). Opening up the “Black Box”: Metabolic phenotyping and metabolome-wide association studies in epidemiology. J Clin Epidemiol, 63, 970–979.

11.

Bird

. (2002). DNA methylation patterns and epigenetic memory. Genes Dev, 16, 6–21.

12.

Blackstock

, and Weir

. (1999). Proteomics: Quantitative and physical mapping of cellular proteins. Trends Biotechnol, 17, 121–127.

13.

Bush

, and Moore

. (2012). Chapter 11: Genome-wide association studies. PLoS Comput Biol, 8, e1002822.

14.

Califano

, Butte

, Friend

, Ideker

, and Schadt

. (2012). Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat Genet, 44, 841–847.

15.

Chadeau-Hyam

, Ebbels

, Brown

, et al. (2010). Metabolic profiling and the metabolome-wide association study: Significance level for biomarker identification. J Proteome Res, 9, 4620–4627.

16.

Chaim

, Borovsky

, Rao

, Gur

, Zamir

, and Paran

. (2006). Comparative QTL mapping of fruit size and shape in tomato and pepper. Israel J Plant Sci, 54, 191–203.

17.

Chang

, Fang

, Zhang

, and Wang

. (2015a). Network-based analysis of schizophrenia genome-wide association data to detect the joint functional association signals. PLoS One, 10, e0133404.

18.

Chang

, Wang

, and Zhang

. (2015b). Pathway-based analysis for genome-wide association study data of bipolar disorder provides new insights for genetic study. Protein Cell, 6, 912–915.

19.

Chen

, Gao

, Nerella

, et al. (2013). MethylPCA: A toolkit to control for confounders in methylome-wide association studies. BMC Bioinformatics, 14, 74.

20.

Cho

, Kim

, and Przytycka

. (2012). Chapter 5: Network biology approach to complex diseases. PLoS Comput Biol, 8, e1002820.

21.

Chrystoja

, and Diamandis

. (2014). Whole genome sequencing as a diagnostic test: Challenges and opportunities. Clin Chem, 60, 724–733.

22.

Chu

, Zhang

, Zhao

, et al. (2014). A genome-wide gene-gene interaction analysis identifies an epistatic gene pair for lung cancer susceptibility in Han Chinese. Carcinogenesis, 35, 572–577.

23.

Chu

, and Corey

. (2012). RNA sequencing: Platform selection, experimental design, and data interpretation. Nucleic Acid Ther, 22, 271–274.

24.

Consortium

IMSG

. (2013). Network-based multiple sclerosis pathway analysis with GWAS data from 15,000 cases and 30,000 controls. Am J Hum Genet, 92, 854–865.

25.

Dekker

, Rippe

, Dekker

, and Kleckner

. (2002). Capturing chromosome conformation. Science, 295, 1306–1311.

26.

Denny

, Bastarache

, Ritchie

, et al. (2013). Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol, 31, 1102–1110.

27.

Dolled-Filhart

, Lee

, Ou-Yang

, Haraksingh

, and Lin

. (2013). Computational and bioinformatics frameworks for next-generation whole exome and genome sequencing. ScientificWorldJournal, 2013, 730210.

28.

Dubé

, and Hegele

. (2013). Genetics 100 for cardiologists: Basics of genome-wide association studies. Can J Cardiol, 29, 10–17.

29.

Dunn

, Wiste

, Radmanesh

, et al. (2016). Genome-wide association study (GWAS) and genome-wide by environment interaction study (GWEIS) of depressive symptoms in African American and Hispanic/Latina women. Depress Anxiety, 33, 265–280.

30.

Ekblom

, and Wolf

. (2014). A field guide to whole-genome sequencing, assembly and annotation. Evol Appl, 7, 1026–1042.

31.

El-Metwally

, Hamza

, Zakaria

, and Helmy

. (2013). Next-generation sequence assembly: Four stages of data processing and computational challenges. PLoS Comput Biol, 9, e1003345.

32.

Erwin

, Grieshop

, Bhimsaria

, Eguchi

, Rodríguez-Martínez

, and Ansari

. (2016). Genome-wide mapping of drug-DNA interactions in cells with COSMIC (crosslinking of small molecules to isolate chromatin). J Vis Exp, 107, e53510.

33.

Fan

, Huang

, Zhu

, et al. (2014). Identification of proteins that interact with alpha A-crystallin using a human proteome microarray. Mol Vis, 20, 117–124.

34.

Flanagan

. (2015). Epigenome-wide association studies (EWAS): Past, present, and future. Methods Mol Biol, 1238, 51–63.

35.

, Wang

, Chiu

, et al. (2013). Environmental OMICS: Current status and future directions. JIOMICS, 3, 75–87.

36.

Gieger

, Geistlinger

, Altmaier

, et al. (2008). Genetics meets metabolomics: A genome-wide association study of metabolite profiles in human serum. PLoS Genet, 4, e1000282.

37.

Goudey

, Rawlinson

, Wang

, et al. (2013). GWIS—model-free, fast and exhaustive search for epistatic interactions in case-control GWAS. BMC Genomics, 14 Suppl 3, S10.

38.

Greenald

, Jeyakani

, Pelster

, Sealy

, Mathavan

, and van Eeden

. (2015). Genome-wide mapping of Hif-1α binding sites in zebrafish. BMC Genomics, 16, 923.

39.

Guenther

, Levine

, Boyer

, Jaenisch

, and Young

. (2007). A chromatin landmark and transcription initiation at most promoters in human cells. Cell, 130, 77–88.

40.

Gusareva

, Carrasquillo

, Bellenguez

, et al. (2014). Genome-wide association interaction analysis for Alzheimer's disease. Neurobiol Aging, 35, 2436–2443.

41.

Hafner

, Landthaler

, Burger

, et al. (2010a). PAR-CliP—a method to identify transcriptome-wide the binding sites of RNA binding proteins. J Vis Exp, 41, e2034.

42.

Hafner

, Landthaler

, Burger

, et al. (2010b). Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell, 141, 129–141.

43.

Hall

, Ptacek

, and Snyder

. (2007). Protein microarray technology. Mech Ageing Dev, 128, 161–167.

44.

Hall

, Dudek

, Goodloe

, et al. (2014). Environment-wide association study (EWAS) for type 2 diabetes in the Marshfield Personalized Medicine Research Project Biobank. Pac Symp Biocomput. 200–211.

45.

Hebbring

. (2014). The challenges, advantages and future of phenome-wide association studies. Immunology, 141, 157–165.

46.

Hibar

, Stein

, Jahanshad

, et al. (2015). Genome-wide interaction analysis reveals replicated epistatic effects on brain structure. Neurobiol Aging, 36 Suppl 1, S151–S158.

47.

Hoffman

, and Jones

. (2009). Genome-wide identification of DNA-protein interactions using chromatin immunoprecipitation coupled with flow cell sequencing. J Endocrinol, 201, 1–13.

48.

Jansen

, and Verstrepen

. (2011). Nucleosome positioning in Saccharomyces cerevisiae. Microbiol Mol Biol Rev, 75, 301–320.

49.

Jiang

, Zhou

, Li

, et al. (2015). Whole transcriptome analysis with sequencing: Methods, challenges and potential solutions. Cell Mol Life Sci, 72, 3425–3439.

50.

Jiao

, Wang

, Yang

, et al. (2015). Pathway-based genome-wide association studies for plasma triglycerides in obese females and normal-weight controls. PLoS One, 10, e0134923.

51.

Kim

, and Ren

. (2006). Genome-wide analysis of protein-DNA interactions. Annu Rev Genomics Hum Genet, 7, 81–102.

52.

König

, Zarnack

, Rot

, et al. (2010). iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol, 17, 909–915.

53.

Kunej

, Godnic

, Horvat

, Zorc

, and Calin

. (2012). Cross talk between microRNA and coding cancer genes. Cancer J, 18, 223–231.

54.

Lage

, Karlberg

, Størling

, et al. (2007). A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol, 25, 309–316.

55.

Lee

, Luo

, Wilson

, and Shi

. (2013). Analyzing the cancer methylome through targeted bisulfite sequencing. Cancer Lett, 340, 171–178.

56.

Lee

, Giovinazzo

, Kang

, et al. (2014). Protein microarray characterization of the S-nitrosoproteome. Mol Cell Proteomics, 13, 63–72.

57.

Lesur

, and Domon

. (2015). Advances in high-resolution accurate mass spectrometry application to targeted proteomics. Proteomics, 15, 880–890.

58.

. (2016). Chromosome conformation capture in Drosophila. Methods Mol Biol, 1480, 207–212.

59.

, and Tollefsbol

. (2011). DNA methylation detection: Bisulfite genomic sequencing analysis. Methods Mol Biol, 791, 11–21.

60.

Lieberman-Aiden

, van Berkum

, Williams

, et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326, 289–293.

61.

Liu

, Calin

, Meloon

, et al. (2004). An oligonucleotide microchip for genome-wide microRNA profiling in human and mouse tissues. Proc Natl Acad Sci U S A, 101, 9740–9744.

62.

Liu

, Dou

, Peng

, Han

, and Jing

. (2015). Genome-wide analysis of histone acetylation dynamics during mouse embryonic stem cell neural differentiation. Genom Data, 5, 15–16.

63.

Luo

. (2015). Metabolite-based genome-wide association studies in plants. Curr Opin Plant Biol, 24, 31–38.

64.

, Ma

, Liu

, et al. (2015). Genome-wide protein-protein interactions and protein function exploration in cyanobacteria. Sci Rep, 5, 15519.

65.

Manolio

. (2010). Genomewide association studies and assessment of the risk of disease. N Engl J Med, 363, 166–176.

66.

Mantione

, Kream

, Kuzelova

, et al. (2014). Comparing bioinformatic gene expression profiling methods: Microarray and RNA-Seq. Med Sci Monit Basic Res, 20, 138–142.

67.

Margolin

, Nemenman

, Basso

, et al. (2006). ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics, 7 Suppl 1, S7.

68.

Marguerat

, and Bähler

. (2010). RNA-seq: From technology to biology. Cell Mol Life Sci, 67, 569–579.

69.

McClay

, Aberg

, Clark

, et al. (2014). A methylome-wide study of aging using massively parallel sequencing of the methyl-CpG-enriched genomic fraction from blood in over 700 subjects. Hum Mol Genet, 23, 1175–1185.

70.

McDowall

, Scott

, and Barton

. (2009). PIPs: Human protein-protein interaction prediction database. Nucleic Acids Res, 37,(Database issue), D651–D656.

71.

Metzker

. (2010). Sequencing technologies—The next generation. Nat Rev Genet, 11, 31–46.

72.

Morin

, O'Connor

, Griffith

, et al. (2008). Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res, 18, 610–621.

73.

Morrison

, Wood

, Hancock

, et al. (2006). Annotation of environmental OMICS data: Application to the transcriptomics domain. OMICS, 10, 172–178.

74.

Munker

, and Calin

. (2011). MicroRNA profiling in cancer. Clin Sci (Lond), 121, 141–158.

75.

Murk

, and DeWan

. (2016). Exhaustive genome-wide search for SNP-SNP interactions across 10 human diseases. G3 (Bethesda), 6, 2043–2050.

76.

Osborn

, Park

, Parks

, et al. (2013). Metabolome-wide association study of neovascular age-related macular degeneration. PLoS One, 8, e72737.

77.

Pareek

, Smoczynski

, and Tretyn

. (2011). Sequencing technologies and genome sequencing. J Appl Genet, 52, 413–435.

78.

Patel

, Bhattacharya

, and Butte

. (2010). An environment-wide association study (EWAS) on type 2 diabetes mellitus. PLoS One, 5, e10746.

79.

Pendergrass

, Dudek

, Crawford

, and Ritchie

. (2012). Visually integrating and exploring high throughput phenome-wide association study (PheWAS) results using PheWAS-view. BioData Min, 5, 5.

80.

Petersen

, Krumsiek

, Wägele

, et al. (2012). On the hypothesis-free testing of metabolite ratios in genome-wide and metabolome-wide association studies. BMC Bioinformatics, 13, 120.

81.

Pritchard

, Cheng

, and Tewari

. (2012). MicroRNA profiling: approaches and considerations. Nat Rev Genet, 13, 358–369.

82.

Pugh

, and Gilmour

. (2001). Genome-wide analysis of protein-DNA interactions in living cells. Genome Biol, 2, REVIEWS1013.

83.

Qian

, Ba

, Zhuang

, and Zhong

. (2014). RNA-Seq technology and its application in fish transcriptomics. OMICS, 18, 98–110.

84.

Rabbani

, Tekin

, and Mahdieh

. (2014). The promise of whole-exome sequencing in medical genetics. J Hum Genet, 59, 5–15.

85.

Rabilloud

, Chevallet

, Luche

, and Lelong

. (2010). Two-dimensional gel electrophoresis in proteomics: Past, present and future. J Proteomics, 73, 2064–2077.

86.

Rakyan

, Down

, Balding

, and Beck

. (2011). Epigenome-wide association studies for common human diseases. Nat Rev Genet, 12, 529–541.

87.

Savic

, Gertz

, Jain

, Cooper

, and Myers

. (2013). Mapping genome-wide transcription factor binding sites in frozen tissues. Epigenetics Chromatin, 6, 30.

88.

Sui

, Yan

, Li

, et al. (2013). Genome-wide analysis of long noncoding RNA expression in peripheral blood mononuclear cells of uremia patients. J Nephrol, 26, 731–738.

89.

Ule

, Jensen

, Ruggiu

, Mele

, Ule

, and Darnell

. (2003). CLIP identifies Nova-regulated RNA networks in the brain. Science, 302, 1212–1215.

90.

Uzoma

, and Zhu

. (2013). Interactome mapping: Using protein microarray technology to reconstruct diverse protein networks. Genomics Proteomics Bioinformatics, 11, 18–28.

91.

van Berkum

, Lieberman-Aiden

, Williams

, et al. (2010). Hi-C: A method to study the three-dimensional architecture of genomes. J Vis Exp, 39, e1869.

92.

Volk

, Conboy

, Wical

, Patterson

, and Kirmani

. (2015). Whole-exome sequencing in the clinic: Lessons from six consecutive cases from the clinician's perspective. Mol Syndromol, 6, 23–31.

93.

Wang

, Li

, and Hakonarson

. (2010). Analysing biological pathways in genome-wide association studies. Nat Rev Genet, 11, 843–854.

94.

Wen

, Li

, et al. (2014). Metabolome-based genome-wide association study of maize kernel leads to novel biochemical insights. Nat Commun, 5, 3438.

95.

Winham

, and Biernacka

. (2013). Gene-environment interactions in genome-wide association studies: Current approaches and new directions. J Child Psychol Psychiatry, 54, 1120–1134.

96.

Yarmishyn

, and Kurochkin

. (2015). Long noncoding RNAs: A potential novel class of cancer biomarkers. Front Genet, 6, 145.

97.

, Bi

, Liu

, et al. (2014). Protein-interaction-network-based analysis for genome-wide association analysis of schizophrenia in Han Chinese population. J Psychiatr Res, 50, 73–78.

98.

Zhang

, Wang

, Shi

, and Li

. (2015a). Network-based association study of obesity and type 2 diabetes with gene expression profiles. Biomed Res Int, 2015, 619730.

99.

Zhang

, Cooper

, and Brockdorff

. (2015b). The interplay of histone modifications—writers that read. EMBO Rep, 16, 1467–1481.

100.

Zhou

, Ren

, Meng

, Li

, Yu

, and Yu

. (2010). The next-generation sequencing technology and application. Protein Cell, 1, 520–536.

101.

Zhu

, and Qian

. (2012). Applications of functional protein microarrays in basic and clinical research. Adv Genet, 79, 123–155.

102.

Zhu

, Liu

, Ye

, et al. (2014). The long noncoding RNA expression profile of hepatocellular carcinoma identified by microarray analysis. PLoS One, 9, e101707.