Emerging Trends in Genomic Approaches for Microbial Bioprospecting

Abstract

Microorganisms constitute two out of the three domains of life on earth. They exhibit vast biodiversity and metabolic versatility. This enables the microorganisms to inhabit and thrive in even the most extreme environmental conditions, making them all pervading. The magnitude of biodiversity observed among microorganisms substantially supersedes that exhibited by the eukaryotes. These characteristics make the microbial world a very lucrative and inexhaustible resource for prospecting novel bioactive molecules. Despite their vast potential, over 99% of the microbial world still remains to be explored. The primary reason for this is that the culture-dependent methods used in the laboratories are grossly insufficient, as they support the growth of under 1% of the microorganisms found in nature. This limitation necessitated the development of techniques to circumvent culture dependency and gain access to the outstanding majority of the microorganisms. The development of culture-independent techniques has essentially reshaped the study of microbial diversity and community dynamics. Application of genomic and metagenomic approaches is contributing substantially towards characterization of the real microbial diversity. The amenability of these techniques to high throughput has opened the doors to explore the vast number of “uncultivable” microbial forms in substantially lesser time. The present article provides an update on the recent technological advances and emerging trends in exploring microbial community.

Introduction

Microorganisms occupy every habitat on earth and determine the biogeochemistry of the planet. For more than a century, the identification and characterization of microorganisms has been carried out exclusively via traditional/axenic culturing techniques. These studies contributed substantially towards the establishment and development of public healthcare practices. They also revealed the key roles that microorganisms play in geochemical cycles and bioremediation, thus offering insights into the vast potential of the microbial world. As rich sources of novel bioactive molecules, the microorganisms continue to attract substantial interest for their pharmaceutical and therapeutic applications (Joint et al., 2010; Van Hamme et al., 2003; Widada et al., 2002). Despite providing considerable insights into the microbial world, culture-dependent techniques have some major limitations. They show a considerable bias towards organisms that are best adapted to laboratory conditions. Microorganisms coexist as mixed communities in nature. The composition and dynamics of each community are determined by biotic and abiotic factors. Environmental factors such as pH, temperature, and salinity play a vital role in determining the type of microflora. Numerous studies have found that the cultivable microorganisms constitute only a fraction of the overall community diversity. For example, the microorganisms initially identified to play a predominant role in bioremediation via culture techniques were later realized to contribute marginally. Further, when the gene pools of various xenobiotics degrading microbial communities were compared, the ‘uncultivable organisms' were found to constitute a significant fraction of total diversity (Ferrer et al., 2009; Handelsman, 2004, 2005). Vast disparities were also apparent between the number and range of cultivable microorganisms and the overall estimate of microbial diversity in marine bioprospecting studies. Characterizing the biodiversity of marine microflora has been especially challenging due to the complexity and dynamic nature of the marine environments (Joint et al., 2010). These and many other studies have revealed that preferential growth of specific microorganisms under laboratory culturing conditions gave a distorted/erroneous pictures of the constitution of microbial communities (Eyers et al., 2004; Malik et al., 2008; Torsvik et al., 2002). Hence, investigations employing traditional culturing approaches alone are becoming unacceptable for bioprospecting and cannot be relied upon to determine the community composition accurately either qualitatively or quantitatively (Daniel, 2005; Malik et al., 2008; Torsvik et al., 2002).

To address this issue, novel culturing techniques to simulate the natural habitat of various organisms in laboratory are being developed and optimized to enhance microbial growth (Widada et al., 2002). In addition, culture-independent methods (CIMs) employing current molecular approaches, are also significantly contributing towards improving our understanding of the biodiversity of microorganisms (Malik et al., 2008; Stenuit et al., 2008). The various strategies being adopted for studying bacterial bioprospecting and biodiversity are summarized in Figure 1. Collectively, these approaches serve as powerful tools to tap into the biotechnological applications of the vast resource of uncultured microorganisms.

FIG. 1.

Current approaches in investigating biodiversity and bioprospection of microorganism. Microorganisms from environmental samples can be isolated using traditional plate culture methods, but only a limited number of microorganisms are able to adapt to the culture conditions. Culture enrichment techniques enhance the possibility of isolating organisms, if selective conditions are provided to encourage growth of desired organisms. The high throughput methods allow simultaneous testing of a wide range of conditions and allow isolation of novel organisms of biological interest. The culture-independent approaches are based on DNA isolation and generation of genomic or metagenomic libraries to explore the vast diversity of uncultivable, as well as cultivable, microflora. These libraries can be further screened for novel bioactive molecules and functional characteristics.

Emergence of Culture-Independent Methods

The most significant surge in application of molecular tools to study the microbial ecology and biodiversity began in 1995. Analysis of specific cell constituents such as nucleic acids, proteins, and lipids extracted directly from the environment samples allow simultaneous study of both the cultivable and noncultivable microorganisms (Greene and Voordouw, 2003; Neufeld et al., 2004). Application of these strategies revealed a startling diversity of the ‘uncultivable’ microbiota. Their abundance renewed the excitement towards exploring these unknown microorganisms. The initial CIMs were based on PCR sequencing of clones from 5S rRNA cDNA library, which were found to be unreliable. Analyzing the 1500 bp long 16S rRNA gene, however, was much more successful. This is because these sequences are ubiquitously distributed and show high levels of conservation even among evolutionarily distant species. 16S rRNA typing is widely applied for taxonomic grouping of bacteria. The well-accepted criteria for defining a new bacterial species is a less than 97% similarity in the 16S rRNA gene sequence compared to the closest known bacterial species (Lane et al., 1985; Nocker et al., 2007; Pace et al., 1985). 16S rRNA typing also allows comparison of microbial distribution among different samples, in addition to quantifying the relative abundance of each taxonomic group. The robustness and versatility of this technique makes it a good tool to carry out phylogenetic inferences and gain insights into the metabolic diversity of microorganisms. This technique has provided the most convincing evidence of the vastness of microbial diversity (Handelsman, 2004; Ludwig, 2010; Sanz and Kochling, 2007).

In spite of the above advantages, relying solely on 16S rRNA gene sequences for phylogenetic analyses does have some limitations. First, the differentiation of closely-related bacterial species can become ambiguous at times, since the 16S rRNA genes are highly conserved (Gürtler and Stanisich, 1996; Kolbert and Persing, 1999). The presence of multiple 16S rRNA gene copies among different organisms (varying from 1 to 15), heterogeneity due to paralogous copies, occurrence of lateral gene transfer in some cases can also result in inconclusive phylogenity (Klappenbach et al., 2001). To address some of these challenges, Multi-Locus Sequence Analysis (MLSA) has been developed. This technique in principle is similar to Multi-Locus Sequence Typing (MLST), which is widely used for bacterial typing in epidemiological studies (Enright and Spratt, 1999; Platonov et al., 2000). MLSA utilizes ubiquitous housekeeping and other functional genes present as single copies, to complement 16SrRNA data. Some of the genes targeted for MLSA genotyping approach include 23S rRNA, RpoB, gyrB, and pheS. recA, (DNA repair protein) dnak, atpD EF-Tu, EF-G hsp70, and hsp60, in addition to 16SrRNA sequence (Lee and Cote, 2006; Ludwig, 2010). Additional specific functional genes like pmoA, mxaF, and nod are helpful when analyzing environmental samples to deduce correlation between the extent of diversity and the metabolic function (Horz et al., 2001). To strike a balance between the acceptable identification power and time/cost for strain typing, internal fragments (450–500 bases length) of multiple housekeeping/functional genes (about 7–8 genes) are commonly used in the laboratories (Ludwig, 2010). Genetic fingerprinting techniques based on the separation of the phylogenetic markers provide a reliable and specific profiling pattern for a given microbial community. The application potential, advantages, and limitations of the various finger printing techniques available have been extensively reviewed earlier (Cardenas and Tiedje, 2008; Fromin et al., 2002; Gabor et al., 2007; Nocker et al., 2007). The application of these approaches has significantly widened the horizon of phylogenetic and taxonomical characterization of uncultured bacteria from diverse environments (Ferrer et al., 2009).

The ability to conduct genome-wide analysis of large communities of microbial phyla in the post-genomic era has substantiated the complete dependence of the biosphere on the metabolic activities of microorganisms. The total number of characterized bacterial species to-date is limited to around only 6000 (Rappe and Giovannoni, 2003; Vinuesa, 2010). However, the genome representation per g of soil is estimated to be about 4000–7000 and total prokaryotic cell diversity predicted to be an incredible 4–6×10³⁰! (Joint et al, 2010; Kaeberlein et al, 2002).The advancements in r-RNA based phylogenetic approaches are currently allowing monitoring of 50–200 microbial forms/g (Pontes et al., 2007; Malik et al., 2008). Even at this stage, the current Gene Bank entries of 16S RNA genes from uncultured prokaryotes significantly outnumbering (over twice) those identified via culturing techniques. Several new divisions of bacteria with little or no affiliation to the known organisms have been characterized from the fraction of uncultured organisms using CMI (Hallam et al., 2006; Kunin et al., 2008b). Thus, the bulk of the microbiota still remains as a vast untapped resource for application (Handelsman, 2004, 2005; Lovely, 2003; Vinuesa, 2010). Several studies have further revealed presence of huge genomic diversity even within a single bacterial species (Malik et al., 2008; Mira et al., 2010; Nocker et al., 2007; Tettelin et al., 2008). Advances in sequencing technologies and subsequent reduction in cost are permitting complete genome sequencing of several bacterial strains of individual species. Comparative analysis of data from genomes of multiple strains/species of a single bacterium is collectively termed as a pan-genome. The pan-genomic repertoire is larger in magnitude by many orders than any single genome. It comprises mainly of a “core genome” containing the genes which are present in all characterized strains. Other “dispensable genome” sets contain genes that are present in two or more strains or genes that are unique to a specific strain (Mira et al., 2010; Tettelin et al., 2008). Thus, CIMs have become indispensable tools for investigating bacterial genetic diversity, population structures, and understanding their ecological role in various habitats.

DNA Microarrays

Microarray technology is yet another important taxonomical and functional tool that is widely used for genome and proteome analysis of mixed microbial communities. The property of a single-stranded DNA or RNA molecule to hybridize with complementary probe molecules attached to a solid support forms the basic principle of microarrays. Readouts are detected as signals given off by the fluorescent dyes incorporated in the sample (Zhou, 2003). The microarray slides/chips are prepared by spotting an ex situ synthesized probe or by directly assembling the probe in situ. In the latter approach, thousands of probes are spotted on a single slide using photolithographic masks and electrochemical reactions (e.g., “Gene Chip” arrays). The microchip types differ based on immobilization technology used, length, and nature of the probes, as well labeling of the targets. Factors such as probe density, specificity, sensitivity, quantification, and cost decide the technique selected for a study. Compared to traditional nucleic acid hybridization techniques, microarrays provide a rapid and sensitive detection system capable of detecting upto a single mismatch (Gao et al., 2007; Liu and Zhu. 2005; Zhou, 2003).

Three major classes of environmental microarray formats are used for bioprospecting microbial community. These are the community genome arrays (CGA), the phylogenetic oligonucleotide arrays (POA), and functional gene arrays (FGA) (Chandler et al., 2006; Rhee et al., 2004; Wu et al., 2001, 2004, 2006). Combination of microarrays types increases the versatility to probe complex microbial communities. For example, comprehensive FGA or Geochips that also have several phylogenetic probes, facilitated study of microbial communities dynamics in situ (He et al., 2007; Wu et al., 2001). Coupling of whole community RNA amplification (WCRA) with CGA allowed monitoring of the functional activities of microbial communities in environments contaminated with organic solvents, hydrocarbons, and uranium (Gao et al., 2007; Wu et al., 2006). Another prominent step forward in application of DNA arrays to community analysis is the use of probes produced directly from environmental DNA without any cultivation steps. Such metagenomic arrays (MGA) hold potential for high-throughput screening and have been successfully applied to characterize microbial community of a groundwater microcosm and other natural environments (Gentry et al., 2006; Sebat et al., 2003). These strategies reveal many direct linkages between biogeochemical processes and functional activities among microbial communities of the environments. Thus, microarrays present the advantage of miniaturization, for simultaneous gene function analysis in real time (Chandler et al., 2006; Rhee et al., 2004).

Real-Time PCR

Real-time PCR (rt-PCR) is a robust and powerful tool widely employed in microbial ecology for profiling and bioprospecting of environmental samples. This technique allows real-time quantification of amplicons during or at end of each cycle using fluorescent markers. In early exponential phase of PCR, the amplification of targets is directly proportional to the initial template concentration. This allows reliable quantification, as the observed intensity of fluorescence is directly proportional to the product concentration. rt-PCR is found to be 100- to 10,000-fold more sensitive than the microarray-based methods, with quantification limit of 1–2 genome copies (Eyers et al., 2004; Inglis and Kalischuk, 2004). Further, its real-time monitoring ability avoids all the time-consuming post-PCR quantification steps. The high analytical sensitivity for identification of specific genes in complex DNA mixtures also makes rt- PCR highly suitable for analysis of environmental samples (Powell et al., 2006; Ritalahti et al., 2006). Its amenability to high-throughput allows scale up for genomic and metagenomic level analyses. Community profiling of challenging samples such as hydrocarbon-contaminated Antarctic soil has been achieved using this approach. In this study, real time changes in gene expression and influence of biotic and abiotic factors were successfully recorded across both spatial and temporal levels (Ritalahti et al., 2006).

Numerous variants of PCR have been used for DNA amplification from complex environmental samples to address microbial communities profiling from diverse facets including biodegradation of organic contaminants (Gabor et al., 2007; Gilbride et al., 2006; Stenuit et al., 2006, 2008). These techniques adopted in conjunction with real time analysis increase their versatility and allow quantification. Multiplex rt-PCR technique is one such variant that utilizes multiple primer sets within a single PCR mix. As multiple genes are targeted at once, several amplicons are formed simultaneously. Each primer set is labeled with distinct fluorescent dyes having nonoverlapping excitation ranges. The resulting spectra permit independent detection of target amplification rates with good correlation. Thus, careful design of primers sets, coupled with use of optimized annealing conditions can provide a lot of additional information in a single test run. This helps to conserve the sample material and avoids well-to-well variation. Another important benefit of multiplex rt-PCR is the ease with which normalization can be carried out to increase the reliability of the results. To achieve this, reference genes (like house-keeping genes) are amplified along with target genes as internal controls to get more accurate quantification of target genes. Normalization also permits reliable comparison between results obtained from different experiments.

Multiplex PCR has been widely used to save time and resources in several bioprospecting and bioremediation studies. Some of these include detection of mono- or dioxygenase enzymes attacking polycyclic aromatic hydrocarbon (PAH) (Baldwin et al., 2003; Dionisi et al., 2004; Gilbride et al., 2006; Harms et al., 2003; Wilson et al., 1999). Targeting gene encoding Rieseke iron sulfate center (which is common in dioxygenase enzyme) using generic primers and rt- PCR could track the population shifts of PAH degrading microorganisms (Cébron et al., 2008; Chandain et al., 2006). This approach was also used to characterize microbial heterogeneity and functionality of the Anaeromyxobacter community at the Oak Ridge IFC uranium-contaminated sub-surface environment (Thomas et al., 2009). In another study, good correlation was observed between data obtained from multiplex rt-PCR and dot blot hybridization (validation test), as well as C₁₄-mineralization (direct indicator) for naphthalene degradation by Proteobacteria sp. showing validation of the former rt-PCR results (Nyyssonen et al., 2006). These studies illustrate the use of rt-PCR and its variants as powerful and reliable tools for assessing bioremediation potential in an environment, as well as for bioprospecting of novel enzymes without isolation/cultivation of bacteria. Thus speed, sensitivity, accuracy, and amenability to robotic automation makes rt- PCR occupy a prominent position among the molecular tools (Cardenas and Tiedje, 2008; Lerat et al., 2005; Powell et al., 2006).

PCR-Independent Amplification Techniques

New PCR-independent amplification techniques are emerging as popular tools to access genomic information from very low abundance microbial sources that otherwise remain inaccessible This approach avoids generic biases associated with the PCR-dependent methods such as artifacts/errors resulting from PCR or skewing due to unequal amplification. Whole genome amplification using multiple displacement amplification (MDA) techniques exhibited remarkably uniform amplification across the genomic targets in comparison to PCR-based whole genome amplification. As MDA used ø29 DNA polymerase and random exonuclease resistant primers, there was no need for thermal cycling (Abulencia et al., 2006; Binga et al., 2008; Dean et al., 2002). This method generated larger sized products with lower error frequency and was amicable to single cell genome sequencing as well. Combination of MDA and CGA technique was successful in analyzing oligotrophic microbial communities in groundwater contaminated with uranium and other metals (Wu et al., 2006). T7 polymerase-based linear amplification approach using fusion primers has been utilized for mRNA-based metatranscriptome analyses (Gao et al., 2007). These and other emerging techniques overcome the various limitations of PCR and open vistas to explore low density/uncultivable organisms present in a microbial community.

Innovations in Culturing Techniques

In an effort to enhance cultivability of more microbial types in laboratory, novel culturing techniques are being actively developed. These techniques are based on mimicking the natural habitat in which the microorganisms of interest grow and thrive. Techniques such as dilution-to-extinction, culturing in arrays, diffusion chambers, and micro-droplet encapsulation are being successfully applied to significantly improve the cultivability of as-yet uncultivable marine organisms in low-nutrient media (Connon and Giovannoni, 2002; Dionisi et al., 2012; Kaeberlein et al., 2002; Nicholas et al., 2008; Zengler et al., 2002, 2005). Providing a nutrient-poor media increased the recovery percentage of the cultured forms by several orders of magnitude compared to what was achieved earlier using nutrient-rich media. The studies resulted in isolation of several previously uncultured marine bacteria and bacterio-planktons (Joint et al., 2010; Penesyan et al., 2009). Further, the novel cultivation methods significantly increased the proportion of recovered microorganisms from marine environments from 10% to 25%, just in the last decade (Connon and Giovannoni, 2002; Lee et al., 2010). Optimizing the physical parameters such as temperature, pH, and pressure has permitted the isolation of several extremophiles in laboratory that produce novel cold/heat adaptive enzymes (Cowan et al., 2005; Dionisi et al., 2012). More recently, the availability of information regarding the cell attachment characteristics, cellular signaling pathways, and alternate electron acceptor requirements is also aiding in further optimization of the culture parameters (Joint et al., 2010; Lee et al., 2009; Maldonado et al., 2005). Second generation automated high throughput systems like isolation chips (ichips) and micro-petridishes available, allow enhanced cultivability. They have few hundreds to million growth compartments that can be inoculated even with single cells. Integration of these novel culturing techniques with fluorescence microscopy allows high-throughput screening of multiple cell arrays on a large scale (Lee et al., 2010). Another approach uses gel micro-droplets to encapsulate single cells for large-scale parallel microbial cultivation under low-nutrient flux conditions (Keller and Zengler, 2004). As the micro-colonies are formed in agarose, its porous nature facilitates easy diffusion of nutrients, signaling molecules, and waste metabolites. Growth within these microcapsules is detected by flow cytometry (Toledo et al., 2002; Zengler et al., 2002, 2005). The advantage of the micro-droplet and microbial trap methods is that the beads formed are easier to handle due to being physically distinct and much larger in size than bacterial cells. The studies of Zengler and colleagues (2005) efficiently employed micro-droplet and encapsulation procedure to enrich actinomycetes by allowing their efficient colonization. The study identified several new clades of marine actinomycetes, which could not be detected previously in the environmental gene library. These approaches have substantially widened the scope of microbial bioprospecting (Ingham et al., 2007, Nicholas et al., 2010; Sprenkels et al., 2007). Although novel culturing techniques allow isolation of several novel organisms, many of them are found to undergo only limited number of divisions in the laboratory (Doinisi et al., 2012; Joint et al., 2010). Adaptability/domestication of the recovered strain to laboratory conditions are a bottleneck that needs to be addressed for the successful large scale cultivation. Co-culturing with helper organisms and/or identifying signal peptides has been found to aid culturing of hitherto uncultivable strains (Lewis et al., 2010; Nicholas et al., 2008).

The Metagenomic Shift

All the genomic approaches described thus far focus on deciphering the complete genetic complement of a single organism. However, microorganisms exist in nature as communities of varying complexity. The role of an individual organism in an ecosystem is dictated by the composition of its surrounding microbial community. Metagenomics focuses on microbial community profiling and transcends the limitations of studying individual organisms. The term “metagenomics” was coined by Handelsman and associates in 1998 and is also known as community genomics, ecogenomics, or environmental genomics. In metagenomics, genome sequences from an entire community of organisms inhabiting a common environment are sampled. The process requires no prior separation of organisms from their habitat or maintenance of the microorganisms as pure or mixed cultures in laboratory. Metagenomic strategies also circumvent several limitations occurring due to direct DNA cloning. The approach minimizes improper representations of the microbial community as observed while screening a finite number of clones (Cowan et al., 2005). As nucleic acids are extracted directly from the environmental sample, the genomic data obtained includes both characterized and novel microbial forms of the community. In principle, any environment is amenable to metagenomic analysis, provided good quality nucleic acids can be extracted from the sample material. The advancements in sequencing technology are now catering for massive scale sequencing of the vast metagenomic repertoire.

The strength of metagenomics lies in its potential for serendipitous discovery as the approach by-passes the major limitations of classical approaches in microbiology (Binga et al., 2008; Cowan et al., 2005; Ferrer et al., 2009). This field area has gained a lot of significance in the last decade by becoming the center of focus for several studies. Numerous metagenomics projects have been initiated for analysis of microbial communities in diverse environments, including oceans, soils, thermal vents, hot springs, and the human micro-biome (Sebat et al., 2003; Hugenholtz and Tyson, 2008; Langer et al., 2006). Metagenomics studies can be conducted on three different scales. Small-scale studies selectively investigate a function of interest in a given microbial community. These projects are mainly initiated by a single investigator or laboratory. Middle-scale projects are collaborative efforts employing multidisciplinary approaches to thoroughly investigate the community of interest. Large-scale projects, on the other hand, are global initiatives undertaken to understand select microbial communities in depth and to obtain detailed profiles (Ferrer et al., 2009; Langer et al., 2006).

The design of metagenomics projects is crucial, as the results obtained largely depend on the design strength of the study. The design process is broadly categorized into pre-sequencing, sampling, data generation, sequence processing, gene prediction, annotation, and data analysis stages (Kunin et al., 2008a, b). Prior knowledge about the dominant population in the community of interest is helpful for optimizing the design selection, as well as during subsequent interpretation of the results. A crucial factor that needs to be ascertained right at the beginning itself is the sequence coverage requirement for the proposed study. This is because, unlike complete genome sequencing studies where the genome size is already known, metagenomic analyses do not have a fixed end point. Determining the quantum of genome to be sequenced is especially challenging when studying diverse microbial communities as the abundance of different organisms is not uniform within the community. Further, the gene coding densities also vary significantly among different species. In view of these issues, to ensure a decent representation of the genomes of the community, coverage of 6× to 8× fold is taken as the standard (Dinsdale et al., 2008; Ferrer et al., 2009; Kunin et al., 2008a). Certain studies have shown that even extremely low coverage of <0.01× was sufficient to detect genetic gradients in case of dominant population of stratified hyper saline mat community (Goldberg et al., 2006; Kunin et al., 2008a, b). Ultimately, the objectives of the study guide the decisions made on coverage parameters, with an average genome sizes reaching over 100 Mb for moderate metagenomics libraries. The Sanger (dye terminator) sequencing technique has been employed as the method of choice for obtaining metagenomic sequence data for many years. However in recent times, pyrosequencing is also gaining wider applicability. The advantages of this method over the Sanger method is that it requires no prior cloning, which avoids cloning bias. Further, the sequencing cost per base is much lower in pyrosequencing, thus permitting sequencing of large repertoires at affordable cost (Edwards et al., 2006; Shendure and Ji, 2008). The major limitation of pyrosequencing is its short average read length. Due to this, the approach relies heavily on similarity searches against reference databases as gene calling or assemblies are usually not feasible (Wommack et al., 2008). Combining both the sequencing technologies together has also being explored for producing high-quality draft assemblies (Cardenas and Tiedje, 2008; Gabor et al., 2007; Goldberg et al., 2006).

The metagenomics data sets obtained are annotated by mapping the genes and gene fragments into families. This provides an estimate of their relative representation. Assigning roles to the proteins encoded by the sequenced genes is the next crucial step. The two principal strategies currently used for gene prediction are evidence-based gene calling method and “ab initio” approach (Kunin et al., 2008a; Raes et al., 2007). Gene prediction is generally followed by functional annotation, which is similar to that in genomic annotation. To enhance the function assignment of the sequenced data, gene-centric trends can also be adopted in metagenomics (Tringe and Rubin, 2005). The annotated data is further supplemented with collateral nonsequence data (i.e., “metadata”). For environmental studies, this information generally includes geographical data such as global positioning system coordinates, depth/height from where samples are collected, and environmental parameters such as pH, temperature, and salinity of the site (Kunin et al., 2008a, b; Urich et al., 2008; Venter et al., 2004). It is crucial to record the meta information during the initial sampling process itself, as re-sampling is not always directly comparable for analysis. The metadata significantly aids in interpreting the sequence data and is particularly useful during comparative analysis of temporal or spatial distribution. As mentioned earlier, genome closure is not possible for most metagenomes due to the vast and unknown size. Hence, finishing becomes a viable option only for those data sets pertaining to dominant populations within the metagenome (Hallam et al., 2006). Some of the notable studies that obtained complete or near-complete draft-level coverage of dominant genome assemblies include biofilms in acid mine drainage, activated sludge, and hypersaline environments (Hugenholtz and Tyson, 2008; Kunin et al., 2008b).

Metagenomics is being extensively applied to explore biosynthetic diversity of microorganisms from varied environments. The studies have led to identification of new genes, novel biosynthetic pathways, and bioremediation mechanisms of important xenobiotic compounds (Baldwin et al., 2003; Cowan et al., 2005; Dionisi et al., 2012; Lewis et al., 2010). Light-driven proton pump mediated by proteorhodopsin was first identified in bacterioplanktons from environmental DNA samples using the metagenomics approach (Harms et al., 2003; Robertson and Steer, 2004; Sanz and Kochling, 2007). A more recent discovery pertains to the implication of Archaea as one of the main ammonia oxidizers. Using bioinformatics tools, an ammonia mono-oxygenase gene was initially identified next to the gene encoding small subunit ribosomal RNA and their role was later confirmed experimentally both in marine and terrestrial ecosystems (Baldwin et al., 2003; Daniel 2005; Schneiker et al., 2006). Co-localization studies combining FISH and digital image analysis are providing comparative analysis of temporal or spatial information in structured ecosystems in metagenome analysis (Handelsman, 2005; Malik 2008; Sanz and Kochling, 2007; Wagner et al., 2006). Intracellular fluorescent biosensors or whole-cell-based biosensors are also being widely applied for environmental sensing and detecting bioremediation of specific contaminants (Bhattacharyyaa et al., 2005; Williamson et al., 2005). These studies show the presence of extensive metabolic interactions and high interdependency between members of the community. Further, they provide means for assessing the ‘true’ biodiversity of the microbial communities by circumventing the limitations imposed by the low cultivability of the microorganisms (Cowan et al., 2005; Daniel, 2005; Dinsdale et al., 2008; Ferrer et al., 2009). Metagenomics assemblages of microorganisms are also providing answers to several fundamental questions in microbial ecology by aiding in assessing the diversity and functionality of microorganism objectively and quantitatively in situ.

Constructing metagenomic libraries from complex environmental sample though is conceptually simple; it can be technically very challenging. Co-extraction of inhibitory substances such as humic acids, organic matter, and clay particles can significantly interfere in the amplification step (Gabor et al., 2007; Pontes et al., 2007). As discussed earlier, determining the minimum size of the metagenomic library is a major challenge. The high sequencing cost associated with a large metagenome repertoire from complex environments can still be prohibitive (Malik et al., 2008; Stenuit et al., 2006, 2008; Wommack et al., 2008). In addition, improvements in the three aspects namely, resolution, classification of short metagenomic fragments, and better means for robust functional assignment and verification can further improve the application potential of metagenomics (Hendelsman, 2004, 2005; Kunin et al., 2008a, b; Lewis et al., 2010). Thus, metagenomics has emerged as a valuable tool with applications in diverse fields including medicine, alternative energy, environmental remediation, biotechnology, agriculture, biodefense, and forensics (Cowan et al., 2005; Dinsdale et al., 2008; Dionisi et al., 2012).

Integration of “Omic” Approaches for Microbial Bioprospecting

Elucidating the phylogenetic diversity of microorganisms is only the first step in understanding the biodiversity. It is vital to determine the function of each gene at protein or metabolic levels in addition to gathering sequencing data. Around 30%–40% of the fully sequenced genomes from various organisms contain genes with no assigned function, as they are unrelated to any known gene. Thus, the greater challenge lies in assigning functions at the biochemical level to the newly identified genes. Bridging this void is the focus of the rapidly emerging discipline of functional genomics and proteomics (Ferrer et al., 2009; Langer et al., 2006; Valenzuela et al., 2006). Three different types of function-driven approaches are widely recognized. These are the phenotypic detection of gene activity, heterologous complementation of host strains, and induced gene expression (Ferrer et al., 2009). At the genomic level, these functional approaches have led to discovery of several new enzymes, antibiotics, and other biomolecules with therapeutic and biotechnological applications (Eyers et al., 2004; Langer et al., 2006; Lee et al., 2010). As function-driven approaches typically involve low-throughput screens based on visual detection, there were initial difficulties in adopting them to a metagenomic scale. However, automated colony picking, pipetting robotics, use of microtiter plates, availability of sensitive activity assays, targeted biomolecules screening, and informatics assisted data management have aided in automation of the functional screens, thereby making them viable for application at metagenomic levels (Dionisi et al., 2012; Gentry et al., 2006; Kunin et al., 2008a). The capacity for high-throughput is vital as the number of “hits” obtained during metagenome screening is typically very low (<2 out of 10,000 clones screened). Other high throughput techniques like microarrays, real time analysis of expressed gene, and PCR independent analysis already discussed above are also employed in functional metagenome analyses. As an alternative to elaborate functional screens, the induced gene expression approach uses diverse strategies to enrich/select community genomes with desired traits through promoter activity rather than via phenotypic expression (Lorenz and Eck, 2005). Substrate-induced gene expression (SIGEX) and its variants utilizing promoter-trap gfp-expression vector, in combination with fluorescence-activated cell sorting, have been highly successful in liquid cultures for efficient, large scale selection of clones (Uchiyama and Miyazaki, 2010; Yun and Ryu, 2005).

Metagenomic libraries have thus become good sources for bioprospecting of novel biocatalyst and biomolecules, even from yet to be cultivable organisms. Combination of the metagenomic approaches with heterologous expression systems aid in substantial utilization of the microbial biodiversity for human welfare. However, a single gene can generate a number of distinguishable functional entities at protein level as a result of differential splicing (Benndorf et al., 2007). The entire sets of metabolites produced by cellular proteins in response to various environmental stimuli are examined in metabolomics. As all the metabolites present in a system are targeted, there is little scope for bias related to the metabolites examined. Thus, in addition to gene profiling, global protein and metabolite profiling is crucial, especially while investigating mechanisms of complex pathways such as bioremediation (Malik et al., 2008; Stenuit et al., 2008). This makes comprehensive characterization of the physiological functions and elucidating the ecological roles of the novel organisms a challenging task. Progressive integration of various high-throughput approaches is particularly fruitful in the field of environmental microbiology. They offer advantage of miniaturization, automation, massive parallelization of time consuming steps, along with “real-time” analysis (Dinsdale et al., 2008; Paul et al., 2005; Zhao and Poh, 2008).

Achieving synergy between the emerging –omic approaches in the long run can offer comprehensive characterization of the biological and ecological function of microbial communities in an environment. Integration of complementary “omics” techniques is envisaged to provide greater insight into genome structure, micro-heterogeneity, lateral gene transfer, and nutrient cycling among the members of microbial community of a region. This ‘systems biology’ outlook in a way holds promise to provide the holistic overview required to understand the microbial ecosystems and resolve several fundamental questions in microbial ecology and evolution.

Footnotes

Author Disclosure Statement

No competing financial interests exist.

References

Abulencia

, Wyborski

, Garcia

et al. 2006. Environmental whole-genome amplification to access microbial populations in contaminated sediments. Appl Environ Microbiol, 72:3291–3301.

Baldwin

, Nakatsu

, Nies

. 2003. Detection and enumeration of aromatic oxygen-ase genes by multiplex and real-time PCR. Appl Environ Microbiol, 69:3350–3358.

Benndorf

, Balcke

, Harms

, Von Bergen

. 2007. Functional metaproteome analysis of protein extracts from contaminated soil and groundwater. ISME J, 1:224–234.

Bhattacharyyaa

, Reada

, Amosc

, Dooleyc

, Killhama

, Pato-Na

. 2005. Biosensor-based diagnostics of contaminated groundwater: Assessment and remediation strategy. Environ Pollution, 134:485–492.

Binga

, Lasken

, Neufeld

. 2008. Something from nothing: The impact of multiple displacement amplification on microbial ecology. ISME J, 2:233–241.

Cardenas

, Tiedje

. 2008. New tools for discovering and characterizing microbial diversity. Curr Opin Biotechnol, 19:544–549.

Cébron

, Norini

, Beguiristain

, Leyval

. 2008. Real-time PCR quantification of PAHring hydroxylating dioxygenase (PAH-RHD [alpha]) genes from Gram positive and Gram negative bacteria in soil and sediment samples. J Microbioll Methods, 73:148–159.

Chadhain

SMN

, Norman

, Pesce

, Kukor

, Zylstra

. 2006. Microbial dioxygenase gene population shifts during polycyclic aromatic hydrocarbon biodegradation. Appl Environ Microbiol, 72:4078–4087.

Chandler

, Jarrell

, Roden

et al. 2006. Suspension array analysis of 16S rRNA from Fe⁺² and SO₄^2-reducing bacteria in uranium contaminated sediments undergoing bioremediation. Appl Environ Microbiol, 72:4672–4687.

10.

Connon

, Giovannoni

. 2002. High-throughput methods for culturing microorganisms in very-low-nutrient media yield diverse new marine isolates. Appl Environ Microbiol, 68:3878–3885.

11.

Cowan

, Meyer

, Stafford

, Muyanga

, Cameron

, Wittwer

. 2005. Metagenomic gene discovery: Past, present and future. Trends Biotechnol, 23:321–329.

12.

Daniel

. 2005. The metagenomics of soil. Nature Rev Microbiol, 3:470–478.

13.

Dean

, Hosono

, Fang

et al. 2002. Comprehensive human genome amplification using multiple displacement amplification. Proc Natl Acad Sci USA, 99:5261–5266.

14.

Dinsdale

, Edwards

, Hall

et al. 2008. Functional metagenomic profiling of nine biomes. Nature, 452:629–633.

15.

Dionisi

, Chewning

, Morgan

, Menn

, Easter

, Sayler

. 2004. Abundance of dioxygenase genes similar to Ralstonia sp. Strain U2 nagAc is correlated with naphthalene concentrations in coal tar-contaminated freshwater sediments. Appl Environ Microbiol, 70:3988–3995.

16.

Dionisi

, Lozada

, Nelda

, Olivera

. 2012. Bioprospection of marine micro-organisms: Biotechnological applications and methods. Revista Argentina Microbiol, 44:49–60.

17.

Edwards

, Rodriguez-Brito

, Wegley

et al. 2006. Using pyrosequencing to shed light on deep mine microbial ecology. BMC Genom, 7:57–70.

18.

Enright

, Spratt

. 1999. Multilocus sequence typing. Trends Microbiol, 7:482–487.

19.

Eyers

, George

, Schuler

, Stenuit

, Agathos

, Fantroussi

. 2004. Environmental genomics: Exploring the unmined richness of microbes to degrade xenobiotics. Appl Microbiol Biotechnol, 66:123–130.

20.

Ferrer

, Beloqui

, Timmis

, Golyshin

. 2009. Metagenomics for mining new genetic resources of microbial communities. J Mol Microbiol Biotechnol, 16:109–123.

21.

Fromin

, Hamelin

, Tarnawski

et al. 2002. Statistical analysis of denaturing gel electrophoresis (DGE) fingerprinting patterns. Environ Microbiol, 4:634–643.

22.

Gabor

, Liebeton

, Niehaus

, Eck

, Lorenz

. 2007. Updating the metagenomics toolbox. Biotechnol J, 2:201–206.

23.

Gao

, Yang

, Gentry

, Wu

, Schadt

, Zhou

. 2007. Microarray-based analysis of microbial community RNAs by whole-community RNA amplification. Appl Environ Microbiol, 73:563–571.

24.

Gentry

, Wickham

, Schadt

, He

, Zhou

. 2006. Microarray applications in microbial ecology research. Microb Ecol, 52:159–175.

25.

Gilbride

, Lee

, Beaudette

. 2006. Molecular techniques in wastewater: Understanding microbial communities, detecting pathogens, and real-time process control. J Microbiol Methods, 66:1–20.

26.

Goldberg

, Johnson

, Busam

et al. 2006. A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes. Proc Natl Acad Sci USA, 3:11240–11245.

27.

Greene

, Voordouw

. 2003. Analysis of environmental microbial community by reverse sample genome probing. J Microbiol Methods, 53:211–219.

28.

Gürtler

, Stanisich

. 1996. New approaches to typing and identification of bacteria using the 16S-23S r DNA spacer region. Microbiology, 142:3–16.

29.

Hallam

, Konstantinidis

, Putnam

et al. 2006. Genomic analysis of the uncultivated marine crenarchaeote Cenarchaeum symbiosum. Proc Natl Acad Sci USA, 103:18296–18301.

30.

Handelsman

, Rondon

, Brady

, Clardy

, Goodman

. 1998. Molecular biological access to the chemistry of unknown soil microbes: A new frontier for natural products. Chem Biol, 5:245–249.

31.

Handelsman

. 2004. Metagenomics: Application of genomics to uncultured microorganisms. Microbiol Mol Biol Rev, 68:669–685.

32.

Handelsman

. 2005. Sorting out metagenomes. Nature Biotechnol, 23:38–39.

33.

Harms

, Layton

, Dionisi

et al. 2003. Real-time PCR quantification of nitrifying bacteria in a municipal waste water treatment plant. Environ Sci Technol, 37:343–351.

34.

, Gentry

, Schadt

et al. 2007. GeoChip: A comprehensive microarray for investigating biogeo-chemical, ecological and environmental processes. ISME J, 1:67–77.

35.

Horz

, Yimga

, Liesack

. 2001. Detection of methanotroph diversity on roots of submerged rice plants by molecular retrieval of pmoA, mmoX, mxaF, and 16S rRNA and ribosomal DNA, including pmoA-based terminal restriction fragment length polymorphism prowling. Appl Environ Microbiol, 67:4177–4185.

36.

Hugenholtz

, Tyson

. 2008. Metagenomics. Nature, 455:481–483.

37.

Ingham

, Sprenkels

, Bomer

et al. 2007. The micro-petri dish, a million-well growth chip for the culture and high-throughput screening of microorganisms. Proc Natl Acad Sci USA, 104:17–22.

38.

Inglis

, Kalischuk

. 2004. Direct quantification of Campylobacter jejuni and Campylobacter lanienae in faeces of cattle by real-time quantitative PCR. Appl Environ Microbiol, 70:2296–2306.

39.

Joint

, Muhling

, Querellou

. 2010. Culturing marine bacteria. An essential prerequisite for biodiscovery. Microbiol Biotechnol, 3:564–575.

40.

Kaeberlein

, Lewis

, Epstein

. 2002. Isolating “uncultivable” microorganisms in pure culture in a simulated natural environment. Science, 296:1127–1129.

41.

Keller

, Zengler

. 2004. Tapping into microbial diversity. Nature Rev Microbiol, 2:141–150.

42.

Klappenbach

, Saxman

, Cole

, Schmidt

. 2001. Rrndb: The ribosomal RNA operon copy number database. Nucleic Acids Res, 29:181–184.

43.

Kolbert

, Persing

. 1999. Ribosomal DNA sequencing as a tool for identification of bacterial pathogens. Curr Opin Microbiol, 2:299–305.

44.

Kunin

, Copeland

, Lapidus

, Konstantinos

, Hugenholtz

. 2008a. Bioinformatician's guide to metagenomics. Microbiol Mol Biol Rev, 72:557–578.

45.

Kunin

, Raes

, Harris

et al. 2008b. 75Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat. Mol Syst Biol, 4:198–204.

46.

Lane

, Pace

, Olsen

, Stahl

, Sogin

, Pace

. 1985. Rapid determination of 16S ribosomal RNA sequences for phylogenetic analyses. Proc Natl Acad Sci USA, 82:6955–6959.

47.

Langer

, Gabor

, Liebeton

et al. 2006. Metagenomics: An inexhaustible access to nature's diversity. Biotechnol J, 1:815–821.

48.

Lee

, Cote

. 2006. Phylogenetic analysis of γ-proteobacteria inferred from nucleotide sequence comparisons of the house-keeping genes adk, aroE and gdh: Comparisons with phylogeny inferred from 16S rRNA gene sequences. J Gen Appl Microbiol, 52:147–158.

49.

Lee

, Kwon

, Kang

, Cha

, Kim

, Lee

. 2010. Approaches for novel enzyme discovery from marine environments. Curr Opin Biotechnol, 21:353–357.

50.

Lee

, Lewis

, Ashman

. 2009. Microbial flocculation, a potentially low-cost harvesting technique for marine microalgae for production of biodiesel. J Appl Phycol, 21:559–567.

51.

Lerat

, England

, Vincent

et al. 2005. Real-time polymerase chain reaction quantification of the transgenes for roundup ready corn and roundup ready soybean in soil samples. J Agricult Food Chem, 53:1337–1342.

52.

Lewis

, Epstein

, D'onofrio

, Ling

. 2010. Uncultured microorganisms as a source of secondary metabolites. J Antibiot, 63:1–9.

53.

Liu

, Zhu

. 2005. Environmental microbiology-on-a-chip and its future impacts. Trends Biotechnol, 23:174–179.

54.

Lorenz

, Eck

. 2005. Metagenomics and industrial applications. Nature Rev Microbiol, 3:510–516.

55.

Lovely

. 2003. Cleaning up with genomics: Applying molecular biology to bioremediation. Nature Rev Microbiol, 1:35–44.

56.

Ludwig

. 2010. Molecular phylogeny of microorganisms is rRNA still a useful marker? Molecular Phylogeny of Microorganisms. Oren

, Papke

. Caister Academic Press: Norfolk, U.K., 65–85.

57.

Maldonado

, Stach

, Pathom-Aree

, Ward

, Bull

, Goodfellow

. 2005. Diversity of cultivable Actinobacteria in geographically widespread marine sediments. Antonie van Leeuwenhoek, 87:11–18.

58.

Malik Seidu

, Michael Megharaj

, Naidu

. 2008. The use of molecular techniques to characterize the microbial communities in contaminated soil and water. Environ Intl, 34:265–276.

59.

Mira

, Martín-Cuadrado

, D'auria

, Rodríguez-Valera

. 2010. The bacterial pan-genome: A new paradigm in microbiology. Intl Microbiol, 13:45–57.

60.

Neufeld

, Yu

, Lam

, Mohnw

. 2004. Serial analysis of ribosomal sequence tags (SARST): A high-throughput method for profiling complex microbial communities. Environ Microbiol, 6:131–144.

61.

Nichols

, Lewis

, Orjala

et al. 2008. Short peptide induces an “uncultivable” microorganism to grow in vitro. Appl Environ Microbiol, 74:4889–4897.

62.

Nichols

, Cahoon

, Trakhtenberg

et al. 2010. Use of ichip for high-throughput in situ cultivation of “uncultivable” microbial species. Appl Environ Microbiol, 76:2445–2450.

63.

Nocker

, Burn

, Camper

. 2007. Genotypic microbial community profiling: A critical technical review. Microbial Ecol, 54:276–289.

64.

Nyyssonen

, Piskonen

, Itavaara

. 2006. A targeted real-time PCR assay for studying naphthalene degradation in the environment. Microb Ecol, 52:533–543.

65.

Pace

, Stahl

, Lane

, Olsen

. 1985. Analyzing natural microbial populations by rRNA sequences. ASM News, 51:4–12.

66.

Paul

, Pandey

, Jain

. 2005. Accessing microbial diversity for bioremediation and environmental restoration. Trends Biotechnol, 23:135–142.

67.

Penesyan

, Marshall-Jones

, Holmstrom

, Kjelleberg

, Egan

. 2009. Antimicrobial activity observed among cultured marine epiphytic bacteria reflects their potential as a source of new drugs. FEMS Microbiol Ecol, 69:113–124.

68.

Platonov

, Shipulin

, Platonova

. 2000. Multilocus sequence typing: A new method and the first results in the genotyping of bacteria. Russ J Genet, 36:481–487.

69.

Powell

, Ferguson

, Bowman

, Snape

. 2006. Using real-time PCR to assess changes in the hydrocarbon-degrading microbial community in antarctic soil during bioremediation. Microb Ecol, 52:523–532.

70.

Pontes

, Lima

BCI

, Chartone

, Maral Nascimento

. 2007. Molecular approaches: Advantages and artifacts in assessing bacterial diversity. J Indust Microbiol Biotechnol, 34:463–473.

71.

Raes

, Foerstner

, Bork

. 2007. Get the most out of your metagenome: Computational analysis of environmental sequence data. Curr Opin Microbiol, 10:490–498.

72.

Rappe

, Giovannoni

. 2003. The uncultured microbial majority. Ann Rev Microbiol, 57:369–394.

73.

Rhee

, Liu

, Wu

, Chong

, Wan

, Zhou

. 2004. Detection of genes involved in biodegradation and biotransformation in microbial communities by using 50- mer oligonucleotide microarrays. Appl Environ Microbiol, 70:4303–4317.

74.

Ritalahti

, Amos

, Sung

, Wu

, Koenigsberg

, Loffler

. 2006. Quantitative PCR targeting 16S rRNA and reductive dehalogenase genes simultaneously monitors multiple Dehalococcoides strains. Appl Environ Microbiol, 72:2765–2774.

75.

Robertson

, Steer

. 2004. Recent progress in biocatalyst discovery and optimization. Curr Opin Chem Biol, 8:141–149.

76.

Sanz

, Kochling

. 2007. Molecular biology techniques used in waste water treatment: An overview. Process Biochem, 42:119–133.

77.

Schneiker

, Dos Santos

, Bartels

et al. 2006. Genome sequence of the ubiquitous hydrocarbon-degrading marine bacterium Alcanivorax borkumensis. Nature Biotechnol, 24:997–1004.

78.

Sebat

, Colwell

, Crawford

. 2003. Metagenomic profiling: Microarray analysis of an environmental genomic library. Appl Environ Microbiol, 69:4927–4934.

79.

Shendure

, Ji

. 2008. Next-generation DNA sequencing. Nature Biotechnol, 26:1135–1114.

80.

Sprenkels

CJA

, Bomer

, Molenaar

et al. 2007. The micro-petri dish, a million-well growth chip for the culture and high-throughput screening of microorganisms. Proc Natl Acad Sci USA, 104:18217–18222.

81.

Stenuit

, Eyers

, Rozenberg

, Habib-Jiwan

, Agathos

. 2006. Aerobic growth of Escherichia coli with 2,4,6-trinitrotoluene (TNT) as the sole nitrogen source and evidence of TNT denitration by whole cells and cell-free extracts. Appl Environ Microbiol, 72:7945–7948.

82.

Stenuit

, Eyers

, Schuler

, Agathos

, George

. 2008. Emerging high-throughput approaches to analyze bioremediation of sites contaminated with hazardous and/or recalcitrant wastes. Biotechnol Adv, 26:561–575.

83.

Tettelin

, Riley

, Cattuto

, Medini

. 2008. Comparative genomics: The bacterial pan-genome. Curr Opin Microbiol, 11:472–477.

84.

Thomas

, Padilla-Crespo

, Jardine

, Sanford

, Loffler

. 2009. Diversity and distribution of Anaeromyxobacter strains in a uranium-contaminated subsurface environment with a non-uniform groundwater flow. Appl Environ Microbiol, 75:3679–3687.

85.

Toledo

, Rappe

, Elkins

, Mathur

, Short

, Keller

. 2002. Cultivating the uncultured. Proc Natl Acad Sci USA, 99:15681–15686.

86.

Torsvik

, Ovreas

, Thingstad

. 2002. Prokaryotic diversity-magnitude, dynamics, and controlling factors. Science, 296:1064–1066.

87.

Tringe

, Rubin

. 2005. Metagenomics: DNA sequencing of environmental samples. Nature Rev Genet, 6:805–814.

88.

Urich

, Lanzen

, Qi

, Huson

, Schleper

, Schuster

. 2008. Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome. PloS ONE, 3:e2527.

89.

Uchiyama

, Miyazaki

. 2010. Product-induced gene expression, a product-responsive reporter assay used to screen metagenomics libraries for enzyme-encoding genes. Appl Environ Microbiol, 76:7029–7035.

90.

Valenzuela

, Chian

, Orell

et al. 2006. Genomics, metagenomics and proteomics in biomining microorganisms. Biotechnol Adv, 24:197–211.

91.

Van Hamme

, Singh

, Ward

. 2003. Recent advances in petroleum microbiology. Microbiol Mol Biol Rev, 67:503–549.

92.

Venter

, Remington

, Heidelberg

et al. 2004. Environmental genome shotgun sequencing of the Sargasso sea. Science, 304:66–74.

93.

Vinuesa

. 2010. Multilocus sequence analysis and bacterial species phylogeny estimation. Molecular Phylogeny of Microorganisms. Oren

, Papke

. Academic Press.

94.

Wagner

, Nielsen

, Loy

, Nielsen

, Daims

. 2006. Linking microbial community structure with function: Fluorescence in situ hybridization-micro-autoradiography and isotope arrays. Curr Opin Biotechnol, 17:83–91.

95.

Widada

, Nojiri

, Omori

. 2002. Recent developments in molecular techniques for identification and monitoring of xenobiotic-degrading bacteria and their catabolic genes in bioremediation. Appl Environ Microbiol, 60:45–59.

96.

Wilson

, Tatford

, Yin

, Rajki

, Walsh

, Larock

. 1999. Species specific detection of hydrocarbon-utilizing bacteria. J Microbiol Methods, 39:59–78.

97.

Williamson

, Borlee

, Schloss

, Guan

, Allen

, Handelsman

. 2005. Intracellular screen to identify metagenomic clones that induce or inhibit a quorum-sensing biosensor. Appl Environ Microbiol, 71:6335–6344.

98.

Wommack

, Bhavsar

, Ravel

. 2008. Metagenomics: Read length matters. Appl Environ Microbiol, 74:1453–1463.

99.

, Liu

, Schadt

, Zhou

. 2006. Microarray-based analysis of subnanogram quantities of microbial community DNAs by using whole-community genome amplification. Appl Environ Microbiol, 72:4931–4941.

100.

, Thompson

, Liu

et al. 2004. Development and evaluation of microarray-based whole-genome hybridization for detection of microorganisms within the context of environmental applications. Environ Sci Technol, 38:6775–6782.

101.

, Thompson

, Li

, Hurt

, Tiedje

, Zhou

. 2001. Development and evaluation of functional gene arrays for detection of selected genes in the environment. Appl Environ Microbiol, 67:5780–5790.

102.

Yun

, Ryu

. 2005. Screening for novel enzymes from metagenome and SIGEX, as a way to improve it. Microb Cell Factories, 4:8.

103.

Zengler

, Toledo

, Rappe

, Elkins

, Mathur

, Short

, Keller

. 2002. Cultivating the uncultured. Proc Natl Acad Sci USA, 99:15681–15686.

104.

Zengler

, Walcher

, Clark

et al. 2005. High throughput cultivation of microorganisms using microcapsules. Methods Enzymol, 397:124–130.

105.

Zhao

, Poh

. 2008. Insights into environmental bioremediation by microorganisms through functional genomics and proteomics. Proteomics, 8:874–881.

106.

Zhou

. 2003. Microarrays for bacterial detection and microbial community analysis. Curr Opin Microbiol, 6:288–294.