Microbial community analysis: Metagenomics
Thanks to recent technological advancements, methods for the elucidation of microbial community structures have shifted from indirect methods, such as DGGE, T-RFLP and DNA microarrays, to direct methods called metagenomics (Rondon et al., 2000; Schmidt et al., 1991). Metagenomics is a study of collective set of genetic materials extracted directly from environmental samples, and does not rely on cultivation or prior knowledge of the microbial communities (Riesenfeld et al., 2004). Thus, it is a powerful tool to unravel environmental genetic diversity without potential biases resulting from culturing or isolation. Metagenomics is also known by other names, such as environmental genomics or community genomics, or microbial ecogenomics (Rastogi and Sani, 2011). The two major interests of metagenomics are which organisms are present and what metabolic processes are possible in the community (Allen and Banfield, 2005). The former is surveyed mainly based on 16S rRNA gene profiling, the prevalent marker gene for identification of prokaryotic species (Weisburg et al., 1991). Metagenomic investigations have been conducted in several environments, ranging from the oceans to soil, the phyllosphere and acid mine drainage, and have provided access to phylogenetic and functional diversity of uncultured micro-organisms (Handelsman, 2004).
Several major technical limitations have long been in existence with respect to metagenomics. PCR was usually used in metagenomics to selectively amplify target genes and then cloned into vectors for sequencing (Lane et al., 1985). This approach could amplify a minute amount of target genes from the bulk DNA to a reasonable quantity for analysis, but this analysis is subject to PCR-inherent bias (Polz and Cavanaugh, 1998) and thus may not reflect actual microbial community structure. By the advances of meta-strategies in biotechnology and bioinformatics, the need for PCR can be avoided by adopting shotgun sequencing into metagenomics (Breitbart et al., 2002; Tyson et al., 2004). This was feasible by using randomly sheared environmental DNA as it is for insert to be sequences, but still the potential bias imposed by cloning remained as a significant concern in shotgun metagenomics (Handelsman, 2004).
As described above, NGS methods such as Roche 454 pyrosequencing have brought a revolution in metagenomics not only by producing a large amount of data at a low cost, but also by excluding time-consuming and bias-imposing step such as clone library construction.
For the purpose of collecting metagenomics data, DNA is extracted from an entire microbial community, and a target region flanked by highly conserved primers is amplified by PCR before sequencing. This generates a mixture of amplicons, in which every read stems from a homologous region, and the sequence variation between the reads reflects the phylogenetic diversity in the community (Quince et al., 2009). Usually, the hypervarialble regions of 16S rRNA gene sequences are used for the target of pyrosequencing. The produced sequences are short (400~500 bp), but provide useful phylogenetic information. For example, investigation on the spatial changes in soil bacterial communities was explored using 88 soil samples and a massive bar-coded pyrosequencing technique (Lauber et al., 2009). The V1 and V2 hypervariable region of
16S rRNA genes was the target of sequencing. The results demonstrated that soil bacterial communities contain a large number of microbial species, implying extreme diversity; at least 1 000 species per soil sample. A large “rare biosphere” represented by an enormous number of low-abundance unique taxa also supports this finding. Such studies highlight the importance of large-scale sequencing techniques in investigating the highly diverse soil microbial communities (Rastogi and Sani, 2011). Now, this kind of microbial metagenomic sequencing data itself have become generally affordable and researchers are flooded by an unprecedented amount of DNA sequence data from various environments (Huber et al., 2007; Jones et al., 2009; Warnecke et al., 2007; Wegley et al., 2007).