Overview
MetaPathways [CIT2002] is a meta’omic analysis pipeline for the annotation and analysis for environmental sequence information.
MetaPathways include metagenomic or metatranscriptomic sequence data in one of several file formats
(.fasta, .gff, or .gbk). The pipeline consists of five operational stages including
Pipeline Overview
MetaPathways is composed of five general stages, encompassing a number of analytical or data handling steps (Figure 1):
- QC and ORF Prediction: Here MetaPathways performs basic quality control (QC) including removing duplicate
sequences and sequence trimming. Open Reading Frame (ORF) prediction is then performed on the QC’ed sequences
using Prodigal [PRODIGAL2010] or GeneMark [GeneMark12]. The final translated ORFs are
now also trimmed according to a user-defined setting.
- MetaPathways steps: PREPROCESS INPUT, ORF PREDICTION, and FILTER AMINOS
- Functional and Taxonomic Annotation: Using seed-and-extend homology search algorithms (B)LAST
[BLAST90], [LAST11], MetaPathways can be used to conduct searches against functional and taxonomic databases.
- MetaPathways steps: FUNC SEARCH, PARSE FUNC SEARCH, SCAN rRNA, and ANNOTATE ORFS
- Analyses: After sequence annotation, MetaPathways performs further taxonomic analyses including
the Lowest Common Ancestor (LCA) algorithm
[MEGAN07] and tRNA Scan [TRNASCAN97], and
prepares detected annotations for environmental Pathway/Genome database (ePGDB) creation via Pathway Tools.
- MetaPathways Steps: PATHOLOGIC INPUT, CREATE ANNOT REPORTS, and COMPUTE RPKM.
- ePGDB Creation: MetaPathways then predicts MetaCyc pathways using
the Pathway Tools software
and its pathway prediction algorithm
PathoLogic [KARP11], resulting in the creation of an environmental Pathway/Genome
Database (ePGDB), an integrative data structure of sequences, genes, pathways, and literature
annotations for integrative interpretation.
- MetaPathways Steps: BUILD ePGDB
- Pathway Export: Here MetaCyc pathways or reactions are exported in a tabular format for downstream
analysis. As of the v2.5 release, MetaPathways will perform this step automatically.
- MetaPathways Steps: BUILD ePGDB
Output Format
Visualizing Output
| [PRODIGAL2010] | D. Hyatt et al., Prodigal: prokaryotic gene recognition and translation
initiation site identification. BMC Bioinformatics 11, 119 (2010). |
| [GeneMark12] |
- Hyatt, P. F. LoCascio, L. J. Hauser, E. C. Uberbacher, Gene and translation initiation site prediction in metagenomic sequences. Bioinformatics 28, 2223–2230 (2012).
|
| [BLAST90] |
- Altschul, W. Gish, W. Miller, E. W. Myers, D. J. Lipman, Basic local alignment search tool. J Mol Biol 215, 403–410 (1990).
|
| [LAST11] |
- Kiełbasa, R. Wan, K. Sato, P. Horton, M. C. Frith, Adaptive seeds tame genomic sequence comparison. Genome Res 21, 487–493 (2011).
|
| [MEGAN07] |
- Huson, A. F. Auch, J. Qi, S. C. Schuster, MEGAN analysis of metagenomic data. Genome Res 17, 377–386 (2007).
|
| [TRNASCAN97] |
- Lowe, S. R. Eddy, tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Research 25, 0955–0964 (1997).
|
| [KARP11] |
- Karp, M. Latendresse, R. Caspi, The pathway tools pathway prediction algorithm. Stand Genomic Sci 5, 424–429 (2011).
|