Phylogenetic classification of short environmental DNA fragments

Krause L, Diaz NN, Goesmann A, Kelley S, Nattkemper TW, Rohwer F, Edwards RA, Stoye J (2008)
Nucleic Acids Research 36(7): 2230-2239.

Journal Article | Published | English
; ; ; ; ; ; ;
Metagenomics is providing striking insights into the ecology of microbial communities. The recently developed massively parallel 454 pyrosequencing technique gives the opportunity to rapidly obtain metagenomic sequences at a low cost and without cloning bias. However, the phylogenetic analysis of the short reads produced represents a significant computational challenge. The phylogenetic algorithm CARMA for predicting the source organisms of environmental 454 reads is described. The algorithm searches for conserved Pfam domain and protein families in the unassembled reads of a sample. These gene fragments (environmental gene tags, EGTs), are classified into a higher-order taxonomy based on the reconstruction of a phylogenetic tree of each matching Pfam family. The method exhibits high accuracy for a wide range of taxonomic groups, and EGTs as short as 27 amino acids can be phylogenetically classified up to the rank of genus. The algorithm was applied in a comparative study of three aquatic microbial samples obtained by 454 pyrosequencing. Profound differences in the taxonomic composition of these samples could be clearly revealed.
Publishing Year

Cite this

Krause L, Diaz NN, Goesmann A, et al. Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Research. 2008;36(7):2230-2239.
Krause, L., Diaz, N. N., Goesmann, A., Kelley, S., Nattkemper, T. W., Rohwer, F., Edwards, R. A., et al. (2008). Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Research, 36(7), 2230-2239.
Krause, L., Diaz, N. N., Goesmann, A., Kelley, S., Nattkemper, T. W., Rohwer, F., Edwards, R. A., and Stoye, J. (2008). Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Research 36, 2230-2239.
Krause, L., et al., 2008. Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Research, 36(7), p 2230-2239.
L. Krause, et al., “Phylogenetic classification of short environmental DNA fragments”, Nucleic Acids Research, vol. 36, 2008, pp. 2230-2239.
Krause, L., Diaz, N.N., Goesmann, A., Kelley, S., Nattkemper, T.W., Rohwer, F., Edwards, R.A., Stoye, J.: Phylogenetic classification of short environmental DNA fragments. Nucleic Acids Research. 36, 2230-2239 (2008).
Krause, Lutz, Diaz, Naryttza N., Goesmann, Alexander, Kelley, Scott, Nattkemper, Tim Wilhelm, Rohwer, Forest, Edwards, Robert A., and Stoye, Jens. “Phylogenetic classification of short environmental DNA fragments”. Nucleic Acids Research 36.7 (2008): 2230-2239.
Main File(s)
File Name
Access Level
OA Open Access

This data publication is cited in the following publications:
This publication cites the following data publications:

122 Citations in Europe PMC

Data provided by Europe PubMed Central.

One fungus, which genes? Development and assessment of universal primers for potential secondary fungal DNA barcodes.
Stielow JB, Levesque CA, Seifert KA, Meyer W, Iriny L, Smits D, Renfurm R, Verkley GJ, Groenewald M, Chaduli D, Lomascolo A, Welti S, Lesage-Meessen L, Favel A, Al-Hatmi AM, Damm U, Yilmaz N, Houbraken J, Lombard L, Quaedvlieg W, Binder M, Vaas LA, Vu D, Yurkov A, Begerow D, Roehl O, Guerreiro M, Fonseca A, Samerpitak K, van Diepeningen AD, Dolatabadi S, Moreno LF, Casaregola S, Mallet S, Jacques N, Roscini L, Egidi E, Bizet C, Garcia-Hermoso D, Martin MP, Deng S, Groenewald JZ, Boekhout T, de Beer ZW, Barnes I, Duong TA, Wingfield MJ, de Hoog GS, Crous PW, Lewis CT, Hambleton S, Moussa TA, Al-Zahrani HS, Almaghrabi OA, Louis-Seize G, Assabgui R, McCormick W, Omer G, Dukik K, Cardinali G, Eberhardt U, de Vries M, Robert V., Persoonia 35(), 2015
PMID: 26823635
MetAnnotate: function-specific taxonomic profiling and comparison of metagenomes.
Petrenko P, Lobb B, Kurtz DA, Neufeld JD, Doxey AC., BMC Biol. 13(), 2015
PMID: 26541816
Challenges of the Unknown: Clinical Application of Microbial Metagenomics.
Rose G, Wooldridge DJ, Anscombe C, Mee ET, Misra RV, Gharbia S., Int J Genomics 2015(), 2015
PMID: 26451363
Metatranscriptomic discovery of plant biomass-degrading capacity from grass carp intestinal microbiomes.
Wu S, Ren Y, Peng C, Hao Y, Xiong F, Wang G, Li W, Zou H, Angert ER., FEMS Microbiol. Ecol. 91(10), 2015
PMID: 26362922
Exploring microbial dark matter to resolve the deep archaeal ancestry of eukaryotes.
Saw JH, Spang A, Zaremba-Niedzwiedzka K, Juzokaite L, Dodsworth JA, Murugapiran SK, Colman DR, Takacs-Vesbach C, Hedlund BP, Guy L, Ettema TJ., Philos. Trans. R. Soc. Lond., B, Biol. Sci. 370(1678), 2015
PMID: 26323759
Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies.
Oulas A, Pavloudi C, Polymenakou P, Pavlopoulos GA, Papanikolaou N, Kotoulas G, Arvanitidis C, Iliopoulos I., Bioinform Biol Insights 9(), 2015
PMID: 25983555
Environmental genes and genomes: understanding the differences and challenges in the approaches and software for their analyses.
Zepeda Mendoza ML, Sicheritz-Ponten T, Gilbert MT., Brief. Bioinformatics 16(5), 2015
PMID: 25673291
Classification of metagenomics data at lower taxonomic level using a robust supervised classifier.
Hou T, Liu F, Liu Y, Zou QY, Zhang X, Wang K., Evol. Bioinform. Online 11(), 2015
PMID: 25673967
MBBC: an efficient approach for metagenomic binning based on clustering.
Wang Y, Hu H, Li X., BMC Bioinformatics 16(), 2015
PMID: 25652152
A two-phase binning algorithm using l-mer frequency on groups of non-overlapping reads.
Vinh le V, Lang TV, Binh le T, Hoai TV., Algorithms Mol Biol 10(1), 2015
PMID: 25648210
Successful heterologous expression of a novel chitinase identified by sequence analyses of the metagenome from a chitin-enriched soil sample.
Stoveken J, Singh R, Kolkenbrock S, Zakrzewski M, Wibberg D, Eikmeyer FG, Puhler A, Schluter A, Moerschbacher BM., J. Biotechnol. 201(), 2015
PMID: 25240439
Metagenomic search strategies for interactions among plants and multiple microbes.
Melcher U, Verma R, Schneider WL., Front Plant Sci 5(), 2014
PMID: 24966863
Taxonomic Profiling and Metagenome Analysis of a Microbial Community from a Habitat Contaminated with Industrial Discharges.
Shah V, Zakrzewski M, Wibberg D, Eikmeyer F, Schluter A, Madamwar D., Microb. Ecol. (), 2013
PMID: 23797291
Viral metagenomics: a tool for virus discovery and diversity in aquaculture.
Alavandi SV, Poornima M., Indian J Virol 23(2), 2012
PMID: 23997432
Analysis of composition-based metagenomic classification.
Higashi S, Barreto Ada M, Cantao ME, de Vasconcelos AT., BMC Genomics 13 Suppl 5(), 2012
PMID: 23095761
Unsupervised two-way clustering of metagenomic sequences.
Prabhakara S, Acharya R., J. Biomed. Biotechnol. 2012(), 2012
PMID: 22577288
Peptide markers of aminoacyl tRNA synthetases facilitate taxa counting in metagenomic data.
Persi E, Weingart U, Freilich S, Horn D., BMC Genomics 13(), 2012
PMID: 22325056
The impact of normalization and phylogenetic information on estimating the distance for metagenomes.
Su CH, Wang TY, Hsu MT, Weng FC, Kao CY, Wang D, Tsai HK., IEEE/ACM Trans Comput Biol Bioinform 9(2), 2012
PMID: 21844636
Taxonomic and functional assignment of cloned sequences from high Andean forest soil metagenome.
Montana JS, Jimenez DJ, Hernandez M, Angel T, Baena S., Antonie Van Leeuwenhoek 101(2), 2012
PMID: 21792685
Metagenomics as a new technological tool to gain scientific knowledge
Guazzaroni ME, Beloqui A, Golyshin PN, Ferrer M., World J. Microbiol. Biotechnol. 25(6), 2009
PMID: IND44209421

33 References

Data provided by Europe PubMed Central.

Phylip: phylogeny inference package (version 3.2)
Felsenstein J., 1989
Phylogenetic analysis of general bacterial porins: a phylogenomic case study.
Nguyen TX, Alegre ER, Kelley ST., J. Mol. Microbiol. Biotechnol. 11(6), 2006
PMID: 17114893
Estimating phylogenies from lacunose distance matrices: additive is superior to ultrametric estimation
Landry P-A, Lapointe F-J, Kirsch JAW., 1996

Shannon CE, Weaver W., 1963
The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data.
Cole JR, Chai B, Farris RJ, Wang Q, Kulam-Syed-Mohideen AS, McGarrell DM, Bandela AM, Cardenas E, Garrity GM, Tiedje JM., Nucleic Acids Res. 35(Database issue), 2007
PMID: 17090583
Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.
Wang Q, Garrity GM, Tiedje JM, Cole JR., Appl. Environ. Microbiol. 73(16), 2007
PMID: 17586664
Stromatolite reef from the Early Archaean era of Australia.
Allwood AC, Walter MR, Kamber BS, Marshall CP, Burch IW., Nature 441(7094), 2006
PMID: 16760969
Composition and structure of microbial communities from stromatolites of Hamelin Pool in Shark Bay, Western Australia.
Papineau D, Walker JJ, Mojzsis SJ, Pace NR., Appl. Environ. Microbiol. 71(8), 2005
PMID: 16085880


0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®


PMID: 18285365
PubMed | Europe PMC

Search this title in

Google Scholar