AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization

Langenkämper D, Goesmann A, Nattkemper TW (2014)
BMC Bioinformatics 15(1): 384.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
OA
Abstract / Bemerkung
Background: With the advent of low cost, fast sequencing technologies metagenomic analyses are made possible. The large data volumes gathered by these techniques and the unpredictable diversity captured in them are still, however, a challenge for computational biology. Results: In this paper we address the problem of rapid taxonomic assignment with small and adaptive data models (< 5 MB) and present the accelerated k-mer explorer (AKE). Acceleration in AKE's taxonomic assignments is achieved by a special machine learning architecture, which is well suited to model data collections that are intrinsically hierarchical. We report classification accuracy reasonably well for ranks down to order, observed on a study on real world data (Acid Mine Drainage, Cow Rumen). Conclusion: We show that the execution time of this approach is orders of magnitude shorter than competitive approaches and that accuracy is comparable. The tool is presented to the public as a web application.
Stichworte
machine learning; classification; genomics; metagenomics; big data; H2SOM; high performance computing; web-based; acceleration; bioinformatics
Erscheinungsjahr
2014
Zeitschriftentitel
BMC Bioinformatics
Band
15
Ausgabe
1
Art.-Nr.
384
ISSN
1471-2105
eISSN
1471-2105
Finanzierungs-Informationen
Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.
Page URI
https://pub.uni-bielefeld.de/record/2705712

Zitieren

Langenkämper D, Goesmann A, Nattkemper TW. AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization. BMC Bioinformatics. 2014;15(1): 384.
Langenkämper, D., Goesmann, A., & Nattkemper, T. W. (2014). AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization. BMC Bioinformatics, 15(1), 384. doi:10.1186/s12859-014-0384-0
Langenkämper, Daniel, Goesmann, Alexander, and Nattkemper, Tim Wilhelm. 2014. “AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization”. BMC Bioinformatics 15 (1): 384.
Langenkämper, D., Goesmann, A., and Nattkemper, T. W. (2014). AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization. BMC Bioinformatics 15:384.
Langenkämper, D., Goesmann, A., & Nattkemper, T.W., 2014. AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization. BMC Bioinformatics, 15(1): 384.
D. Langenkämper, A. Goesmann, and T.W. Nattkemper, “AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization”, BMC Bioinformatics, vol. 15, 2014, : 384.
Langenkämper, D., Goesmann, A., Nattkemper, T.W.: AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization. BMC Bioinformatics. 15, : 384 (2014).
Langenkämper, Daniel, Goesmann, Alexander, and Nattkemper, Tim Wilhelm. “AKE - The Accelerated k-mer Exploration Web-Tool for Rapid Taxonomic Classification and Visualization”. BMC Bioinformatics 15.1 (2014): 384.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-06T09:18:28Z
MD5 Prüfsumme
8ea9017509f3173c02aa3c9fc6cfbea0


Link(s) zu Volltext(en)
Access Level
Restricted Closed Access

3 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Considerations for Optimization of High-Throughput Sequencing Bioinformatics Pipelines for Virus Detection.
Lambert C, Braxton C, Charlebois RL, Deyati A, Duncan P, La Neve F, Malicki HD, Ribrioux S, Rozelle DK, Michaels B, Sun W, Yang Z, Khan AS., Viruses 10(10), 2018
PMID: 30262776
Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis.
Dubinkina VB, Ischenko DS, Ulyantsev VI, Tyakht AV, Alexeev DG., BMC Bioinformatics 17(), 2016
PMID: 26774270
Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.
Langenkämper D, Jakobi T, Feld D, Jelonek L, Goesmann A, Nattkemper TW., Front Genet 7(), 2016
PMID: 26904094

31 References

Daten bereitgestellt von Europe PubMed Central.

A novel approach, based on BLSOMs (Batch Learning Self-Organizing Maps), to the microbiome analysis of ticks.
Nakao R, Abe T, Nijhof AM, Yamamoto S, Jongejan F, Ikemura T, Sugimoto C., ISME J 7(5), 2013
PMID: 23303373
Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers.
Liu Z, DeSantis TZ, Andersen GL, Knight R., Nucleic Acids Res. 36(18), 2008
PMID: 18723574
Quikr: a method for rapid reconstruction of bacterial communities via compressive sensing.
Koslicki D, Foucart S, Rosen G., Bioinformatics 29(17), 2013
PMID: 23786768
A bioinformatician's guide to metagenomics.
Kunin V, Copeland A, Lapidus A, Mavromatis K, Hugenholtz P., Microbiol. Mol. Biol. Rev. 72(4), 2008
PMID: 19052320
Integrative analysis of environmental sequences using MEGAN4.
Huson DH, Mitra S, Ruscheweyh HJ, Weber N, Schuster SC., Genome Res. 21(9), 2011
PMID: 21690186
WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads.
Gerlach W, Junemann S, Tille F, Goesmann A, Stoye J., BMC Bioinformatics 10(), 2009
PMID: 20021646
Taxonomic classification of metagenomic shotgun sequences with CARMA3.
Gerlach W, Stoye J., Nucleic Acids Res. 39(14), 2011
PMID: 21586583
The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.
Meyer F, Paarmann D, D'Souza M, Olson R, Glass EM, Kubal M, Paczian T, Rodriguez A, Stevens R, Wilke A, Wilkening J, Edwards RA., BMC Bioinformatics 9(), 2008
PMID: 18803844
Accurate phylogenetic classification of variable-length DNA fragments.
McHardy AC, Martin HG, Tsirigos A, Hugenholtz P, Rigoutsos I., Nat. Methods 4(1), 2006
PMID: 17179938
The PhyloPythiaS web server for taxonomic assignment of metagenome sequences.
Patil KR, Roune L, McHardy AC., PLoS ONE 7(6), 2012
PMID: 22745671

Rosen GLG, Reichenberger ERE, Rosenfeld AMA., 2010
Metagenomic taxonomic classification using extreme learning machines.
Rasheed Z, Rangwala H., J Bioinform Comput Biol 10(5), 2012
PMID: 22849369
Practical application of self-organizing maps to interrelate biodiversity and functional data in NGS-based metagenomics.
Weber M, Teeling H, Huang S, Waldmann J, Kassabgy M, Fuchs BM, Klindworth A, Klockow C, Wichels A, Gerdts G, Amann R, Glockner FO., ISME J 5(5), 2010
PMID: 21160538
Kraken: ultrafast metagenomic sequence classification using exact alignments.
Wood DE, Salzberg SL., Genome Biol. 15(3), 2014
PMID: 24580807
Mixture models for analysis of the taxonomic composition of metagenomes.
Meinicke P, Asshauer KP, Lingner T., Bioinformatics 27(12), 2011
PMID: 21546400
Environments shape the nucleotide composition of genomes.
Foerstner KU, von Mering C, Hooper SD, Bork P., EMBO Rep. 6(12), 2005
PMID: 16200051
Compositional differences within and between eukaryotic genomes.
Karlin S, Mrazek J., Proc. Natl. Acad. Sci. U.S.A. 94(19), 1997
PMID: 9294192
Genomic signature: characterization and classification of species assessed by chaos game representation of sequences.
Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B., Mol. Biol. Evol. 16(10), 1999
PMID: 10563018
Hyperbolic SOM-based clustering of DNA fragment features for taxonomic visualization and classification.
Martin C, Diaz NN, Ontrup J, Nattkemper TW., Bioinformatics 24(14), 2008
PMID: 18535082

Markowitz VM, Chen I-MA, Palaniappan K, Chu K, Szeto E, Grechkin Y, Ratner A, Jacob B, Huang J, Williams P, Huntemann M, Anderson I, Mavromatis K, Ivanova NN, Kyrpides NC., 2011

Kohonen T., 1982

Ritter H., 1999

AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0
Community structure and metabolism through reconstruction of microbial genomes from the environment.
Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF., Nature 428(6978), 2004
PMID: 14961025
Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.
Hess M, Sczyrba A, Egan R, Kim TW, Chokhawala H, Schroth G, Luo S, Clark DS, Chen F, Zhang T, Mackie RI, Pennacchio LA, Tringe SG, Visel A, Woyke T, Wang Z, Rubin EM., Science 331(6016), 2011
PMID: 21273488

Mavromatis K, Ivanova N, Barry K, Shapiro H, Goltsman E, McHardy AC, Rigoutsos I, Salamov A, Korzeniewski F, Land M, Lapidus A, Grigoriev I, Richardson P, Hugenholtz P, Kyrpides NC., 2007
Interactive metagenomic visualization in a Web browser.
Ondov BD, Bergman NH, Phillippy AM., BMC Bioinformatics 12(), 2011
PMID: 21961884
D³: Data-Driven Documents.
Bostock M, Ogievetsky V, Heer J., IEEE Trans Vis Comput Graph 17(12), 2011
PMID: 22034350
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®
Quellen

PMID: 25495116
PubMed | Europe PMC

Suchen in

Google Scholar