Analyzing large scale genomic data on the cloud with Sparkhit

Huang L, Krüger J, Sczyrba A (2018)
Bioinformatics 34(9): 1457-1465.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Erscheinungsjahr
2018
Zeitschriftentitel
Bioinformatics
Band
34
Ausgabe
9
Seite(n)
1457-1465
ISSN
1367-4803
Page URI
https://pub.uni-bielefeld.de/record/2915890

Zitieren

Huang L, Krüger J, Sczyrba A. Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics. 2018;34(9):1457-1465.
Huang, L., Krüger, J., & Sczyrba, A. (2018). Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics, 34(9), 1457-1465. doi:10.1093/bioinformatics/btx808
Huang, Liren, Krüger, Jan, and Sczyrba, Alexander. 2018. “Analyzing large scale genomic data on the cloud with Sparkhit”. Bioinformatics 34 (9): 1457-1465.
Huang, L., Krüger, J., and Sczyrba, A. (2018). Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics 34, 1457-1465.
Huang, L., Krüger, J., & Sczyrba, A., 2018. Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics, 34(9), p 1457-1465.
L. Huang, J. Krüger, and A. Sczyrba, “Analyzing large scale genomic data on the cloud with Sparkhit”, Bioinformatics, vol. 34, 2018, pp. 1457-1465.
Huang, L., Krüger, J., Sczyrba, A.: Analyzing large scale genomic data on the cloud with Sparkhit. Bioinformatics. 34, 1457-1465 (2018).
Huang, Liren, Krüger, Jan, and Sczyrba, Alexander. “Analyzing large scale genomic data on the cloud with Sparkhit”. Bioinformatics 34.9 (2018): 1457-1465.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung-Nicht kommerziell 4.0 International (CC BY-NC 4.0):

Link(s) zu Volltext(en)
Access Level
OA Open Access

1 Zitation in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

gcMeta: a Global Catalogue of Metagenomics platform to support the archiving, standardization and analysis of microbiome data.
Shi W, Qi H, Sun Q, Fan G, Liu S, Wang J, Zhu B, Liu H, Zhao F, Wang X, Hu X, Li W, Liu J, Tian Y, Wu L, Ma J., Nucleic Acids Res 47(d1), 2019
PMID: 30365027

36 References

Daten bereitgestellt von Europe PubMed Central.

SparkBWA: Speeding Up the Alignment of High-Throughput DNA Sequencing Data.
Abuin JM, Pichel JC, Pena TF, Amigo J., PLoS ONE 11(5), 2016
PMID: 27182962
Review of current methods, applications, and data management for the bioinformatics analysis of whole exome sequencing
Bao R.., 2014
Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies.
Boisvert S, Laviolette F, Corbeil J., J. Comput. Biol. 17(11), 2010
PMID: 20958248
Near-optimal probabilistic RNA-seq quantification
Bray N.L.., 2016

Chen Y.-T.., 2015
Mapreduce: simplified data processing on large clusters
Dean J., Ghemawat S.., 2008
Halvade: scalable sequence analysis with MapReduce.
Decap D, Reumers J, Herzeel C, Costanza P, Fostier J., Bioinformatics 31(15), 2015
PMID: 25819078
Global metagenomic survey reveals a new bacterial candidate phylum in geothermal springs.
Eloe-Fadrosh EA, Paez-Espino D, Jarett J, Dunfield PF, Hedlund BP, Dekas AE, Grasby SE, Brady AL, Dong H, Briggs BR, Li WJ, Goudeau D, Malmstrom R, Pati A, Pett-Ridge J, Rubin EM, Woyke T, Kyrpides NC, Ivanova NN., Nat Commun 7(), 2016
PMID: 26814032
A high-performance, portable implementation of the mpi message passing interface standard
Gropp W.., 1996
Aligning short sequencing reads with bowtie
Langmead B.., 2010
Fast gapped-read alignment with Bowtie 2.
Langmead B, Salzberg SL., Nat. Methods 9(4), 2012
PMID: 22388286
Searching for SNPs with cloud computing.
Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL., Genome Biol. 10(11), 2009
PMID: 19930550
Cloud-scale RNA-sequencing differential expression analysis with Myrna.
Langmead B, Hansen KD, Leek JT., Genome Biol. 11(8), 2010
PMID: 20701754
Fast and accurate short read alignment with Burrows-Wheeler transform.
Li H, Durbin R., Bioinformatics 25(14), 2009
PMID: 19451168
The Sequence Alignment/Map format and SAMtools.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup., Bioinformatics 25(16), 2009
PMID: 19505943
SOAP: short oligonucleotide alignment program.
Li R, Li Y, Kristiansen K, Wang J., Bioinformatics 24(5), 2008
PMID: 18227114
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA., Genome Res. 20(9), 2010
PMID: 20644199
Docker: lightweight linux containers for consistent development and deployment
Merkel D.., 2014
FR-HIT, a very fast program to recruit metagenomic reads to homologous reference genomes.
Niu B, Zhu Z, Fu L, Wu S, Li W., Bioinformatics 27(12), 2011
PMID: 21505035
The NIH Human Microbiome Project.
NIH HMP Working Group, Peterson J, Garges S, Giovanni M, McInnes P, Wang L, Schloss JA, Bonazzi V, McEwen JE, Wetterstrand KA, Deal C, Baker CC, Di Francesco V, Howcroft TK, Karp RW, Lunsford RD, Wellington CR, Belachew T, Wright M, Giblin C, David H, Mills M, Salomon R, Mullins C, Akolkar B, Begg L, Davis C, Grandison L, Humble M, Khalsa J, Little AR, Peavy H, Pontzer C, Portnoy M, Sayre MH, Starke-Reed P, Zakhari S, Read J, Watson B, Guyer M., Genome Res. 19(12), 2009
PMID: 19819907
The 3,000 rice genomes project.
3,000 rice genomes project., Gigascience 3(), 2014
PMID: 24872877
Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany.
Rasko DA, Webster DR, Sahl JW, Bashir A, Boisen N, Scheutz F, Paxinos EE, Sebra R, Chin CS, Iliopoulos D, Klammer A, Peluso P, Lee L, Kislyuk AO, Bullard J, Kasarskis A, Wang S, Eid J, Rank D, Redman JC, Steyert SR, Frimodt-Moller J, Struve C, Petersen AM, Krogfelt KA, Nataro JP, Schadt EE, Waldor MK., N. Engl. J. Med. 365(8), 2011
PMID: 21793740
Insights into the phylogeny and coding potential of microbial dark matter.
Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, Cheng JF, Darling A, Malfatti S, Swan BK, Gies EA, Dodsworth JA, Hedlund BP, Tsiamis G, Sievert SM, Liu WT, Eisen JA, Hallam SJ, Kyrpides NC, Stepanauskas R, Rubin EM, Hugenholtz P, Woyke T., Nature 499(7459), 2013
PMID: 23851394
The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.
Rusch DB, Halpern AL, Sutton G, Heidelberg KB, Williamson S, Yooseph S, Wu D, Eisen JA, Hoffman JM, Remington K, Beeson K, Tran B, Smith H, Baden-Tillson H, Stewart C, Thorpe J, Freeman J, Andrews-Pfannkoch C, Venter JE, Li K, Kravitz S, Heidelberg JF, Utterback T, Rogers YH, Falcon LI, Souza V, Bonilla-Rosso G, Eguiarte LE, Karl DM, Sathyendranath S, Platt T, Bermingham E, Gallardo V, Tamayo-Castillo G, Ferrari MR, Strausberg RL, Nealson K, Friedman R, Frazier M, Venter JC., PLoS Biol. 5(3), 2007
PMID: 17355176
Computational solutions to large-scale data management and analysis.
Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP., Nat. Rev. Genet. 11(9), 2010
PMID: 20717155
CloudBurst: highly sensitive read mapping with MapReduce.
Schatz MC., Bioinformatics 25(11), 2009
PMID: 19357099

Shvachko K.., 2010
ABySS: a parallel assembler for short read sequence data.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I., Genome Res. 19(6), 2009
PMID: 19251739
Kraken: ultrafast metagenomic sequence classification using exact alignments.
Wood DE, Salzberg SL., Genome Biol. 15(3), 2014
PMID: 24580807
Heterogeneity in the inter-tumor transcriptome of high risk prostate cancer.
Wyatt AW, Mo F, Wang K, McConeghy B, Brahmbhatt S, Jong L, Mitchell DM, Johnston RL, Haegert A, Li E, Liew J, Yeung J, Shrestha R, Lapuk AV, McPherson A, Shukin R, Bell RH, Anderson S, Bishop J, Hurtado-Coll A, Xiao H, Chinnaiyan AM, Mehra R, Lin D, Wang Y, Fazli L, Gleave ME, Volik SV, Collins CC., Genome Biol. 15(8), 2014
PMID: 25155515

Zaharia M.., 2012

Zhao G.., 2015
MetaSpark: a spark-based distributed processing tool to recruit metagenomic reads to reference genomes.
Zhou W, Li R, Yuan S, Liu C, Yao S, Luo J, Niu B., Bioinformatics 33(7), 2017
PMID: 28065898
Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls.
Zook JM, Chapman B, Wang J, Mittelman D, Hofmann O, Hide W, Salit M., Nat. Biotechnol. 32(3), 2014
PMID: 24531798
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®
Quellen

PMID: 29253074
PubMed | Europe PMC

Suchen in

Google Scholar