Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly

Dutilh BE, Huynen MA, Strous M (2009)
BIOINFORMATICS 25(21): 2878-2881.

Journal Article | Published | English

No fulltext has been uploaded

Author
; ;
Abstract
Motivation: Most microbial species can not be cultured in the laboratory. Metagenomic sequencing may still yield a complete genome if the sequenced community is enriched and the sequencing coverage is high. However, the complexity in a natural population may cause the enrichment culture to contain multiple related strains. This diversity can confound existing strict assembly programs and lead to a fragmented assembly, which is unnecessary if we have a related reference genome available that can function as a scaffold. Results: Here, we map short metagenomic sequencing reads from a population of strains to a related reference genome, and compose a genome that captures the consensus of the population's sequences. We show that by iteration of the mapping and assembly procedure, the coverage increases while the similarity with the reference genome decreases. This indicates that the assembly becomes less dependent on the reference genome and approaches the consensus genome of the multi-strain population.
Publishing Year
ISSN
eISSN
PUB-ID

Cite this

Dutilh BE, Huynen MA, Strous M. Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. BIOINFORMATICS. 2009;25(21):2878-2881.
Dutilh, B. E., Huynen, M. A., & Strous, M. (2009). Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. BIOINFORMATICS, 25(21), 2878-2881.
Dutilh, B. E., Huynen, M. A., and Strous, M. (2009). Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. BIOINFORMATICS 25, 2878-2881.
Dutilh, B.E., Huynen, M.A., & Strous, M., 2009. Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. BIOINFORMATICS, 25(21), p 2878-2881.
B.E. Dutilh, M.A. Huynen, and M. Strous, “Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly”, BIOINFORMATICS, vol. 25, 2009, pp. 2878-2881.
Dutilh, B.E., Huynen, M.A., Strous, M.: Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. BIOINFORMATICS. 25, 2878-2881 (2009).
Dutilh, Bas E., Huynen, Martijn A., and Strous, Marc. “Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly”. BIOINFORMATICS 25.21 (2009): 2878-2881.
This data publication is cited in the following publications:
This publication cites the following data publications:

20 Citations in Europe PMC

Data provided by Europe PubMed Central.

Freshwater Metaviromics and Bacteriophages: A Current Assessment of the State of the Art in Relation to Bioinformatic Challenges.
Bruder K, Malki K, Cooper A, Sible E, Shapiro JW, Watkins SC, Putonti C., Evol. Bioinform. Online 12(Suppl 1), 2016
PMID: 27375355
Construction of a virtual Mycobacterium tuberculosis consensus genome and its application to data from a next generation sequencer.
Okumura K, Kato M, Kirikae T, Kayano M, Miyoshi-Akiyama T., BMC Genomics 16(), 2015
PMID: 25879806
A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes.
Dutilh BE, Cassman N, McNair K, Sanchez SE, Silva GG, Boling L, Barr JJ, Speth DR, Seguritan V, Aziz RK, Felts B, Dinsdale EA, Mokili JL, Edwards RA., Nat Commun 5(), 2014
PMID: 25058116
Explaining microbial phenotypes on a genomic scale: GWAS for microbes.
Dutilh BE, Backus L, Edwards RA, Wels M, Bayjanov JR, van Hijum SA., Brief Funct Genomics 12(4), 2013
PMID: 23625995
Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.
Wendl MC, Kota K, Weinstock GM, Mitreva M., J Math Biol 67(5), 2013
PMID: 22965653
Bacterial oxygen production in the dark.
Ettwig KF, Speth DR, Reimann J, Wu ML, Jetten MS, Keltjens JT., Front Microbiol 3(), 2012
PMID: 22891064
Environmental microbiology through the lens of high-throughput DNA sequencing: synopsis of current platforms and bioinformatics approaches.
Logares R, Haverkamp TH, Kumar S, Lanzen A, Nederbragt AJ, Quince C, Kauserud H., J. Microbiol. Methods 91(1), 2012
PMID: 22849829
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.
Swain MT, Tsai IJ, Assefa SA, Newbold C, Berriman M, Otto TD., Nat Protoc 7(7), 2012
PMID: 22678431
NetCmpt: a network-based tool for calculating the metabolic competition between bacterial species.
Kreimer A, Doron-Faigenboim A, Borenstein E, Freilich S., Bioinformatics 28(16), 2012
PMID: 22668793
CAPRG: sequence assembling pipeline for next generation sequencing of non-model organisms.
Rawat A, Elasri MO, Gust KA, George G, Pham D, Scanlan LD, Vulpe C, Perkins EJ., PLoS ONE 7(2), 2012
PMID: 22319566
Pathogen comparative genomics in the next-generation sequencing era: genome alignments, pangenomics and metagenomics.
Hu B, Xie G, Lo CC, Starkenburg SR, Chain PS., Brief Funct Genomics 10(6), 2011
PMID: 22199376
EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data.
Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF., Genome Biol. 12(5), 2011
PMID: 21595876
Genomic sequencing and analysis of a Chinese hamster ovary cell line using Illumina sequencing technology.
Hammond S, Swanberg JC, Kaplarevic M, Lee KH., BMC Genomics 12(), 2011
PMID: 21269493
Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology.
Otto TD, Sanders M, Berriman M, Newbold C., Bioinformatics 26(14), 2010
PMID: 20562415
Nitrite-driven anaerobic methane oxidation by oxygenic bacteria.
Ettwig KF, Butler MK, Le Paslier D, Pelletier E, Mangenot S, Kuypers MM, Schreiber F, Dutilh BE, Zedelius J, de Beer D, Gloerich J, Wessels HJ, van Alen T, Luesken F, Wu ML, van de Pas-Schoonen KT, Op den Camp HJ, Janssen-Megens EM, Francoijs KJ, Stunnenberg H, Weissenbach J, Jetten MS, Strous M., Nature 464(7288), 2010
PMID: 20336137

16 References

Data provided by Europe PubMed Central.

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694
Whole-genome re-sequencing.
Bentley DR., Curr. Opin. Genet. Dev. 16(6), 2006
PMID: 17055251
Accurate whole human genome sequencing using reversible terminator chemistry.
Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ, Karbelashvili MS, Kirk SM, Li H, Liu X, Maisinger KS, Murray LJ, Obradovic B, Ost T, Parkinson ML, Pratt MR, Rasolonjatovo IM, Reed MT, Rigatti R, Rodighiero C, Ross MT, Sabot A, Sankar SV, Scally A, Schroth GP, Smith ME, Smith VP, Spiridou A, Torrance PE, Tzonev SS, Vermaas EH, Walter K, Wu X, Zhang L, Alam MD, Anastasi C, Aniebo IC, Bailey DM, Bancarz IR, Banerjee S, Barbour SG, Baybayan PA, Benoit VA, Benson KF, Bevis C, Black PJ, Boodhun A, Brennan JS, Bridgham JA, Brown RC, Brown AA, Buermann DH, Bundu AA, Burrows JC, Carter NP, Castillo N, Chiara E Catenazzi M, Chang S, Neil Cooley R, Crake NR, Dada OO, Diakoumakos KD, Dominguez-Fernandez B, Earnshaw DJ, Egbujor UC, Elmore DW, Etchin SS, Ewan MR, Fedurco M, Fraser LJ, Fuentes Fajardo KV, Scott Furey W, George D, Gietzen KJ, Goddard CP, Golda GS, Granieri PA, Green DE, Gustafson DL, Hansen NF, Harnish K, Haudenschild CD, Heyer NI, Hims MM, Ho JT, Horgan AM, Hoschler K, Hurwitz S, Ivanov DV, Johnson MQ, James T, Huw Jones TA, Kang GD, Kerelska TH, Kersey AD, Khrebtukova I, Kindwall AP, Kingsbury Z, Kokko-Gonzales PI, Kumar A, Laurent MA, Lawley CT, Lee SE, Lee X, Liao AK, Loch JA, Lok M, Luo S, Mammen RM, Martin JW, McCauley PG, McNitt P, Mehta P, Moon KW, Mullens JW, Newington T, Ning Z, Ling Ng B, Novo SM, O'Neill MJ, Osborne MA, Osnowski A, Ostadan O, Paraschos LL, Pickering L, Pike AC, Pike AC, Chris Pinkard D, Pliskin DP, Podhasky J, Quijano VJ, Raczy C, Rae VH, Rawlings SR, Chiva Rodriguez A, Roe PM, Rogers J, Rogert Bacigalupo MC, Romanov N, Romieu A, Roth RK, Rourke NJ, Ruediger ST, Rusman E, Sanches-Kuiper RM, Schenker MR, Seoane JM, Shaw RJ, Shiver MK, Short SW, Sizto NL, Sluis JP, Smith MA, Ernest Sohna Sohna J, Spence EJ, Stevens K, Sutton N, Szajkowski L, Tregidgo CL, Turcatti G, Vandevondele S, Verhovsky Y, Virk SM, Wakelin S, Walcott GC, Wang J, Worsley GJ, Yan J, Yau L, Zuerlein M, Rogers J, Mullikin JC, Hurles ME, McCooke NJ, West JS, Oaks FL, Lundberg PL, Klenerman D, Durbin R, Smith AJ., Nature 456(7218), 2008
PMID: 18987734
Denitrifying bacteria anaerobically oxidize methane in the absence of Archaea.
Ettwig KF, Shima S, van de Pas-Schoonen KT, Kahnt J, Medema MH, Op den Camp HJ, Jetten MS, Strous M., Environ. Microbiol. 10(11), 2008
PMID: 18721142
Enrichment and molecular detection of denitrifying methanotrophic bacteria of the NC10 phylum.
Ettwig KF, van Alen T, van de Pas-Schoonen KT, Jetten MS, Strous M., Appl. Environ. Microbiol. 75(11), 2009
PMID: 19329658

AUTHOR UNKNOWN, 0
Initial sequencing and analysis of the human genome.
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann Y, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ, Szustakowki J; International Human Genome Sequencing Consortium., Nature 409(6822), 2001
PMID: 11237011
Next-generation DNA sequencing methods.
Mardis ER., Annu Rev Genomics Hum Genet 9(), 2008
PMID: 18576944
"Candidatus Cloacamonas acidaminovorans": genome sequence reconstruction provides a first glimpse of a new bacterial division.
Pelletier E, Kreimeyer A, Bocs S, Rouy Z, Gyapay G, Chouari R, Riviere D, Ganesan A, Daegelen P, Sghir A, Cohen GN, Medigue C, Weissenbach J, Le Paslier D., J. Bacteriol. 190(7), 2008
PMID: 18245282
Gene-boosted assembly of a novel bacterial genome from very short reads.
Salzberg SL, Sommer DD, Puiu D, Lee VT., PLoS Comput. Biol. 4(9), 2008
PMID: 18818729

AUTHOR UNKNOWN, 0
Environmental genome shotgun sequencing of the Sargasso Sea.
Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, Fouts DE, Levy S, Knap AH, Lomas MW, Nealson K, White O, Peterson J, Hoffman J, Parsons R, Baden-Tillson H, Pfannkoch C, Rogers YH, Smith HO., Science 304(5667), 2004
PMID: 15001713
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Zerbino DR, Birney E., Genome Res. 18(5), 2008
PMID: 18349386
A greedy algorithm for aligning DNA sequences.
Zhang Z, Schwartz S, Wagner L, Miller W., J. Comput. Biol. 7(1-2), 2000
PMID: 10890397

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 19542148
PubMed | Europe PMC

Search this title in

Google Scholar