Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study

Cerdeira LT, Carneiro AR, Juca Ramos RT, de Almeida SS, D'Afonseca V, Cruz Schneider MP, Baumbach J, Tauch A, McCulloch JA, Carvalho Azevedo VA, Silva A (2011)
Journal of Microbiological Methods 86(2): 218-223.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
 
Download
Es wurde kein Volltext hochgeladen. Nur Publikationsnachweis!
Autor*in
Cerdeira, Louise Teixeira; Carneiro, Adriana Ribeiro; Juca Ramos, Rommel Thiago; de Almeida, Sintia Silva; D'Afonseca, Vivian; Cruz Schneider, Maria Paula; Baumbach, Jan; Tauch, AndreasUniBi; McCulloch, John Anthony; Carvalho Azevedo, Vasco Ariston; Silva, Artur
Abstract / Bemerkung
Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de nova strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain 119, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de nova strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. (C) 2011 Elsevier B.V. All rights reserved.
Stichworte
SOLiD; Corynebacterium; Next generation; Short read; Assembly; sequencing; De novo
Erscheinungsjahr
2011
Zeitschriftentitel
Journal of Microbiological Methods
Band
86
Ausgabe
2
Seite(n)
218-223
ISSN
0167-7012
Page URI
https://pub.uni-bielefeld.de/record/2326568

Zitieren

Cerdeira LT, Carneiro AR, Juca Ramos RT, et al. Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods. 2011;86(2):218-223.
Cerdeira, L. T., Carneiro, A. R., Juca Ramos, R. T., de Almeida, S. S., D'Afonseca, V., Cruz Schneider, M. P., Baumbach, J., et al. (2011). Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods, 86(2), 218-223. doi:10.1016/j.mimet.2011.05.008
Cerdeira, L. T., Carneiro, A. R., Juca Ramos, R. T., de Almeida, S. S., D'Afonseca, V., Cruz Schneider, M. P., Baumbach, J., Tauch, A., McCulloch, J. A., Carvalho Azevedo, V. A., et al. (2011). Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods 86, 218-223.
Cerdeira, L.T., et al., 2011. Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods, 86(2), p 218-223.
L.T. Cerdeira, et al., “Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study”, Journal of Microbiological Methods, vol. 86, 2011, pp. 218-223.
Cerdeira, L.T., Carneiro, A.R., Juca Ramos, R.T., de Almeida, S.S., D'Afonseca, V., Cruz Schneider, M.P., Baumbach, J., Tauch, A., McCulloch, J.A., Carvalho Azevedo, V.A., Silva, A.: Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods. 86, 218-223 (2011).
Cerdeira, Louise Teixeira, Carneiro, Adriana Ribeiro, Juca Ramos, Rommel Thiago, de Almeida, Sintia Silva, D'Afonseca, Vivian, Cruz Schneider, Maria Paula, Baumbach, Jan, Tauch, Andreas, McCulloch, John Anthony, Carvalho Azevedo, Vasco Ariston, and Silva, Artur. “Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study”. Journal of Microbiological Methods 86.2 (2011): 218-223.

27 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

Whole-genome optical mapping reveals a mis-assembly between two rRNA operons of Corynebacterium pseudotuberculosis strain 1002.
Mariano DC, Sousa Tde J, Pereira FL, Aburjaile F, Barh D, Rocha F, Pinto AC, Hassan SS, Saraiva TD, Dorella FA, de Carvalho AF, Leal CA, Figueiredo HC, Silva A, Ramos RT, Azevedo VA., BMC Genomics 17(), 2016
PMID: 27129708
Draft genome sequence of a CTX-M-15-producing Klebsiella pneumoniae sequence type 340 (clonal complex 258) isolate from a food-producing animal.
Cerdeira L, Silva KC, Fernandes MR, Ienne S, de Souza TA, de Oliveira Garcia D, Moreno AM, Lincopan N., J Glob Antimicrob Resist 7(), 2016
PMID: 27664870
Complete Genome Sequence of an F8-Like Lytic Myovirus ({varphi}SPM-1) That Infects Metallo-β-Lactamase-Producing Pseudomonas aeruginosa.
Neves PR, Cerdeira LT, Mitne-Neto M, Oliveira TG, McCulloch JA, Sampaio JL, Mamizuka EM, Levy CE, Sato MI, Lincopan N., Genome Announc 2(2), 2014
PMID: 24699949
Value of a newly sequenced bacterial genome.
Barbosa EG, Aburjaile FF, Ramos RT, Carneiro AR, Le Loir Y, Baumbach J, Miyoshi A, Silva A, Azevedo V., World J Biol Chem 5(2), 2014
PMID: 24921006
Complete Genome Sequence of Lactococcus lactis Strain AI06, an Endophyte of the Amazonian Açaí Palm.
McCulloch JA, de Oliveira VM, de Almeida Pina AV, Pérez-Chaparro PJ, de Almeida LM, de Vasconcelos JM, de Oliveira LF, da Silva DE, Rogez HL, Cretenet M, Mamizuka EM, Nunes MR., Genome Announc 2(6), 2014
PMID: 25414513
Genome sequence of Corynebacterium pseudotuberculosis biovar equi strain 258 and prediction of antigenic targets to improve biotechnological vaccine production.
Soares SC, Trost E, Ramos RT, Carneiro AR, Santos AR, Pinto AC, Barbosa E, Aburjaile F, Ali A, Diniz CA, Hassan SS, Fiaux K, Guimarães LC, Bakhtiar SM, Pereira U, Almeida SS, Abreu VA, Rocha FS, Dorella FA, Miyoshi A, Silva A, Azevedo V, Tauch A., J Biotechnol 167(2), 2013
PMID: 23201561
Tips and tricks for the assembly of a Corynebacterium pseudotuberculosis genome using a semiconductor sequencer.
Ramos RT, Carneiro AR, Soares Sde C, dos Santos AR, Almeida S, Guimarães L, Figueira F, Barbosa E, Tauch A, Azevedo V, Silva A., Microb Biotechnol 6(2), 2013
PMID: 23199210
Next-generation sequencing technologies and their impact on microbial genomics.
Forde BM, O'Toole PW., Brief Funct Genomics 12(5), 2013
PMID: 23314033
Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40.
Umemura M, Koyama Y, Takeda I, Hagiwara H, Ikegami T, Koike H, Machida M., PLoS One 8(5), 2013
PMID: 23667655
Complete Genome of a Methanosarcina mazei Strain Isolated from Sediment Samples from an Amazonian Flooded Area.
Assis das Graças D, Thiago Jucá Ramos R, Vieira Araújo AC, Zahlouth R, Ribeiro Carneiro A, Souza Lopes T, Azevedo Baraúna R, Azevedo V, Cruz Schneider MP, Pellizari VH, Silva A., Genome Announc 1(3), 2013
PMID: 23704185
High efficiency application of a mate-paired library from next-generation sequencing to postlight sequencing: Corynebacterium pseudotuberculosis as a case study for microbial de novo genome assembly.
Ramos RT, Carneiro AR, de Castro Soares S, Barbosa S, Varuzza L, Orabona G, Tauch A, Azevedo V, Schneider MP, Silva A., J Microbiol Methods 95(3), 2013
PMID: 23792707
Graphical contig analyzer for all sequencing platforms (G4ALL): a new stand-alone tool for finishing and draft generation of bacterial genomes.
Ramos RT, Carneiro AR, Caracciolo PH, Azevedo V, Schneider MP, Barh D, Silva A., Bioinformation 9(11), 2013
PMID: 23888102
AutoAssemblyD: a graphical user interface system for several genome assemblers.
Veras AA, de Sá PH, Azevedo V, Silva A, Ramos RT., Bioinformation 9(16), 2013
PMID: 24143057
Progression of 'OMICS' methodologies for understanding the pathogenicity of Corynebacterium pseudotuberculosis: the Brazilian experience.
Dorella FA, Gala-Garcia A, Pinto AC, Sarrouh B, Antunes CA, Ribeiro D, Aburjaile FF, Fiaux KK, Guimarães LC, Seyffert N, El-Aouar RA, Silva R, Hassan SS, Castro TL, Marques WS, Ramos R, Carneiro A, de Sá P, Miyoshi A, Azevedo V, Silva A., Comput Struct Biotechnol J 6(), 2013
PMID: 24688721
Next-generation sequence assembly: four stages of data processing and computational challenges.
El-Metwally S, Hamza T, Zakaria M, Helmy M., PLoS Comput Biol 9(12), 2013
PMID: 24348224
CoryneRegNet 6.0--Updated database content, new analysis methods and novel features focusing on community demands.
Pauling J, Röttger R, Tauch A, Azevedo V, Baumbach J., Nucleic Acids Res 40(database issue), 2012
PMID: 22080556
Whole-genome sequence of Corynebacterium pseudotuberculosis strain Cp162, isolated from camel.
Hassan SS, Schneider MP, Ramos RT, Carneiro AR, Ranieri A, Guimarães LC, Ali A, Bakhtiar SM, Pereira Ude P, dos Santos AR, Soares Sde C, Dorella F, Pinto AC, Ribeiro D, Barbosa MS, Almeida S, Abreu V, Aburjaile F, Fiaux K, Barbosa E, Diniz C, Rocha FS, Saxena R, Tiwari S, Zambare V, Ghosh P, Pacheco LG, Dowson CG, Kumar A, Barh D, Miyoshi A, Azevedo V, Silva A., J Bacteriol 194(20), 2012
PMID: 23012291
Genome sequence of the Corynebacterium pseudotuberculosis Cp316 strain, isolated from the abscess of a Californian horse.
Ramos RT, Silva A, Carneiro AR, Pinto AC, Soares Sde C, Santos AR, Almeida SS, Guimarães LC, Aburjaile FF, Barbosa EG, Dorella FA, Rocha FS, Cerdeira LT, Barbosa MS, Tauch A, Edman J, Spier S, Miyoshi A, Schneider MP, Azevedo V., J Bacteriol 194(23), 2012
PMID: 23144380
Complete genome sequence of Corynebacterium pseudotuberculosis Cp31, isolated from an Egyptian buffalo.
Silva A, Ramos RT, Ribeiro Carneiro A, Cybelle Pinto A, de Castro Soares S, Rodrigues Santos A, Silva Almeida S, Guimarães LC, Figueira Aburjaile F, Vieira Barbosa EG, Alves Dorella F, Souza Rocha F, Souza Lopes T, Kawasaki R, Gomes Sá P, da Rocha Coimbra NA, Teixeira Cerdeira L, Silvanira Barbosa M, Cruz Schneider MP, Miyoshi A, Selim SA, Moawad MS, Azevedo V., J Bacteriol 194(23), 2012
PMID: 23144408
Genome sequence of Exiguobacterium antarcticum B7, isolated from a biofilm in Ginger Lake, King George Island, Antarctica.
Carneiro AR, Ramos RT, Dall'Agnol H, Pinto AC, de Castro Soares S, Santos AR, Guimarães LC, Almeida SS, Baraúna RA, das Graças DA, Franco LC, Ali A, Hassan SS, Nunes CI, Barbosa MS, Fiaux KK, Aburjaile FF, Barbosa EG, Bakhtiar SM, Vilela D, Nóbrega F, dos Santos AL, Carepo MS, Azevedo V, Schneider MP, Pellizari VH, Silva A., J Bacteriol 194(23), 2012
PMID: 23144424
Complete genome sequence of Corynebacterium pseudotuberculosis biovar ovis strain P54B96 isolated from antelope in South Africa obtained by rapid next generation sequencing technology.
Hassan SS, Guimarães LC, Pereira Ude P, Islam A, Ali A, Bakhtiar SM, Ribeiro D, Rodrigues Dos Santos A, Soares Sde C, Dorella F, Pinto AC, Schneider MP, Barbosa MS, Almeida S, Abreu V, Aburjaile F, Carneiro AR, Cerdeira LT, Fiaux K, Barbosa E, Diniz C, Rocha FS, Ramos RT, Jain N, Tiwari S, Barh D, Miyoshi A, Müller B, Silva A, Azevedo V., Stand Genomic Sci 7(2), 2012
PMID: 23408795
Whole genome sequencing of environmental Vibrio cholerae O1 from 10 nanograms of DNA using short reads.
Pérez Chaparro PJ, McCulloch JA, Cerdeira LT, Al-Dilaimi A, Canto de Sá LL, de Oliveira R, Tauch A, de Carvalho Azevedo VA, Cruz Schneider MP, da Silva AL., J Microbiol Methods 87(2), 2011
PMID: 21871929
Whole-genome sequence of Corynebacterium pseudotuberculosis PAT10 strain isolated from sheep in Patagonia, Argentina.
Cerdeira LT, Pinto AC, Schneider MP, de Almeida SS, dos Santos AR, Barbosa EG, Ali A, Barbosa MS, Carneiro AR, Ramos RT, de Oliveira RS, Barh D, Barve N, Zambare V, Belchior SE, Guimarães LC, de Castro Soares S, Dorella FA, Rocha FS, de Abreu VA, Tauch A, Trost E, Miyoshi A, Azevedo V, Silva A., J Bacteriol 193(22), 2011
PMID: 22038974
Complete genome sequence of Corynebacterium pseudotuberculosis strain CIP 52.97, isolated from a horse in Kenya.
Cerdeira LT, Schneider MP, Pinto AC, de Almeida SS, dos Santos AR, Barbosa EG, Ali A, Aburjaile FF, de Abreu VA, Guimarães LC, Soares Sde C, Dorella FA, Rocha FS, Bol E, Gomes de Sá PH, Lopes TS, Barbosa MS, Carneiro AR, Jucá Ramos RT, Coimbra NA, Lima AR, Barh D, Jain N, Tiwari S, Raja R, Zambare V, Ghosh P, Trost E, Tauch A, Miyoshi A, Azevedo V, Silva A., J Bacteriol 193(24), 2011
PMID: 22123771

19 References

Daten bereitgestellt von Europe PubMed Central.

ALLPATHS: de novo assembly of whole-genome shotgun microreads.
Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB., Genome Res. 18(5), 2008
PMID: 18340039
De novo fragment assembly with short mate-paired reads: Does the read length matter?
Chaisson MJ, Brinza D, Pevzner PA., Genome Res. 19(2), 2008
PMID: 19056694
De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer.
Hernandez D, Francois P, Farinelli L, Osteras M, Schrenzel J., Genome Res. 18(5), 2008
PMID: 18332092
Crystallizing short-read assemblies around seeds.
Hossain MS, Azimi N, Skiena S., BMC Bioinformatics 10 Suppl 1(), 2009
PMID: 19208115
Genomic mapping by fingerprinting random clones: a mathematical analysis.
Lander ES, Waterman MS., Genomics 2(3), 1988
PMID: 3294162
The sequence and de novo assembly of the giant panda genome.
Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Wang W, Li J, Wei F, Li H, Jian M, Li J, Zhang Z, Nielsen R, Li D, Gu W, Yang Z, Xuan Z, Ryder OA, Leung FC, Zhou Y, Cao J, Sun X, Fu Y, Fang X, Guo X, Wang B, Hou R, Shen F, Mu B, Ni P, Lin R, Qian W, Wang G, Yu C, Nie W, Wang J, Wu Z, Liang H, Min J, Wu Q, Cheng S, Ruan J, Wang M, Shi Z, Wen M, Liu B, Ren X, Zheng H, Dong D, Cook K, Shan G, Zhang H, Kosiol C, Xie X, Lu Z, Zheng H, Li Y, Steiner CC, Lam TT, Lin S, Zhang Q, Li G, Tian J, Gong T, Liu H, Zhang D, Fang L, Ye C, Zhang J, Hu W, Xu A, Ren Y, Zhang G, Bruford MW, Li Q, Ma L, Guo Y, An N, Hu Y, Zheng Y, Shi Y, Li Z, Liu Q, Chen Y, Zhao J, Qu N, Zhao S, Tian F, Wang X, Wang H, Xu L, Liu X, Vinar T, Wang Y, Lam TW, Yiu SM, Liu S, Zhang H, Li D, Huang Y, Wang X, Yang G, Jiang Z, Wang J, Qin N, Li L, Li J, Bolund L, Kristiansen K, Wong GK, Olson M, Zhang X, Li S, Yang H, Wang J, Wang J., Nature 463(7279), 2009
PMID: 20010809
Next-generation DNA sequencing methods
Mardis, Annu. Rev. Genom. Human Genet. 9(), 2008
Assembly algorithms for next-generation sequencing data.
Miller JR, Koren S, Sutton G., Genomics 95(6), 2010
PMID: 20211242

Nagarajan, 2010
DNA sequencing with chain-terminating inhibitors.
Sanger F, Nicklen S, Coulson AR., Proc. Natl. Acad. Sci. U.S.A. 74(12), 1977
PMID: 271968
Next-generation DNA sequencing.
Shendure J, Ji H., Nat. Biotechnol. 26(10), 2008
PMID: 18846087
Complete genome sequence of Corynebacterium pseudotuberculosis I19, a strain isolated from a cow in Israel with bovine mastitis
Silva, Journal of Bacteriology. 193(1), 2010
ABySS: a parallel assembler for short read sequence data.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I., Genome Res. 19(6), 2009
PMID: 19251739
Caseous lymphadenitis in small ruminants.
Williamson LH., Vet. Clin. North Am. Food Anim. Pract. 17(2), 2001
PMID: 11515406
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Zerbino DR, Birney E., Genome Res. 18(5), 2008
PMID: 18349386

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 21620904
PubMed | Europe PMC

Suchen in

Google Scholar