Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study

Cerdeira LT, Carneiro AR, Juca Ramos RT, de Almeida SS, D'Afonseca V, Cruz Schneider MP, Baumbach J, Tauch A, McCulloch JA, Carvalho Azevedo VA, Silva A (2011)
Journal of Microbiological Methods 86(2): 218-223.

Journal Article | Published | English

No fulltext has been uploaded

Author
; ; ; ; ; ; ; ; ; ;
Abstract
Due to the advent of the so-called Next-Generation Sequencing (NGS) technologies the amount of monetary and temporal resources for whole-genome sequencing has been reduced by several orders of magnitude. Sequence reads can be assembled either by anchoring them directly onto an available reference genome (classical reference assembly), or can be concatenated by overlap (de novo assembly). The latter strategy is preferable because it tends to maintain the architecture of the genome sequence the however, depending on the NGS platform used, the shortness of read lengths cause tremendous problems the in the subsequent genome assembly phase, impeding closing of the entire genome sequence. To address the problem, we developed a multi-pronged hybrid de nova strategy combining De Bruijn graph and Overlap-Layout-Consensus methods, which was used to assemble from short reads the entire genome of Corynebacterium pseudotuberculosis strain 119, a bacterium with immense importance in veterinary medicine that causes Caseous Lymphadenitis in ruminants, principally ovines and caprines. Briefly, contigs were assembled de novo from the short reads and were only oriented using a reference genome by anchoring. Remaining gaps were closed using iterative anchoring of short reads by craning to gap flanks. Finally, we compare the genome sequence assembled using our hybrid strategy to a classical reference assembly using the same data as input and show that with the availability of a reference genome, it pays off to use the hybrid de nova strategy, rather than a classical reference assembly, because more genome sequences are preserved using the former. (C) 2011 Elsevier B.V. All rights reserved.
Publishing Year
ISSN
PUB-ID

Cite this

Cerdeira LT, Carneiro AR, Juca Ramos RT, et al. Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods. 2011;86(2):218-223.
Cerdeira, L. T., Carneiro, A. R., Juca Ramos, R. T., de Almeida, S. S., D'Afonseca, V., Cruz Schneider, M. P., Baumbach, J., et al. (2011). Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods, 86(2), 218-223.
Cerdeira, L. T., Carneiro, A. R., Juca Ramos, R. T., de Almeida, S. S., D'Afonseca, V., Cruz Schneider, M. P., Baumbach, J., Tauch, A., McCulloch, J. A., Carvalho Azevedo, V. A., et al. (2011). Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods 86, 218-223.
Cerdeira, L.T., et al., 2011. Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods, 86(2), p 218-223.
L.T. Cerdeira, et al., “Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study”, Journal of Microbiological Methods, vol. 86, 2011, pp. 218-223.
Cerdeira, L.T., Carneiro, A.R., Juca Ramos, R.T., de Almeida, S.S., D'Afonseca, V., Cruz Schneider, M.P., Baumbach, J., Tauch, A., McCulloch, J.A., Carvalho Azevedo, V.A., Silva, A.: Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study. Journal of Microbiological Methods. 86, 218-223 (2011).
Cerdeira, Louise Teixeira, Carneiro, Adriana Ribeiro, Juca Ramos, Rommel Thiago, de Almeida, Sintia Silva, D'Afonseca, Vivian, Cruz Schneider, Maria Paula, Baumbach, Jan, Tauch, Andreas, McCulloch, John Anthony, Carvalho Azevedo, Vasco Ariston, and Silva, Artur. “Rapid hybrid de novo assembly of a microbial genome using only short reads: Corynebacterium pseudotuberculosis I19 as a case study”. Journal of Microbiological Methods 86.2 (2011): 218-223.
This data publication is cited in the following publications:
This publication cites the following data publications:

23 Citations in Europe PMC

Data provided by Europe PubMed Central.

Complete Genome Sequence of Lactococcus lactis Strain AI06, an Endophyte of the Amazonian Acai Palm.
McCulloch JA, de Oliveira VM, de Almeida Pina AV, Perez-Chaparro PJ, de Almeida LM, de Vasconcelos JM, de Oliveira LF, da Silva DE, Rogez HL, Cretenet M, Mamizuka EM, Nunes MR., Genome Announc 2(6), 2014
PMID: 25414513
Value of a newly sequenced bacterial genome.
Barbosa EG, Aburjaile FF, Ramos RT, Carneiro AR, Le Loir Y, Baumbach J, Miyoshi A, Silva A, Azevedo V., World J Biol Chem 5(2), 2014
PMID: 24921006
Complete Genome Sequence of an F8-Like Lytic Myovirus ({varphi}SPM-1) That Infects Metallo-β-Lactamase-Producing Pseudomonas aeruginosa.
Neves PR, Cerdeira LT, Mitne-Neto M, Oliveira TG, McCulloch JA, Sampaio JL, Mamizuka EM, Levy CE, Sato MI, Lincopan N., Genome Announc 2(2), 2014
PMID: 24699949
Progression of 'OMICS' methodologies for understanding the pathogenicity of Corynebacterium pseudotuberculosis: the Brazilian experience.
Dorella FA, Gala-Garcia A, Pinto AC, Sarrouh B, Antunes CA, Ribeiro D, Aburjaile FF, Fiaux KK, Guimaraes LC, Seyffert N, El-Aouar RA, Silva R, Hassan SS, Castro TL, Marques WS, Ramos R, Carneiro A, de Sa P, Miyoshi A, Azevedo V, Silva A., Comput Struct Biotechnol J 6(), 2013
PMID: 24688721
AutoAssemblyD: a graphical user interface system for several genome assemblers.
Veras AA, de Sa PH, Azevedo V, Silva A, Ramos RT., Bioinformation 9(16), 2013
PMID: 24143057
Graphical contig analyzer for all sequencing platforms (G4ALL): a new stand-alone tool for finishing and draft generation of bacterial genomes.
Ramos RT, Carneiro AR, Caracciolo PH, Azevedo V, Schneider MP, Barh D, Silva A., Bioinformation 9(11), 2013
PMID: 23888102
High efficiency application of a mate-paired library from next-generation sequencing to postlight sequencing: Corynebacterium pseudotuberculosis as a case study for microbial de novo genome assembly.
Ramos RT, Carneiro AR, de Castro Soares S, Barbosa S, Varuzza L, Orabona G, Tauch A, Azevedo V, Schneider MP, Silva A., J. Microbiol. Methods 95(3), 2013
PMID: 23792707
Complete Genome of a Methanosarcina mazei Strain Isolated from Sediment Samples from an Amazonian Flooded Area.
Assis das Gracas D, Thiago Juca Ramos R, Vieira Araujo AC, Zahlouth R, Ribeiro Carneiro A, Souza Lopes T, Azevedo Barauna R, Azevedo V, Cruz Schneider MP, Pellizari VH, Silva A., Genome Announc 1(3), 2013
PMID: 23704185
Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40.
Umemura M, Koyama Y, Takeda I, Hagiwara H, Ikegami T, Koike H, Machida M., PLoS ONE 8(5), 2013
PMID: 23667655
Next-generation sequencing technologies and their impact on microbial genomics.
Forde BM, O'Toole PW., Brief Funct Genomics 12(5), 2013
PMID: 23314033
Genome sequence of Corynebacterium pseudotuberculosis biovar equi strain 258 and prediction of antigenic targets to improve biotechnological vaccine production.
Soares SC, Trost E, Ramos RT, Carneiro AR, Santos AR, Pinto AC, Barbosa E, Aburjaile F, Ali A, Diniz CA, Hassan SS, Fiaux K, Guimaraes LC, Bakhtiar SM, Pereira U, Almeida SS, Abreu VA, Rocha FS, Dorella FA, Miyoshi A, Silva A, Azevedo V, Tauch A., J. Biotechnol. 167(2), 2013
PMID: 23201561
Tips and tricks for the assembly of a Corynebacterium pseudotuberculosis genome using a semiconductor sequencer.
Ramos RT, Carneiro AR, Soares Sde C, dos Santos AR, Almeida S, Guimaraes L, Figueira F, Barbosa E, Tauch A, Azevedo V, Silva A., Microb Biotechnol 6(2), 2013
PMID: 23199210
Genome sequence of Exiguobacterium antarcticum B7, isolated from a biofilm in Ginger Lake, King George Island, Antarctica.
Carneiro AR, Ramos RT, Dall'Agnol H, Pinto AC, de Castro Soares S, Santos AR, Guimaraes LC, Almeida SS, Barauna RA, das Gracas DA, Franco LC, Ali A, Hassan SS, Nunes CI, Barbosa MS, Fiaux KK, Aburjaile FF, Barbosa EG, Bakhtiar SM, Vilela D, Nobrega F, dos Santos AL, Carepo MS, Azevedo V, Schneider MP, Pellizari VH, Silva A., J. Bacteriol. 194(23), 2012
PMID: 23144424
Complete genome sequence of Corynebacterium pseudotuberculosis Cp31, isolated from an Egyptian buffalo.
Silva A, Ramos RT, Ribeiro Carneiro A, Cybelle Pinto A, de Castro Soares S, Rodrigues Santos A, Silva Almeida S, Guimaraes LC, Figueira Aburjaile F, Vieira Barbosa EG, Alves Dorella F, Souza Rocha F, Souza Lopes T, Kawasaki R, Gomes Sa P, da Rocha Coimbra NA, Teixeira Cerdeira L, Silvanira Barbosa M, Cruz Schneider MP, Miyoshi A, Selim SA, Moawad MS, Azevedo V., J. Bacteriol. 194(23), 2012
PMID: 23144408
Genome sequence of the Corynebacterium pseudotuberculosis Cp316 strain, isolated from the abscess of a Californian horse.
Ramos RT, Silva A, Carneiro AR, Pinto AC, Soares Sde C, Santos AR, Almeida SS, Guimaraes LC, Aburjaile FF, Barbosa EG, Dorella FA, Rocha FS, Cerdeira LT, Barbosa MS, Tauch A, Edman J, Spier S, Miyoshi A, Schneider MP, Azevedo V., J. Bacteriol. 194(23), 2012
PMID: 23144380
Whole-genome sequence of Corynebacterium pseudotuberculosis strain Cp162, isolated from camel.
Hassan SS, Schneider MP, Ramos RT, Carneiro AR, Ranieri A, Guimaraes LC, Ali A, Bakhtiar SM, Pereira Ude P, dos Santos AR, Soares Sde C, Dorella F, Pinto AC, Ribeiro D, Barbosa MS, Almeida S, Abreu V, Aburjaile F, Fiaux K, Barbosa E, Diniz C, Rocha FS, Saxena R, Tiwari S, Zambare V, Ghosh P, Pacheco LG, Dowson CG, Kumar A, Barh D, Miyoshi A, Azevedo V, Silva A., J. Bacteriol. 194(20), 2012
PMID: 23012291
CoryneRegNet 6.0--Updated database content, new analysis methods and novel features focusing on community demands.
Pauling J, Rottger R, Tauch A, Azevedo V, Baumbach J., Nucleic Acids Res. 40(Database issue), 2012
PMID: 22080556
Complete genome sequence of Corynebacterium pseudotuberculosis strain CIP 52.97, isolated from a horse in Kenya.
Cerdeira LT, Schneider MP, Pinto AC, de Almeida SS, dos Santos AR, Barbosa EG, Ali A, Aburjaile FF, de Abreu VA, Guimaraes LC, Soares Sde C, Dorella FA, Rocha FS, Bol E, Gomes de Sa PH, Lopes TS, Barbosa MS, Carneiro AR, Juca Ramos RT, Coimbra NA, Lima AR, Barh D, Jain N, Tiwari S, Raja R, Zambare V, Ghosh P, Trost E, Tauch A, Miyoshi A, Azevedo V, Silva A., J. Bacteriol. 193(24), 2011
PMID: 22123771
Whole-genome sequence of Corynebacterium pseudotuberculosis PAT10 strain isolated from sheep in Patagonia, Argentina.
Cerdeira LT, Pinto AC, Schneider MP, de Almeida SS, dos Santos AR, Barbosa EG, Ali A, Barbosa MS, Carneiro AR, Ramos RT, de Oliveira RS, Barh D, Barve N, Zambare V, Belchior SE, Guimaraes LC, de Castro Soares S, Dorella FA, Rocha FS, de Abreu VA, Tauch A, Trost E, Miyoshi A, Azevedo V, Silva A., J. Bacteriol. 193(22), 2011
PMID: 22038974
Whole genome sequencing of environmental Vibrio cholerae O1 from 10 nanograms of DNA using short reads.
Perez Chaparro PJ, McCulloch JA, Cerdeira LT, Al-Dilaimi A, Canto de Sa LL, de Oliveira R, Tauch A, de Carvalho Azevedo VA, Cruz Schneider MP, da Silva AL., J. Microbiol. Methods 87(2), 2011
PMID: 21871929

19 References

Data provided by Europe PubMed Central.

ALLPATHS: de novo assembly of whole-genome shotgun microreads.
Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB., Genome Res. 18(5), 2008
PMID: 18340039
De novo fragment assembly with short mate-paired reads: Does the read length matter?
Chaisson MJ, Brinza D, Pevzner PA., Genome Res. 19(2), 2009
PMID: 19056694
De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer.
Hernandez D, Francois P, Farinelli L, Osteras M, Schrenzel J., Genome Res. 18(5), 2008
PMID: 18332092
Crystallizing short-read assemblies around seeds.
Hossain MS, Azimi N, Skiena S., BMC Bioinformatics 10 Suppl 1(), 2009
PMID: 19208115
Genomic mapping by fingerprinting random clones: a mathematical analysis.
Lander ES, Waterman MS., Genomics 2(3), 1988
PMID: 3294162
The sequence and de novo assembly of the giant panda genome.
Li R, Fan W, Tian G, Zhu H, He L, Cai J, Huang Q, Cai Q, Li B, Bai Y, Zhang Z, Zhang Y, Wang W, Li J, Wei F, Li H, Jian M, Li J, Zhang Z, Nielsen R, Li D, Gu W, Yang Z, Xuan Z, Ryder OA, Leung FC, Zhou Y, Cao J, Sun X, Fu Y, Fang X, Guo X, Wang B, Hou R, Shen F, Mu B, Ni P, Lin R, Qian W, Wang G, Yu C, Nie W, Wang J, Wu Z, Liang H, Min J, Wu Q, Cheng S, Ruan J, Wang M, Shi Z, Wen M, Liu B, Ren X, Zheng H, Dong D, Cook K, Shan G, Zhang H, Kosiol C, Xie X, Lu Z, Zheng H, Li Y, Steiner CC, Lam TT, Lin S, Zhang Q, Li G, Tian J, Gong T, Liu H, Zhang D, Fang L, Ye C, Zhang J, Hu W, Xu A, Ren Y, Zhang G, Bruford MW, Li Q, Ma L, Guo Y, An N, Hu Y, Zheng Y, Shi Y, Li Z, Liu Q, Chen Y, Zhao J, Qu N, Zhao S, Tian F, Wang X, Wang H, Xu L, Liu X, Vinar T, Wang Y, Lam TW, Yiu SM, Liu S, Zhang H, Li D, Huang Y, Wang X, Yang G, Jiang Z, Wang J, Qin N, Li L, Li J, Bolund L, Kristiansen K, Wong GK, Olson M, Zhang X, Li S, Yang H, Wang J, Wang J., Nature 463(7279), 2010
PMID: 20010809
Next-generation DNA sequencing methods
Mardis, Annu. Rev. Genom. Human Genet. 9(), 2008
Assembly algorithms for next-generation sequencing data.
Miller JR, Koren S, Sutton G., Genomics 95(6), 2010
PMID: 20211242

Nagarajan, 2010
DNA sequencing with chain-terminating inhibitors.
Sanger F, Nicklen S, Coulson AR., Proc. Natl. Acad. Sci. U.S.A. 74(12), 1977
PMID: 271968
Next-generation DNA sequencing.
Shendure J, Ji H., Nat. Biotechnol. 26(10), 2008
PMID: 18846087
Complete genome sequence of Corynebacterium pseudotuberculosis I19, a strain isolated from a cow in Israel with bovine mastitis
Silva, Journal of Bacteriology. 193(1), 2010
ABySS: a parallel assembler for short read sequence data.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I., Genome Res. 19(6), 2009
PMID: 19251739
Caseous lymphadenitis in small ruminants.
Williamson LH., Vet. Clin. North Am. Food Anim. Pract. 17(2), 2001
PMID: 11515406
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Zerbino DR, Birney E., Genome Res. 18(5), 2008
PMID: 18349386

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 21620904
PubMed | Europe PMC

Search this title in

Google Scholar