Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing

Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber APM (2011)
BMC Genomics 12: 227.

Download
OA 2.90 MB
Journal Article | Original Article | Published | English
Author
; ; ; ;
Abstract
Background: The garden pea, Pisum sativum, is among the best-investigated legume plants and of significant agro-commercial relevance. Pisum sativum has a large and complex genome and accordingly few comprehensive genomic resources exist. Results: We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly. A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format. Conclusions: We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will need to concentrate mainly on resolving the issues of redundancy and paralogy during transcriptome assembly.
Publishing Year
ISSN
PUB-ID

Cite this

Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber APM. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics. 2011;12: 227.
Franssen, S. U., Shrestha, R. P., Bräutigam, A., Bornberg-Bauer, E., & Weber, A. P. M. (2011). Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics, 12, 227. doi:10.1186/1471-2164-12-227
Franssen, S. U., Shrestha, R. P., Bräutigam, A., Bornberg-Bauer, E., and Weber, A. P. M. (2011). Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics 12:227.
Franssen, S.U., et al., 2011. Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics, 12: 227.
S.U. Franssen, et al., “Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing”, BMC Genomics, vol. 12, 2011, : 227.
Franssen, S.U., Shrestha, R.P., Bräutigam, A., Bornberg-Bauer, E., Weber, A.P.M.: Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics. 12, : 227 (2011).
Franssen, Susanne U., Shrestha, Roshan P., Bräutigam, Andrea, Bornberg-Bauer, Erich, and Weber, Andreas P. M. “Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing”. BMC Genomics 12 (2011): 227.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Last Uploaded
2017-12-19T09:17:21Z

This data publication is cited in the following publications:
This publication cites the following data publications:

64 Citations in Europe PMC

Data provided by Europe PubMed Central.

Transcriptome de novo assembly from next-generation sequencing and comparative analyses in the hexaploid salt marsh species Spartina maritima and Spartina alterniflora (Poaceae).
Ferreira de Carvalho J, Poulain J, Da Silva C, Wincker P, Michon-Coudouel S, Dheilly A, Naquin D, Boutte J, Salmon A, Ainouche M., Heredity (Edinb) 110(2), 2013
PMID: 23149455
Current state-of-art of sequencing technologies for plant genomics research.
Thudi M, Li Y, Jackson SA, May GD, Varshney RK., Brief Funct Genomics 11(1), 2012
PMID: 22345601
SNP markers retrieval for a non-model species: a practical approach.
Shahin A, van Gurp T, Peters SA, Visser RG, van Tuyl JM, Arens P., BMC Res Notes 5(), 2012
PMID: 22284269
Evaluating characteristics of de novo assembly software on 454 transcriptome data: a simulation approach.
Mundry M, Bornberg-Bauer E, Sammeth M, Feulner PG., PLoS One 7(2), 2012
PMID: 22384018
Transcriptome sequencing of field pea and faba bean for discovery and validation of SSR genetic markers.
Kaur S, Pembleton LW, Cogan NO, Savin KW, Leonforte T, Paull J, Materne M, Forster JW., BMC Genomics 13(), 2012
PMID: 22433453
Using nuclear gene data for plant phylogenetics: progress and prospects.
Zimmer EA, Wen J., Mol Phylogenet Evol 65(2), 2012
PMID: 22842093
RNA-Seq Assembly - Are We There Yet?
Schliesky S, Gowik U, Weber AP, Bräutigam A., Front Plant Sci 3(), 2012
PMID: 23056003
Transcriptomic resilience to global warming in the seagrass Zostera marina, a marine foundation species.
Franssen SU, Gu J, Bergmann N, Winters G, Klostermeier UC, Rosenstiel P, Bornberg-Bauer E, Reusch TB., Proc Natl Acad Sci U S A 108(48), 2011
PMID: 22084086

67 References

Data provided by Europe PubMed Central.

Glycine max
AUTHOR UNKNOWN, 0
Comparison of next generation sequencing technologies for transcriptome characterization
AUTHOR UNKNOWN, 2009
Approximation properties of haplotype tagging.
Vinterbo SA, Dreiseitl S, Ohno-Machado L., BMC Bioinformatics 7(), 2006
PMID: 16401341
MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes.
Thimm O, Blasing O, Gibon Y, Nagel A, Meyer S, Kruger P, Selbig J, Muller LA, Rhee SY, Stitt M., Plant J. 37(6), 2004
PMID: 14996223
Next is now: new technologies for sequencing of genomes, transcriptomes, and beyond.
Lister R, Gregory BD, Ecker JR., Curr. Opin. Plant Biol. 12(2), 2009
PMID: 19157957
Mapping Accuracy of Short Reads from Massively Parallel Sequencing and the Implications for Quantitative Expression Profiling
AUTHOR UNKNOWN, 2009
Improved scoring of functional groups from gene expression data by decorrelating GO graph structure.
Alexa A, Rahnenfuhrer J, Lengauer T., Bioinformatics 22(13), 2006
PMID: 16606683
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G., Nat. Genet. 25(1), 2000
PMID: 10802651
LIGHT CONTROL OF SEEDLING DEVELOPMENT.
Von Arnim A, Deng XW., Annu. Rev. Plant Physiol. Plant Mol. Biol. 47(), 1996
PMID: 15012288
Light control of Arabidopsis development entails coordinated regulation of genome expression and cellular pathways.
Ma L, Li J, Qu L, Hager J, Chen Z, Zhao H, Deng XW., Plant Cell 13(12), 2001
PMID: 11752374
Proteases and proteolytic cleavage of storage proteins in developing and germinating dicotyledonous seeds
AUTHOR UNKNOWN, 1996
An mRNA blueprint for C4 photosynthesis derived from comparative transcriptomics of closely related C3 and C4 species.
Brautigam A, Kajala K, Wullenweber J, Sommer M, Gagneul D, Weber KL, Carr KM, Gowik U, Mass J, Lercher MJ, Westhoff P, Hibberd JM, Weber AP., Plant Physiol. 155(1), 2011
PMID: 20543093
Sampling the Arabidopsis transcriptome with massively parallel pyrosequencing.
Weber AP, Weber KL, Carr K, Wilkerson C, Ohlrogge JB., Plant Physiol. 144(1), 2007
PMID: 17351049
Arabidopsis thaliana
AUTHOR UNKNOWN, 0
Fast and accurate long-read alignment with Burrows-Wheeler transform.
Li H, Durbin R., Bioinformatics 26(5), 2010
PMID: 20080505
The Sequence Alignment/Map format and SAMtools.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup., Bioinformatics 25(16), 2009
PMID: 19505943

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 21569327
PubMed | Europe PMC

Search this title in

Google Scholar