Exploiting single-molecule transcript sequencing for eukaryotic gene prediction

Minoche AE, Dohm JC, Schneider J, Holtgräwe D, Viehöver P, Montfort M, Rosleff Sörensen T, Weisshaar B, Himmelbauer H (2015)
Genome Biology 16: 184.

Download
OA 2.11 MB
Journal Article | Published | English
Author
; ; ; ; ; ; ; ;
Abstract
We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes.
Publishing Year
eISSN
PUB-ID

Cite this

Minoche AE, Dohm JC, Schneider J, et al. Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biology. 2015;16: 184.
Minoche, A. E., Dohm, J. C., Schneider, J., Holtgräwe, D., Viehöver, P., Montfort, M., Rosleff Sörensen, T., et al. (2015). Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biology, 16: 184.
Minoche, A. E., Dohm, J. C., Schneider, J., Holtgräwe, D., Viehöver, P., Montfort, M., Rosleff Sörensen, T., Weisshaar, B., and Himmelbauer, H. (2015). Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biology 16:184.
Minoche, A.E., et al., 2015. Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biology, 16: 184.
A.E. Minoche, et al., “Exploiting single-molecule transcript sequencing for eukaryotic gene prediction”, Genome Biology, vol. 16, 2015, : 184.
Minoche, A.E., Dohm, J.C., Schneider, J., Holtgräwe, D., Viehöver, P., Montfort, M., Rosleff Sörensen, T., Weisshaar, B., Himmelbauer, H.: Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biology. 16, : 184 (2015).
Minoche, Andre E, Dohm, Juliane C, Schneider, Jessica, Holtgräwe, Daniela, Viehöver, Prisca, Montfort, Magda, Rosleff Sörensen, Thomas, Weisshaar, Bernd, and Himmelbauer, Heinz. “Exploiting single-molecule transcript sequencing for eukaryotic gene prediction”. Genome Biology 16 (2015): 184.
Main File(s)
Access Level
OA Open Access
Last Uploaded
2016-11-29T13:03:36Z

This data publication is cited in the following publications:
This publication cites the following data publications:

4 Citations in Europe PMC

Data provided by Europe PubMed Central.

A survey of the sorghum transcriptome using single-molecule long reads.
Abdel-Ghany SE, Hamilton M, Jacobi JL, Ngam P, Devitt N, Schilkey F, Ben-Hur A, Reddy AS., Nat Commun 7(), 2016
PMID: 27339290
cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing.
Cartolano M, Huettel B, Hartwig B, Reinhardt R, Schneeberger K., PLoS ONE 11(6), 2016
PMID: 27327613
Single-molecule real-time transcript sequencing facilitates common wheat genome annotation and grain transcriptome research.
Dong L, Liu H, Zhang J, Yang S, Kong G, Chu JS, Chen N, Wang D., BMC Genomics 16(), 2015
PMID: 26645802
De novo and comparative transcriptome analysis of cultivated and wild spinach.
Xu C, Jiao C, Zheng Y, Sun H, Liu W, Cai X, Wang X, Liu S, Xu Y, Mou B, Dai S, Fei Z, Wang Q., Sci Rep 5(), 2015
PMID: 26635144

36 References

Data provided by Europe PubMed Central.


AUTHOR UNKNOWN, 0
GMAP: a genomic mapping and alignment program for mRNA and EST sequences.
Wu TD, Watanabe CK., Bioinformatics 21(9), 2005
PMID: 15728110

AUTHOR UNKNOWN, 0
GenomeView: a next-generation genome browser.
Abeel T, Van Parys T, Saeys Y, Galagan J, Van de Peer Y., Nucleic Acids Res. 40(2), 2012
PMID: 22102585

AUTHOR UNKNOWN, 0
BLAT--the BLAST-like alignment tool.
Kent WJ., Genome Res. 12(4), 2002
PMID: 11932250

AUTHOR UNKNOWN, 0

AUTHOR UNKNOWN, 0
The generic genome browser: a building block for a model organism system database.
Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S., Genome Res. 12(10), 2002
PMID: 12368253
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ., Nucleic Acids Res. 25(17), 1997
PMID: 9254694
Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures.
Loke JC, Stahlberg EA, Strenski DG, Haas BJ, Wood PC, Li QQ., Plant Physiol. 138(3), 2005
PMID: 15965016

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 26328666
PubMed | Europe PMC

Search this title in

Google Scholar