Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence

Pucker B, Holtgräwe D, Weisshaar B (2017)
BMC Research Notes 10: 667.

Journal Article | Original Article | Published | English

No fulltext has been uploaded

Abstract
Abstract Objective The Arabidopsis thaliana Niederzenz-1 genome sequence was recently published with an ab initio gene prediction. In depth analysis of the predicted gene set revealed some errors involving genes with non-canonical splice sites in their introns. Since non-canonical splice sites are difficult to predict ab initio, we checked for options to improve the annotation by transferring annotation information from the recently released Columbia-0 reference genome sequence annotation Araport11. Results Incorporation of hints generated from Araport11 enabled the precise prediction of non-canonical splice sites. Manual inspection of RNA-Seq read mapping and RT-PCR were applied to validate the structural annotations of non-canonical splice sites. Predictions of untranslated regions were also updated by harnessing the potential of Araport11’s information, which was generated by using high coverage RNA-Seq data. The improved gene set of the Nd-1 genome assembly (GeneSet_Nd-1_v1.1) was evaluated via comparison to the initial gene prediction (GeneSet_Nd-1_v1.0) as well as against Araport11 for the Col-0 reference genome sequence. GeneSet_Nd-1_v1.1 contains previously missed non-canonical splice sites in 1256 genes. Reciprocal best hits for 24,527 (89.4%) of all nuclear Col-0 genes against the GeneSet_Nd-1_v1.1 indicate a high gene prediction quality.
Publishing Year
ISSN
Financial disclosure
Article Processing Charge funded by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.
PUB-ID

Cite this

Pucker B, Holtgräwe D, Weisshaar B. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes. 2017;10: 667.
Pucker, B., Holtgräwe, D., & Weisshaar, B. (2017). Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes, 10, 667. doi:10.1186/s13104-017-2985-y
Pucker, B., Holtgräwe, D., and Weisshaar, B. (2017). Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes 10:667.
Pucker, B., Holtgräwe, D., & Weisshaar, B., 2017. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes, 10: 667.
B. Pucker, D. Holtgräwe, and B. Weisshaar, “Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence”, BMC Research Notes, vol. 10, 2017, : 667.
Pucker, B., Holtgräwe, D., Weisshaar, B.: Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Research Notes. 10, : 667 (2017).
Pucker, Boas, Holtgräwe, Daniela, and Weisshaar, Bernd. “Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence”. BMC Research Notes 10 (2017): 667.
This data publication is cited in the following publications:
This publication cites the following data publications:

Export

0 Marked Publications

Open Data PUB

Sources

PMID: 29202864
PubMed | Europe PMC

Search this title in

Google Scholar