Ensemble approach combining multiple methods improves human transcription start site prediction

Dineen DG, Schroeder M, Higgins DG, Cunningham P (2010)
BMC Genomics 11(1): 677.

Download
OA
Zeitschriftenaufsatz | Veröffentlicht | Englisch
Autor
; ; ;
Abstract / Bemerkung
Background: The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results: We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier ('Profisi Ensemble') using predictions from 7 programs, along with 2 other data sources. Support vector machines using 'full' and 'reduced' data sets are combined in an either/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions: Supervised learning methods are a useful way to combine predictions from diverse sources.
Erscheinungsjahr
Zeitschriftentitel
BMC Genomics
Band
11
Zeitschriftennummer
1
Seite
677
ISSN
PUB-ID

Zitieren

Dineen DG, Schroeder M, Higgins DG, Cunningham P. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics. 2010;11(1):677.
Dineen, D. G., Schroeder, M., Higgins, D. G., & Cunningham, P. (2010). Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics, 11(1), 677. doi:10.1186/1471-2164-11-677
Dineen, D. G., Schroeder, M., Higgins, D. G., and Cunningham, P. (2010). Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics 11, 677.
Dineen, D.G., et al., 2010. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics, 11(1), p 677.
D.G. Dineen, et al., “Ensemble approach combining multiple methods improves human transcription start site prediction”, BMC Genomics, vol. 11, 2010, pp. 677.
Dineen, D.G., Schroeder, M., Higgins, D.G., Cunningham, P.: Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics. 11, 677 (2010).
Dineen, David G., Schroeder, Markus, Higgins, Desmond G., and Cunningham, Padraig. “Ensemble approach combining multiple methods improves human transcription start site prediction”. BMC Genomics 11.1 (2010): 677.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2012-02-16T08:37:57Z

2 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

The impact of sequence length and number of sequences on promoter prediction performance.
Carvalho SG, Guerra-Sá R, de C Merschmann LH., BMC Bioinformatics 16 Suppl 19(), 2015
PMID: 26695879

33 References

Daten bereitgestellt von Europe PubMed Central.

Computational identification of promoters and first exons in the human genome.
Davuluri RV, Grosse I, Zhang MQ., Nat. Genet. 29(4), 2001
PMID: 11726928
Using multiple alignments to improve gene prediction.
Gross SS, Brent MR., J. Comput. Biol. 13(2), 2006
PMID: 16597247
Combining classifiers for improved classification of proteins from sequence or structure.
Melvin I, Weston J, Leslie CS, Noble WS., BMC Bioinformatics 9(), 2008
PMID: 18808707
The human genome browser at UCSC.
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D., Genome Res. 12(6), 2002
PMID: 12045153
DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs.
Suzuki Y, Yamashita R, Nakai K, Sugano S., Nucleic Acids Res. 30(1), 2002
PMID: 11752328
The WEKA data mining software: an update
AUTHOR UNKNOWN, 2009
LIBSVM: a library for support vector machines
AUTHOR UNKNOWN, 2001

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

Quellen

PMID: 21118509
PubMed | Europe PMC

Suchen in

Google Scholar