Ensemble approach combining multiple methods improves human transcription start site prediction
Dineen DG, Schroeder M, Higgins DG, Cunningham P (2010)
BMC Genomics 11(1): 677.
Zeitschriftenaufsatz
| Veröffentlicht | Englisch
Download
Autor*in
Dineen, David G.;
Schroeder, MarkusUniBi;
Higgins, Desmond G.;
Cunningham, Padraig
Einrichtung
Abstract / Bemerkung
Background: The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results: We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier ('Profisi Ensemble') using predictions from 7 programs, along with 2 other data sources. Support vector machines using 'full' and 'reduced' data sets are combined in an either/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions: Supervised learning methods are a useful way to combine predictions from diverse sources.
Erscheinungsjahr
2010
Zeitschriftentitel
BMC Genomics
Band
11
Ausgabe
1
Art.-Nr.
677
ISSN
1471-2164
Page URI
https://pub.uni-bielefeld.de/record/2003511
Zitieren
Dineen DG, Schroeder M, Higgins DG, Cunningham P. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics. 2010;11(1): 677.
Dineen, D. G., Schroeder, M., Higgins, D. G., & Cunningham, P. (2010). Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics, 11(1), 677. https://doi.org/10.1186/1471-2164-11-677
Dineen, David G., Schroeder, Markus, Higgins, Desmond G., and Cunningham, Padraig. 2010. “Ensemble approach combining multiple methods improves human transcription start site prediction”. BMC Genomics 11 (1): 677.
Dineen, D. G., Schroeder, M., Higgins, D. G., and Cunningham, P. (2010). Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics 11:677.
Dineen, D.G., et al., 2010. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics, 11(1): 677.
D.G. Dineen, et al., “Ensemble approach combining multiple methods improves human transcription start site prediction”, BMC Genomics, vol. 11, 2010, : 677.
Dineen, D.G., Schroeder, M., Higgins, D.G., Cunningham, P.: Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics. 11, : 677 (2010).
Dineen, David G., Schroeder, Markus, Higgins, Desmond G., and Cunningham, Padraig. “Ensemble approach combining multiple methods improves human transcription start site prediction”. BMC Genomics 11.1 (2010): 677.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
Open Access
Zuletzt Hochgeladen
2019-09-06T08:57:19Z
MD5 Prüfsumme
d6d0c965494c82dd62505a445c9cfded
Daten bereitgestellt von European Bioinformatics Institute (EBI)
2 Zitationen in Europe PMC
Daten bereitgestellt von Europe PubMed Central.
The impact of sequence length and number of sequences on promoter prediction performance.
Carvalho SG, Guerra-Sá R, de C Merschmann LH., BMC Bioinformatics 16 Suppl 19(), 2015
PMID: 26695879
Carvalho SG, Guerra-Sá R, de C Merschmann LH., BMC Bioinformatics 16 Suppl 19(), 2015
PMID: 26695879
Melanocortin 3 receptor has a 5' exon that directs translation of apically localized protein from the second in-frame ATG.
Park J, Sharma N, Cutting GR., Mol Endocrinol 28(9), 2014
PMID: 25051171
Park J, Sharma N, Cutting GR., Mol Endocrinol 28(9), 2014
PMID: 25051171
33 References
Daten bereitgestellt von Europe PubMed Central.
The transcriptional landscape of the mammalian genome.
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schonbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y; FANTOM Consortium; RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)., Science 309(5740), 2005
PMID: 16141072
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest AR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SP, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schonbach C, Sekiguchi K, Semple CA, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y; FANTOM Consortium; RIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)., Science 309(5740), 2005
PMID: 16141072
Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution.
Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR., Science 308(5725), 2005
PMID: 15790807
Cheng J, Kapranov P, Drenkow J, Dike S, Brubaker S, Patel S, Long J, Stern D, Tammana H, Helt G, Sementchenko V, Piccolboni A, Bekiranov S, Bailey DK, Ganesh M, Ghosh S, Bell I, Gerhard DS, Gingeras TR., Science 308(5725), 2005
PMID: 15790807
RNA maps reveal new RNA classes and a possible function for pervasive transcription.
Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, Bell I, Cheung E, Drenkow J, Dumais E, Patel S, Helt G, Ganesh M, Ghosh S, Piccolboni A, Sementchenko V, Tammana H, Gingeras TR., Science 316(5830), 2007
PMID: 17510325
Kapranov P, Cheng J, Dike S, Nix DA, Duttagupta R, Willingham AT, Stadler PF, Hertel J, Hackermuller J, Hofacker IL, Bell I, Cheung E, Drenkow J, Dumais E, Patel S, Helt G, Ganesh M, Ghosh S, Piccolboni A, Sementchenko V, Tammana H, Gingeras TR., Science 316(5830), 2007
PMID: 17510325
Genome-wide analysis of mammalian promoter architecture and evolution.
Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, Suzuki H, Grimmond SM, Wells CA, Orlando V, Wahlestedt C, Liu ET, Harbers M, Kawai J, Bajic VB, Hume DA, Hayashizaki Y., Nat. Genet. 38(6), 2006
PMID: 16645617
Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engstrom PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, Suzuki H, Grimmond SM, Wells CA, Orlando V, Wahlestedt C, Liu ET, Harbers M, Kawai J, Bajic VB, Hume DA, Hayashizaki Y., Nat. Genet. 38(6), 2006
PMID: 16645617
Toward a gold standard for promoter prediction evaluation.
Abeel T, Van de Peer Y, Saeys Y., Bioinformatics 25(12), 2009
PMID: 19478005
Abeel T, Van de Peer Y, Saeys Y., Bioinformatics 25(12), 2009
PMID: 19478005
A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters.
Saxonov S, Berg P, Brutlag DL., Proc. Natl. Acad. Sci. U.S.A. 103(5), 2006
PMID: 16432200
Saxonov S, Berg P, Brutlag DL., Proc. Natl. Acad. Sci. U.S.A. 103(5), 2006
PMID: 16432200
CpG-depleted promoters harbor tissue-specific transcription factor binding signals--implications for motif overrepresentation analyses.
Roider HG, Lenhard B, Kanhere A, Haas SA, Vingron M., Nucleic Acids Res. 37(19), 2009
PMID: 19736212
Roider HG, Lenhard B, Kanhere A, Haas SA, Vingron M., Nucleic Acids Res. 37(19), 2009
PMID: 19736212
Determining promoter location based on DNA structure first-principles calculations.
Goni JR, Perez A, Torrents D, Orozco M., Genome Biol. 8(12), 2007
PMID: 18072969
Goni JR, Perez A, Torrents D, Orozco M., Genome Biol. 8(12), 2007
PMID: 18072969
ARTS: accurate recognition of transcription starts in human.
Sonnenburg S, Zien A, Ratsch G., Bioinformatics 22(14), 2006
PMID: 16873509
Sonnenburg S, Zien A, Ratsch G., Bioinformatics 22(14), 2006
PMID: 16873509
High DNA melting temperature predicts transcription start site location in human and mouse
AUTHOR UNKNOWN, 2009
AUTHOR UNKNOWN, 2009
MetaProm: a neural network based meta-predictor for alternative human promoter prediction.
Wang J, Ungar LH, Tseng H, Hannenhalli S., BMC Genomics 8(), 2007
PMID: 17941982
Wang J, Ungar LH, Tseng H, Hannenhalli S., BMC Genomics 8(), 2007
PMID: 17941982
EnsemPro: an ensemble approach to predicting transcription start sites in human genomic DNA sequences.
Won HH, Kim MJ, Kim S, Kim JW., Genomics 91(3), 2008
PMID: 18164178
Won HH, Kim MJ, Kim S, Kim JW., Genomics 91(3), 2008
PMID: 18164178
Meta-prediction of phosphorylation sites with weighted voting and restricted grid search parameter selection
AUTHOR UNKNOWN, 2008
AUTHOR UNKNOWN, 2008
Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value.
Boulesteix AL, Porzelius C, Daumer M., Bioinformatics 24(15), 2008
PMID: 18544547
Boulesteix AL, Porzelius C, Daumer M., Bioinformatics 24(15), 2008
PMID: 18544547
Classifier ensembles for protein structural class prediction with varying homology.
Kedarisetti KD, Kurgan L, Dick S., Biochem. Biophys. Res. Commun. 348(3), 2006
PMID: 16904630
Kedarisetti KD, Kurgan L, Dick S., Biochem. Biophys. Res. Commun. 348(3), 2006
PMID: 16904630
Ensemble Based Systems in Decision Making
AUTHOR UNKNOWN, 2006
AUTHOR UNKNOWN, 2006
Is Combining Classifiers Better than Selecting the Best One
AUTHOR UNKNOWN, 2004
AUTHOR UNKNOWN, 2004
Performance assessment of promoter predictions on ENCODE regions in the EGASP experiment.
Bajic VB, Brent MR, Brown RH, Frankish A, Harrow J, Ohler U, Solovyev VV, Tan SL., Genome Biol. 7 Suppl 1(), 2006
PMID: 16925837
Bajic VB, Brent MR, Brown RH, Frankish A, Harrow J, Ohler U, Solovyev VV, Tan SL., Genome Biol. 7 Suppl 1(), 2006
PMID: 16925837
Developmental programming of CpG island methylation profiles in the human genome.
Straussman R, Nejman D, Roberts D, Steinfeld I, Blum B, Benvenisty N, Simon I, Yakhini Z, Cedar H., Nat. Struct. Mol. Biol. 16(5), 2009
PMID: 19377480
Straussman R, Nejman D, Roberts D, Steinfeld I, Blum B, Benvenisty N, Simon I, Yakhini Z, Cedar H., Nat. Struct. Mol. Biol. 16(5), 2009
PMID: 19377480
The effect of methylation on some biological parameters in Salmonella enterica serovar Typhimurium.
Aloui A, Tagourti J, El May A, Joseleau Petit D, Landoulsi A., Pathol. Biol. 59(4), 2009
PMID: 19477083
Aloui A, Tagourti J, El May A, Joseleau Petit D, Landoulsi A., Pathol. Biol. 59(4), 2009
PMID: 19477083
The significance of DNA methylation patterns: promoter inhibition by sequence-specific methylation is one functional consequence.
Doerfler W., Philos. Trans. R. Soc. Lond., B, Biol. Sci. 326(1235), 1990
PMID: 1968662
Doerfler W., Philos. Trans. R. Soc. Lond., B, Biol. Sci. 326(1235), 1990
PMID: 1968662
Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences.
King DC, Taylor J, Elnitski L, Chiaromonte F, Miller W, Hardison RC., Genome Res. 15(8), 2005
PMID: 16024817
King DC, Taylor J, Elnitski L, Chiaromonte F, Miller W, Hardison RC., Genome Res. 15(8), 2005
PMID: 16024817
Genome-wide analysis of core promoter elements from conserved human and mouse orthologous pairs.
Jin VX, Singer GA, Agosto-Perez FJ, Liyanarachchi S, Davuluri RV., BMC Bioinformatics 7(), 2006
PMID: 16522199
Jin VX, Singer GA, Agosto-Perez FJ, Liyanarachchi S, Davuluri RV., BMC Bioinformatics 7(), 2006
PMID: 16522199
ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles.
Abeel T, Saeys Y, Rouze P, Van de Peer Y., Bioinformatics 24(13), 2008
PMID: 18586720
Abeel T, Saeys Y, Rouze P, Van de Peer Y., Bioinformatics 24(13), 2008
PMID: 18586720
Generic eukaryotic core promoter prediction using structural features of DNA.
Abeel T, Saeys Y, Bonnet E, Rouze P, Van de Peer Y., Genome Res. 18(2), 2007
PMID: 18096745
Abeel T, Saeys Y, Bonnet E, Rouze P, Van de Peer Y., Genome Res. 18(2), 2007
PMID: 18096745
Computational identification of promoters and first exons in the human genome.
Davuluri RV, Grosse I, Zhang MQ., Nat. Genet. 29(4), 2001
PMID: 11726928
Davuluri RV, Grosse I, Zhang MQ., Nat. Genet. 29(4), 2001
PMID: 11726928
Computational detection and location of transcription start sites in mammalian genomic DNA.
Down TA, Hubbard TJ., Genome Res. 12(3), 2002
PMID: 11875034
Down TA, Hubbard TJ., Genome Res. 12(3), 2002
PMID: 11875034
Using multiple alignments to improve gene prediction.
Gross SS, Brent MR., J. Comput. Biol. 13(2), 2006
PMID: 16597247
Gross SS, Brent MR., J. Comput. Biol. 13(2), 2006
PMID: 16597247
Combining classifiers for improved classification of proteins from sequence or structure.
Melvin I, Weston J, Leslie CS, Noble WS., BMC Bioinformatics 9(), 2008
PMID: 18808707
Melvin I, Weston J, Leslie CS, Noble WS., BMC Bioinformatics 9(), 2008
PMID: 18808707
The human genome browser at UCSC.
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D., Genome Res. 12(6), 2002
PMID: 12045153
Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D., Genome Res. 12(6), 2002
PMID: 12045153
DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs.
Suzuki Y, Yamashita R, Nakai K, Sugano S., Nucleic Acids Res. 30(1), 2002
PMID: 11752328
Suzuki Y, Yamashita R, Nakai K, Sugano S., Nucleic Acids Res. 30(1), 2002
PMID: 11752328
The WEKA data mining software: an update
AUTHOR UNKNOWN, 2009
AUTHOR UNKNOWN, 2009
LIBSVM: a library for support vector machines
AUTHOR UNKNOWN, 2001
AUTHOR UNKNOWN, 2001
Export
Markieren/ Markierung löschen
Markierte Publikationen
Web of Science
Dieser Datensatz im Web of Science®Quellen
PMID: 21118509
PubMed | Europe PMC
Suchen in