Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings
Dörr D, Stoye J, Böcker S, Jahn K (2014)
BMC Genomics 15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.
Zeitschriftenaufsatz
| Veröffentlicht | Englisch
Download
Autor*in
Einrichtung
Abstract / Bemerkung
Background: Comparative analyses of chromosomal gene orders are successfully used to predict gene clusters in
bacterial and fungal genomes. Present models for detecting sets of co-localized genes in chromosomal sequences
require prior knowledge of gene family assignments of genes in the dataset of interest. These families are often
computationally predicted on the basis of sequence similarity or higher order features of gene products. Errors
introduced in this process amplify in subsequent gene order analyses and thus may deteriorate gene cluster
prediction.
Results: In this work, we present a new dynamic model and efficient computational approaches for gene cluster
prediction suitable in scenarios ranging from traditional gene family-based gene cluster prediction, via multiple
conflicting gene family annotations, to gene family-free analysis, in which gene clusters are predicted solely on the
basis of a pairwise similarity measure of the genes of different genomes. We evaluate our gene family-free model
against a gene family-based model on a dataset of 93 bacterial genomes.
Conclusions: Our model is able to detect gene clusters that would be also detected with well-established gene
family-based approaches. Moreover, we show that it is able to detect conserved regions which are missed by gene
family-based methods due to wrong or deficient gene family assignments.
Erscheinungsjahr
2014
Zeitschriftentitel
BMC Genomics
Band
15
Ausgabe
Suppl. 6: Proc. of RECOMB-CG 2014
Art.-Nr.
S2
ISSN
1471-2164
Finanzierungs-Informationen
Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.
Page URI
https://pub.uni-bielefeld.de/record/2687033
Zitieren
Dörr D, Stoye J, Böcker S, Jahn K. Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics. 2014;15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.
Dörr, D., Stoye, J., Böcker, S., & Jahn, K. (2014). Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics, 15(Suppl. 6: Proc. of RECOMB-CG 2014), S2. doi:10.1186/1471-2164-15-S6-S2
Dörr, Daniel, Stoye, Jens, Böcker, Sebastian, and Jahn, Katharina. 2014. “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”. BMC Genomics 15 (Suppl. 6: Proc. of RECOMB-CG 2014): S2.
Dörr, D., Stoye, J., Böcker, S., and Jahn, K. (2014). Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics 15:S2.
Dörr, D., et al., 2014. Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics, 15(Suppl. 6: Proc. of RECOMB-CG 2014): S2.
D. Dörr, et al., “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”, BMC Genomics, vol. 15, 2014, : S2.
Dörr, D., Stoye, J., Böcker, S., Jahn, K.: Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings. BMC Genomics. 15, : S2 (2014).
Dörr, Daniel, Stoye, Jens, Böcker, Sebastian, and Jahn, Katharina. “Identifying Gene Clusters by Discovering Common Intervals in Indeterminate Strings”. BMC Genomics 15.Suppl. 6: Proc. of RECOMB-CG 2014 (2014): S2.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
Open Access
Zuletzt Hochgeladen
2019-09-06T09:18:25Z
MD5 Prüfsumme
084536c32a0dcb8d56920d30f53c67e3
Daten bereitgestellt von European Bioinformatics Institute (EBI)
1 Zitation in Europe PMC
Daten bereitgestellt von Europe PubMed Central.
Finding approximate gene clusters with Gecko 3.
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Bocker S., Nucleic Acids Res. 44(20), 2016
PMID: 27679480
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Bocker S., Nucleic Acids Res. 44(20), 2016
PMID: 27679480
29 References
Daten bereitgestellt von Europe PubMed Central.
Evolution of gene order conservation in prokaryotes
AUTHOR UNKNOWN, 2001
AUTHOR UNKNOWN, 2001
Molecular evidence for an ancient duplication of the entire yeast genome.
Wolfe KH, Shields DC., Nature 387(6634), 1997
PMID: 9192896
Wolfe KH, Shields DC., Nature 387(6634), 1997
PMID: 9192896
Algorithms for finding gene clusters
AUTHOR UNKNOWN, 2001
AUTHOR UNKNOWN, 2001
Quadratic time algorithms for finding common intervals in two and moresequences
AUTHOR UNKNOWN, 2004
AUTHOR UNKNOWN, 2004
Common intervals of multiple permutations
AUTHOR UNKNOWN, 2011
AUTHOR UNKNOWN, 2011
The algorithmic of gene teams
AUTHOR UNKNOWN, 2002
AUTHOR UNKNOWN, 2002
Identifying conserved gene clusters in the presence of homology families.
He X, Goldwasser MH., J. Comput. Biol. 12(6), 2005
PMID: 16108708
He X, Goldwasser MH., J. Comput. Biol. 12(6), 2005
PMID: 16108708
Detecting gene clusters under evolutionary constraint in a large number of genomes.
Ling X, He X, Xin D., Bioinformatics 25(5), 2009
PMID: 19158161
Ling X, He X, Xin D., Bioinformatics 25(5), 2009
PMID: 19158161
Integer linear programs for discovering approximate gene clusters
AUTHOR UNKNOWN, 2006
AUTHOR UNKNOWN, 2006
Computation of median gene clusters.
Bocker S, Jahn K, Mixtacki J, Stoye J., J. Comput. Biol. 16(8), 2009
PMID: 19689215
Bocker S, Jahn K, Mixtacki J, Stoye J., J. Comput. Biol. 16(8), 2009
PMID: 19689215
Efficient computation of approximate gene clusters based on reference occurrences.
Jahn K., J. Comput. Biol. 18(9), 2011
PMID: 21899430
Jahn K., J. Comput. Biol. 18(9), 2011
PMID: 21899430
The COG database: an updated version includes eukaryotes.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510
How environmental solution conditions determine the compaction velocity of single DNA molecules.
Hirano K, Ichikawa M, Ishido T, Ishikawa M, Baba Y, Yoshikawa K., Nucleic Acids Res. 40(1), 2011
PMID: 21896618
Hirano K, Ichikawa M, Ishido T, Ishikawa M, Baba Y, Yoshikawa K., Nucleic Acids Res. 40(1), 2011
PMID: 21896618
OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011.
Waterhouse RM, Zdobnov EM, Tegenfeldt F, Li J, Kriventseva EV., Nucleic Acids Res. 39(Database issue), 2010
PMID: 20972218
Waterhouse RM, Zdobnov EM, Tegenfeldt F, Li J, Kriventseva EV., Nucleic Acids Res. 39(Database issue), 2010
PMID: 20972218
MultiMSOAR 2.0: an accurate tool to identify ortholog groups among multiple genomes.
Shi G, Peng MC, Jiang T., PLoS ONE 6(6), 2011
PMID: 21712981
Shi G, Peng MC, Jiang T., PLoS ONE 6(6), 2011
PMID: 21712981
OrthoMCL: identification of ortholog groups for eukaryotic genomes.
Li L, Stoeckert CJ Jr, Roos DS., Genome Res. 13(9), 2003
PMID: 12952885
Li L, Stoeckert CJ Jr, Roos DS., Genome Res. 13(9), 2003
PMID: 12952885
InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.
Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL., Nucleic Acids Res. 38(Database issue), 2009
PMID: 19892828
Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL., Nucleic Acids Res. 38(Database issue), 2009
PMID: 19892828
Domain architecture comparison for multidomain homology identification.
Song N, Sedgewick RD, Durand D., J. Comput. Biol. 14(4), 2007
PMID: 17572026
Song N, Sedgewick RD, Durand D., J. Comput. Biol. 14(4), 2007
PMID: 17572026
Family classification without domain chaining.
Joseph JM, Durand D., Bioinformatics 25(12), 2009
PMID: 19478015
Joseph JM, Durand D., Bioinformatics 25(12), 2009
PMID: 19478015
Genome-wide comparative gene family classification.
Frech C, Chen N., PLoS ONE 5(10), 2010
PMID: 20976221
Frech C, Chen N., PLoS ONE 5(10), 2010
PMID: 20976221
Domains, motifs and clusters in the protein universe.
Liu J, Rost B., Curr Opin Chem Biol 7(1), 2003
PMID: 12547420
Liu J, Rost B., Curr Opin Chem Biol 7(1), 2003
PMID: 12547420
Algorithms on indeterminate strings
AUTHOR UNKNOWN, 2003
AUTHOR UNKNOWN, 2003
Fast algorithms to enumerate all common intervals of two permutations
AUTHOR UNKNOWN, 2000
AUTHOR UNKNOWN, 2000
Character sets of strings
AUTHOR UNKNOWN, 2007
AUTHOR UNKNOWN, 2007
Toward automatic reconstruction of a highly resolved tree of life.
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P., Science 311(5765), 2006
PMID: 16513982
Ciccarelli FD, Doerks T, von Mering C, Creevey CJ, Snel B, Bork P., Science 311(5765), 2006
PMID: 16513982
Metrics for GO based protein semantic similarity: a systematic evaluation.
Pesquita C, Faria D, Bastos H, Ferreira AE, Falcao AO, Couto FM., BMC Bioinformatics 9 Suppl 5(), 2008
PMID: 18460186
Pesquita C, Faria D, Bastos H, Ferreira AE, Falcao AO, Couto FM., BMC Bioinformatics 9 Suppl 5(), 2008
PMID: 18460186
Basic local alignment search tool.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ., J. Mol. Biol. 215(3), 1990
PMID: 2231712
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ., J. Mol. Biol. 215(3), 1990
PMID: 2231712
Proteinortho: detection of (co-)orthologs in large-scale analysis.
Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ., BMC Bioinformatics 12(), 2011
PMID: 21526987
Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ., BMC Bioinformatics 12(), 2011
PMID: 21526987
RegulonDB v8.0: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and more.
Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J., Nucleic Acids Res. 41(Database issue), 2012
PMID: 23203884
Salgado H, Peralta-Gil M, Gama-Castro S, Santos-Zavaleta A, Muniz-Rascado L, Garcia-Sotelo JS, Weiss V, Solano-Lira H, Martinez-Flores I, Medina-Rivera A, Salgado-Osorio G, Alquicira-Hernandez S, Alquicira-Hernandez K, Lopez-Fuentes A, Porron-Sotelo L, Huerta AM, Bonavides-Martinez C, Balderas-Martinez YI, Pannier L, Olvera M, Labastida A, Jimenez-Jacinto V, Vega-Alvarado L, Del Moral-Chavez V, Hernandez-Alvarez A, Morett E, Collado-Vides J., Nucleic Acids Res. 41(Database issue), 2012
PMID: 23203884
Export
Markieren/ Markierung löschen
Markierte Publikationen
Web of Science
Dieser Datensatz im Web of Science®Quellen
PMID: 25571793
PubMed | Europe PMC
Suchen in