Finding approximate gene clusters with GECKO 3

Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Böcker S (2016)
Nucleic Acids Research 44(20): 9600-9610.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Winter, Sascha; Jahn, Katharina; Wehner, Stefanie; Kuchenbecker, Leon; Marz, Manja; Stoye, JensUniBi ; Böcker, Sebastian
Abstract / Bemerkung
Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min.
Nucleic Acids Research
Page URI


Winter S, Jahn K, Wehner S, et al. Finding approximate gene clusters with GECKO 3. Nucleic Acids Research. 2016;44(20):9600-9610.
Winter, S., Jahn, K., Wehner, S., Kuchenbecker, L., Marz, M., Stoye, J., & Böcker, S. (2016). Finding approximate gene clusters with GECKO 3. Nucleic Acids Research, 44(20), 9600-9610. doi:10.1093/nar/gkw843
Winter, Sascha, Jahn, Katharina, Wehner, Stefanie, Kuchenbecker, Leon, Marz, Manja, Stoye, Jens, and Böcker, Sebastian. 2016. “Finding approximate gene clusters with GECKO 3”. Nucleic Acids Research 44 (20): 9600-9610.
Winter, S., Jahn, K., Wehner, S., Kuchenbecker, L., Marz, M., Stoye, J., and Böcker, S. (2016). Finding approximate gene clusters with GECKO 3. Nucleic Acids Research 44, 9600-9610.
Winter, S., et al., 2016. Finding approximate gene clusters with GECKO 3. Nucleic Acids Research, 44(20), p 9600-9610.
S. Winter, et al., “Finding approximate gene clusters with GECKO 3”, Nucleic Acids Research, vol. 44, 2016, pp. 9600-9610.
Winter, S., Jahn, K., Wehner, S., Kuchenbecker, L., Marz, M., Stoye, J., Böcker, S.: Finding approximate gene clusters with GECKO 3. Nucleic Acids Research. 44, 9600-9610 (2016).
Winter, Sascha, Jahn, Katharina, Wehner, Stefanie, Kuchenbecker, Leon, Marz, Manja, Stoye, Jens, and Böcker, Sebastian. “Finding approximate gene clusters with GECKO 3”. Nucleic Acids Research 44.20 (2016): 9600-9610.

2 Zitationen in Europe PMC

Daten bereitgestellt von Europe PubMed Central.

GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.
Schulz T, Stoye J, Doerr D., BMC Genomics 19(suppl 5), 2018
PMID: 29745835

87 References

Daten bereitgestellt von Europe PubMed Central.

Synteny and collinearity in plant genomes.
Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH., Science 320(5875), 2008
PMID: 18436778
The use of gene clusters to infer functional coupling.
Overbeek R, Fonstein M, D'Souza M, Pusch GD, Maltsev N., Proc. Natl. Acad. Sci. U.S.A. 96(6), 1999
PMID: 10077608
Gene expansion shapes genome architecture in the human pathogen Lichtheimia corymbifera: an evolutionary genomics analysis in the ancient terrestrial mucorales (Mucoromycotina).
Schwartze VU, Winter S, Shelest E, Marcet-Houben M, Horn F, Wehner S, Linde J, Valiante V, Sammeth M, Riege K, Nowrousian M, Kaerger K, Jacobsen ID, Marz M, Brakhage AA, Gabaldon T, Bocker S, Voigt K., PLoS Genet. 10(8), 2014
PMID: 25121733
Fast algorithms to enumerate all common intervals of two permutations
Uno T., Yagiura M.., 2000
Algorithms for finding gene clusters
Heber S., Stoye J.., 2001
The algorithmic of gene teams
Bergeron A., Corteel S., Raffinot M.., 2002
GRIMM: genome rearrangements web server.
Tesler G., Bioinformatics 18(3), 2002
PMID: 11934753
The automatic detection of homologous regions (ADHoRe) and its application to microcolinearity between Arabidopsis and rice.
Vandepoele K, Saeys Y, Simillion C, Raes J, Van De Peer Y., Genome Res. 12(11), 2002
PMID: 12421767
Common intervals of two sequences
Didier G.., 2003
Fast identification and statistical evaluation of segmental homologies in comparative maps.
Calabrese PP, Chakravarty S, Vision TJ., Bioinformatics 19 Suppl 1(), 2003
PMID: 12855440
DAGchainer: a tool for mining segmental genome duplications and synteny.
Haas BJ, Delcher AL, Wortman JR, Salzberg SL., Bioinformatics 20(18), 2004
PMID: 15247098
Building genomic profiles for uncovering segmental homology in the twilight zone.
Simillion C, Vandepoele K, Saeys Y, Van de Peer Y., Genome Res. 14(6), 2004
PMID: 15173115
Identifying conserved gene clusters in the presence of homology families
He X., Goldwasser M.H.., 2005
Identification of genomic features using microsyntenies of domains: domain teams.
Pasek S, Bergeron A, Risler JL, Louis A, Ollivier E, Raffinot M., Genome Res. 15(6), 2005
PMID: 15899966
Gene teams with relaxed proximity constraint
Kim S., Choi J.-H., Yang J.., 2005
Statistical inference of chromosomal homology based on gene colinearity and applications to Arabidopsis and rice.
Wang X, Shi X, Li Z, Zhu Q, Kong L, Tang W, Ge S, Luo J., BMC Bioinformatics 7(), 2006
PMID: 17038171
Gecko and GhostFam—rigorous and efficient gene cluster detection in prokaryotic genomes
Schmidt T., Stoye J.., 2007
Efficiently identifying max-gap clusters in pairwise genome comparison.
Ling X, He X, Xin D, Han J, Han J., J. Comput. Biol. 15(6), 2008
PMID: 18631023
Syntenator: multiple gene order alignments with a gene-specific scoring function.
Rodelsperger C, Dieterich C., Algorithms Mol Biol 3(), 2008
PMID: 18990215
CYNTENATOR: progressive gene order alignment of 17 vertebrate genomes.
Rodelsperger C, Dieterich C., PLoS ONE 5(1), 2010
PMID: 20126624
Bacterial syntenies: an exact approach with gene quorum.
Denielou YP, Sagot MF, Boyer F, Viari A., BMC Bioinformatics 12(), 2011
PMID: 21605461
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K., Nucleic Acids Res. 40(2), 2011
PMID: 22102584
Identifying gene clusters by discovering common intervals in indeterminate strings
Doerr D., Stoye J., Böcker S., Jahn K.., 2014
Basic local alignment search tool.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ., J. Mol. Biol. 215(3), 1990
PMID: 2231712
Partitioning biological data with transitivity clustering.
Wittkop T, Emig D, Lange S, Rahmann S, Albrecht M, Morris JH, Bocker S, Stoye J, Baumbach J., Nat. Methods 7(6), 2010
PMID: 20508635
Computation of median gene clusters.
Bocker S, Jahn K, Mixtacki J, Stoye J., J. Comput. Biol. 16(8), 2009
PMID: 19689215

Jahn K.., 2010
Statistics for approximate gene clusters
Jahn K., Winter S., Stoye J., Böcker S.., 2013
Controlling the false discovery rate: a practical and powerful approach to multiple testing
Benjamini Y., Hochberg Y.., 1995
STRING v9.1: protein-protein interaction networks, with increased coverage and integration.
Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ., Nucleic Acids Res. 41(Database issue), 2012
PMID: 23203871
The COG database: an updated version includes eukaryotes.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA., BMC Bioinformatics 4(), 2003
PMID: 12969510
RefSeq: an update on mammalian reference sequences.
Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, DiCuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM., Nucleic Acids Res. 42(Database issue), 2013
PMID: 24259432
Comparative analysis of the primary transcriptome of Synechocystis sp. PCC 6803.
Kopf M, Klahn S, Scholz I, Matthiessen JK, Hess WR, Voß B., DNA Res. 21(5), 2014
PMID: 24935866
The metabolic network of Synechocystis sp. PCC 6803: systemic properties of autotrophic growth.
Knoop H, Zilliges Y, Lockau W, Steuer R., Plant Physiol. 154(1), 2010
PMID: 20616194
Flux balance analysis of cyanobacterial metabolism: the metabolic network of Synechocystis sp. PCC 6803.
Knoop H, Grundel M, Zilliges Y, Lehmann R, Hoffmann S, Lockau W, Steuer R., PLoS Comput. Biol. 9(6), 2013
PMID: 23843751
Stress sensors and signal transducers in cyanobacteria.
Los DA, Zorina A, Sinetova M, Kryazhov S, Mironov K, Zinchenko VV., Sensors (Basel) 10(3), 2010
PMID: 22294932
Transformation in the cyanobacterium Synechocystis sp. 6803
Grigorieva G., Shestakov S.., 1982
Optimum conditions for transformation of Synechocystis sp. PCC 6803.
Zang X, Liu B, Liu S, Arunakumara KK, Zhang X., J. Microbiol. 45(3), 2007
PMID: 17618230
DOOR 2.0: presenting operons and their functions through dynamic and integrated views.
Mao X, Ma Q, Zhou C, Chen X, Zhang H, Yang J, Mao F, Lai W, Xu Y., Nucleic Acids Res. 42(Database issue), 2013
PMID: 24214966
Transcript mapping based on dRNA-seq data.
Bischler T, Kopf M, Voß B., BMC Bioinformatics 15(), 2014
PMID: 24780064
Transcription regulation of plastid genes involved in sulfate transport in Viridiplantae.
Lyubetsky VA, Seliverstov AV, Zverkov OA., Biomed Res Int 2013(), 2013
PMID: 24073405
The tufB-secE-nusG-rplKAJL-rpoB gene cluster of the liberibacters: sequence comparisons, phylogeny and speciation.
Teixeira DC, Eveillard S, Sirand-Pugnet P, Wulff A, Saillard C, Ayres AJ, Bove JM., Int. J. Syst. Evol. Microbiol. 58(Pt 6), 2008
PMID: 18523188
Prokaryotic genomes: the emerging paradigm of genome-based microbiology.
Koonin EV, Galperin MY., Curr. Opin. Genet. Dev. 7(6), 1997
PMID: 9468784
Phylogenomic analysis of the Chlamydomonas genome unmasks proteins potentially involved in photosynthetic function and regulation.
Grossman AR, Karpowicz SJ, Heinnickel M, Dewez D, Hamel B, Dent R, Niyogi KK, Johnson X, Alric J, Wollman FA, Li H, Merchant SS., Photosyn. Res. 106(1-2), 2010
PMID: 20490922
Cyanobacteria contain a mitochondrial complex I-homologous NADH-dehydrogenase.
Berger S, Ellersiek U, Steinmuller K., FEBS Lett. 286(1-2), 1991
PMID: 1907569
Identification and transcriptional control of the genes encoding the Caulobacter crescentus ClpXP protease.
Osteras M, Stotz A, Schmid Nuoffer S, Jenal U., J. Bacteriol. 181(10), 1999
PMID: 10322004
Genes essential to iron transport in the cyanobacterium Synechocystis sp. strain PCC 6803.
Katoh H, Hagino N, Grossman AR, Ogawa T., J. Bacteriol. 183(9), 2001
PMID: 11292796
Posttranslational regulation of nitrate assimilation in the cyanobacterium Synechocystis sp. strain PCC 6803.
Kobayashi M, Takatani N, Tanigawa M, Omata T., J. Bacteriol. 187(2), 2005
PMID: 15629921
Comparative genome analysis of the closely related Synechocystis strains PCC 6714 and PCC 6803.
Kopf M, Klahn S, Pade N, Weingartner C, Hagemann M, Voß B, Hess WR., DNA Res. 21(3), 2014
PMID: 24408876
The Kdp-ATPase system and its regulation.
Ballal A, Basu B, Apte SK., J. Biosci. 32(3), 2007
PMID: 17536175
Identification of cyanobacterial cell division genes by comparative and mutational analyses.
Miyagishima SY, Wolk CP, Osteryoung KW., Mol. Microbiol. 56(1), 2005
PMID: 15773984
Highly expressed and alien genes of the Synechocystis genome.
Mrazek J, Bhaya D, Grossman AR, Karlin S., Nucleic Acids Res. 29(7), 2001
PMID: 11266562
Gene expression patterns of sulfur starvation in Synechocystis sp. PCC 6803.
Zhang Z, Pendse ND, Phillips KN, Cotner JB, Khodursky A., BMC Genomics 9(), 2008
PMID: 18644144
Comparative genomic analyses of nickel, cobalt and vitamin B12 utilization.
Zhang Y, Rodionov DA, Gelfand MS, Gladyshev VN., BMC Genomics 10(), 2009
PMID: 19208259
A conserved rubredoxin is necessary for photosystem II accumulation in diverse oxygenic photoautotrophs.
Calderon RH, Garcia-Cerdan JG, Malnoe A, Cook R, Russell JJ, Gaw C, Dent RM, de Vitry C, Niyogi KK., J. Biol. Chem. 288(37), 2013
PMID: 23900844

Peschek G., Löffelhardt W., Schmetterer G.., 1999
Carboxysome genomics: a status report
Cannon G.C., Heinhorst S., Bradburne C.E., Shively J.M.., 2002
Export of extracellular polysaccharides modulates adherence of the Cyanobacterium synechocystis.
Fisher ML, Allen R, Luo Y, Curtiss R 3rd., PLoS ONE 8(9), 2013
PMID: 24040267
Distinct constitutive and low-CO2-induced CO2 uptake systems in cyanobacteria: genes involved and their phylogenetic relationship with homologous genes in other organisms.
Shibata M, Ohkawa H, Kaneko T, Fukuzawa H, Tabata S, Kaplan A, Ogawa T., Proc. Natl. Acad. Sci. U.S.A. 98(20), 2001
PMID: 11562454
The "anchor polypeptide" of cyanobacterial phycobilisomes. Molecular characterization of the Synechococcus sp. PCC 6301 apce gene.
Capuano V, Braux AS, Tandeau de Marsac N, Houmard J., J. Biol. Chem. 266(11), 1991
PMID: 1901865
Biochemical analysis of three putative KaiC clock proteins from Synechocystis sp. PCC 6803 suggests their functional divergence.
Wiegard A, Dorrich AK, Deinzer HT, Beck C, Wilde A, Holtzendorff J, Axmann IM., Microbiology (Reading, Engl.) 159(Pt 5), 2013
PMID: 23449916

Hirt H., Shinozaki K.., 2004
Analysis and manipulation of aspartate pathway genes for L-lysine overproduction from methanol by Bacillus methanolicus.
Nærdal I, Netzer R, Ellingsen TE, Brautaset T., Appl. Environ. Microbiol. 77(17), 2011
PMID: 21724876
Mutational analysis of genes involved in pilus structure, motility and transformation competency in the unicellular motile cyanobacterium Synechocystis sp. PCC 6803.
Yoshihara S, Geng X, Okamoto S, Yura K, Murata T, Go M, Ohmori M, Ikeuchi M., Plant Cell Physiol. 42(1), 2001
PMID: 11158445
Near-UV cyanobacteriochrome signaling system elicits negative phototaxis in the cyanobacterium Synechocystis sp. PCC 6803.
Song JY, Cho HS, Cho JI, Jeon JS, Lagarias JC, Park YI., Proc. Natl. Acad. Sci. U.S.A. 108(26), 2011
PMID: 21670284
Cold-regulated genes under control of the cold sensor Hik33 in Synechocystis.
Suzuki I, Kanesaki Y, Mikami K, Kanehisa M, Murata N., Mol. Microbiol. 40(1), 2001
PMID: 11298290
The CopRS two-component system is responsible for resistance to copper in the cyanobacterium Synechocystis sp. PCC 6803.
Giner-Lamia J, Lopez-Maury L, Reyes JC, Florencio FJ., Plant Physiol. 159(4), 2012
PMID: 22715108

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

PMID: 27679480
PubMed | Europe PMC

Suchen in

Google Scholar