GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data

Schulz T, Stoye J, Dörr D (2018)
BMC Genomics 19(Suppl. 5): 308.

Zeitschriftenaufsatz | Veröffentlicht | Englisch
OA 1.42 MB
Abstract / Bemerkung
Abstract Background Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. Results We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. Conclusions By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Spatial gene cluster; Gene teams; Single-linkage clustering; Graph teams; Hi-C data
BMC Genomics
Suppl. 5
Open-Access-Publikationskosten wurden durch die Deutsche Forschungsgemeinschaft und die Universität Bielefeld gefördert.
Page URI


Schulz T, Stoye J, Dörr D. GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data. BMC Genomics. 2018;19(Suppl. 5): 308.
Schulz, T., Stoye, J., & Dörr, D. (2018). GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data. BMC Genomics, 19(Suppl. 5), 308. doi:10.1186/s12864-018-4622-0
Schulz, Tizian, Stoye, Jens, and Dörr, Daniel. 2018. “GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data”. BMC Genomics 19 (Suppl. 5): 308.
Schulz, T., Stoye, J., and Dörr, D. (2018). GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data. BMC Genomics 19:308.
Schulz, T., Stoye, J., & Dörr, D., 2018. GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data. BMC Genomics, 19(Suppl. 5): 308.
T. Schulz, J. Stoye, and D. Dörr, “GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data”, BMC Genomics, vol. 19, 2018, : 308.
Schulz, T., Stoye, J., Dörr, D.: GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data. BMC Genomics. 19, : 308 (2018).
Schulz, Tizian, Stoye, Jens, and Dörr, Daniel. “GraphTeams. A method for discovering spatial gene clusters in Hi-C sequencing data”. BMC Genomics 19.Suppl. 5 (2018): 308.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Access Level
OA Open Access
Zuletzt Hochgeladen
MD5 Prüfsumme

26 References

Daten bereitgestellt von Europe PubMed Central.

An algorithmic view of gene teams
Beal M, Bergeron A, Corteel S, Raffinot M., 2004
Identifying conserved gene clusters in the presence of homology families.
He X, Goldwasser MH., J. Comput. Biol. 12(6), 2005
PMID: 16108708
A new efficient algorithm for the gene-team problem on general sequences.
Wang BF, Kuo CC, Liu SJ, Lin CH., IEEE/ACM Trans Comput Biol Bioinform 9(2), 2012
PMID: 22282907
Constructing a Gene Team Tree in Almost O (n lg n) Time.
Wang BF, Lin CH, Yang IT., IEEE/ACM Trans Comput Biol Bioinform 11(1), 2014
PMID: 26355514
[Operon: a group of genes with the expression coordinated by an operator.]
JACOB F, PERRIN D, SANCHEZ C, MONOD J., C. R. Hebd. Seances Acad. Sci. 250(), 1960
PMID: 14406329
The NK homeobox gene cluster predates the origin of Hox genes.
Larroux C, Fahey B, Degnan SM, Adamski M, Rokhsar DS, Degnan BM., Curr. Biol. 17(8), 2007
PMID: 17379523
Hi-C: a comprehensive technique to capture the conformation of genomes.
Belton JM, McCord RP, Gibcus JH, Naumova N, Zhan Y, Dekker J., Methods 58(3), 2012
PMID: 22652625
Topological domains in mammalian genomes identified by analysis of chromatin interactions.
Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu JS, Ren B., Nature 485(7398), 2012
PMID: 22495300
Comprehensive mapping of long-range interactions reveals folding principles of the human genome.
Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, Amit I, Lajoie BR, Sabo PJ, Dorschner MO, Sandstrom R, Bernstein B, Bender MA, Groudine M, Gnirke A, Stamatoyannopoulos J, Mirny LA, Lander ES, Dekker J., Science 326(5950), 2009
PMID: 19815776
Three-dimensional folding and functional organization principles of the Drosophila genome.
Sexton T, Yaffe E, Kenigsberg E, Bantignies F, Leblanc B, Hoichman M, Parrinello H, Tanay A, Cavalli G., Cell 148(3), 2012
PMID: 22265598
Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types.
Ryba T, Hiratani I, Lu J, Itoh M, Kulik M, Zhang J, Schulz TC, Robins AJ, Dalton S, Gilbert DM., Genome Res. 20(6), 2010
PMID: 20430782
Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions.
Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J., Nat. Biotechnol. 31(12), 2013
PMID: 24185095
Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing.
Selvaraj S, R Dixon J, Bansal V, Ren B., Nat. Biotechnol. 31(12), 2013
PMID: 24185094
Character sets of strings
Didier G, Schmidt T, Stoye J, Tsur D., 2006
Fast algorithms to enumerate all common intervals of two permutations
Uno T, Yagiura M., 2000
Finding approximate gene clusters with Gecko 3.
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Bocker S., Nucleic Acids Res. 44(20), 2016
PMID: 27679480
Functional gene groups are concentrated within chromosomes, among chromosomes and in the nuclear space of the human genome.
Thevenin A, Ein-Dor L, Ozery-Flato M, Shamir R., Nucleic Acids Res. 42(15), 2014
PMID: 25056310

Cormen TH, Leiserson CE, Rivest RL, Stein C., 2009
Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G., Nat. Genet. 25(1), 2000
PMID: 10802651
GO-based functional dissimilarity of gene sets.
Diaz-Diaz N, Aguilar-Ruiz JS., BMC Bioinformatics 12(), 2011
PMID: 21884611
Snakemake--a scalable bioinformatics workflow engine.
Koster J, Rahmann S., Bioinformatics 28(19), 2012
PMID: 22908215
Ensembl 2016.
Yates A, Akanni W, Amode MR, Barrell D, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Fitzgerald S, Gil L, Giron CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Johnson N, Juettemann T, Keenan S, Lavidas I, Martin FJ, Maurel T, McLaren W, Murphy DN, Nag R, Nuhn M, Parker A, Patricio M, Pignatelli M, Rahtz M, Riat HS, Sheppard D, Taylor K, Thormann A, Vullo A, Wilder SP, Zadissa A, Birney E, Harrow J, Muffato M, Perry E, Ruffier M, Spudich G, Trevanion SJ, Cunningham F, Aken BL, Zerbino DR, Flicek P., Nucleic Acids Res. 44(D1), 2015
PMID: 26687719
Improved algorithms for finding gene teams and constructing gene team trees.
Wang BF, Lin CH., IEEE/ACM Trans Comput Biol Bioinform 8(5), 2011
PMID: 21116042
Gene team tree: a hierarchical representation of gene teams for all gap lengths.
Zhang M, Leong HW., J. Comput. Biol. 16(10), 2009
PMID: 19803736

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Web of Science

Dieser Datensatz im Web of Science®

PMID: 29745835
PubMed | Europe PMC

Suchen in

Google Scholar