Abstract / Bemerkung
Background Genes occurring co-localized in multiple genomes can be strong indicators for either functional constraints on the genome organization or remnant ancestral gene order. The computational detection of these patterns, which are usually referred to as gene clusters, has become increasingly sensitive over the past decade. The most powerful approaches allow for various types of imperfect cluster conservation: Cluster locations may be internally rearranged. The individual cluster locations may contain only a subset of the cluster genes and may be disrupted by uninvolved genes. Moreover cluster locations may not at all occur in some or even most of the studied genomes. The detection of such low quality clusters increases the risk of mistaking faint patterns that occur merely by chance for genuine findings. Therefore, it is crucial to estimate the significance of computational gene cluster predictions and discriminate between true conservation and coincidental clustering. Results In this paper, we present an efficient and accurate approach to estimate the significance of gene cluster predictions under the approximate common intervals model. Given a single gene cluster prediction, we calculate the probability to observe it with the same or a higher degree of conservation under the null hypothesis of random gene order, and add a correction factor to account for multiple testing. Our approach considers all parameters that define the quality of gene cluster conservation: the number of genomes in which the cluster occurs, the number of involved genes, the degree of conservation in the different genomes, as well as the frequency of the clustered genes within each genome. We apply our approach to evaluate gene cluster predictions in a large set of well annotated genomes.
Suppl 15: Proc. of RECOMB-CG 2013
Jahn K, Winter S, Stoye J, Böcker S. Statistics for approximate gene clusters. BMC Bioinformatics. 2013;14(Suppl 15: Proc. of RECOMB-CG 2013):S14.
Jahn, K., Winter, S., Stoye, J., & Böcker, S. (2013). Statistics for approximate gene clusters. BMC Bioinformatics, 14(Suppl 15: Proc. of RECOMB-CG 2013), S14. doi:10.1186/1471-2105-14-S15-S14
Jahn, K., Winter, S., Stoye, J., and Böcker, S. (2013). Statistics for approximate gene clusters. BMC Bioinformatics 14, S14.
Jahn, K., et al., 2013. Statistics for approximate gene clusters. BMC Bioinformatics, 14(Suppl 15: Proc. of RECOMB-CG 2013), p S14.
K. Jahn, et al., “Statistics for approximate gene clusters”, BMC Bioinformatics, vol. 14, 2013, pp. S14.
Jahn, K., Winter, S., Stoye, J., Böcker, S.: Statistics for approximate gene clusters. BMC Bioinformatics. 14, S14 (2013).
Jahn, Katharina, Winter, Sascha, Stoye, Jens, and Böcker, Sebastian. “Statistics for approximate gene clusters”. BMC Bioinformatics 14.Suppl 15: Proc. of RECOMB-CG 2013 (2013): S14.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
This Item is protected by copyright and/or related rights. [...]