Approximating the DCJ distance of balanced genomes in linear time

Rubert DP, Feijão P, Dias Vieira Braga M, Stoye J, Martinez FHV (2017)
Algorithms for Molecular Biology 12: 3.

Download
OA 2.08 MB
Journal Article | Original Article | Published | English
Author
; ; ; ;
Abstract
Background Rearrangements are large-scale mutations in genomes, responsible for complex changes and structural variations. Most rearrangements that modify the organization of a genome can be represented by the double cut and join (DCJ) operation. Given two balanced genomes, i.e., two genomes that have exactly the same number of occurrences of each gene in each genome, we are interested in the problem of computing the rearrangement distance between them, i.e., finding the minimum number of DCJ operations that transform one genome into the other. This problem is known to be NP-hard. Results We propose a linear time approximation algorithm with approximation factor O(k) for the DCJ distance problem, where k is the maximum number of occurrences of any gene in the input genomes. Our algorithm works for linear and circular unichromosomal balanced genomes and uses as an intermediate step an O(k)-approximation for the minimum common string partition problem, which is closely related to the DCJ distance problem. Conclusions Experiments on simulated data sets show that our approximation algorithm is very competitive both in efficiency and in quality of the solutions.
Publishing Year
ISSN
Financial disclosure
Article Processing Charge funded by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University.
PUB-ID

Cite this

Rubert DP, Feijão P, Dias Vieira Braga M, Stoye J, Martinez FHV. Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology. 2017;12: 3.
Rubert, D. P., Feijão, P., Dias Vieira Braga, M., Stoye, J., & Martinez, F. H. V. (2017). Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology, 12, 3. doi:10.1186/s13015-017-0095-y
Rubert, D. P., Feijão, P., Dias Vieira Braga, M., Stoye, J., and Martinez, F. H. V. (2017). Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology 12:3.
Rubert, D.P., et al., 2017. Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology, 12: 3.
D.P. Rubert, et al., “Approximating the DCJ distance of balanced genomes in linear time”, Algorithms for Molecular Biology, vol. 12, 2017, : 3.
Rubert, D.P., Feijão, P., Dias Vieira Braga, M., Stoye, J., Martinez, F.H.V.: Approximating the DCJ distance of balanced genomes in linear time. Algorithms for Molecular Biology. 12, : 3 (2017).
Rubert, Diego P., Feijão, Pedro, Dias Vieira Braga, Marília, Stoye, Jens, and Martinez, Fábio Henrique Viduani. “Approximating the DCJ distance of balanced genomes in linear time”. Algorithms for Molecular Biology 12 (2017): 3.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Last Uploaded
2017-09-27T12:37:24Z

This data publication is cited in the following publications:
This publication cites the following data publications:

15 References

Data provided by Europe PubMed Central.

Efficient sorting of genomic permutations by translocation, inversion and block interchange.
Yancopoulos S, Attie O, Friedberg R., Bioinformatics 21(16), 2005
PMID: 15951307

AUTHOR UNKNOWN, 0
Inapproximability of (1,2)-exemplar distance.
Bulteau L, Jiang M., IEEE/ACM Trans Comput Biol Bioinform 10(6), 2013
PMID: 24407297
The complexity of calculating exemplar distances
Bryant D., 2000

AUTHOR UNKNOWN, 0
Approximating the true evolutionary distance between two genomes
Swenson K, Marron M, Earnest-DeYong K, Moret BME., 2005

AUTHOR UNKNOWN, 0
Reversal distance for strings with duplicates: linear time approximation using hitting set
Kolman P, Waleń T., 2007

AUTHOR UNKNOWN, 0
The solution space of sorting by DCJ.
Braga MD, Stoye J., J. Comput. Biol. 17(9), 2010
PMID: 20874401
Efficient tools for computing the number of breakpoints and the number of adjacencies between two genomes with duplicate genes.
Angibaud S, Fertin G, Rusu I, Thevenin A, Vialette S., J. Comput. Biol. 15(8), 2008
PMID: 18774903
Optimal suffix tree construction with large alphabets
Farach M., 1997

AUTHOR UNKNOWN, 0
A simpler and faster 1.5-approximation algorithm for sorting by transpositions
Hartman T, Shamir R., 2006

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Sources

PMID: 28293275
PubMed | Europe PMC

Search this title in

Google Scholar