Natural Family-Free Genomic Distance

Rubert D, Martinez FV, Dias Vieira Braga M (2020)
In: 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Kingsford C, Pisanti N (Eds); Leibniz International Proceedings in Informatics (LIPIcs), 172. Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik: 3:1-3:23.

Konferenzbeitrag | Veröffentlicht | Englisch
 
Download
OA 958.71 KB
Autor*in
Herausgeber*in
Kingsford, Carl; Pisanti, Nadia
Abstract / Bemerkung
A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. While the most traditional approaches in this area are family-based, i.e., require the classification of DNA fragments of both genomes into families, more recently an alternative model was proposed, which, instead of family classification, simply uses the pairwise similarities between DNA fragments of both genomes to compute their rearrangement distance. This model represents structural rearrangements by the generic double cut and join (DCJ) operation and is then called family-free DCJ distance. It computes the DCJ distance between the two genomes by searching for a matching of their genes based on the given pairwise similarities, therefore helping to find gene homologies. The drawback is that its computation is NP-hard. Another point is that the family-free DCJ distance must correspond to a maximal matching of the genes, due to the fact that unmatched genes are just ignored: maximizing the matching prevents the free lunch artifact of having empty or almost empty matchings giving the smaller distances. In this paper, besides DCJ operations, we allow content-modifying operations of insertions and deletions of DNA segments and propose a new and more general family-free genomic distance. In our model we use the pairwise similarities to assign weights to both matched and unmatched genes, so that an optimal solution does not necessarily maximize the matching. Our model then results in a natural family-free genomic distance, that takes into consideration all given genes and has a search space composed of matchings of any size. We provide an efficient ILP formulation to solve it, by extending the previous formulations for computing family-based genomic distances from Shao et al. (J. Comput. Biol., 2015) and Bohnenkämper et al. (Proc. of RECOMB, 2020). Our experiments show that the ILP can handle not only bacterial genomes, but also fungi and insects, or sets of chromosomes of mammals and plants. In a comparison study of six fruit fly genomes, we obtained accurate results.
Erscheinungsjahr
2020
Titel des Konferenzbandes
20th International Workshop on Algorithms in Bioinformatics (WABI 2020)
Serien- oder Zeitschriftentitel
Leibniz International Proceedings in Informatics (LIPIcs)
Band
172
Seite(n)
3:1-3:23
Konferenz
20th International Workshop on Algorithms in Bioinformatics (WABI 2020)
Konferenzort
Pisa, Italy
Konferenzdatum
2020-09-07 – 2020-09-09
ISBN
978-3-95977-161-0
ISSN
1868-8969
Page URI
https://pub.uni-bielefeld.de/record/2965072

Zitieren

Rubert D, Martinez FV, Dias Vieira Braga M. Natural Family-Free Genomic Distance. In: Kingsford C, Pisanti N, eds. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs). Vol 172. Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik; 2020: 3:1-3:23.
Rubert, D., Martinez, F. V., & Dias Vieira Braga, M. (2020). Natural Family-Free Genomic Distance. In C. Kingsford & N. Pisanti (Eds.), Leibniz International Proceedings in Informatics (LIPIcs): Vol. 172. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020) (pp. 3:1-3:23). Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.WABI.2020.3
Rubert, Diego, Martinez, Fábio V., and Dias Vieira Braga, Marília. 2020. “Natural Family-Free Genomic Distance”. In 20th International Workshop on Algorithms in Bioinformatics (WABI 2020), ed. Carl Kingsford and Nadia Pisanti, 172:3:1-3:23. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik.
Rubert, D., Martinez, F. V., and Dias Vieira Braga, M. (2020). “Natural Family-Free Genomic Distance” in 20th International Workshop on Algorithms in Bioinformatics (WABI 2020), Kingsford, C., and Pisanti, N. eds. Leibniz International Proceedings in Informatics (LIPIcs), vol. 172, (Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik), 3:1-3:23.
Rubert, D., Martinez, F.V., & Dias Vieira Braga, M., 2020. Natural Family-Free Genomic Distance. In C. Kingsford & N. Pisanti, eds. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs). no.172 Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 3:1-3:23.
D. Rubert, F.V. Martinez, and M. Dias Vieira Braga, “Natural Family-Free Genomic Distance”, 20th International Workshop on Algorithms in Bioinformatics (WABI 2020), C. Kingsford and N. Pisanti, eds., Leibniz International Proceedings in Informatics (LIPIcs), vol. 172, Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020, pp.3:1-3:23.
Rubert, D., Martinez, F.V., Dias Vieira Braga, M.: Natural Family-Free Genomic Distance. In: Kingsford, C. and Pisanti, N. (eds.) 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Leibniz International Proceedings in Informatics (LIPIcs). 172, p. 3:1-3:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2020).
Rubert, Diego, Martinez, Fábio V., and Dias Vieira Braga, Marília. “Natural Family-Free Genomic Distance”. 20th International Workshop on Algorithms in Bioinformatics (WABI 2020). Ed. Carl Kingsford and Nadia Pisanti. Dagstuhl, Germany: Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020.Vol. 172. Leibniz International Proceedings in Informatics (LIPIcs). 3:1-3:23.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung 4.0 International Public License (CC-BY 4.0):
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2022-08-12T11:15:53Z
MD5 Prüfsumme
d3e63111bc692e559ec2bb3e9240a3d1


Link(s) zu Volltext(en)
Access Level
OA Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar
ISBN Suche