Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes
Rubert D, Dias Vieira Braga M (2022)
In: 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Boucher C, Rahmann S (Eds); Leibniz International Proceedings in Informatics (LIPIcs), 242. Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik.
Konferenzbeitrag
| Veröffentlicht | Englisch
Download
LIPIcs-WABI-2022-24.pdf
1.37 MB
Herausgeber*in
Boucher, Christina;
Rahmann, Sven
Einrichtung
Abstract / Bemerkung
Recently we developed a gene orthology inference tool based on genome rearrangements (Journal of
Bioinformatics and Computational Biology 19:6, 2021). Given a set of genomes our method first
computes all pairwise gene similarities. Then it runs pairwise ILP comparisons to compute optimal
gene matchings, which minimize, by taking the similarities into account, the weighted rearrangement
distance between the analyzed genomes (a problem that is NP-hard). The gene matchings are
then integrated into gene families in the final step. Although the ILP is quite efficient and could
conceptually analyze genomes that are not completely assembled but split in several contigs, our
tool failed in completing that task. The main reason is that each ILP pairwise comparison includes
an optimal capping that connects each end of a linear segment of one genome to an end of a linear
segment in the other genome, producing an exponential increase of the search space.
In this work, we design and implement a heuristic capping algorithm that replaces the optimal
capping by clustering (based on their gene content intersections) the linear segments into m ≥ 1
subsets, whose ends are capped independently. Furthermore, in each subset, instead of allowing all
possible connections, we let only the ends of content-related segments be connected. Although there
is no guarantee that m is much bigger than one, and with the possible side effect of resulting in sub-
optimal instead of optimal gene matchings, the heuristic works very well in practice, from both the
speed performance and the quality of computed solutions. Our experiments on real data show that
we can now efficiently analyze fruit fly genomes with unfinished assemblies distributed in hundreds or
even thousands of contigs, obtaining orthologies that are more similar to FlyBase orthologies when
compared to orthologies computed by other inference tools. Moreover, for complete assemblies the
version with heuristic capping reports orthologies that are very similar to the orthologies computed
by the optimal version of our tool. Our approach is implemented into a pipeline incorporating the
pre-computation of gene similarities.
Erscheinungsjahr
2022
Titel des Konferenzbandes
22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Serien- oder Zeitschriftentitel
Leibniz International Proceedings in Informatics (LIPIcs)
Band
242
Art.-Nr.
24
Urheberrecht / Lizenzen
Konferenz
22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Konferenzort
Dagstuhl, Germany
Konferenzdatum
2022-09-05 – 2022-09-07
ISBN
978-3-95977-243-3
Page URI
https://pub.uni-bielefeld.de/record/2968237
Zitieren
Rubert D, Dias Vieira Braga M. Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes. In: Boucher C, Rahmann S, eds. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs). Vol 242. Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik; 2022.
Rubert, D., & Dias Vieira Braga, M. (2022). Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes. In C. Boucher & S. Rahmann (Eds.), Leibniz International Proceedings in Informatics (LIPIcs): Vol. 242. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022) Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.WABI.2022.24
Rubert, Diego, and Dias Vieira Braga, Marília. 2022. “Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes”. In 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022), ed. Christina Boucher and Sven Rahmann. Vol. 242. Leibniz International Proceedings in Informatics (LIPIcs). Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik: 24.
Rubert, D., and Dias Vieira Braga, M. (2022). “Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes” in 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022), Boucher, C., and Rahmann, S. eds. Leibniz International Proceedings in Informatics (LIPIcs), vol. 242, (Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik).
Rubert, D., & Dias Vieira Braga, M., 2022. Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes. In C. Boucher & S. Rahmann, eds. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs). no.242 Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik.
D. Rubert and M. Dias Vieira Braga, “Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes”, 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022), C. Boucher and S. Rahmann, eds., Leibniz International Proceedings in Informatics (LIPIcs), vol. 242, Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik, 2022.
Rubert, D., Dias Vieira Braga, M.: Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes. In: Boucher, C. and Rahmann, S. (eds.) 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Leibniz International Proceedings in Informatics (LIPIcs). 242, Schloss Dagstuhl, Leibniz-Zentrum für Informatik, Dagstuhl (2022).
Rubert, Diego, and Dias Vieira Braga, Marília. “Gene Orthology Inference via Large-Scale Rearrangements for Partially Assembled Genomes”. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022). Ed. Christina Boucher and Sven Rahmann. Dagstuhl: Schloss Dagstuhl, Leibniz-Zentrum für Informatik, 2022.Vol. 242. Leibniz International Proceedings in Informatics (LIPIcs).
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung 4.0 International Public License (CC-BY 4.0):
Volltext(e)
Name
LIPIcs-WABI-2022-24.pdf
1.37 MB
Access Level
Open Access
Zuletzt Hochgeladen
2023-01-16T13:35:35Z
MD5 Prüfsumme
f4991b526e495eb274dced62d435540b
Link(s) zu Volltext(en)
Access Level
Open Access