Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases
Eriksen CA, Andersen JL, Fagerberg R, Merkle D (2024)
Journal of Computational Biology .
Zeitschriftenaufsatz
| E-Veröff. vor dem Druck | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Eriksen, Casper Asbjorn;
Andersen, Jakob Lykke;
Fagerberg, Rolf;
Merkle, DanielUniBi
Abstract / Bemerkung
Information on the structure of molecules, retrieved via biochemical databases, plays a pivotal role in various disciplines, including metabolomics, systems biology, and drug discovery. No such database can be complete and it is often necessary to incorporate data from several sources. However, the molecular structure for a given compound is not necessarily consistent between databases. This article presents StructRecon, a novel tool for resolving unique molecular structures from database identifiers. Currently, identifiers from BiGG, ChEBI, Escherichia coli Metabolome Database (ECMDB), MetaNetX, and PubChem are supported. StructRecon traverses the cross-links between entries in different databases to construct what we call identifier graphs. The goal of these graphs is to offer a more complete view of the total information available on a given compound across all the supported databases. To reconcile discrepancies met during the traversal of the databases, we develop an extensible model for molecular structure supporting multiple independent levels of detail, which allows standardization of the structure to be applied iteratively. In some cases, our standardization approach results in multiple candidate structures for a given compound, in which case a random walk-based algorithm is used to select the most likely structure among incompatible alternatives. As a case study, we applied StructRecon to the EColiCore2 model. We found at least one structure for 98.66% of its compounds, which is more than twice as many as possible when using the databases in more standard ways not considering the complex network of cross-database references captured by our identifier graphs. StructRecon is open-source and modular, which enables support for more databases in the future.
Erscheinungsjahr
2024
Zeitschriftentitel
Journal of Computational Biology
eISSN
1557-8666
Page URI
https://pub.uni-bielefeld.de/record/2989895
Zitieren
Eriksen CA, Andersen JL, Fagerberg R, Merkle D. Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases. Journal of Computational Biology . 2024.
Eriksen, C. A., Andersen, J. L., Fagerberg, R., & Merkle, D. (2024). Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases. Journal of Computational Biology . https://doi.org/10.1089/cmb.2024.0520
Eriksen, Casper Asbjorn, Andersen, Jakob Lykke, Fagerberg, Rolf, and Merkle, Daniel. 2024. “Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases”. Journal of Computational Biology .
Eriksen, C. A., Andersen, J. L., Fagerberg, R., and Merkle, D. (2024). Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases. Journal of Computational Biology .
Eriksen, C.A., et al., 2024. Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases. Journal of Computational Biology .
C.A. Eriksen, et al., “Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases”, Journal of Computational Biology , 2024.
Eriksen, C.A., Andersen, J.L., Fagerberg, R., Merkle, D.: Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases. Journal of Computational Biology . (2024).
Eriksen, Casper Asbjorn, Andersen, Jakob Lykke, Fagerberg, Rolf, and Merkle, Daniel. “Toward the Reconciliation of Inconsistent Molecular Structures from Biochemical Databases”. Journal of Computational Biology (2024).
Daten bereitgestellt von European Bioinformatics Institute (EBI)
Zitationen in Europe PMC
Daten bereitgestellt von Europe PubMed Central.
References
Daten bereitgestellt von Europe PubMed Central.
Export
Markieren/ Markierung löschen
Markierte Publikationen
Web of Science
Dieser Datensatz im Web of Science®Quellen
PMID: 38758924
PubMed | Europe PMC
Suchen in