Automation in Graph-Based Data Integration and Mapping
Friedrichs M (2022)
In: Integrative Bioinformatics. History and Future. Chen M, Hofestädt R (Eds); Singapore: Springer Singapore: 97-110.
Sammelwerksbeitrag
| Veröffentlicht | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Herausgeber*in
Chen, Ming;
Hofestädt, Ralf
Einrichtung
Abstract / Bemerkung
Data integration plays a vital role in scientific research. In biomedical research, the OMICS fields have shown the need for increasingly larger datasets, like proteomics, pharmacogenomics, and even newer fields like foodomics. In 2019 Nucleic Acids Research counted 1637 databases, accounting only for a fraction of all data sources available online. Data integration efforts need to process large amounts of heterogeneous data from different file formats ranging from simple files to complex relational databases and increasingly graph databases. Aside from data formats, availability is another obstacle. Whether files are available for direct download, need a user account, or are available only through an application programming interface (API). Keeping data sources up-to-date is important to make use of the latest discoveries in the respective fields, retrieve error corrections, and potentially mitigate issues with other data sources referencing newly added entities. Finally, all data sources provide information on certain entities and in most cases make use of specific identification systems. In the best case, data sources provide cross-references to other data sources. In order to generate robust mappings between all required data sources, identifiers of good quality need to be selected forming new connections between the entities. All of these vital steps and issues of data integration and mapping benefit from automation and are in most parts able to be fully automated. Workflow systems and integration tools are capable of automating different elements of the aforementioned steps and require varying levels of computer science skills. This chapter describes these issues, and the potential of the fully automated, graph-based data integration and mapping tool BioDWH2 is explored.
Erscheinungsjahr
2022
Buchtitel
Integrative Bioinformatics. History and Future
Seite(n)
97-110
ISBN
978-981-16-6794-7
eISBN
978-981-16-6795-4
Page URI
https://pub.uni-bielefeld.de/record/2963200
Zitieren
Friedrichs M. Automation in Graph-Based Data Integration and Mapping. In: Chen M, Hofestädt R, eds. Integrative Bioinformatics. History and Future. Singapore: Springer Singapore; 2022: 97-110.
Friedrichs, M. (2022). Automation in Graph-Based Data Integration and Mapping. In M. Chen & R. Hofestädt (Eds.), Integrative Bioinformatics. History and Future (pp. 97-110). Singapore: Springer Singapore. https://doi.org/10.1007/978-981-16-6795-4_5
Friedrichs, Marcel. 2022. “Automation in Graph-Based Data Integration and Mapping”. In Integrative Bioinformatics. History and Future, ed. Ming Chen and Ralf Hofestädt, 97-110. Singapore: Springer Singapore.
Friedrichs, M. (2022). “Automation in Graph-Based Data Integration and Mapping” in Integrative Bioinformatics. History and Future, Chen, M., and Hofestädt, R. eds. (Singapore: Springer Singapore), 97-110.
Friedrichs, M., 2022. Automation in Graph-Based Data Integration and Mapping. In M. Chen & R. Hofestädt, eds. Integrative Bioinformatics. History and Future. Singapore: Springer Singapore, pp. 97-110.
M. Friedrichs, “Automation in Graph-Based Data Integration and Mapping”, Integrative Bioinformatics. History and Future, M. Chen and R. Hofestädt, eds., Singapore: Springer Singapore, 2022, pp.97-110.
Friedrichs, M.: Automation in Graph-Based Data Integration and Mapping. In: Chen, M. and Hofestädt, R. (eds.) Integrative Bioinformatics. History and Future. p. 97-110. Springer Singapore, Singapore (2022).
Friedrichs, Marcel. “Automation in Graph-Based Data Integration and Mapping”. Integrative Bioinformatics. History and Future. Ed. Ming Chen and Ralf Hofestädt. Singapore: Springer Singapore, 2022. 97-110.