Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury

ter Horst H (2021)
Bielefeld: Universität Bielefeld.

Bielefelder E-Dissertation | Englisch
 
Download
OA 3.28 MB
Gutachter*in / Betreuer*in
Cimiano, Philipp
Abstract / Bemerkung
Every year, a vast amount of unstructured medical knowledge is described in thousands of pre-clinical studies published on publicly available websites such as PubMed. The aggregation of such knowledge plays an important role in various medical applications such as therapy development in evidence-based medicine where decisions are made on the basis of the best available evidence published in the literature so far. However, due to their natural language format, the manual aggregation of available information is tedious and time-consuming and can hardly be performed by researchers. Towards this issue, we are concerned with the automatic information extraction of structured knowledge at a level of detail that supports evidence-based decision making. Specifically, we focus on automatically populating a deep domain knowledge graph with information from pre-clinical studies that describe experimental results in the area of spinal cord injury. An important challenge is that a single study contains multiple outcomes described by a total of up to 7,816 (dependent) study parameters. Since the problem of extracting all these parameters jointly is so far intractable, we propose a hierarchical architecture that predicts incrementally feasible substructures in a bottom-up fashion relying on statistical inference and conditional random fields at the heart of our system. The main contribution of this work is the development of a machine learning methods integrated into a holistic domain-adapted information extraction system that is capable of predicting the full details of experimental outcomes as described in pre-clinical studies written in natural language. We present a general methodology for the extraction of deeply nested structures rooted in the paradigm of structure prediction and model-complete text comprehension. We further identify domain specific challenges, and provide adapted solutions. We show how to efficiently evaluate complex nested structures predicted by our system and present a comprehensive evaluation to understand the extent to which it can be used with the depth required to support aggregation of evidence. We show that the information extraction results are satisfactory for many classes of our domain ontology and identify those which require further research.
Jahr
2021
Seite(n)
222
Page URI
https://pub.uni-bielefeld.de/record/2959813

Zitieren

ter Horst H. Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Bielefeld: Universität Bielefeld; 2021.
ter Horst, H. (2021). Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Bielefeld: Universität Bielefeld. https://doi.org/10.4119/unibi/2959813
ter Horst, Hendrik. 2021. Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Bielefeld: Universität Bielefeld.
ter Horst, H. (2021). Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Bielefeld: Universität Bielefeld.
ter Horst, H., 2021. Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury, Bielefeld: Universität Bielefeld.
H. ter Horst, Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury, Bielefeld: Universität Bielefeld, 2021.
ter Horst, H.: Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Universität Bielefeld, Bielefeld (2021).
ter Horst, Hendrik. Information extraction from text for deep domain knowledge graph population. Extracting pre-clinical outcomes in the domain of spinal cord injury. Bielefeld: Universität Bielefeld, 2021.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International Public License (CC BY-SA 4.0):
Volltext(e)
Name
Access Level
OA Open Access
Zuletzt Hochgeladen
2021-12-08T18:33:01Z
MD5 Prüfsumme
c40b700d2f894a309febd173aecf4bc7


Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar