Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica

Walter S (2017)
Bielefeld: Universität Bielefeld.

Download
OA
Bielefeld Dissertation | English
Abstract
There is an increasing interest in providing common web users with access to structured knowledge bases such as DBpedia, for example by means of question answering systems.
All such question answering systems have in common that they have to map a natural language input, be it spoken or written, to a formal representation in order to extract the correct answer from the target knowledge base. This is also the case for systems which generate natural language text from a given knowledge base. The main challenge is how to map natural language (spoken or written) to structured data and vice versa. To this end, question answering systems require knowledge about how the vocabulary elements used in the available datasets are verbalized in natural language, covering different verbalization variants. Multilinguality of course increases the complexity of this challenge.
In this thesis we introduce M-ATOLL, a framework for automatically inducing ontology lexica in multiple languages, to find such verbalization variants.
We have instantiated the system for three languages, English, German and Spanish, by exploiting a set of language-specific dependency patterns for finding lexicalizations in text corpora. Additionally, we extended our framework to extract complex adjective lexicalizations with a machine-learning-based approach.
M-ATOLL is the first open-source and multilingual approach for the generation of ontology lexica. In this thesis we present grammatical patterns for three different languages, on which the extraction of lexicalization relies. We provide an analysis of these patterns as well as a comparison with those proposed by other state-of-the-art systems. Additionally, we present a detailed evaluation comparing the different approaches with different settings on a publicly available goldstandard, and discuss their potential and limitations.
Year
PUB-ID

Cite this

Walter S. Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica. Bielefeld: Universität Bielefeld; 2017.
Walter, S. (2017). Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica. Bielefeld: Universität Bielefeld.
Walter, S. (2017). Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica. Bielefeld: Universität Bielefeld.
Walter, S., 2017. Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica, Bielefeld: Universität Bielefeld.
S. Walter, Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica, Bielefeld: Universität Bielefeld, 2017.
Walter, S.: Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica. Universität Bielefeld, Bielefeld (2017).
Walter, Sebastian. Generation of multilingual ontology lexica with M-ATOLL : a corpus-based approach for the induction of ontology lexica. Bielefeld: Universität Bielefeld, 2017.
All files available under the following license(s):
Copyright Statement:
This Item is protected by copyright and/or related rights. [...]
Main File(s)
Access Level
OA Open Access
Last Uploaded
2017-02-02T10:07:23Z
MD5 Checksum
4a5fcb1d7093ec11d8faa6ef20901a56

This data publication is cited in the following publications:
This publication cites the following data publications:

Export

0 Marked Publications

Open Data PUB

Search this title in

Google Scholar