Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks

Mehler A, Geelhaar T (2019)
Presented at the LiLa: Linking Latin, Milano, Italy.

Konferenzbeitrag | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Mehler, Alexander; Geelhaar, TimUniBi
Abstract / Bemerkung
We present the Frankfurt Latin Lexicon (FLL) as a lexical resource used by us in a number of NLP tasks of preprocessing Latin texts such as morphological tagging, lemmatization, and POS tagging. FLL was developed with the help of several source lexicons and taggers. First, a large number of so-called superlemmas were collected, then variants (lemmata) were differentiated for each superlemma, and finally a rule-based morphological expansion was carried out for each of these lemmas. e resulting lexicon is used for human computation, according to which its entries are continuously checked and, if necessary, corrected by registered expert users. FLL serves as a reference lexicon of TEILex, a system for integrating lexica and text corpora, in which the tokens of a corpus are linked with their lexicon entries in such a way that updates of the lexicon are immediately transferred to the linked corpora and vice versa. In this way, expert-based lexicon modeling becomes independent of indexing the underlying corpus. e paper describes the use of FLL and TEILex as two text-technological resources in different tasks such as morphological tagging, lemmatization and POS tagging as well as in the calculation of so-called Wikiditions of Latin corpora and their distribution via the website CompHistSem (http://www.comphistsem.org/home.html). Furthermore, the extension of FLL by various methods for the calculation of word embeddings is presented and illustrated by lexical networks and their analysis for the purpose of text classification. In this sense, the paper spans a spectrum from resource development (FLL) and NLP (tagging and lemmatization) to text mining (word embeddings and lexical networks) and the provision of research infrastructures and NLP pipelines (TextImager, eHumanities Desktop and CompHistSem).
Erscheinungsjahr
2019
Konferenz
LiLa: Linking Latin
Konferenzort
Milano, Italy
Konferenzdatum
2019-06-03 – 2019-06-04
Page URI
https://pub.uni-bielefeld.de/record/2955256

Zitieren

Mehler A, Geelhaar T. Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks. Presented at the LiLa: Linking Latin, Milano, Italy.
Mehler, A., & Geelhaar, T. (2019). Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks. Presented at the LiLa: Linking Latin, Milano, Italy. https://doi.org/10.13140/RG.2.2.24342.40007
Mehler, A., and Geelhaar, T. (2019).“Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks”. Presented at the LiLa: Linking Latin, Milano, Italy.
Mehler, A., & Geelhaar, T., 2019. Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks. Presented at the LiLa: Linking Latin, Milano, Italy.
A. Mehler and T. Geelhaar, “Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks”, Presented at the LiLa: Linking Latin, Milano, Italy, 2019.
Mehler, A., Geelhaar, T.: Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks. Presented at the LiLa: Linking Latin, Milano, Italy (2019).
Mehler, Alexander, and Geelhaar, Tim. “Frankfurt Latin Lexicon: From Morphological Expansion to Latin Word Embeddings and Lexical Networks”. Presented at the LiLa: Linking Latin, Milano, Italy, 2019.

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar