Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)

Dione CMB, Kuhn J, Zarrieß S (2010)
In: Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10). Valletta, Malta: European Language Resources Association (ELRA).

Konferenzbeitrag | Englisch
 
Autor*in
Dione, Cheikh M. Bamba; Kuhn, Jonas; Zarrieß, SinaUniBi
Abstract / Bemerkung
In this paper, we report on the design of a part-of-speech-tagset for Wolof and on the creation of a semi-automatically annotated gold standard. In order to achieve high-quality annotation relatively fast, we first generated an accurate lexicon that draws on existing word and name lists and takes into account inflectional and derivational morphology. The main motivation for the tagged corpus is to obtain data for training automatic taggers with machine learning approaches. Hence, we took machine learning considerations into account during tagset design and we present training experiments as part of this paper. The best automatic tagger achieves an accuracy of 95.2{\%} in cross-validation experiments. We also wanted to create a basis for experimenting with annotation projection techniques, which exploit parallel corpora. For this reason, it was useful to use a part of the Bible as the gold standard corpus, for which sentence-aligned parallel versions in many languages are easy to obtain. We also report on preliminary experiments exploiting a statistical word alignment of the parallel text.
Erscheinungsjahr
2010
Titel des Konferenzbandes
Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10)
Page URI
https://pub.uni-bielefeld.de/record/2955833

Zitieren

Dione CMB, Kuhn J, Zarrieß S. Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal). In: Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10). Valletta, Malta: European Language Resources Association (ELRA); 2010.
Dione, C. M. B., Kuhn, J., & Zarrieß, S. (2010). Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal). Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10) Valletta, Malta: European Language Resources Association (ELRA).
Dione, C. M. B., Kuhn, J., and Zarrieß, S. (2010). “Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)” in Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10) (Valletta, Malta: European Language Resources Association (ELRA).
Dione, C.M.B., Kuhn, J., & Zarrieß, S., 2010. Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal). In Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10). Valletta, Malta: European Language Resources Association (ELRA).
C.M.B. Dione, J. Kuhn, and S. Zarrieß, “Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)”, Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10), Valletta, Malta: European Language Resources Association (ELRA), 2010.
Dione, C.M.B., Kuhn, J., Zarrieß, S.: Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal). Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10). European Language Resources Association (ELRA), Valletta, Malta (2010).
Dione, Cheikh M. Bamba, Kuhn, Jonas, and Zarrieß, Sina. “Design and Development of Part-of-Speech-Tagging Resources for Wolof (Niger-Congo, spoken in Senegal)”. Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10). Valletta, Malta: European Language Resources Association (ELRA), 2010.
Link(s) zu Volltext(en)
Access Level
OA Open Access

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar