Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC
Mehler A, Waltinger U (2009)
Library Hi Tech 27(4): 520-539.
Zeitschriftenaufsatz
| Veröffentlicht | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Abstract / Bemerkung
Purpose - The purpose of this paper is to present a topic classification model using the Dewey Decimal Classification (DDC) as the target scheme. This is to be done by exploring metadata. as provided by the Open Archives Initiative (OAT) to derive document snippets as minimal document representations. The reason is to reduce the effort of document processing in digital libraries. Further, the paper seeks to perform feature selection and extension by means of social ontologies and related web-based lexical resources. This is done to provide reliable topic-related classifications while circumventing the problem of data sparseness. Finally, the paper aims to evaluate the model by means of two language-specific corpora. The paper bridges digital libraries, on the one hand, and computational linguistics, on the other. The aim is to make accessible computational linguistic methods to provide thematic classifications in digital libraries based on closed topic models such as the DDC. Design/methodology/approach - The approach takes the form of text classification, text-technology, computational linguistics, computational semantics, and social semantics. Findings - It is shown that SVM-based classifiers perform best by exploring certain selections of OAI document metadata. Research limitations/implications - The findings show that it is necessary to further develop SVM-based DDC-classifiers by using larger training sets possibly for more than two languages in order to get better F-measure values. Originality/value - Algorithmic and formal-mathematical information is provided on how to build DDC-classifiers for digital libraries.
Stichworte
Modelling;
Digital libraries;
Document management
Erscheinungsjahr
2009
Zeitschriftentitel
Library Hi Tech
Band
27
Ausgabe
4
Seite(n)
520-539
Konferenz
9th International Bielefeld Conference "Upgrading the eLibrary"
Konferenzort
Bielefeld, Germany
Konferenzdatum
2009-02-03 – 2009-02-05
ISSN
0737-8831
Page URI
https://pub.uni-bielefeld.de/record/1588836
Zitieren
Mehler A, Waltinger U. Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC. Library Hi Tech. 2009;27(4):520-539.
Mehler, A., & Waltinger, U. (2009). Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC. Library Hi Tech, 27(4), 520-539. https://doi.org/10.1108/07378830911007646
Mehler, Alexander, and Waltinger, Ulli. 2009. “Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC”. Library Hi Tech 27 (4): 520-539.
Mehler, A., and Waltinger, U. (2009). Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC. Library Hi Tech 27, 520-539.
Mehler, A., & Waltinger, U., 2009. Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC. Library Hi Tech, 27(4), p 520-539.
A. Mehler and U. Waltinger, “Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC”, Library Hi Tech, vol. 27, 2009, pp. 520-539.
Mehler, A., Waltinger, U.: Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC. Library Hi Tech. 27, 520-539 (2009).
Mehler, Alexander, and Waltinger, Ulli. “Enhancing document modeling by means of open topic models Crossing the frontier of classification schemes in digital libraries by example of the DDC”. Library Hi Tech 27.4 (2009): 520-539.
Export
Markieren/ Markierung löschen
Markierte Publikationen
Web of Science
Dieser Datensatz im Web of Science®Suchen in