Structural classifiers of text types: Towards a novel model of text representation
Mehler A, Geibel P, Pustylnikov O (2007)
LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology 22(2): 51-66.
Zeitschriftenaufsatz
| Veröffentlicht | Englisch
Download
mehler_geibel_pustylnikov_2007.pdf
Autor*in
Mehler, AlexanderUniBi;
Geibel, Peter;
Pustylnikov, Olga
Abstract / Bemerkung
Texts can be distinguished in terms of their content, function, structure
or layout (Brinker, 1992; Bateman et al., 2001; Joachims, 2002; Power
et al., 2003). These reference points do not open necessarily orthogonal
perspectives on text classification. As part of explorative data analysis,
text classification aims at automatically dividing sets of textual objects into
classes of maximum internal homogeneity and external heterogeneity. This
paper deals with classifying texts into text types whose instances serve more
or less homogeneous functions. Other than mainstream approaches, which
rely on the vector space model (Sebastiani, 2002) or some of its descendants
(Baeza-Yates and Ribeiro-Neto, 1999) and, thus, on content-related lexical
features, we solely refer to structural differentiae. That is, we explore
patterns of text structure as determinants of class membership. Our starting
point are tree-like text representations which induce feature vectors and
tree kernels. These kernels are utilized in supervised learning based on
cross-validation as a method of model selection (Hastie et al., 2001) by
example of a corpus of press communication. For a subset of categories we
show that classification can be performed very well by structural differentia
only.
Erscheinungsjahr
2007
Zeitschriftentitel
LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology
Band
22
Ausgabe
2
Seite(n)
51-66
ISSN
0175-1336
Page URI
https://pub.uni-bielefeld.de/record/2480145
Zitieren
Mehler A, Geibel P, Pustylnikov O. Structural classifiers of text types: Towards a novel model of text representation. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology. 2007;22(2):51-66.
Mehler, A., Geibel, P., & Pustylnikov, O. (2007). Structural classifiers of text types: Towards a novel model of text representation. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology, 22(2), 51-66.
Mehler, Alexander, Geibel, Peter, and Pustylnikov, Olga. 2007. “Structural classifiers of text types: Towards a novel model of text representation”. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology 22 (2): 51-66.
Mehler, A., Geibel, P., and Pustylnikov, O. (2007). Structural classifiers of text types: Towards a novel model of text representation. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology 22, 51-66.
Mehler, A., Geibel, P., & Pustylnikov, O., 2007. Structural classifiers of text types: Towards a novel model of text representation. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology, 22(2), p 51-66.
A. Mehler, P. Geibel, and O. Pustylnikov, “Structural classifiers of text types: Towards a novel model of text representation”, LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology, vol. 22, 2007, pp. 51-66.
Mehler, A., Geibel, P., Pustylnikov, O.: Structural classifiers of text types: Towards a novel model of text representation. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology. 22, 51-66 (2007).
Mehler, Alexander, Geibel, Peter, and Pustylnikov, Olga. “Structural classifiers of text types: Towards a novel model of text representation”. LDV-Forum : Zeitschrift für Computerlinguistik und Sprachtechnologie ; GLDV-Journal for Computational Linguistics and Language Technology 22.2 (2007): 51-66.
Volltext(e)
Name
mehler_geibel_pustylnikov_2007.pdf
Access Level
Closed Access
Zuletzt Hochgeladen
2019-09-06T09:18:00Z
MD5 Prüfsumme
45f648bad93a568e98011a24ce454f39