Leveraging the Wikipedia Graph for Evaluating Word Embeddings

Giesen J, Kahlmeyer P, Nussbaum F, Zarrieß S (2022)
In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). De Raedt L (Ed); California: International Joint Conferences on Artificial Intelligence Organization: 4136-4142.

Konferenzbeitrag | Veröffentlicht | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Giesen, Joachim; Kahlmeyer, Paul; Nussbaum, Frank; Zarrieß, SinaUniBi
Herausgeber*in
De Raedt, Luc
Abstract / Bemerkung
Deep learning models for different NLP tasks often rely on pre-trained word embeddings, that is, vector representations of words. Therefore, it is crucial to evaluate pre-trained word embeddings independently of downstream tasks. Such evaluations try to assess whether the geometry induced by a word embedding captures connections made in natural language, such as, analogies, clustering of words, or word similarities. Here, traditionally, similarity is measured by comparison to human judgment. However, explicitly annotating word pairs with similarity scores by surveying humans is expensive. We tackle this problem by formulating a similarity measure that is based on an agent for routing the Wikipedia hyperlink graph. In this graph, word similarities are implicitly encoded by edges between articles. We show on the English Wikipedia that our measure correlates well with a large group of traditional similarity measures, while covering a much larger proportion of words and avoiding explicit human labeling. Moreover, since Wikipedia is available in more than 300 languages, our measure can easily be adapted to other languages, in contrast to traditional similarity measures.
Erscheinungsjahr
2022
Titel des Konferenzbandes
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22)
Seite(n)
4136-4142
Konferenz
Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}
Konferenzort
Vienna, Austria
Konferenzdatum
2022-07-23 – 2022-07-29
ISBN
978-1-956792-00-3
Page URI
https://pub.uni-bielefeld.de/record/2965223

Zitieren

Giesen J, Kahlmeyer P, Nussbaum F, Zarrieß S. Leveraging the Wikipedia Graph for Evaluating Word Embeddings. In: De Raedt L, ed. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). California: International Joint Conferences on Artificial Intelligence Organization; 2022: 4136-4142.
Giesen, J., Kahlmeyer, P., Nussbaum, F., & Zarrieß, S. (2022). Leveraging the Wikipedia Graph for Evaluating Word Embeddings. In L. De Raedt (Ed.), Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22) (pp. 4136-4142). California: International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2022/574
Giesen, Joachim, Kahlmeyer, Paul, Nussbaum, Frank, and Zarrieß, Sina. 2022. “Leveraging the Wikipedia Graph for Evaluating Word Embeddings”. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), ed. Luc De Raedt, 4136-4142. California: International Joint Conferences on Artificial Intelligence Organization.
Giesen, J., Kahlmeyer, P., Nussbaum, F., and Zarrieß, S. (2022). “Leveraging the Wikipedia Graph for Evaluating Word Embeddings” in Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), De Raedt, L. ed. (California: International Joint Conferences on Artificial Intelligence Organization), 4136-4142.
Giesen, J., et al., 2022. Leveraging the Wikipedia Graph for Evaluating Word Embeddings. In L. De Raedt, ed. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). California: International Joint Conferences on Artificial Intelligence Organization, pp. 4136-4142.
J. Giesen, et al., “Leveraging the Wikipedia Graph for Evaluating Word Embeddings”, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22), L. De Raedt, ed., California: International Joint Conferences on Artificial Intelligence Organization, 2022, pp.4136-4142.
Giesen, J., Kahlmeyer, P., Nussbaum, F., Zarrieß, S.: Leveraging the Wikipedia Graph for Evaluating Word Embeddings. In: De Raedt, L. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). p. 4136-4142. International Joint Conferences on Artificial Intelligence Organization, California (2022).
Giesen, Joachim, Kahlmeyer, Paul, Nussbaum, Frank, and Zarrieß, Sina. “Leveraging the Wikipedia Graph for Evaluating Word Embeddings”. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22). Ed. Luc De Raedt. California: International Joint Conferences on Artificial Intelligence Organization, 2022. 4136-4142.
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar
ISBN Suche