Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts

Gao F, Jiang H, Blum M, Lu J, Liu D, Jiang Y, Li I (2023)
arXiv:2308.10410.

Preprint | Englisch
 
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Gao, Fan; Jiang, Hang; Blum, MoritzUniBi ; Lu, Jinghui; Liu, Dairui; Jiang, Yuang; Li, Irene
Abstract / Bemerkung
Large Language Models (LLMs) have achieved significant success across various natural language processing (NLP) tasks, encompassing question-answering, summarization, and machine translation, among others. While LLMs excel in general tasks, their efficacy in domain-specific applications remains under exploration. Additionally, LLM-generated text sometimes exhibits issues like hallucination and disinformation. In this study, we assess LLMs' capability of producing concise survey articles within the computer science-NLP domain, focusing on 20 chosen topics. Automated evaluations indicate that GPT-4 outperforms GPT-3.5 when benchmarked against the ground truth. Furthermore, four human evaluators provide insights from six perspectives across four model configurations. Through case studies, we demonstrate that while GPT often yields commendable results, there are instances of shortcomings, such as incomplete information and the exhibition of lapses in factual accuracy.
Erscheinungsjahr
2023
Zeitschriftentitel
arXiv:2308.10410
Page URI
https://pub.uni-bielefeld.de/record/2983360

Zitieren

Gao F, Jiang H, Blum M, et al. Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410. 2023.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., & Li, I. (2023). Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410
Gao, Fan, Jiang, Hang, Blum, Moritz, Lu, Jinghui, Liu, Dairui, Jiang, Yuang, and Li, Irene. 2023. “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”. arXiv:2308.10410.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., and Li, I. (2023). Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410.
Gao, F., et al., 2023. Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410.
F. Gao, et al., “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”, arXiv:2308.10410, 2023.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., Li, I.: Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410. (2023).
Gao, Fan, Jiang, Hang, Blum, Moritz, Lu, Jinghui, Liu, Dairui, Jiang, Yuang, and Li, Irene. “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”. arXiv:2308.10410 (2023).
Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Quellen

arXiv: 2308.10410

Suchen in

Google Scholar