Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts
Gao F, Jiang H, Blum M, Lu J, Liu D, Jiang Y, Li I (2023)
arXiv:2308.10410.
Preprint | Englisch
Download
Es wurden keine Dateien hochgeladen. Nur Publikationsnachweis!
Autor*in
Gao, Fan;
Jiang, Hang;
Blum, MoritzUniBi ;
Lu, Jinghui;
Liu, Dairui;
Jiang, Yuang;
Li, Irene
Abstract / Bemerkung
Large Language Models (LLMs) have achieved significant success across various
natural language processing (NLP) tasks, encompassing question-answering,
summarization, and machine translation, among others. While LLMs excel in
general tasks, their efficacy in domain-specific applications remains under
exploration. Additionally, LLM-generated text sometimes exhibits issues like
hallucination and disinformation. In this study, we assess LLMs' capability of
producing concise survey articles within the computer science-NLP domain,
focusing on 20 chosen topics. Automated evaluations indicate that GPT-4
outperforms GPT-3.5 when benchmarked against the ground truth. Furthermore,
four human evaluators provide insights from six perspectives across four model
configurations. Through case studies, we demonstrate that while GPT often
yields commendable results, there are instances of shortcomings, such as
incomplete information and the exhibition of lapses in factual accuracy.
Erscheinungsjahr
2023
Zeitschriftentitel
arXiv:2308.10410
Page URI
https://pub.uni-bielefeld.de/record/2983360
Zitieren
Gao F, Jiang H, Blum M, et al. Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410. 2023.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., & Li, I. (2023). Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410
Gao, Fan, Jiang, Hang, Blum, Moritz, Lu, Jinghui, Liu, Dairui, Jiang, Yuang, and Li, Irene. 2023. “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”. arXiv:2308.10410.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., and Li, I. (2023). Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410.
Gao, F., et al., 2023. Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410.
F. Gao, et al., “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”, arXiv:2308.10410, 2023.
Gao, F., Jiang, H., Blum, M., Lu, J., Liu, D., Jiang, Y., Li, I.: Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts. arXiv:2308.10410. (2023).
Gao, Fan, Jiang, Hang, Blum, Moritz, Lu, Jinghui, Liu, Dairui, Jiang, Yuang, and Li, Irene. “Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts”. arXiv:2308.10410 (2023).