8 Publikationen
-
2024 | Preprint | PUB-ID: 2993396Small Language Models Like Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby LlamasPUB | arXiv
Bunzeck B, Duran D, Schade L, Zarrieß S (2024)
arXiv:2410.01487. -
2024 | Konferenzbeitrag | PUB-ID: 2994136The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronounsPUB | Download (ext.)
Bunzeck B, Zarrieß S (2024)
In: Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP. Hupkes D, Dankers V, Batsuren K, Kazemnejad A, Christodoulopoulos C, Giulianelli M, Cotterell R (Eds); Miami, Florida, USA: Association for Computational Linguistics: 42-53. -
2024 | Konferenzbeitrag | PUB-ID: 2993430Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unrulyPUB | PDF | Download (ext.)
Bunzeck B, Zarrieß S (2024)
In: Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning. Qiu A, Noble B, Pagmar D, Maraev V, Ilinykh N (Eds); Kerrville, TX: Association for Computational Linguistics: 39-55. -
2023 | Datenpublikation | PUB-ID: 2993810Replication Data for: "The Wikipedia Republic of Literary Characters"PUB | Dateien verfügbar | DOI
Wojcik P, Bunzeck B, Zarrieß S (2023)
Harvard Dataverse. -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980942The Wikipedia Republic of Literary CharactersPUB | PDF | DOI
Wojcik P, Bunzeck B, Zarrieß S (2023)
Journal of Cultural Analytics 8(2). -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2985109GPT-wee: How Small Can a Small Language Model Really Get?PUB | PDF | DOI | Download (ext.)
Bunzeck B, Zarrieß S (2023)
In: Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Warstadt A, Mueller A, Choshen L, Wilcox E, Zhuang C, Ciro J, Mosquera R, Paranjabe B, Williams A, Linzen T, Cotterell R (Eds); Stroudsburg, PA: Association for Computational Linguistics: 35-46. -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980943Hexatomic: An extensible, OS-independent platform fordeep multi-layer linguistic annotation of corporaPUB | PDF | DOI
Druskat S, Krause T, Lachenmaier C, Bunzeck B (2023)
Journal of Open Source Software 8(86): 4825. -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2982902Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Large Language ModelsPUB | PDF
Bunzeck B, Zarrieß S (2023)
In: Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD). Breitholtz E, Lappin S, Loaiciga S, Ilinykh N, Dobnik S (Eds); Stroudsburg, PA: Association for Computational Linguistics: 25-37.