12 Publikationen
-
2025 | Preprint | PUB-ID: 3001572Do Construction Distributions Shape Formal Language Learning In German BabyLMs?PUB | PDF | DOI | arXiv
Bunzeck B, Duran D, Zarrieß S (2025)
arXiv:2503.11593. -
2025 | Preprint | PUB-ID: 3000929Subword models struggle with word learning, but surprisal hides itPUB | PDF | DOI | arXiv
Bunzeck B, Zarrieß S (2025)
arXiv:2502.12835. -
2025 | Konferenzbeitrag | PUB-ID: 3000275Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby LlamasPUB | PDF | Download (ext.)
Bunzeck B, Duran D, Schade L, Zarrieß S (2025)
In: Proceedings of the 31st International Conference on Computational Linguistics. Rambow O, Wanner L, Apidianaki M, Al-Khalifa H, Eugenio BD, Schockaert S (Eds); Abu Dhabi, UAE: Association for Computational Linguistics: 6039-6048. -
2024 | Konferenzbeitrag | PUB-ID: 3001254Graphemes vs. phonemes: battling it out in character-based language modelsPUB | PDF | Download (ext.)
Bunzeck B, Duran D, Schade L, Zarrieß S (2024)
In: The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning. Hu MY, Mueller A, Ross C, Williams A, Linzen T, Zhuang C, Choshen L, Cotterell R, Warstadt A, Wilcox EG (Eds); Miami, FL, USA: Association for Computational Linguistics: 54-64. -
2024 | Konferenzbeitrag | PUB-ID: 2993430Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unrulyPUB | PDF | Download (ext.)
Bunzeck B, Zarrieß S (2024)
In: Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning. Qiu A, Noble B, Pagmar D, Maraev V, Ilinykh N (Eds); Kerrville, TX: Association for Computational Linguistics: 39-55. -
2024 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2999608The richness of the stimulus: Constructional variation and development in child-directed speechPUB | DOI
Bunzeck B, Diessel H (2024)
First Language. -
2024 | Konferenzbeitrag | PUB-ID: 2994136The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronounsPUB | Download (ext.)
Bunzeck B, Zarrieß S (2024)
In: Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP. Hupkes D, Dankers V, Batsuren K, Kazemnejad A, Christodoulopoulos C, Giulianelli M, Cotterell R (Eds); Miami, Florida, USA: Association for Computational Linguistics: 42-53. -
2023 | Datenpublikation | PUB-ID: 2993810Replication Data for: "The Wikipedia Republic of Literary Characters"PUB | Dateien verfügbar | DOI
Wojcik P, Bunzeck B, Zarrieß S (2023)
Harvard Dataverse. -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980942The Wikipedia Republic of Literary CharactersPUB | PDF | DOI
Wojcik P, Bunzeck B, Zarrieß S (2023)
Journal of Cultural Analytics 8(2). -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2985109GPT-wee: How Small Can a Small Language Model Really Get?PUB | PDF | DOI | Download (ext.)
Bunzeck B, Zarrieß S (2023)
In: Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Warstadt A, Mueller A, Choshen L, Wilcox E, Zhuang C, Ciro J, Mosquera R, Paranjabe B, Williams A, Linzen T, Cotterell R (Eds); Stroudsburg, PA: Association for Computational Linguistics: 35-46. -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980943Hexatomic: An extensible, OS-independent platform fordeep multi-layer linguistic annotation of corporaPUB | PDF | DOI
Druskat S, Krause T, Lachenmaier C, Bunzeck B (2023)
Journal of Open Source Software 8(86): 4825. -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2982902Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Large Language ModelsPUB | PDF
Bunzeck B, Zarrieß S (2023)
In: Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD). Breitholtz E, Lappin S, Loaiciga S, Ilinykh N, Dobnik S (Eds); Stroudsburg, PA: Association for Computational Linguistics: 25-37.