12 Publikationen
-
2025 | Preprint | PUB-ID: 3001572Do Construction Distributions Shape Formal Language Learning In German BabyLMs?PUB | PDF | DOI | arXiv
Bunzeck, Bastian, Do Construction Distributions Shape Formal Language Learning In German BabyLMs?. arXiv:2503.11593 (). , 2025 -
2025 | Preprint | PUB-ID: 3000929Subword models struggle with word learning, but surprisal hides itPUB | PDF | DOI | arXiv
Bunzeck, Bastian, Subword models struggle with word learning, but surprisal hides it. arXiv:2502.12835 (). , 2025 -
2025 | Konferenzbeitrag | PUB-ID: 3000275Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby LlamasPUB | PDF | Download (ext.)
Bunzeck, Bastian, Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas. Proceedings of the 31st International Conference on Computational Linguistics (). Abu Dhabi, UAE, 2025 -
2024 | Konferenzbeitrag | PUB-ID: 3001254Graphemes vs. phonemes: battling it out in character-based language modelsPUB | PDF | Download (ext.)
Bunzeck, Bastian, Graphemes vs. phonemes: battling it out in character-based language models. The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning (). Miami, FL, USA, 2024 -
2024 | Konferenzbeitrag | PUB-ID: 2993430Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unrulyPUB | PDF | Download (ext.)
Bunzeck, Bastian, Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unruly. Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning (). Kerrville, TX, 2024 -
2024 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2999608The richness of the stimulus: Constructional variation and development in child-directed speechPUB | DOI
Bunzeck, Bastian, The richness of the stimulus: Constructional variation and development in child-directed speech. First Language (). , 2024 -
2024 | Konferenzbeitrag | PUB-ID: 2994136The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronounsPUB | Download (ext.)
Bunzeck, Bastian, The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP (). Miami, Florida, USA, 2024 -
2023 | Datenpublikation | PUB-ID: 2993810Replication Data for: "The Wikipedia Republic of Literary Characters"PUB | Dateien verfügbar | DOI
Wojcik, Paula, Replication Data for: "The Wikipedia Republic of Literary Characters". (). , 2023 -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980942The Wikipedia Republic of Literary CharactersPUB | PDF | DOI
Wojcik, Paula, The Wikipedia Republic of Literary Characters. Journal of Cultural Analytics 8 (2). , 2023 -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2985109GPT-wee: How Small Can a Small Language Model Really Get?PUB | PDF | DOI | Download (ext.)
Bunzeck, Bastian, GPT-wee: How Small Can a Small Language Model Really Get?. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning (). Stroudsburg, PA, 2023 -
2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980943Hexatomic: An extensible, OS-independent platform fordeep multi-layer linguistic annotation of corporaPUB | PDF | DOI
Druskat, Stephan, Hexatomic: An extensible, OS-independent platform fordeep multi-layer linguistic annotation of corpora. Journal of Open Source Software 8 (86). , 2023 -
2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2982902Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Large Language ModelsPUB | PDF
Bunzeck, Bastian, Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Large Language Models. Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD) (). Stroudsburg, PA, 2023