12 Publikationen

Alle markieren

  • [12]
    2025 | Preprint | PUB-ID: 3001572 OA
    Bunzeck, Bastian, Duran, Daniel, and Zarrieß, Sina. “Do Construction Distributions Shape Formal Language Learning In German BabyLMs?”. arXiv:2503.11593 (2025).
    PUB | PDF | DOI | arXiv
     
  • [11]
    2025 | Preprint | PUB-ID: 3000929 OA
    Bunzeck, Bastian, and Zarrieß, Sina. “Subword models struggle with word learning, but surprisal hides it”. arXiv:2502.12835 (2025).
    PUB | PDF | DOI | arXiv
     
  • [10]
    2025 | Konferenzbeitrag | PUB-ID: 3000275 OA
    Bunzeck, Bastian, Duran, Daniel, Schade, Leonie, and Zarrieß, Sina. “Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas”. Proceedings of the 31st International Conference on Computational Linguistics. Ed. Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, and Steven Schockaert. Abu Dhabi, UAE: Association for Computational Linguistics, 2025. 6039-6048.
    PUB | PDF | Download (ext.)
     
  • [9]
    2024 | Konferenzbeitrag | PUB-ID: 3001254 OA
    Bunzeck, Bastian, Duran, Daniel, Schade, Leonie, and Zarrieß, Sina. “Graphemes vs. phonemes: battling it out in character-based language models”. The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning. Ed. Michael Y. Hu, Aaron Mueller, Candace Ross, Adina Williams, Tal Linzen, Chengxu Zhuang, Leshem Choshen, Ryan Cotterell, Alex Warstadt, and Ethan Gotlieb Wilcox. Miami, FL, USA: Association for Computational Linguistics, 2024. 54-64.
    PUB | PDF | Download (ext.)
     
  • [8]
    2024 | Konferenzbeitrag | PUB-ID: 2993430 OA
    Bunzeck, Bastian, and Zarrieß, Sina. “Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unruly”. Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning. Ed. Amy Qiu, Bill Noble, David Pagmar, Vladislav Maraev, and Nikolai Ilinykh. Kerrville, TX: Association for Computational Linguistics, 2024. 39-55.
    PUB | PDF | Download (ext.)
     
  • [7]
    2024 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2999608
    Bunzeck, Bastian, and Diessel, Holger. “The richness of the stimulus: Constructional variation and development in child-directed speech”. First Language (2024).
    PUB | DOI
     
  • [6]
    2024 | Konferenzbeitrag | PUB-ID: 2994136
    Bunzeck, Bastian, and Zarrieß, Sina. “The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns”. Proceedings of the 2nd GenBench Workshop on Generalisation (Benchmarking) in NLP. Ed. Dieuwke Hupkes, Verna Dankers, Khuyagbaatar Batsuren, Amirhossein Kazemnejad, Christos Christodoulopoulos, Mario Giulianelli, and Ryan Cotterell. Miami, Florida, USA: Association for Computational Linguistics, 2024. 42-53.
    PUB | Download (ext.)
     
  • [5]
    2023 | Datenpublikation | PUB-ID: 2993810
    Wojcik, Paula, Bunzeck, Bastian, and Zarrieß, Sina. Replication Data for: "The Wikipedia Republic of Literary Characters". Harvard Dataverse, 2023.
    PUB | Dateien verfügbar | DOI
     
  • [4]
    2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980942 OA
    Wojcik, Paula, Bunzeck, Bastian, and Zarrieß, Sina. “The Wikipedia Republic of Literary Characters”. Journal of Cultural Analytics 8.2 (2023).
    PUB | PDF | DOI
     
  • [3]
    2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2985109 OA
    Bunzeck, Bastian, and Zarrieß, Sina. “GPT-wee: How Small Can a Small Language Model Really Get?”. Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning. Ed. Alex Warstadt, Aaron Mueller, Leshem Choshen, Ethan Wilcox, Chengxu Zhuang, Juan Ciro, Rafael Mosquera, Bhargavi Paranjabe, Adina Williams, Tal Linzen, and Ryan Cotterell. Stroudsburg, PA: Association for Computational Linguistics, 2023. 35-46.
    PUB | PDF | DOI | Download (ext.)
     
  • [2]
    2023 | Zeitschriftenaufsatz | Veröffentlicht | PUB-ID: 2980943 OA
    Druskat, Stephan, Krause, Thomas, Lachenmaier, Clara, and Bunzeck, Bastian. “Hexatomic: An extensible, OS-independent platform fordeep multi-layer linguistic annotation of corpora”. Journal of Open Source Software 8.86 (2023): 4825.
    PUB | PDF | DOI
     
  • [1]
    2023 | Konferenzbeitrag | Veröffentlicht | PUB-ID: 2982902 OA
    Bunzeck, Bastian, and Zarrieß, Sina. “Entrenchment Matters: Investigating Positional and Constructional Sensitivity in Small and Large Language Models”. Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD). Ed. Ellen Breitholtz, Shalom Lappin, Sharid Loaiciga, Nikolai Ilinykh, and Simon Dobnik. Stroudsburg, PA: Association for Computational Linguistics, 2023. 25-37.
    PUB | PDF
     

Suche

Publikationen filtern

Darstellung / Sortierung

Export / Einbettung