Do Construction Distributions Shape Formal Language Learning In German BabyLMs?

Bunzeck B, Duran D, Zarrieß S (2025)
arXiv:2503.11593.

Preprint | Englisch
 
Download
OA 479.89 KB
Abstract / Bemerkung
We analyze the influence of utterance-level construction distributions in German child-directed speech on the resulting formal linguistic competence and the underlying learning trajectories for small language models trained on a novel collection of developmentally plausible language data for German. We find that trajectories are surprisingly robust for markedly different distributions of constructions in the training data, which have little effect on final accuracies and almost no effect on global learning trajectories. While syntax learning benefits from more complex utterances, lexical learning culminates in better scores with more fragmentary data. We argue that LMs trained on developmentally plausible data can contribute to debates on how rich or impoverished linguistic stimuli actually are.
Erscheinungsjahr
2025
Zeitschriftentitel
arXiv:2503.11593
Page URI
https://pub.uni-bielefeld.de/record/3001572

Zitieren

Bunzeck B, Duran D, Zarrieß S. Do Construction Distributions Shape Formal Language Learning In German BabyLMs? arXiv:2503.11593. 2025.
Bunzeck, B., Duran, D., & Zarrieß, S. (2025). Do Construction Distributions Shape Formal Language Learning In German BabyLMs? arXiv:2503.11593. https://doi.org/10.48550/arXiv.2503.11593
Bunzeck, Bastian, Duran, Daniel, and Zarrieß, Sina. 2025. “Do Construction Distributions Shape Formal Language Learning In German BabyLMs?”. arXiv:2503.11593.
Bunzeck, B., Duran, D., and Zarrieß, S. (2025). Do Construction Distributions Shape Formal Language Learning In German BabyLMs? arXiv:2503.11593.
Bunzeck, B., Duran, D., & Zarrieß, S., 2025. Do Construction Distributions Shape Formal Language Learning In German BabyLMs? arXiv:2503.11593.
B. Bunzeck, D. Duran, and S. Zarrieß, “Do Construction Distributions Shape Formal Language Learning In German BabyLMs?”, arXiv:2503.11593, 2025.
Bunzeck, B., Duran, D., Zarrieß, S.: Do Construction Distributions Shape Formal Language Learning In German BabyLMs? arXiv:2503.11593. (2025).
Bunzeck, Bastian, Duran, Daniel, and Zarrieß, Sina. “Do Construction Distributions Shape Formal Language Learning In German BabyLMs?”. arXiv:2503.11593 (2025).
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung - Weitergabe unter gleichen Bedingungen 4.0 International Public License (CC BY-SA 4.0):
Volltext(e)
Name
Access Level
OA Open Access
Zuletzt Hochgeladen
2025-03-17T09:08:17Z
MD5 Prüfsumme
8e669f2a67b81de809d924b9e35dfb8a


Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Quellen

arXiv: 2503.11593

Suchen in

Google Scholar