Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition
Philippsen AK (2018)
Bielefeld: Universität Bielefeld.
Bielefelder E-Dissertation | Englisch
Download
Autor*in
Philippsen, Anja Kristina
Gutachter*in / Betreuer*in
Einrichtung
Abstract / Bemerkung
Recent speech recognition and speech synthesis systems still lack the flexibility and adaptivity of humans beings. There are two main reasons for that: First, they handle recognition and synthesis as separate problems. The link between perception and production is missing, whereas in infants' learning those two modalities are closely interconnected. Second, state-of-the-art systems model the mature system. Infants' speech processing skills, in contrast, develop gradually over time.
In this thesis, I present a model for learning articulatory control that is inspired by the babbling behavior of infants. The implemented system explores in a goal-directed way and learns to produce vowel and syllable sounds with an articulatory speech synthesizer that is based on a three-dimensional vocal tract model.
A focus of this thesis resides on modeling the influence of ambient language on the learning process. A low-dimensional embedding is derived from a set of ambient speech sounds.
In this way, the system's perception is shaped by the speech it is exposed to in its environment.
Using this low-dimensional embedding space as a goal space for goal-directed exploration, the system autonomously learns to produce vowel and syllable sounds that are present in ambient speech. Articulatory trajectories are represented with a dynamic system using articulatory motor primitives. Time variation in the acoustic features is integrated using the model space of a recurrent neural network.
Because of its developmental nature, the implemented framework is not only applicable to speech production in robots, but also a valuable tool for reproducing and investigating aspects of human speech acquisition. Specifically, I examine the influence of infant-directed speech on articulatory learning by training models with different ambient speech backgrounds, and I demonstrate that the learned models, analogously to human listeners, exhibit categorical speech perception.
Jahr
2018
Seite(n)
206
Page URI
https://pub.uni-bielefeld.de/record/2921296
Zitieren
Philippsen AK. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld; 2018.
Philippsen, A. K. (2018). Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.
Philippsen, Anja Kristina. 2018. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.
Philippsen, A. K. (2018). Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.
Philippsen, A.K., 2018. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition, Bielefeld: Universität Bielefeld.
A.K. Philippsen, Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition, Bielefeld: Universität Bielefeld, 2018.
Philippsen, A.K.: Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Universität Bielefeld, Bielefeld (2018).
Philippsen, Anja Kristina. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld, 2018.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
Open Access
Zuletzt Hochgeladen
2019-09-06T09:19:00Z
MD5 Prüfsumme
d2839b79c9396564c0f782bf214219b1