Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition

Philippsen, Anja Kristina

Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition

Philippsen AK (2018)
Bielefeld: Universität Bielefeld.

Bielefelder E-Dissertation | Englisch

Download

Philippsen_Dissertation.zip

URN

urn:nbn:de:0070-pub-29212968

Autor*in

Philippsen, Anja Kristina

Gutachter*in / Betreuer*in

Wrede, Britta^UniBi

Einrichtung

Technische Fakultät

Abstract / Bemerkung

Recent speech recognition and speech synthesis systems still lack the flexibility and adaptivity of humans beings. There are two main reasons for that: First, they handle recognition and synthesis as separate problems. The link between perception and production is missing, whereas in infants' learning those two modalities are closely interconnected. Second, state-of-the-art systems model the mature system. Infants' speech processing skills, in contrast, develop gradually over time. In this thesis, I present a model for learning articulatory control that is inspired by the babbling behavior of infants. The implemented system explores in a goal-directed way and learns to produce vowel and syllable sounds with an articulatory speech synthesizer that is based on a three-dimensional vocal tract model. A focus of this thesis resides on modeling the influence of ambient language on the learning process. A low-dimensional embedding is derived from a set of ambient speech sounds. In this way, the system's perception is shaped by the speech it is exposed to in its environment. Using this low-dimensional embedding space as a goal space for goal-directed exploration, the system autonomously learns to produce vowel and syllable sounds that are present in ambient speech. Articulatory trajectories are represented with a dynamic system using articulatory motor primitives. Time variation in the acoustic features is integrated using the model space of a recurrent neural network. Because of its developmental nature, the implemented framework is not only applicable to speech production in robots, but also a valuable tool for reproducing and investigating aspects of human speech acquisition. Specifically, I examine the influence of infant-directed speech on articulatory learning by training models with different ambient speech backgrounds, and I demonstrate that the learned models, analogously to human listeners, exhibit categorical speech perception.

Jahr

2018

Seite(n)

206

Page URI

https://pub.uni-bielefeld.de/record/2921296

Zitieren

Philippsen AK. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld; 2018.

Philippsen, A. K. (2018). Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.

Philippsen, Anja Kristina. 2018. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.

Philippsen, A. K. (2018). Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld.

Philippsen, A.K., 2018. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition, Bielefeld: Universität Bielefeld.

A.K. Philippsen, Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition, Bielefeld: Universität Bielefeld, 2018.

Philippsen, A.K.: Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Universität Bielefeld, Bielefeld (2018).

Philippsen, Anja Kristina. Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition. Bielefeld: Universität Bielefeld, 2018.

Alle Dateien verfügbar unter der/den folgenden Lizenz(en):

Copyright Statement: