Robust speech recognition using articulatory information

Kirchhoff K (1999)
Bielefeld (Germany): Bielefeld University.

Bielefelder E-Dissertation | Englisch
 
Download
OA
Autor*in
Kirchhoff, Katrin
Gutachter*in / Betreuer*in
Sagerer, Gerhard (Prof. Dr.-Ing.)
Abstract / Bemerkung
Current automatic speech recognition systems make use of a single source of information about their input, viz. a preprocessed form of the acoustic speech signal, which encodes the time-frequency distribution of signal energy. The goal of this thesis is to investigate the benefits of integrating articulatory information into state-of-the art speech recognizers, either as a genuine alternative to standard acoustic representations, or as an additional source of information. Articulatory information is represented in terms of abstract articulatory classes or "features", which are extracted from the speech signal by means of statistical classifiers. A higher-level classifier then combines the scores for these features and maps them to standard subword unit probabilities. The main motivation for this approach is to improve the robustness of speech recognition systems in adverse acoustic environments, such as background noise. Typically, recognition systems show a sharp decline of performance under these conditions. We argue and demonstrate empirically that the articulatory feature approach can lead to greater robustness by enhancing the accuracy of the bottom-up acoustic modeling component in a speech recognition system. The second focus point of this thesis is to provide detailed analyses of the different types of information provided by the acoustic and the articulatory representations, respectively, and to develop strategies to optimally combine them. To this effect we investigate combination methods at the levels of feature extraction, subword unit probability estimation, and word recognition. The feasibility of this approach is demonstrated with respect to two different speech recognition tasks. The first of these is an American English corpus of telephone-bandwidth speech; the recognition domain is continuous numbers. The second is a German database of studio-quality speech consisting of spontaneous dialogues. In both cases recognition performance will be tested not only under clean acoustic conditions but also under deteriorated conditions.
Stichworte
Automatische Spracherkennung , Robustheit , Artikulation , Akustisches Signal , Automatische Klassifikation , , Speech recognition , Articulation , Pattern recognition
Jahr
1999
Page URI
https://pub.uni-bielefeld.de/record/2302713

Zitieren

Kirchhoff K. Robust speech recognition using articulatory information. Bielefeld (Germany): Bielefeld University; 1999.
Kirchhoff, K. (1999). Robust speech recognition using articulatory information. Bielefeld (Germany): Bielefeld University.
Kirchhoff, Katrin. 1999. Robust speech recognition using articulatory information. Bielefeld (Germany): Bielefeld University.
Kirchhoff, K. (1999). Robust speech recognition using articulatory information. Bielefeld (Germany): Bielefeld University.
Kirchhoff, K., 1999. Robust speech recognition using articulatory information, Bielefeld (Germany): Bielefeld University.
K. Kirchhoff, Robust speech recognition using articulatory information, Bielefeld (Germany): Bielefeld University, 1999.
Kirchhoff, K.: Robust speech recognition using articulatory information. Bielefeld University, Bielefeld (Germany) (1999).
Kirchhoff, Katrin. Robust speech recognition using articulatory information. Bielefeld (Germany): Bielefeld University, 1999.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Access Level
OA Open Access
Zuletzt Hochgeladen
2019-09-06T08:57:41Z
MD5 Prüfsumme
673bbc764b8aeed8aa06a6d829a463ea


Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar