The shrink point: audiovisual integration of speech-gesture synchrony

Kirchhof, Carolin

The shrink point: audiovisual integration of speech-gesture synchrony

Kirchhof C (2017)
Bielefeld: Universität Bielefeld.

Bielefelder E-Dissertation | Englisch

Download

Diss_Pub_CKirchhof.pdf

URN

urn:nbn:de:0070-pub-29087622

Autor*in

Kirchhof, Carolin^UniBi

Gutachter*in / Betreuer*in

Wagner, Petra^UniBi ; Gibbon, Dafydd^UniBi

Einrichtung

Fakultät für Linguistik und Literaturwissenschaft

Abstract / Bemerkung

Up to now, the focus in gesture research has long been on the production of speech-accompanying gestures and on how speech-gesture utterances contribute to communication. An issue that has mostly been neglected is in how far listeners even perceive the gesture-part of a multimodal utterance. For instance, there has been a major focus on the lexico-semiotic connection between spontaneously coproduced gestures and speech in gesture research (e.g., de Ruiter, 2007; Kita & Özyürek, 2003; Krauss, Chen & Gottesman, 2000). Due to the rather precise timing between the prosodic peak in speech with the most prominent stroke of the gesture phrase in production, Schegloff (1984) and Krauss, Morrel-Samuels and Colasante (1991; also Rauscher, Krauss & Chen, 1996), among others, coined the phenomenon of lexical affiliation. By following Krauss et al. (1991), the first empirical study of this dissertation investigates the nature of the semiotic relation between speech and gestures, focusing on its applicability to temporal perception and comprehension. When speech and lip movements diverge too far from the original production synchrony, this can be highly irritating to the viewer, even when audio and video stem from the same original recording (e.g., Vatakis, Navarra, Soto-Faraco & Spence, 2008; Feyereisen, 2007) – there is only a small temporal window of audiovisual integration (AVI) within which viewer-listeners can internally align discrepancies between lip movements and the speech supposedly produced by these (e.g. McGurk & MacDonald, 1976). Several studies in the area of psychophysics (e.g., Nishida, 2006; Fujisaki & Nishida, 2005) found that there is also a time window for the perceptual alignment of nonspeech visual and auditory signals. These and further studies on the AVI of speech-lip asynchronies have inspired research on the perception of speech-gesture utterances. McNeill, Cassell, and McCullough (1994; Cassell, McNeill & McCullough, 1999), for instance, discovered that listeners take up information even from artificially combined speech and gestures. More recent studies researching the AVI of speech and gestures have employed event-related potential (ERP) monitoring as a methodological means to investigate the perception of multimodal utterances (e.g., Gullberg & Holmqvist, 1999; 2006; Özyürek, Willems, Kita & Hagoort, 2007; Habets, Kita, Shao, Özyürek & Hagoort, 2011). While the aforementioned studies from the fields of psychophysics and speech-only and speech-gesture research have contributed greatly to theories of how listeners perceive multimodal signals, there has been a lack of explorations of natural data and of dyadic situations. This dissertation investigates the perception of naturally produced speech-gesture utterances by having participants rate the naturalness of synchronous and asynchronous versions of speech-gesture utterances using different qualitative and quantitative methodologies such as an online rating study and a preference task. Drawing, for example, from speech-gesture production models based on Levelt's (1989) model of speech production (e.g., de Ruiter, 1998; 2007; Krauss et al., 2000; Kita & Özyürek, 2003) and founding on the results and analyses of the studies conducted for this dissertation, I finally propose a model draft of a possible transmission cycle between Growth Point (e.g., McNeill, 1985; 1992) and Shrink Point, the perceptual counterpart to the Growth Point. This model includes the temporal and semantic alignment of speech and different gesture types as well as their audiovisual and conceptual integration during perception. The perceptual studies conducted within the scope of this dissertation have revealed varying temporal ranges in which an asynchrony in speechgesture utterances is integrable by the listener, especially iconic gestures.

Jahr

2017

Page URI

https://pub.uni-bielefeld.de/record/2908762

Zitieren

Kirchhof C. The shrink point: audiovisual integration of speech-gesture synchrony. Bielefeld: Universität Bielefeld; 2017.

Kirchhof, C. (2017). The shrink point: audiovisual integration of speech-gesture synchrony. Bielefeld: Universität Bielefeld.

Kirchhof, Carolin. 2017. The shrink point: audiovisual integration of speech-gesture synchrony. Bielefeld: Universität Bielefeld.

Kirchhof, C. (2017). The shrink point: audiovisual integration of speech-gesture synchrony. Bielefeld: Universität Bielefeld.

Kirchhof, C., 2017. The shrink point: audiovisual integration of speech-gesture synchrony, Bielefeld: Universität Bielefeld.

C. Kirchhof, The shrink point: audiovisual integration of speech-gesture synchrony, Bielefeld: Universität Bielefeld, 2017.

Kirchhof, C.: The shrink point: audiovisual integration of speech-gesture synchrony. Universität Bielefeld, Bielefeld (2017).

Kirchhof, Carolin. The shrink point: audiovisual integration of speech-gesture synchrony. Bielefeld: Universität Bielefeld, 2017.

Alle Dateien verfügbar unter der/den folgenden Lizenz(en):

Copyright Statement:

Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]

Volltext(e)

Name

Diss_Pub_CKirchhof.pdf

Access Level

Open Access

Zuletzt Hochgeladen

2019-09-06T09:18:43Z

MD5 Prüfsumme

15d669b3fc658eae4a9e0e7d5559a2ad

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar

PUB - Publikationen an der Universität Bielefeld

The shrink point: audiovisual integration of speech-gesture synchrony

Zitieren