Incrementally resolving references in order to identify visually present objects in a situated dialogue setting
Kennington C (2016)
Bielefeld: Universität Bielefeld.
Bielefelder E-Dissertation | Englisch
Download
Autor*in
Gutachter*in / Betreuer*in
Einrichtung
Abstract / Bemerkung
The primary concern of this thesis is to model the resolution of spoken referring expressions
made in order to identify objects; in particular, everyday objects that can be perceived visually
and distinctly from other objects. The practical goal of such a model is for it to be implemented
as a component for use in a live, interactive, autonomous spoken dialogue system. The requirement of interaction imposes an added complication; one that has been ignored in previous
models and approaches to automatic reference resolution: the model must attempt to resolve
the reference incrementally as it unfolds–not wait until the end of the referring expression to
begin the resolution process.
Beyond components in dialogue systems, reference has been a major player in the philosophy of meaning for longer than a century. For example, Gottlob Frege (1892) has distinguished
between Sinn (sense) and Bedeutung (reference), discussed how they are related and how they
relate to the meaning of words and expressions. It has furthermore been argued (e.g., Dahlgren
(1976)) that reference to entities in the actual world is not just a fundamental notion of semantic theory, but the fundamental notion; for an individual acquiring a language, understanding
the meaning of many words and concepts is done via the task of reference, beginning in early
childhood. In this thesis, we pursue an account of word meaning that is based on perception of
objects; for example, the meaning of the word red is based on visual features that are selected
as distinguishing red objects from non-red ones.
This thesis proposes two statistical models of incremental reference resolution. Given ex-
amples of referring expressions and visual aspects of the objects to which those expressions
referred, both model components learn a functional mapping between the words of the refer-
ring expressions and the visual aspects. A generative model, the simple incremental update
model, presented in Chapter 5, uses a mediating variable to learn the mapping, whereas a dis-
criminative model, the words-as-classifiers model, presented in Chapter 6, learns the mapping
directly and improves over the generative model. Both models have been evaluated in various
reference resolution tasks to objects in virtual scenes as well as real, tangible objects. This
thesis shows that both models work robustly and are able to resolve referring expressions made
in reference to visually present objects despite realistic, noisy conditions of speech and object
recognition. A theoretical and practical comparison is also provided.
Special emphasis is given to the discriminative model in this thesis because of its simplicity
and ability to represent word meanings. It is in the learning and application of this model that
gives credence to the above claim that reference is the fundamental notion for semantic theory
and that meanings of (visual) words is done through experiencing referring expressions made
to objects that are visually perceivable.
Jahr
2016
Seite(n)
220
Page URI
https://pub.uni-bielefeld.de/record/2902194
Zitieren
Kennington C. Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld; 2016.
Kennington, C. (2016). Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld.
Kennington, Casey. 2016. Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld.
Kennington, C. (2016). Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld.
Kennington, C., 2016. Incrementally resolving references in order to identify visually present objects in a situated dialogue setting, Bielefeld: Universität Bielefeld.
C. Kennington, Incrementally resolving references in order to identify visually present objects in a situated dialogue setting, Bielefeld: Universität Bielefeld, 2016.
Kennington, C.: Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Universität Bielefeld, Bielefeld (2016).
Kennington, Casey. Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld, 2016.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Name
Access Level
Open Access
Zuletzt Hochgeladen
2019-09-25T06:47:03Z
MD5 Prüfsumme
20159f2e305a4b64401d3766378c111e