Simple auditory and visual features for human-robot dialog scene analysis

Yan R, Rodemann T, Wrede B (2013)
In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE: 700-706.

Download
Es wurde kein Volltext hochgeladen. Nur Publikationsnachweis!
Konferenzbeitrag | Veröffentlicht | Englisch
Autor
; ;
Abstract / Bemerkung
This paper presents a system that uses various simple auditory and visual features to achieve human-robot dialog scene analysis. Our scene analysis system is able to learn how many speakers are in the scenario, where the speakers are and who is currently speaking. Speakers are unknown in advance. A visual short-term-memory (STM) helps to memorize persons, even if they disappear from the camera's field of view for a while due to movements of persons or the robot head. In comparison to our previous work, we apply more visual features such as height, color and texture features of different upper body parts, to improve the scene representation performance. We show that our system is able to assign words to corresponding speakers. A speaker is recognized again when he leaves and enters the scene, or changes his position even with a newly appearing person.
Erscheinungsjahr
Titel des Konferenzbandes
2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
Seite
700-706
Konferenz
2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
Konferenzort
Vilamoura, Portugal
Konferenzdatum
2012-10-07 – 2012-10-12
PUB-ID

Zitieren

Yan R, Rodemann T, Wrede B. Simple auditory and visual features for human-robot dialog scene analysis. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE; 2013: 700-706.
Yan, R., Rodemann, T., & Wrede, B. (2013). Simple auditory and visual features for human-robot dialog scene analysis. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 700-706. Piscataway, NJ: IEEE. doi:10.1109/iros.2012.6385534
Yan, R., Rodemann, T., and Wrede, B. (2013). “Simple auditory and visual features for human-robot dialog scene analysis” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (Piscataway, NJ: IEEE), 700-706.
Yan, R., Rodemann, T., & Wrede, B., 2013. Simple auditory and visual features for human-robot dialog scene analysis. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE, pp. 700-706.
R. Yan, T. Rodemann, and B. Wrede, “Simple auditory and visual features for human-robot dialog scene analysis”, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Piscataway, NJ: IEEE, 2013, pp.700-706.
Yan, R., Rodemann, T., Wrede, B.: Simple auditory and visual features for human-robot dialog scene analysis. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. p. 700-706. IEEE, Piscataway, NJ (2013).
Yan, Rujiao, Rodemann, Tobias, and Wrede, Britta. “Simple auditory and visual features for human-robot dialog scene analysis”. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE, 2013. 700-706.