Simple auditory and visual features for human-robot dialog scene analysis

Yan R, Rodemann T, Wrede B (2013)
In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE: 700-706.

Download
No fulltext has been uploaded. References only!
Conference Paper | Published | English

No fulltext has been uploaded

Author
; ;
Abstract
This paper presents a system that uses various simple auditory and visual features to achieve human-robot dialog scene analysis. Our scene analysis system is able to learn how many speakers are in the scenario, where the speakers are and who is currently speaking. Speakers are unknown in advance. A visual short-term-memory (STM) helps to memorize persons, even if they disappear from the camera's field of view for a while due to movements of persons or the robot head. In comparison to our previous work, we apply more visual features such as height, color and texture features of different upper body parts, to improve the scene representation performance. We show that our system is able to assign words to corresponding speakers. A speaker is recognized again when he leaves and enters the scene, or changes his position even with a newly appearing person.
Publishing Year
Conference
2012 IEEE/RSJ International Conference on Intelligent Robots and Systems
Location
Vilamoura, Portugal
Conference Date
2012-10-07 – 2012-10-12
PUB-ID

Cite this

Yan R, Rodemann T, Wrede B. Simple auditory and visual features for human-robot dialog scene analysis. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE; 2013: 700-706.
Yan, R., Rodemann, T., & Wrede, B. (2013). Simple auditory and visual features for human-robot dialog scene analysis. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 700-706. Piscataway, NJ: IEEE. doi:10.1109/iros.2012.6385534
Yan, R., Rodemann, T., and Wrede, B. (2013). “Simple auditory and visual features for human-robot dialog scene analysis” in 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (Piscataway, NJ: IEEE), 700-706.
Yan, R., Rodemann, T., & Wrede, B., 2013. Simple auditory and visual features for human-robot dialog scene analysis. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE, pp. 700-706.
R. Yan, T. Rodemann, and B. Wrede, “Simple auditory and visual features for human-robot dialog scene analysis”, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Piscataway, NJ: IEEE, 2013, pp.700-706.
Yan, R., Rodemann, T., Wrede, B.: Simple auditory and visual features for human-robot dialog scene analysis. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. p. 700-706. IEEE, Piscataway, NJ (2013).
Yan, Rujiao, Rodemann, Tobias, and Wrede, Britta. “Simple auditory and visual features for human-robot dialog scene analysis”. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, NJ: IEEE, 2013. 700-706.
This data publication is cited in the following publications:
This publication cites the following data publications:

Export

0 Marked Publications

Open Data PUB

Web of Science

View record in Web of Science®

Search this title in

Google Scholar
ISBN Search