Multimodal Human Machine Interactions in Virtual and Augmented Reality

Gérard Chollet, Anna Esposito, Annie Gentes, Patrick Horain, Walid Karam, Zhenbo Li, Catherine Pelachaud, Patrick Perrot, Dijana Petrovska-Delacrétaz, Dianle Zhou and Leila Zouari

Publication reference:

G. Chollet, A. Esposito, A. Gentes, P. Horain, W. Karam, Z. Li, C. Pelachaud, P. Perrot, D. Petrovska-Delacrétaz, D. Zhou, L. Zouari, "Multimodal Human Machine Interactions in Virtual and Augmented Reality", in Multimodal Signals: Cognitive and Algorithmic Issues. Selected papers from COST Action 2102 1st International Training School 2008. Anna Esposito, Amir Hussain, Maria Marinaro (Editors), LNCS Volume 5398/2009, Springer-Verlag, pp. 1-23, 2008 [doi:10.1007/978-3-642-00525-1_1] [].


Virtual worlds are developing rapidly over the Internet. They are visited by avatars and staffed with Embodied Conversational Agents (ECAs). An avatar is a representation of a physical person. Each person controls one or several avatars and usually receives feedback from the virtual world on an audio-visual display. Ideally, all senses should be used to feel fully embedded in a virtual world. Sound, vision and sometimes touch are the available modalities. This paper reviews the technological developments which enable audio-visual interactions in virtual and augmented reality worlds. Emphasis is placed on speech and gesture interfaces, including talking face analysis and synthesis.

Full text:

Copyright 2008 Springer-Verlag. The copyright to this Contribution is transferred to Springer-Verlag GmbH Berlin Heidelberg. This article is published on Springer’s website.
PDF (377 kbytes).