J. Richarz and G. A. Fink
Journal of Ambient Intelligence and Smart Environments, Thematic
Issue on Computer Vision for Ambient
Intelligence, 3(3), pages 193-211, 2011.
In human-machine interaction, gestures play an important role as input modality for natural and intuitive interfaces. While the usage of sign languages or crafted command gestures typically requires special user training, the class of gestural actions called ”emblems” represents more intuitive yet expressive signs that seem well suited for the task. Following this, an approach for the visual recognition of 3D emblematic arm gestures in a realistic smart room scenario is presented. Hand and head positions are extracted in multiple unsynchronized monocular camera streams, combined to spatiotemporal 3D gesture trajectories and classified in an Hidden Markov Model (HMM) classification and detection framework. The contributions within this article are threefold: Firstly, a solution for the 3D combination of trajectories obtained from unsynchronized cameras and with varying frame rates is proposed. Secondly, the suitability of different alternative feature representations derived from a hand trajectory is assessed, and it is shown that intuitive gestures can be represented by projection on their principal plane of motion. Thirdly, it is demonstrated that a rejection model for gesture spotting and segmentation can be constructed using out-of-domain data. The approach is evaluated on a challenging realistic data set.