TY - GEN
T1 - Human gesture analysis using multimodal features
AU - Dan, Luo
AU - Ekenel, Hazim Kemal
AU - Jun, Ohya
PY - 2012
Y1 - 2012
N2 - Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
AB - Human gesture as a natural interface plays an utmost important role for achieving intelligent Human Computer Interaction (HCI). Human gestures include different components of visual actions such as motion of hands, facial expression, and torso, to convey meaning. So far, in the field of gesture recognition, most previous works have focused on the manual component of gestures. In this paper, we present an appearance-based multimodal gesture recognition framework, which combines the different groups of features such as facial expression features and hand motion features which are extracted from image frames captured by a single web camera. We refer 12 classes of human gestures with facial expression including neutral, negative and positive meanings from American Sign Languages (ASL). We combine the features in two levels by employing two fusion strategies. At the feature level, an early feature combination can be performed by concatenating and weighting different feature groups, and PLS is used to choose the most discriminative elements by projecting the feature on a discriminative expression space. The second strategy is applied on decision level. Weighted decisions from single modalities are fused in a later stage. A condensation-based algorithm is adopted for classification. We collected a data set with three to seven recording sessions and conducted experiments with the combination techniques. Experimental results showed that facial analysis improve hand gesture recognition, decision level fusion performs better than feature level fusion.
KW - Condensation Algorithm
KW - Facial Expression
KW - Gesture Recognition
UR - http://www.scopus.com/inward/record.url?scp=84866843771&partnerID=8YFLogxK
U2 - 10.1109/ICMEW.2012.88
DO - 10.1109/ICMEW.2012.88
M3 - Conference contribution
AN - SCOPUS:84866843771
SN - 9780769547299
T3 - Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012
SP - 471
EP - 476
BT - Proceedings of the 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012
T2 - 2012 IEEE International Conference on Multimedia and Expo Workshops, ICMEW 2012
Y2 - 9 July 2012 through 13 July 2012
ER -