TY - GEN
T1 - Robot audition for dynamic environments
AU - Nakadai, Kazuhiro
AU - Ince, Gokhan
AU - Nakamura, Keisuke
AU - Nakajima, Hirofumi
PY - 2012
Y1 - 2012
N2 - This paper addresses robot audition for dynamic environments, where speakers and/or a robot is moving within a dynamically-changing acoustic environment. Robot Audition studied so far assumed only stationary human-robot interaction scenes, and thus they have difficulties in coping with such dynamic environments. We recently developed new techniques for a robot to listen to several things simultaneously using its own ears even in dynamic environments; MUltiple SIgnal Classification based on Generalized Eigen-Value Decomposition (GEVD-MUSIC), Geometrically constrained High-order Decorrelation based Source Separation with Adaptive Step-size control (GHDSS-AS), Histogram-based Recursive Level Estimation (HRLE), and Template-based Ego Noise Suppression (TENS). GEVD-MUSIC provides noise-robust sound source localization. GHDSS-AS is a new sound source separation method which quickly adapts its sound source separation parameters to dynamic changes. HRLE is a practical post-filtering method with a small number of parameters. ENS estimates the motor noise of the robot by using templates recorded in advance and eliminates it. These methods are implemented as modules for our open-source robot audition software HARK to be easily integrated. We show that each of these methods and their combinations are effective to cope with dynamic environments through off-line experiments and on-line real-time demonstrations.
AB - This paper addresses robot audition for dynamic environments, where speakers and/or a robot is moving within a dynamically-changing acoustic environment. Robot Audition studied so far assumed only stationary human-robot interaction scenes, and thus they have difficulties in coping with such dynamic environments. We recently developed new techniques for a robot to listen to several things simultaneously using its own ears even in dynamic environments; MUltiple SIgnal Classification based on Generalized Eigen-Value Decomposition (GEVD-MUSIC), Geometrically constrained High-order Decorrelation based Source Separation with Adaptive Step-size control (GHDSS-AS), Histogram-based Recursive Level Estimation (HRLE), and Template-based Ego Noise Suppression (TENS). GEVD-MUSIC provides noise-robust sound source localization. GHDSS-AS is a new sound source separation method which quickly adapts its sound source separation parameters to dynamic changes. HRLE is a practical post-filtering method with a small number of parameters. ENS estimates the motor noise of the robot by using templates recorded in advance and eliminates it. These methods are implemented as modules for our open-source robot audition software HARK to be easily integrated. We show that each of these methods and their combinations are effective to cope with dynamic environments through off-line experiments and on-line real-time demonstrations.
KW - Dynamic environment
KW - Ego noise suppression
KW - Microphone array
KW - Robot audition
KW - Sound source localization
KW - Sound source separation
UR - http://www.scopus.com/inward/record.url?scp=84869471260&partnerID=8YFLogxK
U2 - 10.1109/ICSPCC.2012.6335729
DO - 10.1109/ICSPCC.2012.6335729
M3 - Conference contribution
AN - SCOPUS:84869471260
SN - 9781467321938
T3 - 2012 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2012
SP - 125
EP - 130
BT - 2012 IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2012
T2 - 2012 2nd IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2012
Y2 - 12 August 2012 through 15 August 2012
ER -