Assessment of general applicability of ego noise estimation: Applications to automatic speech recognition and sound source localization

Gökhan Ince*, Keisuke Nakamura, Futoshi Asano, Hirofumi Nakajima, Kazuhiro Nakadai

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Citations (Scopus)

Abstract

Noise generated due to the motion of a robot deteriorates the quality of the desired sounds recorded by robot-embedded microphones. On top of that, a moving robot is also vulnerable to its loud fan noise that changes its orientation relative to the moving limbs where the microphones are mounted on. To tackle the non-stationary ego-motion noise and the direction changes of fan noise, we propose an estimation method based on instantaneous prediction of ego noise using parameterized templates. We verify the ego noise suppression capability of the proposed estimation method on a humanoid robot by evaluating it on two important applications in the framework of robot audition: (1) automatic speech recognition and (2) sound source localization. We demonstrate that our method improves recognition and localization performance during both head and arm motions considerably.

Original languageEnglish
Title of host publication2011 IEEE International Conference on Robotics and Automation, ICRA 2011
Pages3517-3522
Number of pages6
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event2011 IEEE International Conference on Robotics and Automation, ICRA 2011 - Shanghai, China
Duration: 9 May 201113 May 2011

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
ISSN (Print)1050-4729

Conference

Conference2011 IEEE International Conference on Robotics and Automation, ICRA 2011
Country/TerritoryChina
CityShanghai
Period9/05/1113/05/11

Fingerprint

Dive into the research topics of 'Assessment of general applicability of ego noise estimation: Applications to automatic speech recognition and sound source localization'. Together they form a unique fingerprint.

Cite this