Whole body motion noise cancellation of a robot for improved automatic speech recognition

Gökhan Ince*, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun Ichi Imura

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)

Abstract

The motors of a robot produce ego-motion noise that degrades the quality of recorded sounds. This paper describes an architecture that enhances the capability of a robot to perform automatic speech recognition (ASR) even as the entire body of the robot moves. The architecture consists of three blocks: (i) a multichannel noise reduction block, consisting of microphone-array-based sound localization, geometric source separation and post-filtering, (ii) a single-channel template subtraction block and (iii) an ASR block. As the first step of our analysis strategy, we divided the whole-body motion noise problem into three subdomains of arm, leg and head motion noise, according to their intensity levels and spatial location. Subsequently, by following a synthesis-by-analysis approach, we determined the best method for suppressing each type of ego-motion noise. Finally, we proposed to utilize a control module in our ASR framework; this module was designed to make decisions based on instantaneously detected motions, allowing it to switch to the most appropriate method for the current type of noise. This proposed system resulted in improvements of up to 50 points in word correct rates compared with results obtained by single microphone recognition of arm, leg and head motions.

Original languageEnglish
Pages (from-to)1405-1426
Number of pages22
JournalAdvanced Robotics
Volume25
Issue number11
DOIs
Publication statusPublished - 2011
Externally publishedYes

Keywords

  • Robot audition
  • automatic speech recognition
  • noise reduction
  • template subtraction
  • whole-body motion noise

Fingerprint

Dive into the research topics of 'Whole body motion noise cancellation of a robot for improved automatic speech recognition'. Together they form a unique fingerprint.

Cite this