Multi-talker speech recognition under ego-motion noise using missing feature theory

Gökhan Ince*, Kazuhiro Nakadai, Tobias Rodemann, Hiroshi Tsujino, Jun Ichi Imura

*Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: ???type-name???Konferans katkısıbilirkişi

3 Atıf (Scopus)

Özet

This paper presents a system that gives a mobile robot the ability to recognize target speaker's speech, even if the robot performs an action and there are multiple speakers talking in the room. Associated problems to this system are twofold: (1) While the robot is moving, the joints inevitably generate ego-motion noise due to its motors. (2) Recognizing target speech against other interfering speech signals is a difficult task. Since typical solutions to (1) and (2), motor noise suppression and sound source separation, both introduce distortion to the processed signals, the performance of automatic speech recognition (ASR) deteriorates. Instead of removing the ego-motion noise with conventional noise suppression methods, in this work, we investigate methods to eliminate the unreliable parts of the audio features that are contaminated by the ego-motion noise. For this purpose, we model masks that filter unreliable speech features based on the ratio of speech and motor noise energies. We analyze the performance of the proposed technique under various test conditions by comparing it to the performance of existing Missing Feature Theory-based ASR implementations. Finally, we propose an integration framework for two different masks that are designed to eliminate ego noise and to filter the leakage energy of interfering sound sources. We demonstrate that the proposed methods achieve a high ASR accuracy.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings
Sayfalar982-987
Sayfa sayısı6
DOI'lar
Yayın durumuYayınlandı - 2010
Harici olarak yayınlandıEvet
Etkinlik23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Taipei, Taiwan, Province of China
Süre: 18 Eki 201022 Eki 2010

Yayın serisi

AdıIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010
Ülke/BölgeTaiwan, Province of China
ŞehirTaipei
Periyot18/10/1022/10/10

Parmak izi

Multi-talker speech recognition under ego-motion noise using missing feature theory' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap