Sound source separation and automatic speech recognition for moving sources

Kazuhiro Nakadai*, Hirofumi Nakajima, Gökhan Ince, Yuji Hasegawa

*Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümKonferans katkısıbilirkişi

6 Atıf (Scopus)

Özet

This paper addresses sound source separation and speech recognition for moving sound sources. Real-world applications such as robots should cope with both moving and stationary sound sources. However, most studies assume only stationary sound sources. We introduce three key techniques to cope with moving sources, that is, Adaptive Step-size control (AS), Optima Controlled Recursive Average (OCRA), and Separation Parameter Switching (SPS). We implemented a real-time robot audition system with these techniques for our humanoid robot with an 8ch microphone array by using HARK which is our open-source software for robot audition. Preliminary results show that the performance of recognition of moving sound sources improved drastically, and also the performance of the system is shown through two speech dialog scenarios which requires sound source separation and automatic speech recognition for moving sources.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings
Sayfalar976-981
Sayfa sayısı6
DOI'lar
Yayın durumuYayınlandı - 2010
Harici olarak yayınlandıEvet
Etkinlik23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Taipei, Taiwan, Province of China
Süre: 18 Eki 201022 Eki 2010

Yayın serisi

AdıIEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010 - Conference Proceedings

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???23rd IEEE/RSJ 2010 International Conference on Intelligent Robots and Systems, IROS 2010
Ülke/BölgeTaiwan, Province of China
ŞehirTaipei
Periyot18/10/1022/10/10

Parmak izi

Sound source separation and automatic speech recognition for moving sources' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap