Real-time super-resolution Sound Source Localization for robots

Keisuke Nakamura*, Kazuhiro Nakadai, Gokhan Ince

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

56 Citations (Scopus)


Sound Source Localization (SSL) is an essential function for robot audition and yields the location and number of sound sources, which are utilized for post-processes such as sound source separation. SSL for a robot in a real environment mainly requires noise-robustness, high resolution and real-time processing. A technique using microphone array processing, that is, Multiple Signal Classification based on Standard Eigen-Value Decomposition (SEVD-MUSIC) is commonly used for localization. We improved its robustness against noise with high power by incorporating Generalized EigenValue Decomposition (GEVD). However, GEVD-based MUSIC (GEVD-MUSIC) has mainly two issues: 1) the resolution of pre-measured Transfer Functions (TFs) determines the resolution of SSL, 2) its computational cost is expensive for real-time processing. For the first issue, we propose a TF interpolation method integrating time-domain-based and frequency-domain-based interpolation. The interpolation achieves super-resolution SSL, whose resolution is higher than that of the pre-measured TFs. For the second issue, we propose two methods, MUSIC based on Generalized Singular Value Decomposition (GSVD-MUSIC), and Hierarchical SSL (H-SSL). GSVD-MUSIC drastically reduces the computational cost while maintaining noise-robustness in localization. H-SSL also reduces the computational cost by introducing a hierarchical search algorithm instead of using greedy search in localization. These techniques are integrated into an SSL system using a robot embedded microphone array. The experimental result showed: the proposed interpolation achieved approximately 1 degree resolution although we have only TFs at 30 degree intervals, GSVD-MUSIC attained 46.4% and 40.6% of the computational cost compared to SEVD-MUSIC and GEVD-MUSIC, respectively, H-SSL reduces 59.2% computational cost in localization of a single sound source.

Original languageEnglish
Title of host publication2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2012
Number of pages6
Publication statusPublished - 2012
Externally publishedYes
Event25th IEEE/RSJ International Conference on Robotics and Intelligent Systems, IROS 2012 - Vilamoura, Algarve, Portugal
Duration: 7 Oct 201212 Oct 2012

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866


Conference25th IEEE/RSJ International Conference on Robotics and Intelligent Systems, IROS 2012
CityVilamoura, Algarve


Dive into the research topics of 'Real-time super-resolution Sound Source Localization for robots'. Together they form a unique fingerprint.

Cite this