Robust Neuro-Fuzzy Speaker Localization Using a Circular Microphone Array

Axel Plinge, Marius H. Hennecke and Gernot A. Fink
Proc. 12th International Workshop on Acoustic Echo and Noise Control (IWAENC), 2010.

Tel Aviv, Israel

BibTeX PDF

Abstract

A major application area of microphone array processing is the localization of sound sources, mainly of speaking persons. In contrast to most state-of-the-art approaches that are based on correlation measures, we propose a neurologically inspired system that generalizes findings about human spatial hearing to the multi-channel case. It mimics the processing in the human cochlea and the auditory mid-brain. To enhance the localization quality, a new spike generation approach is introduced, termed peak-over-average position (PoAP). A fuzzy combination is used to remove putative artifacts. In contrast to a human listener we employ multiple sensors to gain robustness in reverberant and noisy environments. Post-processing estimates the locations of concurrent speakers. The robustness of the proposed system is shown by comparison with the well-known steered response power approach. Finally, we show the applicability of our realtime neuro-fuzzy model to the concurrent speaker localization task using real reverberant recordings.