Axel Plinge, Rene Grzeszick and Gernot A. Fink
Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 2014.
The classification of acoustic events in indoor environments is an important task for many practical applications in smart environments. In this paper a novel approach for classifying acoustic events that is based on a Bag-of-Features approach is proposed. Mel and gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as input features for the Bag-of representation. Rather than using a prior classification or segmentation step to eliminate silence and background noise, Bag-of-Features representations are learned for a background class. Supervised learning of codebooks and temporal coding are shown to improve the recognition rates. Three different databases are used for the experiments: the CLEAR sound event dataset, the D-CASE event dataset and a new set of smart room recordings.