Integrating Speaker Identification and Learning with Adaptive Speech Recognition

Gernot A. Fink and Thomas Pl{\"o}tz
2004: A Speaker Odyssey -- The Speaker and Language Recognition Workshop, pages 185-192, 2004.

Toledo

BibTeX PDF

Abstract

Presently, speaker adaptive systems are the state-of-the-art in automatic speech recognition. A general baseline model is adapted to the current speaker during recognition in order to improve the quality of the results obtained. However, the adaptation procedure needs to be able to distinguish between data from different speakers. Therefore, in a general speaker adaptive recognizer speaker recognition has to be performed implicitly. The resulting information about the identity of the person speaking can be of great importance in many applications of speech recognition, e.g. in man-machine communication. Therefore, we propose an integrated framework for \textit{speech and speaker recognition}. Our system is able to detect new speakers and to identify already known ones. For a new speaker both an identification and an adapted recognition model are learned from limited data. The latter is then used for the recognition of utterances attributed to this speaker. We will present evaluation results with respect to speaker identification performance on two non-trivial speech recognition tasks that demonstrate the effectiveness of our integrated approach.