Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents

Leonard Rothacker, Marcal Rusinol and Gernot A. Fink
Proc. Int. Conf. on Document Analysis and Recognition, 2013.

Washington DC, USA

BibTeX PDF

Abstract

Recent HMM-based approaches to handwritten word spotting require large amounts of learning samples and mostly rely on a prior segmentation of the document. We propose to use Bag-of-Features HMMs in a patch-based segmentation-free framework that are estimated by a single sample. Bag-of-Features HMMs use statistics of local image feature representatives. Therefore, they can be considered as a variant of discrete HMMs allowing to model the observation of a number of features at a point in time. The discrete nature enables us to estimate a query model with only a single example of the query provided by the user. This makes our method very flexible with respect to the availability of training data. Furthermore, we are able to outperform state-of-the-art results on the George Washington dataset.