Robust Output Modeling in Bag-of-Features {HMM}s for Handwriting Recognition

Leonard Rothacker and Gernot A. Fink
Proc. Int. Conf. on Frontiers in Handwriting Recognition, 2016.

Shenzhen, China

BibTeX PDF

Abstract

Bag-of-Features HMMs have been successfully applied to handwriting recognition and word spotting. In this paper we extend our previous work and present methods for modeling sequences of Bag-of-Features representations with Hidden Markov Models. We will discuss our previous approach that uses a pseudo-discrete model. Afterwards, we present a novel semi-continuous integration. The method is effective for probabilistic text clustering and is suitable for statistically modeling the characteristics of Bag-of-Features representations extracted from document images. Furthermore, its statistical expectation-maximization estimation can directly be integrated in Baum-Welch HMM training. In our experiments we present competitive results on the IfN/ENIT word recognition benchmark and state-of-the-art results for word spotting on the George Washington benchmark. Our evaluation gives insights into the properties of the models from the perspectives of modern as well as historic document analysis.