Department of Computer Science LS XII - Pattern Recognition Group

{Exploring Semantic Word Representations for Recognition-Free NLP on Handwritten Document Images}

Oliver Tueselmann and Gernot A. Fink
Proc. Int. Conf. on Document Analysis and Recognition, pages 85-100, 2023.

San Jose, CA, USA

Abstract

A semantic analysis of documents offers a wide range of practical application scenarios. Thereby, the combination of handwriting recognizer and textual NLP models constitutes an intuitive solution. However, due to the difficulty of recognizing handwriting and the error propagation problem, optimized architectures are required. Recognition-free approaches proved to be robust, but often produce poorer results compared to recognition-based methods. In our opinion, a major reason for this is that recognition-free approaches do not use largely pre-trained semantic word embeddings, which proves to be one of the most powerful method in the textual domain. To overcome this limitation, we explore and evaluate several semantic embeddings for word image representation. We are able to show that context-based embedding methods are well suited for static word representations and that they are more predictive at word image level compared to classical static embedding methods. Furthermore, our recognition-free approach with pre-trained semantic information outperforms recognition-free as well as recognition-based approaches from the literature on several Named Entity Recognition benchmark datasets.