Leonard Rothacker, Fabian Wolf and Gernot A. Fink
Int. Journal of Pattern Recognition and Artificial Intelligence, 35(4), pages 2153001, 2020.
The segmentation-free word spotting method that is proposed in this paper makes document images searchable without requiring any annotated training dataset. This works in the query-by-example scenario where the user selects an exemplary occurrence of the query word. Afterwards, the entire collection of document images is searched according to visual similarity to the query. The proposed method requires only minimal assumptions about the visual appearance of text. This is achieved by processing document images as a whole without requiring a given segmentation of the images on word level or on line level. The detection of potentially relevant document regions does not require to recognize words in general. Word size variabilities can be handled by representing the sequential structure of text with a statistical sequence model. In order to make the computationally costly application of the sequence model feasible in practice, regions are retrieved according to approximate similarity with an efficient model decoding algorithm. Re-ranking these regions according to the visual similarity obtained with the sequence model leads to highly accurate word spotting results. The method is evaluated on five benchmark datasets. In the segmentation-free query-by-example scenario where no annotated training data is available, the method outperforms all other methods that have been evaluated on any of these five benchmarks.