Lampung - A New Handwritten Character Benchmark: Database, Labeling and Recognition

A. Junaidi, S. Vajda and G. A. Fink
International Workshop on Multilingual OCR (MOCR), pages 105-112, 2011.

Beijing, China

BibTeX PDF

Abstract

This research paper deals with our effort of creation and recognition of isolated Lampung characters, a script originated from Indonesia. The aim is to describe this new script with all its peculiarities, propose a labeling scheme to manage a large isolated character dataset and finally a recognition scheme based on water reservoir concept. The Lampung script originally descending from Brahmi script is used in Lampung Province and it is close to extinction if no such initiative as ours will direct the focus to this cultural heritage. The collected dataset contains isolated characters coming from fairy tales transcriptions and were annotated with a semi-automatic labeling method using a limited human effort. Our attention is focused not only the database collection but on recognition as well. For this purpose a water reservoir based feature set is proposed exploiting the different cavities and the subsequent measures of the character shapes. The experimental results (94.27%) prove the efficiency of the method utilizing a brand new script and feature set.