{Einfluss von Bildformaten auf die KI-basierte Analyse von digitalisierten Dokumentbest{\"a}nden (The influence of image formats on the AI-based analysis of digitized document collections)}

Oliver T{\"u}selmann, Fabian Wolf, Tim Raven and Gernot A. Fink
ARCHIV. theorie \& praxis, 2024, to appear.

BibTeX

Abstract

This study examines the impact of image formats and compression on AI-based analysis of digitized documents. Focusing on handwritten text recognition, writer retrieval, and information extraction, it evaluates the effects of compression quality on model performance. Results show that high quality compression (e.g. JPEG 50+) has minimal impact on AI performance, while extreme compression degrades it. The findings highlight the feasibility of storage-efficient formats for long-term archival use without compromising future AI analyses.