Recognizing disturbed text in real-life images is a difficult problem, as information that is missing due to low-resolution or out-of-focus text has to be recreated. Combining text super-resolution and optical character recognition deep learning models can be a valuable tool to enlarge and enhance text images for better readability, as well as recognize text automatically afterward. We achieve improved peak signal-to-noise ratio and text recognition accuracy scores over a state-of-the-art text super-resolution model TBSRN on the real-world low-resolution dataset TextZoom while having a smaller theoretical model size, due to the usage of quantization techniques. This could allow this model to run on low-power devices. In addition, we show how different training strategies influence the performance of the resulting model.
Philipp Hildebrandt, Maximilian Schulze, Sarel Cohen, Vanja Doskoč, Raid Saabni, and Tobias Friedrich. 2022. Optical Character Recognition Guided Image Super Resolution. In Proceedings of The 22th ACM Symposium on Document Engineering (DocEng). ACM, New York, NY, USA, 7 pages.