Adi Azran, Alon Schclar, Raid Saabni
Accurate text line extraction is a vital prerequisite for efficient and successful text recognition systems ranging from keywords/phrases searching to complete conversion to text. In many cases, the proposed algorithms target binary pre-processed versions of the image, which may cause insufficient results due to poor quality document images. Recently, more papers present solutions that work directly on gray-level images [1,2,7,12,15]. In this paper, we present a novel robust, and efficient algorithm to extract text-lines directly from gray-level document images. The proposed approach uses a combination of two variants of Convolutional Neural Network (CNNs), followed by minimal energy seam extraction. The first ConvNet is a modified version of the autoencoder used for biomedical image segmentation . The second is a deep convolutional Neural Network, working on overlapping vertical slices of the original image. The two variants are combined to one neural net after re-attaching the resulting slices of the second net.
The merged results of the two nets are used as a preprocessed image to obtain an energy map for a second phase. In the second step, we use the algorithm presented in , to track minimal energy sub-seams accumulated to perform a full local minimal/maximal separating and medial seam defining the text baselines and the text line regions. We have tested our approach on multi-lingual various datasets written at a range of image quality based on the ICDAR datasets.
|Azran, A., Schclar, A., & Saabni, R. (2021, August). Text line extraction using deep learning and minimal sub seams. In Proceedings of the 21st ACM Symposium on Document Engineering (pp. 1-4).