Ahmad Droby, Berat Kurar Barakat, Raid Saabni, Reem Alaasam, Boraq Madi, and Jihad El-Sana
Abstract: We propose an unsupervised feature learning approach for segmenting text lines of hand-
written document images with no labelling effort. Humans can easily group local text line features to global coarse patterns. We leverage this coherent visual perception of text lines as a supervising signal
by formulating the feature learning as a global pattern differentiation task. The machine is trained
to detect whether a document patch contains a similar global text line pattern with its identity or neighbours, and a different global text line pattern with its 90-degree-rotated identity or neighbours.
Clustering the central windows of document image patches using their extracted features forms blob lines which strike through the text lines. The blob lines guide an energy minimization function for extracting text lines in a binary image and guide a seam carving function for detecting baselines in a colour image. In identifying the aspect of the input patch that supports the actual prediction and clustering, we contribute toward the understanding of input patch functionality.
We evaluate the method on several variants of text line segmentation datasets to demonstrate its effectiveness, visualize what it has learned, and enable it to comprehend its clustering strategy from a human perspective.
Keywords: text line segmentation; text line extraction; text line detection; unsupervised deep learning
Droby, A.; Kurar Barakat, B.; Saabni, R.; Alaasam, R.; Madi, B.; El-Sana, J. Understanding Unsupervised Deep Learning for Text Line Segmentation. Appl. Sci. 2022, 12, 9528. https://doi.org/10.3390/ app12199528