Text‐Line Extraction by Minimal Sub Seams

Dr. Raid Saabni

In this research, we sleeked for a robust and efficient algorithm to extract text lines directly from gray-level document images by tracking s minimal energy sub-seams to extract medial seams defining the text lines. For this task, we consider lines as a collection of areas with high-density foreground pixels lying on near-horizontal imaginary paths. Many times, these lines are not completely horizontal and can be skewed, or even fluctuating.  The minimal sub-seam tracking approach using dynamic programming and multi-scale Anisotropic filters has been applied by Saabni et.al to extract lines from binary and gray-level images. The authors proposed a binarization-free text line extraction method based on seam carving. In another paper, the author uses an efficient method to compute adaptive local density histogram, horizontally emphasized using the multi-scale anisotropic second derivative of Gaussian filter bank to produce a more reliable energy map for minimal/maximal seam computation. In this work, we present a language-independent method for automatic text line extraction, which works directly on gray-level images. The proposed algorithm uses a variety of adaptive local density histograms to follow fluctuation and near-horizontal lines. We use, Anisotropic Gaussian Filter Bank (AGFB) to produce an energy map that emphasizes the flow of text lines and to close gaps between components. Results of preprocessing and the explained previous step will be used as an input to the recurrent neural network of the kind long short-term memory to automatically learn to extract the medial seam of the line and recognize starting and ending points of each text line

 

Raid Saabni, Robust and Efficient Text: Line Extraction by Local Minimal Sub-Seams, ISCSIC ’18 Proceedings of the 2nd International Symposium on  computer Science and Intelligent Control, Stockholm, Sweden — September 21 – 23, 2018

Tags :
2020,Publications
Share This :