Press "Enter" to skip to content

Literature and Resources for Urdu OCR

In this post I have attempted to collect a kind of informal literature review related to the development of and state-of-the-art in Urdu OCR. I was introduced to most of this work during my DFKI internship in 2016. This is by no means exhaustive, but I think the follow papers cover the development of this area fairly well from hand-coded features to new Multi-dimensional LSTM based models (see my post on LSTM resources, if interested).

Two most import persons in the field (my personal view) to know about are Dr. Faisal Shafait and Dr. Adnan Ul-Hasan. In fact Dr. Adnan’s PhD thesis is well worth a read for anyone willing get up to speed with the field and serves as great starting resource. Most of the work mentioned below is done by their students.

Adnan ul-Hasan and Faisal Shafait also have a research project on ResearchGate titled Printed and Handwrittten Urdu Text Recognition

The paper that introduced Urdu Printed Text Image (UPTI) dataset by Nazlay Sabbour and Faisal Shafair is titled ‘A segmentation free approach to Arabic and Urdu OCR’ and can be accessed here. This paper also mentions DFKI’s IUPR as Dr Faisal’s affiliation.

An interesting effort to generate a dataset of handwritten Urdu script can be explored at UCOM Offline Dataset – a Urdu Handwritten Dataset Generation. In this work they introduce a labelled data set comprising of 600 pages of handwritten and labelled urdu script in Nastaleeq style.

More recently, inspired by the ground breaking success demonstrated by CTC-LSTM of Graves et al. the same techniques were applied to Urdu OCR. Some of the interesting work is as follows.

Offline Printed Urdu Nastaleeq script recognition with BLSTM uses Bi-directional LSTM (Aug. 2013) and Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks use Multi-dimensional LSTM (Nov. 2016). Papers can be accessed here and here.

Another interesting work by Mayce Al Azawi et al. compare RNN-LSTM and WFSTs to correct OCR errors. Their work is titled ‘Character-Level Alignment Using WFST and LSTM for Post-processing in Multi-script Recognition Systems – A Comparative Study‘ and can be previewed here.

Lastly, some interesting software package with for general OCR research are OCRoRACTAlex Graves’ RNNLibTesseract OCR, and NabOCR

 

 

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *