Authors: Muhammad Shafi; Kashif Zia
Addresses: Faculty of Computing and Information Technology, Sohar University, P.O. Box: 44, P.C 311, Sohar, Sultanate of Oman ' Faculty of Computing and Information Technology, Sohar University, P.O. Box: 44, P.C 311, Sohar, Sultanate of Oman
Abstract: The cursive nature of the Urdu text, where a particular character concatenated in different words forms different shapes, makes it a complicated language for automatic character recognition. This paper attempts to present a systematic literature survey of the automatic text recognition schemes of Urdu from the literature. Standard systematic literature review protocol is followed that comprises devising research questions to be answered, search terms selection, search sources selection, defining search process, finalising inclusion and exclusion criteria, quality assessment and final selection of the studies, data extraction and synthesis, and answering the research questions. We believe our work will provide a very useful venue for current researchers and a jumpstart for future researchers in this particular field to find answers to the questions such as various terms used in Urdu OCR, state-of-the-art algorithms and techniques, benchmark datasets, limitations of the current work and challenges in this particular area.
Keywords: character recognition; ligature; literature review; review; Urdu.
International Journal of Applied Pattern Recognition, 2021 Vol.6 No.4, pp.283 - 307
Received: 10 Apr 2020
Accepted: 23 Mar 2021
Published online: 11 Nov 2021 *