Authors: Kilvisharam Oziuddeen Mohammed Aarif; P. Sivakumar
Addresses: Department of Electronics and Communication Engineering, C. Abdul Hakeem College of Engineering and Technology, Melvisharam-632509, India ' Department of Electronics and Communication Engineering, Karpagam College of Engineering, Coimbatore-641032, India
Abstract: Script identification is one of a challenging segment of optical character recognition system for the bilingual or multilingual document image. Significant research work has been noted on script identification in the last two decades which highly concentrated on natural languages like Latin, Chinese, Hindi, French and so forth. Very little efforts are made on script identification of cursive languages like Arabic, Urdu, Pashto, etc. Most of the Urdu ancient literature which is yet to be digitised includes both Urdu and Arabic text. In this paper, we present a script identification of Urdu and Arabic text at word level using Gabor features with suitable orientation and frequencies. The proposed model is trained using support vector machine (SVM) classifier and the results achieved are very promising.
Keywords: script identification; cursive language; character recognition; Gabor filter; support vector machine; SVM.
International Journal of Computer Aided Engineering and Technology, 2020 Vol.12 No.3, pp.328 - 335
Received: 07 Sep 2017
Accepted: 29 Jan 2018
Published online: 05 Mar 2020 *