Title: A new dataset of word-level offline handwritten numeral images from four official Indic scripts and its benchmarking using image transform fusion
Authors: Sk Md Obaidullah; Chayan Halder; Nibaran Das; Kaushik Roy
Addresses: Department of Computer Science and Engineering, Aliah University, New Town Campus, Kolkata-700156, W.B, India ' Department of Computer Science, West Bengal State University, Barasat, Kolkata-700126, W.B, India ' Department of Computer Science and Engineering, Jadavpur University, Jadavpur, Kolkata-700032, W.B, India ' Department of Computer Science, West Bengal State University, Barasat, Kolkata-700126, W.B, India
Abstract: Handwritten document image dataset development is one of the most tedious and time consuming tasks in optical character recogniser (OCR) related experimental work. Special attention need to be given in terms of feasibility, realness, clarity etc. while collecting real life data from different writers. Few efforts can be found in the literature for development of handwritten NIdb (numeral image dataset) but they were restricted on single script which is a local script of the fellow researcher who prepared the database. In this paper, an approach to develop word-level handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu has been proposed. Benchmark result is developed with respect to handwritten numeral script identification (HNSI) problem by applying a novel image transform fusion (ITF) based technique. The proposed dataset will be freely available to the researchers for non-commercial use.
Keywords: document image processing; optical character recognition; OCR; handwritten numerals; handwritten numeral image database; numeral image binarisation; feature sets; image transform fusion; ITF; script identification; classification; multilayer perceptron; benchmarking; word-level offline images; official Indic scripts; Bangla; Devanagari; Roman; Urdu; India; document images.
International Journal of Intelligent Engineering Informatics, 2016 Vol.4 No.1, pp.1 - 20
Received: 16 Apr 2015
Accepted: 07 Jul 2015
Published online: 01 Feb 2016 *