Int. J. of Intelligent Engineering Informatics   »   2016 Vol.4, No.1

 

 

You can view the full text of this article for Free access using the link below.

 

 

Title: A new dataset of word-level offline handwritten numeral images from four official Indic scripts and its benchmarking using image transform fusion

 

Authors: Sk Md Obaidullah; Chayan Halder; Nibaran Das; Kaushik Roy

 

Addresses:
Department of Computer Science and Engineering, Aliah University, New Town Campus, Kolkata-700156, W.B, India
Department of Computer Science, West Bengal State University, Barasat, Kolkata-700126, W.B, India
Department of Computer Science and Engineering, Jadavpur University, Jadavpur, Kolkata-700032, W.B, India
Department of Computer Science, West Bengal State University, Barasat, Kolkata-700126, W.B, India

 

Abstract: Handwritten document image dataset development is one of the most tedious and time consuming tasks in optical character recogniser (OCR) related experimental work. Special attention need to be given in terms of feasibility, realness, clarity etc. while collecting real life data from different writers. Few efforts can be found in the literature for development of handwritten NIdb (numeral image dataset) but they were restricted on single script which is a local script of the fellow researcher who prepared the database. In this paper, an approach to develop word-level handwritten NIdb of four popular Indic scripts namely Bangla, Devanagari, Roman and Urdu has been proposed. Benchmark result is developed with respect to handwritten numeral script identification (HNSI) problem by applying a novel image transform fusion (ITF) based technique. The proposed dataset will be freely available to the researchers for non-commercial use.

 

Keywords: document image processing; optical character recognition; OCR; handwritten numerals; handwritten numeral image database; numeral image binarisation; feature sets; image transform fusion; ITF; script identification; classification; multilayer perceptron; benchmarking; word-level offline images; official Indic scripts; Bangla; Devanagari; Roman; Urdu; India; document images.

 

DOI: 10.1504/IJIEI.2016.074497

 

Int. J. of Intelligent Engineering Informatics, 2016 Vol.4, No.1, pp.1 - 20

 

Submission date: 08 Apr 2015
Date of acceptance: 07 Jul 2015
Available online: 01 Feb 2016

 

 

Editors Full text accessFree access Free accessComment on this article