Title: Old document recognition using fuzzy methods

Authors: J.M.C. Sousa, J.M. Gil, C.S. Ribeiro, J.R. Caldas Pinto

Addresses: Department of Mechanical Engineering, Instituto Superior Tecnico, Technical University of Lisbon, GCAR/IDMEC 1049-001 Lisbon, Portugal. ' Department of Mechanical Engineering, Instituto Superior Tecnico, Technical University of Lisbon, GCAR/IDMEC 1049-001 Lisbon, Portugal. ' Department of Mechanical Engineering, Instituto Superior Tecnico, Technical University of Lisbon, GCAR/IDMEC 1049-001 Lisbon, Portugal. ' Department of Mechanical Engineering, Instituto Superior Tecnico, Technical University of Lisbon, GCAR/IDMEC 1049-001 Lisbon, Portugal

Abstract: This paper proposes an expert system based on fuzzy logic for optical character recognition of old printed documents. These documents can have some problems, such as distortion, poor printing quality, faded and misprinted characters, speckles and smudges. The recognition process consists of two stages: training with character image examples and classification of new character images. The proposed OCR builds fuzzy membership functions from oriented features extracted using Gabor filter banks. The proposed methodology is tested on three different books from the 17th century, written in Portuguese. The fuzzy recogniser presents a very high character recognition success rate, which confirms the advantage of using expert systems in image based decision systems.

Keywords: fuzzy logic; optical character recognition; OCR; character recognition; old documents; image processing; uncertainty.

DOI: 10.1504/IJISTA.2006.009908

International Journal of Intelligent Systems Technologies and Applications, 2006 Vol.1 No.3/4, pp.263 - 279

Published online: 01 Jun 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article