Title: New architectural optical character recognition approach for cursive fonts: the historical Maghrebian font as an example

Authors: Ilyes Ouled Omar; Sofiene Haboubi; Faouzi Benzarti

Addresses: LR11ES17 – Signals, Images and Information Technologies Laboratory, National Engineering School of Tunis, University of Tunis El Manar, BP 37, 1002, Tunis, Tunisia ' LR11ES17 – Signals, Images and Information Technologies Laboratory, National Engineering School of Tunis, University of Tunis El Manar, BP 37, 1002, Tunis, Tunisia ' LR11ES17 – Signals, Images and Information Technologies Laboratory, National Engineering School of Tunis, University of Tunis El Manar, BP 37, 1002, Tunis, Tunisia

Abstract: The historical Maghrebian font is an Arabic font that dominated in several North African lands. Various cultural and scientific papers of major importance were developed using this font. In this paper, the full OCR architecture that is able to treat the specificity of the historical Maghrebian font is revealed. Further, a complete design with the accuracy of each module is provided. The novel OCR architecture includes a binarisation module based on deep neural networks with an accuracy of 98.1%. Moreover, it involves three segmentation tasks based on deep learning approaches for text/non-text separation, columns division and connected components segmentations. The classification task is based on the DenseNet model with an accuracy of 98.95%. The post-processing module is also based on deep learning approaches based on sequential modelling with an accuracy of 81.3%. It also includes a user-feedback stage with an accuracy of 94.7%. The total system accuracy is 89.06%.

Keywords: optical character recognition; OCR; cursive historical documents; Maghrebian font database; deep learning.

DOI: 10.1504/IJICA.2023.129361

International Journal of Innovative Computing and Applications, 2023 Vol.14 No.1/2, pp.91 - 103

Received: 29 Mar 2021
Accepted: 26 May 2021

Published online: 07 Mar 2023 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article