Title: SCAT: a system of classification for Arabic texts

Authors: Rami Ayadi, Mohsen Maraoui, Mounir Zrigui

Addresses: The Research Unit of Technologies of Information and Communication, Higher School of Sciences and Technologies of Tunis, 56, Bab Menara, 1008 Tunis, Tunisie. ' Department of Computer Science, Faculty of Science of Monastir, University of Monastir, Monastir, 5019, Tunisie. ' Department of Computer Science, Faculty of Science of Monastir, University of Monastir, Monastir, 5019, Tunisie

Abstract: The core of this work is to realise a system of classification for Arabic texts (SCAT) based on the inter-textual distance theory for Arabic language. This theory assumes the classification of texts according to criteria of lexical statistics, and it is based on the lexical connection approach. Our objective is to integrate this theory as a tool of classification of texts in Arabic language. It requires the integration of a metrics for the classification of texts using a database of lemmatised and identified corpus which can be considered as a literature reference for times, kinds, literary themes and authors and this in order to permit the classification of anonymous texts.

Keywords: text classification; Arabic texts; SCAT; stemming; inter-textual distance theory; Arabic language; Arabia; lexical connection.

DOI: 10.1504/IJITST.2011.039679

International Journal of Internet Technology and Secured Transactions, 2011 Vol.3 No.1, pp.63 - 80

Published online: 29 Nov 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article