Article: A survey of Arabic text classification approaches Journal: International Journal of Computer Applications in Technology (IJCAT) 2019 Vol.59 No.3 pp.236 - 251 Abstract: Categorisation of text is significant trend that ultimately appears owing to the internet revolution nowadays resulting in enormous amounts data that depend on various languages. The Arabic language is one of the most commonly used languages all over the world; it is considered the fifth most spoken one. Various challenges occur through processing and classifying of Arabic text since it has more sophisticated techniques than the English language. These challenges are clear owing to the Arabic language's variation in shape, structure and component; besides, there is a lack of adequate studies discussing Arabic text classification. This research seeks to form a general point of view by categorising different techniques of Arabic text classification for helping new researches concerning this domain. Also, it shows some of prior information and innovative designs about Arabic text classification. Besides, it mentions various works that have discussed classifying Arabic text, with regard to data sets, categories and pre-processes steps, classification mechanism and assessment procedure for those techniques. These discussions aim to conclude a comprehensive overview through forming a general framework for all researchers about this domain via examining the defects of the prior studies, and then the possibility of presenting more advanced directions. Inderscience Publishers - linking academia, business and industry through research

Title: A survey of Arabic text classification approaches

Authors: Mostafa Sayed; Rashed K. Salem; Ayman E. Khder

Addresses: Faculty of Computers and Information, Beni-Suef University, Beni Suef, Egypt ' Faculty of Computers and Information, Menoufia University, Al Minufya, Egypt ' Faculty of Computers and Information Technology, Future University, New Cairo, Egypt

Abstract: Categorisation of text is significant trend that ultimately appears owing to the internet revolution nowadays resulting in enormous amounts data that depend on various languages. The Arabic language is one of the most commonly used languages all over the world; it is considered the fifth most spoken one. Various challenges occur through processing and classifying of Arabic text since it has more sophisticated techniques than the English language. These challenges are clear owing to the Arabic language's variation in shape, structure and component; besides, there is a lack of adequate studies discussing Arabic text classification. This research seeks to form a general point of view by categorising different techniques of Arabic text classification for helping new researches concerning this domain. Also, it shows some of prior information and innovative designs about Arabic text classification. Besides, it mentions various works that have discussed classifying Arabic text, with regard to data sets, categories and pre-processes steps, classification mechanism and assessment procedure for those techniques. These discussions aim to conclude a comprehensive overview through forming a general framework for all researchers about this domain via examining the defects of the prior studies, and then the possibility of presenting more advanced directions.

Keywords: Arabic text classification; text classification techniques; Arabic text mining.

DOI: 10.1504/IJCAT.2019.098601

International Journal of Computer Applications in Technology, 2019 Vol.59 No.3, pp.236 - 251

Received: 07 Sep 2017
Accepted: 23 Apr 2018
Published online: 28 Mar 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: A survey of Arabic text classification approaches

Keep up-to-date