Title: Classifiers for Arabic NLP: survey
Authors: Marwan Al Omari; Moustafa Al-Hajj
Addresses: Centre for Language Sciences and Communication, Lebanese University, Celine Centre, Tayouneh, Beirut, Lebanon ' Centre for Language Sciences and Communication, Lebanese University, Celine Centre, Tayouneh, Beirut, Lebanon
Abstract: In this paper, we reviewed most common-used models and classifiers that used for the Arabic language to classify texts into categories, classes, or topics in tasks of opinion mining, sentence categorisation, part of speech tagging, language identification, name entity recognition, authorship attribution, word sense disambiguation, and text classification. Comparisons between classification tasks conducted in terms of models' performances and accuracies. Classification approaches are three types: lexicon-based, machine and deep learning, or hybrid ones. Research sample is 34 articles in the classification domain. Challenges facing the Arabic language discussed with further solutions: 1) solid research training on both approaches: lexicon-based and corpus-based (machine and deep learning); 2) research contribution mainly corpus, approach technique, and free accessibility; 3) fund increase to the research development in the Arab world.
Keywords: lexicon-based approach; corpus-based approach; machine learning; deep learning; classification; big data; classifier; Arabic NLP; natural language processing; NLP; classification approach; NLP lexicon-based; NLP machine learning.
DOI: 10.1504/IJCCIA.2020.105538
International Journal of Computational Complexity and Intelligent Algorithms, 2020 Vol.1 No.3, pp.231 - 258
Received: 11 Oct 2018
Accepted: 31 Dec 2018
Published online: 03 Mar 2020 *