Authors: Sari Awwad; Mustafa Hammad; Safaa Al-Haj Saleh
Addresses: Computer Science and Applications, The Hashemite University, Zarqa, Jordan ' Department of Information Technology, Mutah University, Al-Karak, Jordan ' Department of Software Engineering, The Hashemite University, Zarqa, Jordan
Abstract: Arabic word classification is a challenging problem owing to the cursive nature of the language and modulation marks. The existing approaches are based on databases and dictionaries to classify Arabic words, which makes classification process operation slow. Therefore, this paper investigates Arabic word classification in the non-vocalised Arabic text by solely using affixes features and explores the extent to which we can rely on these features to determine Arabic word class without the need for dictionaries or word lists. The proposed approach is mainly based on affixes features and Support Vector Machine (SVM). A Fisher encoding is also applied to remove any redundancy and to preserve important information. Our approach is tested on a data set of two main classes (noun and verb) and different six noun sub-classes. The results indicate that this approach is helpful in achieving a success rate approaching 64% of the total words in the articles used in this study. The unsuccessful classification rate appears because there are no affixes in the target Arabic word or some original characters are considered as affixes.
Keywords: affixes features; word classification; SVM; support vector machine; Fisher encoding; Arabic language.
International Journal of Computer Applications in Technology, 2019 Vol.59 No.4, pp.347 - 353
Received: 07 Mar 2018
Accepted: 16 Apr 2018
Published online: 16 Apr 2019 *