Title: Unsupervised generation of Arabic words

Authors: Ahmed Khorsi; Abeer Saad Alsheddi

Addresses: Al-Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia ' Al-Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia

Abstract: Automated word generation might be seen as the reverse process of morphology learning. The aim is to automatically coin valid words in the targeted language. As many other challenges in the field of natural language processing (NLP), the building of the generation engine might be carried out using a supervised or unsupervised approach. The former requires a clean learning data set of a decent size whereas the later needs no more than a plain text. Nonetheless, the unsupervised approaches are usually blamed for their low accuracy. The present article reports the results of an investigation on a context free generation of classical Arabic words. Unsupervised and relatively simple, The proposed approach reached easily an accuracy of 90%.

Keywords: Arabic language; classical vocabulary; computational linguistics; corpus expansion; linguistic corpora; morphology learning; natural language processing; unsupervised learning; statistical linguistics; word generation.

DOI: 10.1504/IJISTA.2019.100793

International Journal of Intelligent Systems Technologies and Applications, 2019 Vol.18 No.4, pp.340 - 352

Received: 28 May 2017
Accepted: 23 Nov 2017

Published online: 18 Jul 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article