Title: Automatic identification of rhetorical relations among intra-sentence discourse segments in Arabic

Authors: Samira Lagrini; Nabiha Azizi; Mohammed Redjimi; Monther Al Dwairi

Addresses: Labged Laboratory, Computer Science Department, Badji Mokhtar University, P.O. Box 12, Annaba, 23000, Algeria ' Labged Laboratory, Computer Science Department, Badji Mokhtar University, P.O. Box 12, Annaba, 23000, Algeria ' Universite 20 Aout 1955 – Skikda, 21000, Algeria ' College of Technological Innovation, Zayed University, P.O. Box 144534, Abu Dhabi, UAE

Abstract: Identifying discourse relations, whether implicit or explicit, has seen renewed interest and remains an open challenge. We present the first model that automatically identifies both explicit and implicit rhetorical relations among intra-sentence discourse segments in Arabic text. We build a large discourse annotated corpora following the rhetorical structure theory framework. Our list of rhetorical relations is organised into three level hierarchies of 23 fine-grained relations, grouped into seven classes. To automatically learn these relations, we evaluate and reuse features from literature, and contribute three additional features: accusative of purpose, specific connectives and the number of antonym words. We perform experiments on identifying fine-grained and coarse-grained relations. The results show that compared with all the baselines, our model achieves the best performance in most cases, with an accuracy of 91.05%.

Keywords: discourse relations; rhetorical structure theory; Arabic language.

DOI: 10.1504/IJISTA.2019.099345

International Journal of Intelligent Systems Technologies and Applications, 2019 Vol.18 No.3, pp.281 - 302

Received: 19 Oct 2017
Accepted: 23 Nov 2017

Published online: 29 Apr 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article