Title: Exploiting discourse relations to produce Arabic extracts

Authors: Samira Lagrini; Nabiha Azizi; Mohammed Redjimi

Addresses: Labged Laboratory, Computer Science Department, Badji Mokhtar University, P.O. Box 12, Annaba, 23000, Algeria ' Labged Laboratory, Computer Science Department, Badji Mokhtar University, P.O. Box 12, Annaba, 23000, Algeria ' LICUS Laboratory, Computer Science Department, University 20 Aout 1955, Skikda, 21000, Algeria

Abstract: Text summarisation is one of the interesting tools for a quick and optimal exploitation of the huge amount of online textual documents. Several approaches have been proposed to date to produce extractive summaries in Arabic. However, in most cases, the linguistic qualities of the generated summary are not satisfactory. In this paper, we attempt to overcome this limitation by proposing a new approach for single-document summarisation that combines a discourse analysis following the rhetorical structure theory (RST) framework and a score-based method. Unlike traditional RST-based approaches, the proposed approach relies on exploiting intra-sentence discourse relations instead of text discourse structure to produce a primary summary. Then, each sentence within the primary summary is evaluated based on a combination of statistical and linguistic features to produce the final summary considering user compression rate. The proposed approach was evaluated under Essex Arabic Summaries Corpus (EASC) using ROUGE-1 and ROUGE-2 measures, and compared against other existing methods. A human evaluation was also conducted in order to assess the linguistic qualities of generated summaries. Experimental results are very encouraging and prove that, exploiting discourse relations is very useful to produce Arabic extractive summaries with good linguistic qualities.

Keywords: extractive single document summarisation; Arabic discourse analysis; Arabic discourse relations; score based; statistical features.

DOI: 10.1504/IJRIS.2022.10047369

International Journal of Reasoning-based Intelligent Systems, 2022 Vol.14 No.2/3, pp.130 - 143

Received: 23 Oct 2021
Accepted: 15 Mar 2022

Published online: 09 Sep 2022 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article