Title: A hybrid classification-based model for automatic text summarisation using machine learning approaches: CBS-ID3MV

Authors: M. Esther Hannah

Addresses: Department of Information Technology, St. Joseph's College of Engineering, Sholinganallur, Chennai, TN, India

Abstract: A hybrid approach for the generation of automatic text summarisation is achieved through CBS-ID3MV. A classification-based model using ID3 and multivariate (CBS-ID3MV) approach produces summaries from the text documents through classification and multiple linear regression. Efficient feature selection and extraction methods identify text features from each sentence, for the purpose of classifying summary sentences. The CBS-ID3MV model is trained with DUC 2002 training documents and the proposed approach's performance is measured at several compression rates namely 10%, 20% and 30% on the text data. The results got by the proposed framework works better when compared with other summarisers after evaluation using ROUGE metrics.

Keywords: training; classification; machine learning; decision trees; feature extraction; summarisation; regression.

DOI: 10.1504/IJPD.2019.099242

International Journal of Product Development, 2019 Vol.23 No.2/3, pp.201 - 211

Received: 14 Feb 2018
Accepted: 28 Jan 2019

Published online: 23 Apr 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article