Title: Multi-label text classification using optimised feature sets

Authors: J. Maruthupandi; K. Vimala Devi

Addresses: Department of Information Technology, Mepco Schlenk Engineering College, Mepco Nagar, Post-626005, Sivakasi, Tamilnadu, India ' Department of Computer Science Engineering, Velammal Engineering College, Velammal Nagar, Ambattur-Red Hills Road, Chennai – 600 066, Tamilnadu, India

Abstract: Multi-label text classification is the process of assigning multi-labels to an instance. A significant aspect of the text classification problem is the high dimensionality of the data which hinders the performance of the classifier. Hence, feature selection plays a significant role in classification process that removes the irrelevant data. In this paper, wrapper-based hybrid artificial bee colony and bacterial foraging optimisation (HABBFO) approach has been proposed to select the most appropriate feature subset for prediction. Initially, pre-processing such as tokenisation, stop word removal and stemming has been performed to extract the features (words). Experiments are conducted on the benchmark dataset and the results show that the proposed approach achieves better performance compared to the other feature selection techniques.

Keywords: multi-label; text classification; feature selection.

DOI: 10.1504/IJDMMM.2017.086583

International Journal of Data Mining, Modelling and Management, 2017 Vol.9 No.3, pp.237 - 248

Received: 06 Apr 2016
Accepted: 22 Feb 2017

Published online: 12 Sep 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article