Title: A clustering-based hybrid approach for dual data reduction

Authors: Saroj Ratnoo; Seema Rathee; Jyoti Ahuja

Addresses: Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar-125001, India ' Department of Computer Science and Engineering, Guru Jambheshwar University of Science and Technology, Hisar-125001, India ' Department of Computer Science, Government Post Graduate College for Women, Rohtak-124001, India

Abstract: The research on data reduction techniques has become important to enhance the efficacy and efficiency of data mining algorithms which may otherwise be compromised in the presence of a large number of irrelevant attributes and redundant instances. Data can be reduced by selecting either a subset of attributes or instances. Dual selection treats the problem of feature and instance selection together as a single optimisation problem. The problem of dual selection is relatively difficult as it involves an enormously large search space. In this paper, we propose a hybrid instance feature selection; HIFS-CHC method using heterogeneous recombination and cataclysmic mutation; CHC adaptive search genetic algorithm to solve the problem of dual selection. The proposed approach works in two stages. In the first stage, K-means clustering algorithm is used to reduce the search space. The second stage incorporates stratified prototype selection and CHC algorithm for data reduction. The clustering based hybrid scheme is experimentally tested on sixteen benchmark datasets and compared with the other similar data reduction algorithms with respect to the predictive accuracy, reduction rate and execution time. Experimental results show that the proposed method outperforms the other methods in terms of reduction rate and execution time while preserving the predictive accuracy almost at the same level.

Keywords: Feature selection; instance selection; dual selection; data reduction; hybrid evolutionary approach.

DOI: 10.1504/IJIEI.2018.094511

International Journal of Intelligent Engineering Informatics, 2018 Vol.6 No.5, pp.468 - 490

Received: 26 Jan 2018
Accepted: 15 Mar 2018

Published online: 24 Aug 2018 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article