Title: Non-persistent stratified sampling based IQRA_IG for scalable reduct generation
Authors: P.S.V.S. Sai Prasad; C. Raghavendra Rao
Addresses: School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Andhra Pradesh, 500046, India ' School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Andhra Pradesh, 500046, India
Abstract: Feature selections in large datasets using reduct based on rough set principles is computationally expensive. The existing scalable sampling-based reduct computation algorithms suffer from limitations like redundancy and inadequacy. This paper develops two algorithms for finding reduct based on stratified sampling and non-persistent stratified sampling techniques which addresses adequacy and to certain extent redundancy. This paper compares the performance of these algorithms against discernbility matrix-based sampling approximate reduct algorithm (SARA) and sample guided improved quick reduct algorithm with information gain heuristic (SGIQRA_IG). The performance of these algorithms is demonstrated on benchmark large dataset repository of Arizona State University.
Keywords: rough sets; quick reduct; scalability; stratified sampling; fidelity; feature selection.
DOI: 10.1504/IJGCRSIS.2014.060844
International Journal of Granular Computing, Rough Sets and Intelligent Systems, 2014 Vol.3 No.3, pp.234 - 255
Received: 24 Sep 2012
Accepted: 07 Jan 2014
Published online: 29 Jul 2014 *