Title: Data analytics: feature extraction for application with small sample in classification algorithms

Authors: L. Kamatchi Priya; M.K. Kavitha Devi; S. Nagarajan

Addresses: Department of Computer Science and Engineering, Vickram College of Engineering Enathi, Sivagangai, Madurai – 630561, India ' Department of Computer Science and Engineering, Thiagarajar College of Engineering, Madurai – 625005, India ' Department of Mechanical Engineering, Vickram College of Engineering Enathi, Sivagangai, Madurai – 630561, India

Abstract: This paper focuses on improving the classification accuracy for supervised learning in areas of application with very few training data and with extremely available high dimensionality. This paper proposes a framework which acts as a decision support system incorporating both feature selection and feature extraction to improvise the classification accuracy. The feature selection technique comprises redundancy elimination and relevance analysis. Feature subset selection problems eliminate features which are redundant by using correlation-based maximum spanning tree. But, the eliminated features may contain useful information which may contribute in determining the target or class labels. The principal components are extracted from the eliminated features and they are complemented with the selected features to perform classification. The superiority of the proposed method over other feature selection methods, in terms of computational complexity and classification accuracy, is established extensively on various datasets.

Keywords: classification; feature selection; feature extraction; redundancy elimination; relevance analysis; maximum spanning tree; MST; supervised learning.

DOI: 10.1504/IJBIS.2017.087108

International Journal of Business Information Systems, 2017 Vol.26 No.3, pp.378 - 401

Received: 07 Mar 2016
Accepted: 13 May 2016

Published online: 06 Oct 2017 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article