Title: Efficient data selection approach in projected feature space for fast training support vector machines
Authors: Sonia Chaibi; Mohamed Tayeb Laskri
Addresses: LRS Laboratory, Computer Science Department, Badji Mokhtar-Annaba University – UBMA, P.O. Box 12, Annaba, 23000 Annaba, Algeria ' LRS Laboratory, Computer Science Department, Badji Mokhtar-Annaba University – UBMA, P.O. Box 12, Annaba, 23000 Annaba, Algeria
Abstract: Support vector machines (SVMs) have shown superior performance compared to other machine learning techniques, especially in classification problems. Yet one limitation is the long computational training time which increases with the data size. This problem has been investigated thoroughly and different algorithms for classification have been used with various success rates. Among them, clustering techniques have shown a considerable success to reduce SVM's data training. However, once these solutions are used for large scale datasets it becomes clear that using only clustering approaches is insufficient. In this paper, we tackle the problem of how to combine clustering methods and feature reducing techniques to minimise efficiently SVM's complexity. Several experiments on different datasets show that the proposed solution can be a promised way for fast training SVMs on large scale datasets.
Keywords: support vector machines; K-means clustering; SVM training; linear discrimination analysis; feature reduction; data mining; data selection; projected feature space; fast training SVM.
DOI: 10.1504/IJBIDM.2014.068349
International Journal of Business Intelligence and Data Mining, 2014 Vol.9 No.3, pp.179 - 194
Received: 11 Jun 2014
Accepted: 10 Jul 2014
Published online: 10 Apr 2015 *