Title: A new feature selection method based on support vector machines for text categorisation

Authors: Yaquan Xu, Haibo Wang

Addresses: School of Business Administration, Virginia State University, Petersburg, VA 23806, USA. ' AR Sanchez Jr. School of Business, Texas A&M International University, Laredo, TX 78041, USA

Abstract: As a machine intelligence paradigm, the support vector machines (SVMs) have tremendous potential for helping people to classify text document into a fixed number of predefined categories. The purpose of this paper is to discuss a new method of feature selection combined with principal component analysis and class profile-based feature as an input vector for SVMs classifier, and to demonstrate the effectiveness of this process. This paper also demonstrates that an applied method with SVMs improves categorisation performance and reduces the amount of time required to configure a learning machine.

Keywords: support vector machines; SVM; text categorisation; principal component analysis; PCA; class profile-based feature selection; text documents; document classification.

DOI: 10.1504/IJDATS.2011.038803

International Journal of Data Analysis Techniques and Strategies, 2011 Vol.3 No.1, pp.1 - 20

Published online: 29 Nov 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article