Title: Using Laplace and angular measures for Feature Selection in Text Categorisation
Authors: Elena Montanes, Pedro Alonso, Elias F. Combarro, Irene Diaz, Raquel Cortina, Jose Ranilla
Addresses: Computer Science Department, University of Oviedo, Spain. ' Mathematics Department, University of Oviedo, Spain. ' Computer Science Department, University of Oviedo, Spain. ' Computer Science Department, University of Oviedo, Spain. ' Computer Science Department, University of Oviedo, Spain. ' Computer Science Department, University of Oviedo, Spain
Abstract: Text Categorisation (TC) consists of automatically assigning documents to a set of prefixed categories. It usually involves the management of a huge number of features. Some of them are irrelevant or noisy which mislead the classifiers. Thus, they are reduced to increase the efficiency and effectiveness of the classification. In this paper we propose to select relevant features using two different families of filtering measures, which are simpler than other usual measures applied for this purpose. The experiments over three corpora show that, in general, the proposed measures perform equal or better than the existing ones, sometimes allowing greater reductions.
Keywords: feature selection; text categorisation; polynomial filtering measures.
DOI: 10.1504/IJAIP.2008.020819
International Journal of Advanced Intelligence Paradigms, 2008 Vol.1 No.1, pp.40 - 59
Published online: 17 Oct 2008 *
Full-text access for editors Access for subscribers Purchase this article Comment on this article