Title: Classical and swarm-based approaches for feature selection in spam filtering

Authors: Kamilia Menghour; Labiba Souici-Meslati

Addresses: LISCO Laboratory, Badji Mokhtar-Annaba University, P.O. Box 12, 23000, Annaba, Algeria ' LISCO Laboratory, Badji Mokhtar-Annaba University, P.O. Box 12, 23000, Annaba, Algeria

Abstract: Feature selection is a significant stage in classification and data mining systems. As a preprocessing step to machine learning, it is very effective in reducing dimensionality, removing irrelevant data, increasing learning accuracy, and improving result comprehensibility. Swarm intelligence, dealing with natural and artificial systems, composed of many individuals that coordinate using decentralised control and self-organisation, represents an interesting recent trend for feature selection. In this article, we propose several algorithms for feature selection based on classical feature ranking methods, in addition to different variants of ant colony optimisation and binary particle swarm optimisation for e-mail classification. This work presents a comparative study about the performance of these approaches when they are applied in conjunction with Naive Bayes classifiers. Our experimental results show that the proposed feature selection approaches can achieve significant predictive error rates in spam filtering; furthermore, the number of selected features is significantly decreased. These results were confirmed with our experiments on other public databases.

Keywords: feature selection; swarm intelligence; ant colony optimisation; ACO; particle swarm optimisation; PSO; spam filtering; feature ranking; naive Bayes classifiers; email classification.

DOI: 10.1504/IJAIP.2014.065225

International Journal of Advanced Intelligence Paradigms, 2014 Vol.6 No.3, pp.214 - 234

Received: 20 May 2013
Accepted: 17 Mar 2014

Published online: 29 Oct 2014 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article