Title: A comparative study of cuckoo search and bat algorithm for Bloom filter optimisation in spam filtering

Authors: Arulanand Natarajan; S. Subramanian; K. Premalatha

Addresses: Anna University of Technology Coimbatore, Academic Campus, Jothipuram Post, Veerapandi Pirivu, Coimbatore – 641 047, Tamil Nadu, India. ' Sri Krishna College of Engineering and Technology, Kuniamuthur, Coimbatore – 641 008, Tamil Nadu, India. ' Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Alathukombai Post, Sathyamangalam, Erode Dt #150; 638 401, Tamil Nadu, India

Abstract: Bloom filter (BF) is a simple but powerful data structure that can check membership to a static set. The trade-off to use Bloom filter is a certain configurable risk of false positives. The odds of a false positive can be made very low if the hash bitmap is sufficiently large. Spam is an irrelevant or inappropriate message sent on the internet to a large number of newsgroups or users. A spam word is a list of well-known words that often appear in spam mails. The proposed system of bin Bloom filter (BBF) groups the words into number of bins with different false positive rates based on the weights of the spam words. Cuckoo search (CS) and bat algorithm are bio-inspired algorithms that imitate the way cuckoo breeding and microbat foraging behaviours respectively. This paper demonstrates the CS and bat algorithm for minimising the total membership invalidation cost of the BBFs by finding the optimal false positive rates and number of elements stored in every bin. The experimental results demonstrate the application of CS and bat algorithm for various numbers of bins and strings.

Keywords: bat algorithm; bin Bloom filter; BBF; cuckoo search; false positive rates; hash function; spam words; bio-inspired computation.

DOI: 10.1504/IJBIC.2012.047179

International Journal of Bio-Inspired Computation, 2012 Vol.4 No.2, pp.89 - 99

Received: 07 Sep 2011
Accepted: 18 Dec 2011

Published online: 22 Sep 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article