Authors: Athraa Jasim Mohammed; Yuhanis Yusof; Husniza Husni
Addresses: School of Computing, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia; University of Technology, Baghdad, Iraq ' School of Computing, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia ' School of Computing, Universiti Utara Malaysia, 06010 Sintok, Kedah, Malaysia
Abstract: Existing conventional clustering techniques require a pre-determined number of clusters, unluckily; missing information about real world problem makes it a hard challenge. A new orientation in data clustering is to automatically cluster a given set of items by identifying the appropriate number of clusters and the optimal centre for each cluster. In this paper, we present the WFA_selection algorithm that originates from weight-based firefly algorithm. The newly proposed WFA_selection merges selected clusters in order to produce a better quality of clusters. Experiments utilising the WFA and WFA_selection algorithms were conducted on the 20Newsgroups and Reuters-21578 benchmark dataset and the output were compared against bisect K-means and general stochastic clustering method (GSCM). Results demonstrate that the WFA_selection generates a more robust and compact clusters as compared to the WFA, bisect K-means and GSCM.
Keywords: partitional clustering; dynamic clustering; hierarchical clustering; text clustering; firefly algorithm; cluster discovering; optimal clusters; data clustering; bisect K-means clustering; general stochastic clustering.
International Journal of Data Mining, Modelling and Management, 2016 Vol.8 No.4, pp.330 - 347
Received: 09 Sep 2014
Accepted: 02 May 2015
Published online: 29 Dec 2016 *