Title: Investigating different fitness criteria for swarm-based clustering

Authors: Maria Priscila Da Silva Souza; Telmo De Menezes e Silva Filho; Getulio José Amorim Do Amaral; Renata Maria Cardoso Rodrigues De Souza

Addresses: Departamento de Estatística, Centro de Ciências Exatas e da Natureza (CCEN), Universidade Federal de Pernambuco, Av. Jornalista Aníbal Fernandes, s/n, 50.740-560 Recife (PE), Brazil ' Centro de Informática (CIn), Universidade Federal de Pernambuco, Av. Jornalista Aníbal Fernandes, s/n, 50.740-560 Recife (PE), Brazil ' Departamento de Estatística, Centro de Cências Exatas e da Natureza (CCEN), Universidade Federal de Pernambuco, Av. Jornalista Aníbal Fernandes, s/n, 50.740-560 Recife (PE), Brazil ' Centro de Informática (CIn), Universidade Federal de Pernambuco, Av. Jornalista Aníbal Fernandes, s/n, 50.740-560 Recife (PE), Brazil

Abstract: Swarm-based optimisation methods have been previously used for tackling clustering tasks, with good results. However, the results obtained by this kind of algorithm are highly dependent on the chosen fitness criterion. In this work, we investigate the influence of four different fitness criteria on swarm-based clustering performance. The first function is the typical sum of distances between instances and their cluster centroids, which is the most used clustering criterion. The remaining functions are based on three different types of data dispersion: total dispersion, within-group dispersion and between-groups dispersion. We use a swarm-based algorithm to optimise these criteria and perform clustering tasks with nine real and artificial datasets. For each dataset, we select the best criterion in terms of adjusted Rand index and compare it with three state-of-the-art swarm-based clustering algorithms, trained with their proposed criteria. Numerical results confirm the importance of selecting an appropriate fitness criterion for each clustering task.

Keywords: swarm optimisation; fitness criterion; clustering; artificial bee colony; particle swarm optimisation.

DOI: 10.1504/IJBIDM.2019.100455

International Journal of Business Intelligence and Data Mining, 2019 Vol.15 No.1, pp.117 - 131

Received: 17 Feb 2017
Accepted: 08 May 2017

Published online: 29 Jun 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article