Title: Effective hybrid PSO and K-means clustering algorithm for gene expression data

Authors: T. Geetha, Michael Arock

Addresses: Department of Computer Applications, National Institute of Technology, Tiruchirappalli, Tamil Nadu-620015, India. ' Department of Computer Applications, National Institute of Technology, Tiruchirappalli, Tamil Nadu-620015, India

Abstract: DNA micro-array technology helps monitor the expression levels of thousands of genes. This paper presents clustering of gene expression data using particle swarm optimisation (PSO) and K-means algorithm combined. Recent studies have shown that partition-based clustering algorithms are more suitable for clustering large datasets. Partition-based K-means is used mostly because of its simple implementation and fast convergence. But it suffers local optima. PSO is a population-based stochastic search process, which searches automatically for the optimal solution in the search space. So, it is combined with K-means algorithm for clustering. In the previous PSO employing papers, particles flock at boundary. Our technique removes boundary blocks by moving the boundary particles towards the global best particle to improve effectiveness of PSO. The results of hybrid PSO, K-means and PSO algorithms are compared for several datasets. Among the three algorithms, the hybrid PSO algorithm performs well for most of the datasets.

Keywords: gene expressions; DNA microarrays; K-means clustering; particle swarm optimisation; PSO.

DOI: 10.1504/IJRAPIDM.2009.029381

International Journal of Rapid Manufacturing, 2009 Vol.1 No.2, pp.173 - 188

Published online: 28 Nov 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article