Title: Inference of number of prototypes with a framework approach to K-means clustering

Authors: Simon J. Chambers; Ian H. Jarman; Terence A. Etchells; Paulo J.G. Lisboa

Addresses: School of Computing and Mathematical Science, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK ' School of Computing and Mathematical Science, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK ' School of Computing and Mathematical Science, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK ' School of Computing and Mathematical Science, Liverpool John Moores University, Byrom Street, Liverpool L3 3AF, UK

Abstract: The selection of an appropriate value of the number of prototypes, K, is an important component in the use of partitioning algorithms such as K-means where such selection is not automatic. This is partly because the purpose of the algorithm is to identify clusters of interest and also because the choice of K is important for ensuring that the resulting partition reflects the underlying structure of the data. This paper introduces a method for guiding the identification of the number of clusters, K, by building upon a larger framework for stabilising partitions using cluster separation and stability. The method is compared with several frequently used algorithms in the published literature, demonstrating the utility of the proposed approach.

Keywords: K-means clustering; partitioning; cluster separation; gap statistic; number of prototypes.

DOI: 10.1504/IJBET.2013.058538

International Journal of Biomedical Engineering and Technology, 2013 Vol.13 No.4, pp.323 - 340

Received: 21 Dec 2012
Accepted: 28 Sep 2013

Published online: 27 Sep 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article