Title: Modified CURE algorithm with enhancement to identify number of clusters
Authors: Alka Tripathi; Kirtee Panwar
Addresses: Department of Mathematics, Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307, India ' Department of Mathematics, Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307, India
Abstract: In this paper, we present an effective way of identifying number of clusters (k) based on density of data in given dataset and optimality of clusters formed. We have used internal evaluation of clustering to choose optimal set of clusters after narrowing the selection space to a small ideal range. This range is identified by partitioning dataset into a number of partitions using kd-tree so that partitions created contains densely packed data objects. We have used the concept of multi-representative points of a cluster in partitioning and evaluation of clustering and implemented it by modifying CURE algorithm. In this paper, linear transformation method (PCA) is applied to reduce high dimensional data into lower dimensions.
Keywords: clustering evaluation; clustering using representatives; CURE; hierarchical clustering; kd-tree; density based partitioning; optimal clusters; principal component analysis; PCA; linear transformation.
International Journal of Artificial Intelligence and Soft Computing, 2016 Vol.5 No.3, pp.226 - 240
Received: 03 May 2015
Accepted: 22 Jan 2016
Published online: 22 Aug 2016 *