Title: Modified CURE algorithm with enhancement to identify number of clusters

Authors: Alka Tripathi; Kirtee Panwar

Addresses: Department of Mathematics, Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307, India ' Department of Mathematics, Jaypee Institute of Information Technology, A-10, Sector-62, Noida, UP-201307, India

Abstract: In this paper, we present an effective way of identifying number of clusters (k) based on density of data in given dataset and optimality of clusters formed. We have used internal evaluation of clustering to choose optimal set of clusters after narrowing the selection space to a small ideal range. This range is identified by partitioning dataset into a number of partitions using kd-tree so that partitions created contains densely packed data objects. We have used the concept of multi-representative points of a cluster in partitioning and evaluation of clustering and implemented it by modifying CURE algorithm. In this paper, linear transformation method (PCA) is applied to reduce high dimensional data into lower dimensions.

Keywords: clustering evaluation; clustering using representatives; CURE; hierarchical clustering; kd-tree; density based partitioning; optimal clusters; principal component analysis; PCA; linear transformation.

DOI: 10.1504/IJAISC.2016.078517

International Journal of Artificial Intelligence and Soft Computing, 2016 Vol.5 No.3, pp.226 - 240

Received: 03 May 2015
Accepted: 22 Jan 2016

Published online: 22 Aug 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article