Title: A density oriented fuzzy C-means clustering algorithm for recognising original cluster shapes from noisy data

Authors: Prabhjot Kaur, Anjana Gosain

Addresses: Department of Information Technology, Maharaja Surajmal Institute of Technology, C4, Janakpuri, New Delhi 110058, India. ' Department of Information Technology, University School of Information Technology, Guru Gobind Singh Indraprastha University, New Delhi 110058, India

Abstract: There are many clustering algorithms in the literature that are robust against outliers. They are robust because they decrease the effect of outliers on the cluster centroid locations but they do not result into efficient clusters as they include outliers in the final clusters. The limitation with these algorithms is that they do not identify outliers. In this paper, we propose an algorithm, density oriented fuzzy C-means (DOFCM) which identifies outliers based upon density of points in the dataset before creating clusters and results into |n + 1| clusters, with |n| good and one invalid cluster containing noise and outliers. Proposed technique is based on the concept that if these outliers are not required in clustering then their memberships should not be involved during clustering. We tried to nullify the effect of outliers by assigning them zero membership value during clustering. It is applied to various synthetic datasets, Bensaid|s data and is compared with well known robust clustering techniques, namely, PFCM, CFCM, and NC. Results obtained after comparing the performance of these algorithms concluded that DOFCM is the best method to recognise original shape of clusters from noisy datasets.

Keywords: fuzzy clustering; outlier identification; data mining; density-oriented approach; noise clustering; robust clustering; shape recognition; cluster shapes; C-means clustering; point density; noisy data.

DOI: 10.1504/IJICA.2011.039591

International Journal of Innovative Computing and Applications, 2011 Vol.3 No.2, pp.77 - 87

Published online: 21 Mar 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article