Title: An improved algorithm to handle noise objects in the process of clustering

Authors: Hasanthi A. Pathberiya; Chandima D. Tilakaratne; Liwan L. Hansen

Addresses: Department of Statistics, University of Sri Jayewardenepura, Nugegoda, Sri Lanka ' Department of Statistics, University of Colombo, Colombo, Sri Lanka ' School of Computing, Engineering and Mathematics, Western Sydney University, Locked Bay 1797, Penrith NSW 2751, Australia

Abstract: Cluster analysis is considered as an approach for unsupervised learning. It tends to recognise hidden grouping structure in a set of objects using a predefined set of rules. Objects occupying unusual characteristics add noise to the data space. As a result, complexities and misinterpretation in clustering structures will arise. This study aims at proposing a novel iterative approach to eradicate the effect of noise objects in the process of deriving clusters of data. Performance of the proposed approach is tested on partitioning, hierarchical and neural network based clustering algorithms using both simulated and standard datasets supplemented with noise. An improvement in the quality of clustering structure resulted from the proposed approach is witnessed, compared to that of conventional clustering algorithms.

Keywords: clustering algorithms; handling noise data; mining methods and algorithms; k-means; Ward's method; self organising map.

DOI: 10.1504/IJDS.2019.098358

International Journal of Data Science, 2019 Vol.4 No.1, pp.1 - 17

Available online: 11 Mar 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article