Title: Summarisation of subspace clusters based on similarity connectedness

Authors: B. Jaya Lakshmi; M. Shashi; K.B. Madhuri

Addresses: Department of Information Technology, GVP College of Engineering (A), Visakhapatnam, Andhra Pradesh 530048, India ' Department of CS & SE, College of Engineering, Andhra University, Visakhapatnam, Andhra Pradesh 530003, India ' Department of Information Technology, GVP College of Engineering (A), Visakhapatnam, Andhra Pradesh 530048, India

Abstract: Subspace clustering is an emerging area which explores clusters of objects in various subspaces. The existing subspace clustering algorithms are computationally expensive as they generate a large number of possibly redundant subspace clusters limiting the interpretability of the results. The problem gets even worse with the increase in data's dimensionality. So, this demands for efficient summarisation framework that generates limited number of interesting subspace clusters. A novel algorithm, Similarity Connectedness Based clustering on subspace clusters (SCoC) is proposed to form natural grouping of lower-dimensional subspace clusters. The concept of similarity connectedness is introduced to group and merge the subspace clusters formed in different lower-dimensional subspaces leaping through the lattice of dimensions. The resulted compact and summarised high-dimensional subspace clusters would easily be interpreted for making sound decisions. The SCoC algorithm is thoroughly tested on various benchmark datasets and found that it outperforms PCoC and SUBCLU both in cluster quality and execution time.

Keywords: subspace clusters; summarisation; similarity; similarity connectedness; similarity threshold; groups of subspace clusters.

DOI: 10.1504/IJDS.2018.094504

International Journal of Data Science, 2018 Vol.3 No.3, pp.255 - 265

Received: 23 Nov 2016
Accepted: 03 Feb 2017

Published online: 04 Sep 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article