Title: Optimisation of sub-space clustering in a high dimension data using Laplacian graph and machine learning

Authors: P.R. Ambika; A. Bharathi Malakreddy

Addresses: Department of Computer Science and Engineering, City Engineering College, Bengaluru – 560062, Karnataka, India ' Department of Artificial Intelligence and Machine Learning, BMS Institute of Technology and Management, Bengaluru – 560064, Karnataka, India

Abstract: There are many applications like business analytics, computer vision and medical data analytics, where an unsupervised approach of learning is used for the high-dimension data (HDD) clustering. The problem of the subspace clustering is modelled as a graph problem which has to retain the critical features from the N-dimension while applying a dimension reduction technique to maintain a higher accuracy and lower computational overhead trade-off. Most of the traditional approaches suffer from the efficiency degradation when applied to HDD. An optimisation of sub-space clustering is proposed in this paper for learning models using Laplacian graph on a HDD. The proposed model addresses the curse of dimensionality problem through Laplacian matrix function to minimise the data redundancy within sub-space. The traditional K-nearest neighbour (KNN) algorithm is improvised for the non-linear classification of subspace clustering on HDD clinical importance. The proposed system offers significant increment of 99% of accuracy in clustering operation.

Keywords: subspace clustering; high dimensional data; dimensionality reduction; curse of dimensionality; Laplacian graph; graph partition problem; KNN algorithm.

DOI: 10.1504/IJBRA.2022.121763

International Journal of Bioinformatics Research and Applications, 2022 Vol.18 No.1/2, pp.68 - 83

Received: 26 Jun 2019
Accepted: 16 Apr 2020

Published online: 07 Apr 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article