Title: Clustering algorithm based on asymmetric similarity and paradigmatic features

Authors: Julio Santisteban; Javier Tejada-Cárcamo

Addresses: Universidad Católica San Pablo, Campus Campiña Paisajista s/n, Quinta Vivanco, Barrio de San Lázaro, Arequipa, Peru ' Universidad Católica San Pablo, Campus Campiña Paisajista s/n, Quinta Vivanco, Barrio de San Lázaro, Arequipa, Peru

Abstract: Similarity measures are essential to solve many pattern recognition problems such as classification, clustering, and information retrieval. Various similarity measures are categorised in both syntactic and semantic relationships. In this paper, we present a novel similarity, unilateral Jaccard similarity coefficient (uJaccard), which does not only take into consideration the space among two points but also the semantics among them. How can we retrieve meaningful information from a large and sparse graph? Traditional approaches focus on generic clustering techniques for network graph. However, they tend to omit interesting patterns such as the paradigmatic relations. In this paper, we propose a novel graph clustering technique modelling the relations of a node using the paradigmatic analysis. Our proposed algorithm paradigmatic clustering (PaC) for graph clustering uses paradigmatic analysis supported by an asymmetric similarity using uJaccard. Extensive experiments and empirical analysis are used to evaluate our algorithm on synthetic and real data.

Keywords: clustering algorithms; paradigmatic similarity; asymmetric similarity; pattern recognition; Jaccard similarity coefficient; semantics; graph clustering; modelling.

DOI: 10.1504/IJICA.2016.080871

International Journal of Innovative Computing and Applications, 2016 Vol.7 No.4, pp.243 - 256

Received: 03 Feb 2016
Accepted: 29 Jul 2016

Published online: 09 Dec 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article