Title: Privacy-preserving smart contracts for fuzzy WordNet-based document representation and clustering using regularised K-means method

Authors: Venkata Nagaraju Thatha; A. Sudhir Babu; D. Haritha

Addresses: Department of CSE, JNTUK, Kakinada, AP, India ' Department of CSE, PVPSIT, Vijayawada, AP, India ' Department of CSE, JNTUK, Kakinada, AP, India

Abstract: Key technology for unsupervised intelligent classification of any textual content is the clustering of documents. Prior document knowledge is not required for document clustering, which is an unsupervised method of learning as compared with document classification. For clustering rather than classification, little prior knowledge of the data is needed. The crucial challenges of document clustering are the high dimensionality, measurability, preciseness, extraction of semantic relationships from texts, and meaningful cluster labels. Fuzzy WordNet-based document representation and clustering using the regularised K-means method as an efficient framework is introduced in the present paper with the purpose of improving the quality of document clustering. To estimate the performance of this framework we carried out experiments on different datasets. Experimental results show that this framework improves the quality of document clustering when compared to other existing methods. Furthermore, this system gives generalised and concrete labels for documents and improves the speed of clustering by reducing their size.

Keywords: document clustering; regularised K-means; WordNet; fuzzy weighting score; TF-IDF.

DOI: 10.1504/IJAHUC.2022.10043491

International Journal of Ad Hoc and Ubiquitous Computing, 2022 Vol.40 No.1/2/3, pp.2 - 9

Received: 04 Jun 2020
Accepted: 07 Sep 2020

Published online: 27 Jun 2022 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article