Title: An efficient document clustering using hybridised harmony search K-means algorithm with multi-view point
Authors: S. Siamala Devi; S. Anto; S.P. Siddique Ibrahim
Addresses: Department of Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore, India ' School of Computer Science and Engineering, VIT University, Vellore, India ' Department of Computer Science and Engineering, Kumaraguru College of Technology, Coimbatore, India
Abstract: Document clustering is the most needed process in the data mining field where the number of documents with different methodologies are scattered. The meaningful information can be extracted from the group of documents by grouping them effectively. There are various researches that exist previously which concentrate on clustering the documents present in the real. In the previous works, document clustering is done by using the methodologies called the term weight-based hybridised harmony K-means search (TW HHKM), coverage factor-based hybridised harmony K-means search (CF HHKM), concept-based, kernel and weighted feature-based clustering algorithm (CKW HHKM). Clustering is normally done by using the K-means algorithm and the centroids of clusters are found optimally by using the harmony search algorithm. The problem reside in the above said existing methods are the poor accuracy while clustering the documents where the unrelated documents are grouped together. To overcome this problem, multi-view point HHKM (MP HHKM) approach is introduced, in which clustering can be done accurately. In this work, multi-point analysis is done based on the similarity measurement. The exploratory tests were directed on news group and TREC dataset from which it is robust that the proposed technique MP HHKM overtakes the existing technique with better accuracy values.
Keywords: clustering; harmony search; multi-view point; optimal.
International Journal of Cloud Computing, 2021 Vol.10 No.1/2, pp.129 - 143
Received: 18 Jul 2019
Accepted: 09 Feb 2020
Published online: 15 Mar 2021 *