Title: Reformulated query-based document retrieval using optimised kernel fuzzy clustering algorithm

Authors: M.M. Gowthul Alam; S. Baulkani

Addresses: Department of CSE, National College of Engineering, Tirunelveli, India ' Department of ECE, Government College of Engineering, Tirunelveli, India

Abstract: Clustering-based document retrieval system offers to find similar documents for a given user's query. This study explores the scope of kernel fuzzy c-means (KFCM) with the genetic algorithm on document retrieval issue. Initially, genetic algorithm-based kernel fuzzy c-means algorithm (GKFCM) is proposed to make the clustering of documents in the library. For each cluster, an index is created, which contains a common significant keywords of the documents for that cluster. Once the user enters the keyword as the input to the system, it will process the keywords with the WORDNET ontology to achieve the neighbourhood keywords and related synset keywords. Lastly, the documents inside the cluster are released at first as the resultant-related documents for the query keyword, which clusters have the maximum matching score values. Experiments results prove that GKFCM-based proposed system outperforms better performance than existing methods.

Keywords: document clustering; WORDNET; ontology; genetic algorithm; kernel fuzzy c-means; KFCM; Gaussian.

DOI: 10.1504/IJBIDM.2017.085089

International Journal of Business Intelligence and Data Mining, 2017 Vol.12 No.3, pp.299 - 318

Received: 05 Oct 2016
Accepted: 20 Dec 2016

Published online: 10 Jul 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article