Title: Reformulated query-based document retrieval using optimised kernel fuzzy clustering algorithm
Authors: M.M. Gowthul Alam; S. Baulkani
Addresses: Department of CSE, National College of Engineering, Tirunelveli, India ' Department of ECE, Government College of Engineering, Tirunelveli, India
Abstract: Clustering-based document retrieval system offers to find similar documents for a given user's query. This study explores the scope of kernel fuzzy c-means (KFCM) with the genetic algorithm on document retrieval issue. Initially, genetic algorithm-based kernel fuzzy c-means algorithm (GKFCM) is proposed to make the clustering of documents in the library. For each cluster, an index is created, which contains a common significant keywords of the documents for that cluster. Once the user enters the keyword as the input to the system, it will process the keywords with the WORDNET ontology to achieve the neighbourhood keywords and related synset keywords. Lastly, the documents inside the cluster are released at first as the resultant-related documents for the query keyword, which clusters have the maximum matching score values. Experiments results prove that GKFCM-based proposed system outperforms better performance than existing methods.
Keywords: document clustering; WORDNET; ontology; genetic algorithm; kernel fuzzy c-means; KFCM; Gaussian.
International Journal of Business Intelligence and Data Mining, 2017 Vol.12 No.3, pp.299 - 318
Received: 05 Oct 2016
Accepted: 20 Dec 2016
Published online: 10 Jul 2017 *