Correlation-based concept-oriented bisecting k-means clustering and topic detection for scientific literature and news tracks Online publication date: Wed, 19-Aug-2015
by J. Jayabharathy; S. Kanmani
International Journal of Knowledge Engineering and Data Mining (IJKEDM), Vol. 3, No. 2, 2015
Abstract: Extracting relevant documents from a larger document corpus is a challenging task. The process of clustering groups together the documents sharing similar topics. Incorporating semantic features will improve the accuracy of document clustering methods. Topic detection deals with discovering meaningful and concise labels for the clusters. In this paper, we propose a clustering algorithm named as correlation-based concept-oriented bisecting k-means algorithm using semantic-based similarity measure. This algorithm uses our existing modified semantic-based model in which related terms are extracted as concepts for concept-based document clustering and topic discovery method. The performance of the proposed work is compared with the existing term-based method and also with our earlier work on concept based algorithm. Additional experiments are conducted to demonstrate the ability of the proposed correlation-based concept-oriented bisecting k-means algorithm considering terms only, synonyms and hyponyms and correlated using F-measure and purity as evaluation metrics. Experimental results demonstrate the performance enhancement of the proposed algorithm.
Online publication date: Wed, 19-Aug-2015
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Knowledge Engineering and Data Mining (IJKEDM):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email email@example.com