Clustering of text documents with keyword weighting function
by A. Christy; G. Meera Gandhi; S. Vaithyasubramanian
International Journal of Intelligent Enterprise (IJIE), Vol. 6, No. 1, 2019

Abstract: In this digital world, data is available in abundance everywhere and it is growing at a phenomenal rate. Making data available readily for decision making is an important task of data analyst. In this article, we propose an unsupervised learning algorithm for text document clustering by adopting keyword weighting function. Documents are pre-processed and relevant keywords based on their weights are grouped together. Clustered keyword weighting (CKW) takes each class in the training collection as a known cluster, and searches for feature weights iteratively to optimise the clustering objective function, in order to retrieve the best clustering result. Performance of CKW is validated by clustering BBC news collection text collections. Experiments were conducted with simple K-means, hierarchical clustering algorithms and our keyword weighting and clustering approach has shown improved cluster quality compared to the other methods.

Online publication date: Tue, 04-Jun-2019

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Intelligent Enterprise (IJIE):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com