FPST: a new term weighting algorithm for long running and short lived events Online publication date: Thu, 24-Dec-2015
by Y. Jahnavi; Y. Radhika
International Journal of Data Analysis Techniques and Strategies (IJDATS), Vol. 7, No. 4, 2015
Abstract: Term weighting is a useful technique that extracts important features from textual documents, thereby providing a basis for different text mining approaches. While several term weighting algorithms based on their frequency and some other statistical measures have been proposed in the past, they are inaccurate in extracting hot terms from internet-based digitised news documents. To overcome that problem, this paper presents an innovative and effective term weighting algorithm by considering position, scattering and topicality along with frequency. Frequency considers the number of occurrences of a term; position focuses on where the term appears; scattering focuses on the distribution of a term in the entire document. Here topicality is calculated for both short lived events and long running events. Experimental evaluation shows that the proposed term weighting algorithm outperforms the existing term weighting algorithms.
Online publication date: Thu, 24-Dec-2015
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Analysis Techniques and Strategies (IJDATS):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email email@example.com