Authors: Shufang Wu; Jie Zhu; Jianmin Xu
Addresses: College of Management and Economics, Tianjin University, Tianjin, China; College of Management, Hebei University, Baoding, China ' Department of Information Management, The Central Institute for Correctional Police, Baoding, China ' College of Computer Science and Technology, Hebei University, Baoding, China
Abstract: The construction of a discriminative topic dictionary is important for describing the topic and increasing the accuracy of topic detection and tracking. In this method, we rank the mutual information of words, and the top few words with the maximum mutual information are selected to construct the discriminative topic dictionaries. Considering context words can provide a more accurate expression of the topic, during word selection, we consider both the differences between different topics and the context words that appear in the stories. Since the news topic is dynamic over time, it is not reasonable to keep the topic dictionary unchanged, a dictionary updating method is also proposed. Experiments were carried out on TDT4 corpus, and we adopt miss probability and false alarm probability as evaluation criteria to compare the performance of incremental TF-IDF and the proposed method. Extensive experiments are conducted to show that our method can provide better results.
Keywords: discriminative dictionary; context word; topic representation; word selection.
International Journal of Computational Science and Engineering, 2019 Vol.19 No.3, pp.400 - 406
Received: 28 Dec 2016
Accepted: 15 Mar 2017
Published online: 05 Aug 2019 *