Title: Building Chinese field association knowledge base from Wikipedia

Authors: Li Wang; Min Yao; Yuanpeng Zhang; Danmin Qian; Xinyun Geng; Kui Jiang; Jiancheng Dong

Addresses: Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China ' Department of Medical informatics, Nantong University, Qi Xiu Road #19, Nantong, 226001, China

Abstract: Field association (FA) terms are a limited set of discriminating terms that offer humans the knowledge to identify fields exiting in the document (text). Field association knowledge base is composed of FA terms and their potential hierarchical relationship of the fields they belong to. The main purpose of this research is building Chinese FA knowledge base. After this, the new knowledge base is tested through a system which can imitate the process whereby humans recognise the fields by looking at a few special terms. In doing so, a novel approach makes use of the structured knowledge in Chinese Wikipedia. A totally new Chinese FA knowledge base is built including 115,696 FA terms. The resulting FA knowledge from this knowledge base is applied to text categorisation. The average accuracies, 97.7% and 89%, are both higher than values obtained by SVM.

Keywords: field association terms; Wikipedia; structured knowledge; topic fields; text categorisation; Chinese knowledge base; China.

DOI: 10.1504/IJCAT.2015.071978

International Journal of Computer Applications in Technology, 2015 Vol.52 No.2/3, pp.168 - 176

Published online: 26 Sep 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article