Title: The analysis and recognition of Chinese temporal expressions based on a mixtured model using statistics and rules

Authors: Zhao Dandan; Huang Degen; Wang Yuzhe; Wu Qiong; Zhao Ge

Addresses: School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Ganjingzi District, Dalian, China; School of Computer Science and Engineering, Dalian Nationalities University, No. 18, Liaohe West Road, Jinzhou New District, Dalian, China ' School of Computer Science and Technology, Dalian University of Technology, No.2 Linggong Road, Ganjingzi District, Dalian, China ' College of Electromechanical and Information Engineering, Dalian Nationalities University, No. 18, Liaohe West Road, Jinzhou New District, Dalian, China ' School of Computer Science and Technology, Dalian University of Technology, No. 2 Linggong Road, Ganjingzi District, Dalian, China ' School of Computer Science and Engineering, Dalian Nationalities University, No. 18, Liaohe West Road, Jinzhou New District, Dalian, China

Abstract: As the first step of temporal information understanding, the results of temporal expressions recognition will directly affect further usage of temporal information. For Chinese language, there are many distinct characters both in word morphology and syntax in temporal expressions compared with the Western languages. Classifications and constructions of Chinese temporal expressions were analysed, and an approach for extracting temporal expressions from Chinese texts was presented in this paper. The model comprises of a cascade of rule-based and machine-learning pattern recognition procedures. Conditional random fields (CRFs) were applied to recognise time units rather than time expressions to avoid the boundary localisation problems in Chinese temporal expressions. Rules for the temporal expressions boundary localisation were formulated based on time triggers thesaurus and time affix words thesaurus. The F-measure of temporal expressions identification was 95.93% on the temporal 2010 Chinese corpus. The experiments result showed the validity of the proposed approach.

Keywords: temporal expressions; TEs; conditional random fields; CRFs; time units; TUs; rules; time trigger; time affix word; Chinese information processing; thesaurus; China.

DOI: 10.1504/IJCSE.2017.085969

International Journal of Computational Science and Engineering, 2017 Vol.15 No.1/2, pp.153 - 161

Received: 06 Aug 2015
Accepted: 22 Oct 2015

Published online: 21 Aug 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article