Title: Towards patent text analysis based on semantic role labelling

Authors: Yanqing He; Ying Li; Ling'en Meng; Hongjiao Xu

Addresses: Research Center for Information Science Theory and Methodology, Institute of Scientific and Technical Information of China, Beijing, 100038, China ' Research Center for Information Science Theory and Methodology, Institute of Scientific and Technical Information of China, Beijing, 100038, China ' Information Center, Beijing Dance Academy, Beijing, 100081, China ' Research Center for Information Science Theory and Methodology, Institute of Scientific and Technical Information of China, Beijing, 100038, China

Abstract: Mining patent texts can obtain valuable technical information and competitive intelligence which is important for the development of technology and business. The current patent text mining approaches suffer from lack of effective, automatic, accurate and wide-coverage techniques that can annotate natural language texts with semantic argument structure. It is helpful for text mining to derive more meaningful semantic relationship from semantic role labelling (SRL) results of patents. This paper uses Word2Vec to learn word real-valued vector and design features related to word vector to train SRL parser. Based on the SRL parser, two patent text mining methods are then given: patent topic extraction and automatic construction of patent technical effect matrix (PTEM). Experiments show that semantic role labelling help achieve satisfactory results and saves manpower.

Keywords: patent technical effect matrix; PTEM; semantic role labelling; SRL; International Patent Classification; IPC; patent analysis; word vector; patent topic extraction; semantic analysis; text mining.

DOI: 10.1504/IJCSE.2017.087415

International Journal of Computational Science and Engineering, 2017 Vol.15 No.3/4, pp.256 - 266

Received: 16 Apr 2016
Accepted: 27 Aug 2016

Published online: 15 Oct 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article