Title: Augmenting keyword-based patent prior art search using weighted classification code hierarchies

Authors: Alok Khode; Sagar Jambhorkar

Addresses: Symbiosis Centre for Research and Innovation, Symbiosis International University, Pune, India ' Department of Computer Science, National Defence Academy, Pune, India

Abstract: Patents are critical intellectual assets for any business. With the rapid increase in the patent filings, patent prior art retrieval has become an important task. The goal of the prior art retrieval is to find documents relevant to a patent application. Due to the special nature of the patent documents, only relying on keyword-based queries does not prove effective in patent retrieval. Previous works have used international patent classification (IPC) to improve the effectiveness of keyword-based search. However, these systems have used two-stage retrieval process using IPC mostly to filter patent documents or to re-rank the documents retrieved by keyword-based query. In the approach proposed in this paper, weighted IPC code hierarchies have been explored to augment keyword-based search, thereby eliminating the use of an additional processing step. Experiments on the CLEF-IP 2011 benchmark dataset show that the proposed approach outperforms the baseline on the MAP, recall and PRES.

Keywords: patent retrieval; prior art search; international patent classification; IPC; query formulation; query expansion; information retrieval; IPC hierarchy; weighted IPC.

DOI: 10.1504/IJBIDM.2022.126500

International Journal of Business Intelligence and Data Mining, 2022 Vol.21 No.4, pp.397 - 418

Received: 17 Apr 2021
Accepted: 25 Jun 2021

Published online: 27 Oct 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article