Title: Semantics-based key concepts identification for documents indexing and retrieval on the web

Authors: Mohammed Maree

Addresses: Department of Multimedia Technology, Faculty of Engineering and Information Technology, Arab American University, P.O. Box 240 Jenin, 13 Zababdeh, Palestine

Abstract: Bridging the semantic gap on the web remains one of the crucial challenges for current horizontal and domain-specific information retrieval systems. This challenge becomes even more pronounced when users express their information needs using short queries that are formulated using a few number of keywords, therefore relying on keywords for indexing web documents results in degrading the quality of the returned results. In this article, we introduce an approach that employs knowledge captured by large-scale knowledge resources to identify key query terms and retrieve semantically-relevant documents. In this context, key terms are mapped to their semantic correspondences and variable term weights are assigned based on the semantic and taxonomic relations for each term. To demonstrate the effectiveness of the proposed approach, we have conducted experimental evaluation using Glasgow's NPL test collections. Findings indicate that the effectiveness has improved against four conventional similarity metrics that are based on the bag-of-words similarity model.

Keywords: key concepts; large-scale ontologies; semantic matching; information indexing; data analysis; precision measures.

DOI: 10.1504/IJICA.2021.113608

International Journal of Innovative Computing and Applications, 2021 Vol.12 No.1, pp.1 - 12

Received: 21 Sep 2019
Accepted: 26 Nov 2019

Published online: 15 Mar 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article