Title: Incorporating noun compounds in distributional-based semantic representation approaches for measuring semantic relatedness

Authors: Abdulgabbar Saif; Nazlia Omar; Ummi Zakiah Zainodin

Addresses: Faculty of Information Technology and Computer Science, University of Saba Region, Marib, Yemen ' Centre for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia ' Centre for Artificial Intelligence Technology, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia

Abstract: Identifying noun compounds in natural language documents is very important for handling their various linguistic features, such as semantic, syntactic, and pragmatic features. In this study, we introduce a knowledge-based method for incorporating noun compounds in distributional-based semantic representation approaches. Wikipedia is exploited as a knowledge resource for extracting noun compounds based on its structural features. The categories are then used to classify the extracted noun compounds as linguistic terms and named entities. Next, the look-up list technique is employed to identify the noun compounds when extracting the semantics of the terms using the corpus-based approach for semantic representation. To obtain the semantic representation, we use five well-known distributional-based approaches: latent semantic analysis (LSA), hyperspace analogue to language (HAL), correlated occurrence analogue to lexical semantic (COALS), bound encoding of the aggregate language environment (BEAGLE), and explicit semantic analysis (ESA). The proposed method was evaluated by measuring the semantic relatedness using five benchmark datasets employed in previous studies. The experimental results demonstrate that incorporating noun compounds in the distributional-based semantic representation helps to improve the semantic evidence for the relationships among words.

Keywords: distributional-based approach; noun compound; semantic analysis; semantic relatedness.

DOI: 10.1504/IJRIS.2019.098059

International Journal of Reasoning-based Intelligent Systems, 2019 Vol.11 No.1, pp.11 - 23

Received: 15 Apr 2017
Accepted: 23 Jan 2018

Published online: 01 Mar 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article