Title: Scalable information retrieval system in semantic web by query expansion and ontological-based LSA ranking similarity measurement
Authors: M. Uma Devi; G. Meera Gandhi
Addresses: SRM Institute of Science and Technology, SRM Nagar, Kattankulathur, Chennai 603203, Tamil Nadu, India ' Faculty of Computing, Sathyabama Institute of Science and Technology, Chennai 600119, Tamil Nadu, India
Abstract: In recent days, semantic web presents a key role in intelligent retrieval of information system that resolves vocabulary mismatch problem by query expansion process. However, achieving the scalable information retrieval (IR) in semantic web is a challenging issue in a large dataset. The semantic IR problem is addressed by an ontological-based semantic similarity measurement using natural language processing. The two novel algorithms namely syntactic correlation coefficient (SCC) and mapping-based K-nearest neighbour (M-KNN) for semantic similarity measurement is proposed which improves the accuracy of relevant result. The ontological constructs with word sense disambiguation (WSD) algorithm for document repository improves the conceptual relationships, reduces the ambiguities in ontology and improves scalability by intensely analysing the semantic relationship as well as dynamically reconstructing the ontology when numbers of documents are updated. Ranking is done with latent semantic analysis (LSA) after semantic similarity analysis, which improves the retrieved result and reduces the complexity in relevancy. The performance of the system is analysed with respect to different metrics such as processing time, F-measure (0.97), time complexity, precision (0.95), recall (0.98) and space complexity.
Keywords: information retrieval; IR; semantic similarity; ontology; k-nearest neighbour; latent semantic analysis; LSA; word sense disambiguation; WSD; SPARQL; singular value decomposition.
International Journal of Advanced Intelligence Paradigms, 2020 Vol.17 No.1/2, pp.44 - 66
Received: 29 Nov 2017
Accepted: 23 Jan 2018
Published online: 03 Aug 2020 *