Title: Secure and privacy-preserving multi-keyword ranked information retrieval from encrypted big data

Authors: Lija Mohan; Sudheep Elayidom M.

Addresses: Division of Computer Science, School of Engineering, Cochin University of Science and Technology (CUSAT), Kerala, India ' Division of Computer Science, School of Engineering, Cochin University of Science and Technology (CUSAT), Kerala, India

Abstract: Cloud deployment raises some security challenges to the confidentiality of data, and the privacy of users. These challenges, along with the pressing demand for adopting big data technologies, together call for the development of stronger encryption algorithms. But encrypting the data makes it difficult to retrieve the most matching documents with respect to the query keywords. Therefore, the authors propose a solution for the ranked encrypted information retrieval, using the modified homomorphic encryption scheme (MHE) still preserving user's privacy. The scheme efficiently utilises the processing power of the cloud server to compute the similarity scores, leaving the decryption and ranking to the client side, thus ensuring the security of the data. Vector space model and term frequency-inverse document frequency (TF-IDF) concepts are used for similarity matching. The execution is then accelerated using a Hadoop cluster and is found to be accurate, efficient, scalable and practical for real world applications.

Keywords: ranked information retrieval; big data security; privacy; cloud; homomorphic encryption; similarity matching.

DOI: 10.1504/IJICS.2020.108839

International Journal of Information and Computer Security, 2020 Vol.13 No.2, pp.141 - 165

Received: 15 Sep 2017
Accepted: 23 Feb 2018

Published online: 05 Aug 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article