Title: A fast and scalable similarity search in high-dimensional image datasets

Authors: Youssef Hanyf; Hassan Silkan

Addresses: Computer Science Department, Faculty of Sciences, Chouaib Doukkali University, EL Jadida, Morocco ' Computer Science Department, Faculty of Sciences, Chouaib Doukkali University, EL Jadida, Morocco

Abstract: Owing to the development of image data production and use, the quantity of image datasets has exponentially increased in the last decade. Consequently, the similarity searching cost in image datasets becomes a severe problem which affects the efficiency of similarity search engines in this data type. In this paper, we address the problem of reducing the similarity search cost in large, high-dimensional and scalable image datasets; we propose an improvement of the D-index method to reduce the searching cost and to deal efficiently with scalable datasets. The proposed improvement is based on two propositions; first, we propose criteria and algorithms to choose effective separation values which can reduce the searching cost. Second, we propose an algorithm for updating the structure in case of scalable datasets to resist the impact of objects' insertion on the searching cost. The experiments show that the proposed D-index version has proved a good searching performance in comparison with the classical D-index and a significant resistance to the dataset scalability against the original D-index.

Keywords: similarity search; high-dimensional images datasets; D-index; image datasets indexing; scalable datasets; content-based retrieval; metric spaces; data structure.

DOI: 10.1504/IJCAT.2019.097126

International Journal of Computer Applications in Technology, 2019 Vol.59 No.1, pp.95 - 104

Received: 06 Jun 2017
Accepted: 04 Jun 2018

Published online: 21 Dec 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article