Title: AIR-IA: an analogous image removal approach using the intelligent archive

Authors: Jyoti Malhotra; Jagdish Bakal

Addresses: Department of CSE, G.H. Raisoni College of Engineering, RTM University Nagpur, India; MIT School of Engineering, MIT ADT University, Pune, India ' Department of CSE, G.H. Raisoni College of Engineering, RTM University Nagpur, India; S.S. Jondhale College of Engineering, Mumbai, India

Abstract: Deduplication is maturing as a standard attribute on backup and archives, whereby the aim is to free the storage space by removing the duplicates. Considering the storage room demand and justifiable deletion, this paper proposes a multi-container intelligent deduplication image archive system; where analogous images are disposed from the system based on the similarity approach. Similarity-aware image deduplication is achieved by calculating image fingerprints and the images are deleted when their hamming distance matches the predefined threshold. A probability model is addressed for the overall probability of getting similar images on the respective containers based on their relative storage and similarity scores of the images. In addition, the linear optimisation model is formulated to target data minimisation and storage space maximisation; which is further verified with the dataset. We perform experimentation of our work on the existing as well as synthesised datasets and various accuracy metrics are calculated in terms of precision, recall, f-score and deduplication ratio. It is observed that binary hashes used in our system give the fair contribution in removing similar images.

Keywords: deduplication; image; fingerprints; similarity; storage space; optimisation.

DOI: 10.1504/IJAC.2020.114357

International Journal of Autonomic Computing, 2020 Vol.3 No.3/4, pp.290 - 307

Accepted: 24 Dec 2018
Published online: 20 Apr 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article