Title: A new bottom-up l-diversity method for Apache spark

Authors: Salheddine Kabou; Laid Gasmi; Abdelbaset Kabou; Sidi Mohamed Benslimane

Addresses: Higher Normal School of Bechar, Bechar, Algeria ' Ahmed Draia University, Adrar, Algeria ' Scientific and Technical Information Center, Algeries, Algeria ' Ecole Superieure en Informatique, LabRI-SBA Lab, Sidi Bel Abbes, Algeria

Abstract: This paper addresses the critical issue of preserving privacy in data sharing by introducing 'ImBLd', an enhanced multidimensional bottom-up anonymisation approach based on the l-diversity privacy model. Implemented on the Apache Spark framework, ImBLd optimises data utility through efficient data insertion and splitting processes. Our method integrates data records from distributed workers into an enhanced bottom-up R-tree, significantly minimising data loss. Experimental results demonstrate that ImBLd outperforms traditional top-down methods, achieving superior data utility and reduced execution times. These findings highlight ImBLd's potential for efficient and scalable privacy-preserving data publishing in large-scale datasets.

Keywords: data anonymisation; bottom-up generalisation; l-diversity; Apache Spark.

DOI: 10.1504/IJICS.2025.146160

International Journal of Information and Computer Security, 2025 Vol.26 No.3, pp.272 - 290

Received: 08 Oct 2023
Accepted: 16 Oct 2024

Published online: 08 May 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article