Title: Parallel computing for preserving privacy using k-anonymisation algorithms from big data

Authors: Sharath Yaji; B. Neelima

Addresses: Department of Information Science and Engineering, NMAM Institute Of Technology, Nitte, Karkala (Taluk), Karnataka, Post Box No. 574118, India ' Department of Information Science and Engineering, NMAM Institute Of Technology, Nitte, Karkala (Taluk), Karnataka, Post Box No. 574118, India

Abstract: Many organisations still consider preserving privacy for big data as a major challenge. Parallel computation can be used to optimise big data analysis. This paper gives a proposal for parallelising k-anonymisation algorithms through comparative study and survey. The k-anonymisation algorithms considered are MinGen, DataFly, Incognito and Mondrian. The result shows the parallel versions of the algorithms perform better than sequential counterparts, as data size increases. For small size dataset in sequential mode MinGen is 71.83% faster than parallel version. However, in sequential mode DataFly and in parallel mode incognito performed well. For large size dataset in parallel mode Incognito is 101.186% faster than sequential. However, in sequential mode MinGen and DataFly performed well. In parallel mode Incognito, DataFly and MinGen performed well. The paper acts as a single point of reference for choosing big data mining k-anonymisation algorithms. This paper gives direction of applying HPC concepts such as parallelisation for privacy preserving algorithms.

Keywords: big data; k-anonymisation; privacy preserving; big data analysis; parallel computing in big data.

DOI: 10.1504/IJBDI.2018.092659

International Journal of Big Data Intelligence, 2018 Vol.5 No.3, pp.191 - 200

Received: 22 Jun 2016
Accepted: 31 Oct 2016

Published online: 27 Nov 2017 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article