Title: A fuzzy programming approach for data reduction and privacy in distance-based mining

Authors: Shibnath Mukherjee, Zhiyuan Chen, Aryya Gangopadhyay

Addresses: Department of Information Systems, University of Maryland, Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD 21250, USA. ' Department of Information Systems, University of Maryland, Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD 21250, USA. ' Department of Information Systems, University of Maryland, Baltimore County (UMBC), 1000 Hilltop Circle, Baltimore, MD 21250, USA

Abstract: With the explosive growth of data and its distributed sources, there are increasing needs for secure cooperative data analysis. The issue of data reduction to decrease communication overheads and the issue of preservation of privacy of the shared data are becoming important. However, existing privacy preserving techniques do not work well for distance-based mining because they do not preserve distances. Besides, most of them either do not reduce data or are tied to very specific mining algorithms. Using the unitarity and energy compaction property of Fourier transforms, this paper proposes a novel framework to preserve privacy and reduce data size, yet preserve Euclidian distances. A fuzzy programming approach for selection of Fourier coefficients is proposed to optimise the objective of preserving Euclidean distances and obtaining privacy and data reduction through coefficient suppression. Experiments demonstrate the superiority of the proposed approach over the existing ones.

Keywords: privacy protection; data mining; fuzzy programming; distance learning; information security; computer security; data reduction; privacy preservation; coefficient suppression.

DOI: 10.1504/IJICS.2008.016820

International Journal of Information and Computer Security, 2008 Vol.2 No.1, pp.27 - 47

Published online: 24 Jan 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article