Title: Fuzzy based clustering algorithm for privacy preserving data mining

Authors: Pradeep Kumar, Kishore Indukuri Varma, Ashish Sureka

Addresses: 219, Faculty Block, IIM Campus, Prabandh Nagar, Off Sitapur Road, Lucknow – 226013, India. ' Infosys Technologies Ltd., Survey No:210, Manikonda Village, Rajendranagar Mandal, Lingampally, Rangareddy District, Hyderabad 500019, India. ' Indraprastha Institute of Information Technology (IIIT), 3rd Floor, Library Building, NSIT Campus, Dwarka, Sector 3, New Delhi – 110078, India

Abstract: Sharing of data among multiple organisations is required in many situations. The shared data may contain sensitive information about individuals which if shared may lead to privacy breach. Thus, maintaining the individual privacy is a great challenge. In order to overcome the challenges involved in data mining, when data needs to be shared, privacy preserving data mining (PPDM) has evolved as a solution. The objective of PPDM is to have the interesting knowledge mined from the data at the same time to maintain the individual privacy. This paper addresses the problem of PPDM by transforming the attributes to fuzzy attributes. Thus, the individual privacy is also maintained, as one cannot predict the exact value, at the same time, better accuracy of mining results is achieved. ID3 and Naive Bayes classification algorithms over three different datasets are used in the experiments to show the effectiveness of the approach.

Keywords: privacy preserving data mining; PPDM; fuzzy sets; decision trees; clustering algorithms; data sharing; multiple organisations; sensitive information; privacy breaches; data protection; individual privacy; fuzzy attributes; ID3; iterative dichotomisers; Naive Bayes classification; business information systems.

DOI: 10.1504/IJBIS.2011.037295

International Journal of Business Information Systems, 2011 Vol.7 No.1, pp.27 - 40

Available online: 02 Dec 2010 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article