Authors: Mary A. Geetha
Addresses: School of Computing Science and Engineering, VIT University, Vellore-632014, India
Abstract: Privacy preservation is the major concern while real time datasets are handled. Especially, when the data are accessed from cloud, firm privacy algorithms has to be adhered to ensure there is no leak of sensitive data. A specific topic, privacy preserving data mining (PPDM), completely deals with data modification, but also limits information loss. Data perturbation is one of the PPDM techniques, which mostly deals with numerical data and concentrates on retaining statistical properties of data. Perturbation is of two types, additive perturbation and multiplicative perturbation, where generated random data is either added or multiplied with the data, which results in a random modified data. In this paper we propose a model in which the perturbation is done by randomisation, where the data is generated in intervals based on the level of privacy generated from a fuzzy system based on various inputs. Our model is proved to be successful from the experimental analysis performed by validating the model using classification algorithms.
Keywords: C5.0 algorithm; classification; data mining; privacy preservation; PPDM; fuzzy logic; random perturbation; medical datasets; clinical practice; data modification; information loss; data perturbation; cloud computing.
International Journal of Telemedicine and Clinical Practices, 2015 Vol.1 No.2, pp.111 - 124
Received: 19 Oct 2013
Accepted: 20 Apr 2014
Published online: 09 Jun 2015 *