Title: Privacy preservation in data mining using hybrid perturbation methods: an application to bankruptcy prediction in banks

Authors: Kunta Ramu, V. Ravi

Addresses: Institute for Development and Research in Banking Technology, Castle Hills Road #1, Masab Tank, Hyderabad – 500 007 (AP) India. ' Institute for Development and Research in Banking Technology, Castle Hills Road #1, Masab Tank, Hyderabad – 500 007 (AP) India

Abstract: Today, the data related to business, finance and healthcare pose problems for Privacy-Preserving Data Mining (PPDM). Privacy regulations and concerns prevent data owners from sharing data for mining purposes. To circumvent this problem, data owners must design strategies to meet privacy requirements and ensure valid data mining results. This paper proposes the hybridisation of the random projection and random rotation methods for privacy-preserving classification. The hybrid method is tested on six benchmark data sets and four bank bankruptcy data sets. These methods ensure the privacy and secrecy of bank data and the resulting data set is mined without a considerable loss of accuracy. A multilayer perceptron, decision tree J48 and logistic regression are used as classifiers. The results of a tenfold cross-validation and t-test indicate improved average accuracies for the hybrid privacy preservation method compared to when random projection is used alone. The reasons for the superior performance of the hybrid privacy preservation method are also highlighted.

Keywords: privacy preservation; data mining; PPDM; random projection; random rotation; stress function; bankruptcy prediction; classification; multilayer perceptron; decision tree J48; logistic regression; banks.

DOI: 10.1504/IJDATS.2009.027509

International Journal of Data Analysis Techniques and Strategies, 2009 Vol.1 No.4, pp.313 - 331

Published online: 28 Jul 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article