Title: Privacy-preserving multi-party decision tree induction

Authors: Justin Z. Zhan, Stan Matwin, LiWu Chang

Addresses: School of Information Technology and Engineering, University of Ottawa, 800 King Edward Ave., P.O. Box 450 Stn A, Ottawa Ontario, K1N 6N5, Canada. ' School of Information Technology and Engineering, University of Ottawa, 800 King Edward Ave., P.O. Box 450 Stn A, Ottawa Ontario, K1N 6N5, Canada. ' Center for High Assurance Computer Systems, Naval Research Laboratory, Washington DC 20375, USA

Abstract: Data mining is a process to extract useful knowledge from large amounts of data. To conduct data mining, we often need to collect data. However, sometimes the data are distributed among various parties. Privacy concerns may prevent the parties from directly sharing the data and some types of information about the data. How multiple parties can collaboratively conduct data mining without breaching data privacy presents a grand challenge. In this paper, we propose a randomisation-based scheme for multi-parties to conduct data mining computations without disclosing their actual data sets to each other.

Keywords: data mining; decision tree classification; privacy preservation; randomisation; data privacy.

DOI: 10.1504/IJBIDM.2007.013937

International Journal of Business Intelligence and Data Mining, 2007 Vol.2 No.2, pp.197 - 212

Published online: 04 Jun 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article