Authors: Xiaoqian Liu; Qianmu Li; Tao Li; Ming Wu
Addresses: Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing, China ' School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China ' School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China ' School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Abstract: As a representative classification model, decision tree has been extensively applied in data mining. It generates a series of if-then rules based on the homogeneity of class distribution. In a society where data spreads everywhere for knowledge discovery, the privacy of the data respondents is likely to be leaked and abused. Based on this concern, we propose an overview of the rapidly evolving research results focusing on privacy preserving decision tree induction. The research results are summarised according to the characteristics of related privacy preservation techniques, which include data perturbation, cryptography, and data anonymisation. In addition, we demonstrate the comparison between the merits and demerits of these methods considering the specific property of decision tree induction. At last, we conclude the future trend of privacy preserving techniques.
Keywords: decision tree; privacy preservation; ensemble; differential privacy; data perturbation; cryptography; data anonymisation.
International Journal of Information and Computer Security, 2021 Vol.16 No.3/4, pp.255 - 271
Accepted: 30 Dec 2018
Published online: 15 Nov 2021 *