Title: Protection of health information in data mining

Authors: Jingquan Li, Michael J. Shaw

Addresses: Department of Business Administration, College of Business, University of Illinois at Urban-Champaign, 350 Wohler's Hall, 1206 S. Sixth, Champaign, IL 61820, USA. ' Department of Business Administration, College of Business, University of Illinois at Urban-Champaign, 350 Wohler's Hall, 1206 S. Sixth, Champaign, IL 61820, USA

Abstract: The paper studies issues related to privacy protection of health information in data mining. We introduce the implications of the Health Insurance Portability and Accountability Act (HIPAA) and the privacy of protected health information (PHI). We present an attribute analysis framework of health information and a technological approach for protecting PHI in data mining. Specifically, we develop such effective privacy-enhancing techniques as data filter, discretisation and randomisation and give an example of inducing the decision-trees from training data in which the values of sensitive attributes have been either removed or modified by using these techniques. The results show that we can achieve comparative predictive accuracies without accessing the original values of the sensitive attributes.

Keywords: classification; data mining; HIPAA; privacy; privacy-enhancing techniques; protected health information; Health Insurance Portability and Accountability Act; information protection; healthcare.

DOI: 10.1504/IJHTM.2004.004977

International Journal of Healthcare Technology and Management, 2004 Vol.6 No.2, pp.210 - 222

Published online: 04 Aug 2004 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article