Title: Privacy-preserving data warehousing

Authors: Benjamin Fabian; Tom Göthling

Addresses: Institute of Information Systems, Humboldt-Universität zu Berlin, Spandauer Str. 1, 10178 Berlin, Germany ' Institute of Information Systems, Humboldt-Universität zu Berlin, Spandauer Str. 1, 10178 Berlin, Germany

Abstract: Data warehouses are an important element of business intelligence and decision support in many companies and inter-organisational data infrastructures. However, when personal information of individuals is concerned, it is critical to provide sufficient protection mechanisms in order to preserve privacy. In addition to classical access control, database anonymisation is an important element of an encompassing strategy for privacy-preserving data storage. This article gives an overview on selected anonymisation concepts and techniques and investigates if they are suitable for a data warehouse context. Furthermore, a process of privacy-preserving data integration and provisioning is presented and the impact of architecture, privacy criteria, and further parameter choices is discussed. Finally, we experimentally compare the impact of these parameters on data utility after anonymisation in several experiments on multiple datasets and derive corresponding recommendations.

Keywords: data warehousing; data integration; privacy preservation; privacy protection; anonymity; personal information; anonymisation; data warehouses; data utility.

DOI: 10.1504/IJBIDM.2015.072210

International Journal of Business Intelligence and Data Mining, 2015 Vol.10 No.4, pp.297 - 336

Received: 02 Jun 2015
Accepted: 06 Jun 2015

Published online: 04 Oct 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article