Title: Using grouping strategy and pattern discovery for delta extraction in a limited collaborative environment

Authors: Zheng Lu; Jun Yan; Xinqin Wang

Addresses: School of Computing and Information Technology, University of Wollongong, NSW, Australia ' School of Computing and Information Technology, University of Wollongong, NSW, Australia ' Planning Service, University of Wollongong, NSW, Australia

Abstract: This work considers extracting delta in a distributed environment where the collaboration from highly autonomous operational database management systems is limited to granting read only access on a set of selected relational tables. Because of inherently huge volume of data in data warehouse system, it is critical to minimise communication costs as much as possible. Based on the observation that usually, two consecutive snapshots are not very different, a statistical-based group hash method is developed to minimise the volumes of data required to complete the data extraction. In addition, to relax the assumption that the changes to remote data are only caused by random events, we define a progression pattern to describe data changes with temporal regularities and also propose a method for progression pattern discovery.

Keywords: delta extraction; data integration; data synchronisation; data change patterns; pattern discovery; collaborative environments; grouping strategy; limited collaboration; autonomous DMBS; operational DMBS; database management systems; group hash; data warehouses.

DOI: 10.1504/IJBIDM.2015.072213

International Journal of Business Intelligence and Data Mining, 2015 Vol.10 No.4, pp.378 - 405

Received: 06 May 2015
Accepted: 06 Jun 2015

Published online: 04 Oct 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article