Title: Approximate queries on distributed data marts

Authors: Francis A. Mendez Mediavilla, Hsun-Ming Lee

Addresses: Department of Computer Information Systems and Quantitative Methods, Texas State University, 601 University Dr, San Marcos, TX 78666, USA. ' Department of Computer Information Systems and Quantitative Methods, Texas State University, 601 University Dr, San Marcos, TX 78666, USA

Abstract: The global business deals with a large amount of business data that are stored in potentially hundreds of distributed systems. It is challenging to allow end-users issue online analytical processing (OLAP) queries to retrieve suitable information through a worldwide network. This article presents the idea of using statistical methods to model federated data marts. Once data marts are modelled, reduced sets of distributed data can be imported and used to approximately reconstruct a federated data mart. Approximate queries can then be obtained from the reconstructed federated data mart. Advantages of this design include: quick query responses without accessing external servers; user-defined accuracy of the approximate query answers and network-efficient method for periodical updates. A proof of concept is presented using large data sets used for marketing analysis purposes.

Keywords: approximate queries; federated data marts; data reduction; distributed systems; OLAP; online analytical processing; modelling; query response; marketing analysis.

DOI: 10.1504/IJIDS.2009.027757

International Journal of Information and Decision Sciences, 2009 Vol.1 No.4, pp.366 - 381

Published online: 10 Aug 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article