Authors: Kun Ma; Bo Yang
Addresses: Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, Shandong, China ' Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, Shandong, China
Abstract: Data warehousing has been embraced by organisations of all sizes. However, there are few publications on data warehouse of NoSQL. In this paper, an extreme data storage middleware (EDSM) of schema-free document stores using MapReduce is presented to address the issue of formulating no redundant data warehouse with small amount of storage space for the purpose of their composition in a way that utilises the MapReduce framework. The experiment is shown to successfully build the NoSQL data warehouse reducing data redundancy compared with document with timestamp and lifecycle tag solutions. Our experiment also provides insight into some of the key challenges and shortcomings that researchers and engineers face when designing the data warehouse middleware.
Keywords: extreme data storage; historical data; CDC; change data capture; MapReduce; NoSQL; lifecycle tag; data warehousing; data redundancy; data warehouse middleware; schema-free document stores.
International Journal of Ad Hoc and Ubiquitous Computing, 2015 Vol.20 No.4, pp.274 - 284
Received: 02 Sep 2013
Accepted: 11 Feb 2014
Published online: 08 Dec 2015 *