Title: Introducing extreme data storage middleware of schema-free document stores using MapReduce

Authors: Kun Ma; Bo Yang

Addresses: Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, Shandong, China ' Shandong Provincial Key Laboratory of Network Based Intelligent Computing, University of Jinan, Jinan 250022, Shandong, China

Abstract: Data warehousing has been embraced by organisations of all sizes. However, there are few publications on data warehouse of NoSQL. In this paper, an extreme data storage middleware (EDSM) of schema-free document stores using MapReduce is presented to address the issue of formulating no redundant data warehouse with small amount of storage space for the purpose of their composition in a way that utilises the MapReduce framework. The experiment is shown to successfully build the NoSQL data warehouse reducing data redundancy compared with document with timestamp and lifecycle tag solutions. Our experiment also provides insight into some of the key challenges and shortcomings that researchers and engineers face when designing the data warehouse middleware.

Keywords: extreme data storage; historical data; CDC; change data capture; MapReduce; NoSQL; lifecycle tag; data warehousing; data redundancy; data warehouse middleware; schema-free document stores.

DOI: 10.1504/IJAHUC.2015.073439

International Journal of Ad Hoc and Ubiquitous Computing, 2015 Vol.20 No.4, pp.274 - 284

Received: 02 Sep 2013
Accepted: 11 Feb 2014

Published online: 08 Dec 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article