Title: MR-LEGOS: a refined MapReduce model


Author: Ahmed Radwan; Akmal Younis; Santhosh Srinivasan; Abhay Gupta


Yahoo! Inc., Santa Clara, CA, 95054, USA.
College of Engineering, University of Miami, Coral Gables, FL 33124, USA.
Yahoo! Inc., Santa Clara, CA, 95054, USA.
Yahoo! Inc., Santa Clara, CA, 95054, USA


Journal: Int. J. of Cloud Computing, 2011 Vol.1, No.1, pp.58 - 80


Abstract: MapReduce is a parallel programming model that is proven to scale. However, using the low-level MapReduce for general data processing tasks poses the problem of developing, maintaining and reusing custom low-level user code. Several frameworks have emerged to address this problem. We highlight several issues in these approaches and alternatively propose a novel refined MapReduce model (MR-LEGOS); an explicit model for composing MapReduce constructs from simpler components, namely, 'Maplets', 'Reducelets' and optionally 'Combinelets'. This composition can be viewed as defining a micro-workflow inside the MapReduce job. Using MR-LEGOS, complex problem semantics can be defined in the encompassing micro-workflow while keeping the building blocks simple. The model is analogous to LEGO bricks. Having a collection of these standard and reusable predefined bricks, helps define complex processing tasks efficiently. We present the design details, usage scenarios, performance experiments and highlight the main features of MR-LEGOS.


Keywords: cloud computing; MapReduce; Hadoop; data management; grid computing; LEGOS; extract transform load; parallel programming; semantics.


DOI: 10.1504/IJCC.2011.043246




Editors Full Text AccessAccess for SubscribersPurchase this articleComment on this article