Title: MR-LEGOS: a refined MapReduce model
Author: Ahmed Radwan; Akmal Younis; Santhosh Srinivasan; Abhay Gupta
Address: Yahoo! Inc., Santa Clara, CA, 95054, USA. ' College of Engineering, University of Miami, Coral Gables, FL 33124, USA. ' Yahoo! Inc., Santa Clara, CA, 95054, USA. ' Yahoo! Inc., Santa Clara, CA, 95054, USA
Journal: Int. J. of Cloud Computing, 2011 Vol.1, No.1, pp.58 - 80
Abstract: MapReduce is a parallel programming model that is proven to scale. However, using the low-level MapReduce for general data processing tasks poses the problem of developing, maintaining and reusing custom low-level user code. Several frameworks have emerged to address this problem. We highlight several issues in these approaches and alternatively propose a novel refined MapReduce model (MR-LEGOS); an explicit model for composing MapReduce constructs from simpler components, namely, 'Maplets', 'Reducelets' and optionally 'Combinelets'. This composition can be viewed as defining a micro-workflow inside the MapReduce job. Using MR-LEGOS, complex problem semantics can be defined in the encompassing micro-workflow while keeping the building blocks simple. The model is analogous to LEGO bricks. Having a collection of these standard and reusable predefined bricks, helps define complex processing tasks efficiently. We present the design details, usage scenarios, performance experiments and highlight the main features of MR-LEGOS.
Keywords: cloud computing; MapReduce; Hadoop; data management; grid computing; LEGOS; extract transform load; parallel programming; semantics.