Data analysis on big data: improving the map and shuffle phases in Hadoop Map Reduce
by J.V.N. Lakshmi
International Journal of Data Analysis Techniques and Strategies (IJDATS), Vol. 10, No. 3, 2018

Abstract: The data management has become a challenging issue for network centric applications which need to process large amount of datasets. System requires advanced tools to analyse these datasets. As an efficient parallel computing programming model Map Reduce and Hadoop are used for large-scale data analysis. However, Map Reduce still suffers with performance problems Map Reduce uses a shuffle phase individual shuffle service component with efficient I/O policy. The map phase requires an improvement in its performance as this phase's output acts as an input to the next phase. Its result reveals the efficiency, so map phase needs some intermediate check points which regularly monitor all the splits generated by intermediate phases. This acts as a barrier for effective resource utilisation. This paper implements shuffle as a service component to decrease the overall execution time of jobs, monitor map phase by skew handling and increase resource utilisation in a cluster.

Online publication date: Fri, 17-Aug-2018

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Analysis Techniques and Strategies (IJDATS):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com