Title: A review of data mining algorithms on Hadoop's MapReduce

Authors: Sikha Bagui; Sean Spratlin

Addresses: Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA ' Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA

Abstract: This paper is a review of the most frequently used data mining algorithms on Hadoop's MapReduce. We describe the algorithms with respect to their implementation and performance on Hadoop's MapReduce. We also discuss the similarities and differences between MapReduce's parallel or distributed implementations and the original standard sequential implementations.

Keywords: Hadoop; MapReduce; Classification; Clustering; KNN; SVM; Regression; Association Rule Mining.

DOI: 10.1504/IJDS.2018.092285

International Journal of Data Science, 2018 Vol.3 No.2, pp.146 - 169

Received: 15 Aug 2016
Accepted: 10 Feb 2017

Published online: 14 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article