Title: A review of data mining algorithms on Hadoop's MapReduce

Authors: Sikha Bagui; Sean Spratlin

Addresses: Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA ' Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA

Abstract: This paper is a review of the most frequently used data mining algorithms on Hadoop's MapReduce. We describe the algorithms with respect to their implementation and performance on Hadoop's MapReduce. We also discuss the similarities and differences between MapReduce's parallel or distributed implementations and the original standard sequential implementations.

Keywords: Hadoop; MapReduce; Classification; Clustering; KNN; SVM; Regression; Association Rule Mining.

DOI: 10.1504/IJDS.2018.092285

International Journal of Data Science, 2018 Vol.3 No.2, pp.146 - 169

Available online: 27 May 2018 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article