Authors: Sikha Bagui; Sean Spratlin
Addresses: Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA ' Department of Computer Science, University of West Florida, Pensacola, FL 32514, USA
Abstract: This paper is a review of the most frequently used data mining algorithms on Hadoop's MapReduce. We describe the algorithms with respect to their implementation and performance on Hadoop's MapReduce. We also discuss the similarities and differences between MapReduce's parallel or distributed implementations and the original standard sequential implementations.
Keywords: Hadoop; MapReduce; Classification; Clustering; KNN; SVM; Regression; Association Rule Mining.
International Journal of Data Science, 2018 Vol.3 No.2, pp.146 - 169
Available online: 27 May 2018 *Full-text access for editors Access for subscribers Purchase this article Comment on this article