Title: Parallel naïve Bayes regression model-based collaborative filtering recommendation algorithm and its realisation on Hadoop for big data

Authors: Shiqi Wen; Cheng Wang; Haibo Li; Guoqi Zheng

Addresses: College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China ' College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China; The Xiamen Engineering Research Centre of Enterprise Interoperability and Business Intelligence, Xiamen 361021,China ' College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China; The Xiamen Engineering Research Centre of Enterprise Interoperability and Business Intelligence, Xiamen 361021,China ' Huaqiao University-Yardi Big Data Research Centre, Xiamen 361021, China

Abstract: Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naïve Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naïve Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naïve Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.

Keywords: parallel naïve Bayes regression model; model-based collaborative filtering; big data; Hadoop; MapReduce.

DOI: 10.1504/IJITM.2019.099818

International Journal of Information Technology and Management, 2019 Vol.18 No.2/3, pp.129 - 142

Received: 22 Jul 2017
Accepted: 05 Nov 2017

Published online: 23 May 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article