Title: An empirical study of alternating least squares collaborative filtering recommendation for Movielens on Apache Hadoop and Spark

Authors: Jung-Bin Li; Szu-Yin Lin; Yu-Hsiang Hsu; Ying-Chu Huang

Addresses: Department of Statistics and Information Science, Fu Jen Catholic University, New Taipei City, Taiwan ' Department of Computer Science and Information Engineering, National Ilan University, Yilan County, Taiwan ' Atrust Fintech Corp., Taipei, Taiwan ' Smart Channel & Logistics Department, Industrial Technology Research Institute, New Taipei City, Taiwan

Abstract: In recent years, both consumers and businesses have faced the problem of information explosion, and the recommendation system provides a possible solution. This study implements a movie recommendation system that provides recommendations to consumers in an effort to increase consumer spending while reducing the time between film selection. This study is a prototype of collaborative filtering recommendation system based on Alternating Least Squares (ALS) algorithm. The advantage of collaborative filtering is that it avoids possible violations of the Personal Data Protection Act and reduces the possibility of errors due to poor quality of personal data. Our research improves the ALS's limited scalability by using a platform that combines Spark with Hadoop Yarn and uses this combination to calculate movie recommendations and store data separately. Based on the results of this study, our proposed system architecture provides recommendations with satisfactory accuracy while maintaining acceptable computational time with limited resources.

Keywords: recommendation system; alternating least squares; collaborative filtering; Movielens; Hadoop; Spark; content-based filtering.

DOI: 10.1504/IJGUC.2020.110053

International Journal of Grid and Utility Computing, 2020 Vol.11 No.5, pp.674 - 682

Received: 12 Apr 2019
Accepted: 12 Aug 2019

Published online: 02 Oct 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article