Title: HDFS-based parallel and scalable pattern mining using clouds for incremental data

Authors: S. Sountharrajan; E. Suganya; N. Aravindhraj; S. Sankarananth; C. Rajan

Addresses: Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamanagalam, India ' Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamanagalam, India ' Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamanagalam, India ' Department of Electrical and Electronics Engineering, Excel College of Engineering and Technology, Nammakkal, Tamil Nadu, India ' Department of Computer Science and Engineering, KS Rangasamy College of Technology, Tiruchengode, Tamil Nadu, India

Abstract: Increased usage of internet led to the migration of large amount of data to the cloud environment which uses Hadoop and map reduce framework for managing various mining applications in distributed environment. Earlier research activity in distributed mining comprises of solving complex problems using distributed computational techniques and new algorithmic designs. But as the nature of the data and user requirement becomes more complex and demanding, the existing distributed algorithms fails in multiple aspects. In our work, a new distributed frequent pattern algorithm, named Hadoop-based parallel frequent pattern mining (HPFP) has been proposed to optimally utilise the clusters efficiently and mine repeated patterns from large databases very effectively. The empirical evaluation shows that HPFP algorithm improves the performance of mining operation by increasing the level of parallelism and execution efficacy. HPFP achieves complete parallelism and delivers superior performance to become an efficient algorithm in HDFS, than existing distributed pattern mining algorithms.

Keywords: cloud computing; Hadoop distributed file system; HDFS; map reduce; association rules; frequent pattern growth algorithm; distributed mining; parallel pattern mining.

DOI: 10.1504/IJCAET.2020.108102

International Journal of Computer Aided Engineering and Technology, 2020 Vol.13 No.1/2, pp.28 - 45

Received: 14 Aug 2017
Accepted: 31 Oct 2017

Published online: 03 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article