Title: Data mining of unstructured big data in cloud computing

Authors: A.K. Reshmy; D. Paulraj

Addresses: Anna University, Chennai, India ' RMD Engineering College, Chennai, India

Abstract: Hadoop Distributed File System, Talend, MapReduce (MR), YARN and Cloudera model have gotten to be prevalent advancements for expansive scale information association and investigation. In our work, we distinguish the prerequisites of the covered information association and propose an augmentation to the present programming model, called Comprehensive Hadoop Distributed File System along with MapReduce (C-HDFS-MR), to address them. The expanded interface is exhibited as application programming interface and actualised with regards to image processing application space. In our work, we show viability of C-HDFS-MR through contextual investigations of picture handling capacities along with the outcomes. Despite the fact that C-HDFS-MR has minimal overhead in information stockpiling and I/O operations, it enormously upgrades the framework execution and improves the application advancement process. Our proposed framework, C-HDFS-MR, works in the absence of progressions for the current prototypes, and is used by numerous applications to prerequisite of covered information.

Keywords: big data; MapReduce; MR; Hadoop; Comprehensive Hadoop Distributed File System along with MapReduce; C-HDFS-MR; medical image processing; analysis; and visualisation; MIPAV.

DOI: 10.1504/IJBIDM.2018.088430

International Journal of Business Intelligence and Data Mining, 2018 Vol.13 No.1/2/3, pp.147 - 162

Received: 28 Sep 2016
Accepted: 29 Nov 2016

Published online: 03 Nov 2017 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article