Title: A decentralised framework for efficient storage and processing of big data using HDFS and IPFS
Authors: Franklin John; Suji Gopinath; Elizabeth Sherly
Addresses: Indian Institute of Information Technology and Management – Kerala, Technopark, Thiruvananthapuram, Kerala, 695581, India ' University of Kerala, Thiruvananthapuram, Kerala 695 581, India ' Indian Institute of Information Technology and Management – Kerala, Technopark, Thiruvananthapuram, Kerala, 695581, India
Abstract: Big data revolution emerged with greater opportunities as well as challenges. Some of the major challenges include capturing, storing, transferring, analysing, processing and updating these large and complex datasets. Traditional data handling techniques cannot manage this fast growing data. Apache Hadoop is one of the best technologies which can address the challenges involved in big data handling. Hadoop is a centralised, distributed data storage model. InterPlanetary file system (IPFS) is an emerging technology which can provide a decentralised distributed storage. By integrating both these technologies, we can create a better framework for the distributed storage and processing of big data. In the proposed work, we formulated a model for big data placement, replication and processing by combining the features of Hadoop and IPFS. Hadoop distributed file system and IPFS jointly handle the data placement and replication tasks and the programming framework MapReduce in Hadoop handle the data processing task. The experimental result shows that the proposed framework can achieve cost-effective storage as well as faster processing of big data.
Keywords: big data management; cloud computing; Hadoop distributed file system; HDFS; interPlanetary file system; IPFS; erasure coding.
International Journal of Humanitarian Technology, 2020 Vol.1 No.2, pp.131 - 143
Received: 04 Oct 2017
Accepted: 06 Jul 2018
Published online: 18 Jan 2021 *