Int. J. of Big Data Intelligence   »   2016 Vol.3, No.2

 

 

Title: Towards cost-effective and high-performance caching middleware for distributed systems

 

Authors: Dongfang Zhao; Kan Qiao; Ioan Raicu

 

Addresses:
Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA
Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA
Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA; Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA

 

Abstract: One performance bottleneck of distributed systems lies on the hard disk drive (HDD) whose single read/write head has physical limitations to support concurrent I/Os. Although the solid-state drive (SSD) has been introduced for years, HDDs are still dominant storage due to large capacity and low cost. This paper proposes a caching middleware that manages the underlying heterogeneous storage devices in order to allow distributed file systems to achieve both high performance and low cost. Specifically, we design and implement a user-level caching system that offers SSD-like performance at a cost similar to a HDD. We demonstrate how such a middleware improves the performance of distributed file systems, such as the HDFS. Experimental results show that the caching system delivers up to 7X higher throughput and 76X higher IOPS than Linux Ext4 file system, and accelerates HDFS by 28% on 32 nodes.

 

Keywords: distributed file systems; user level file systems; hybrid file systems; heterogeneous storage; solid-state drives; SSD; cost-effective caching middleware; high-performance caching middleware; distributed systems; hard disk drives; HDD.

 

DOI: 10.1504/IJBDI.2016.077358

 

Int. J. of Big Data Intelligence, 2016 Vol.3, No.2, pp.92 - 110

 

Available online: 28 Jun 2016

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article