Title: Towards cost-effective and high-performance caching middleware for distributed systems

Authors: Dongfang Zhao; Kan Qiao; Ioan Raicu

Addresses: Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA ' Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA ' Department of Computer Science, Illinois Institute of Technology, Chicago, IL 60616, USA; Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA

Abstract: One performance bottleneck of distributed systems lies on the hard disk drive (HDD) whose single read/write head has physical limitations to support concurrent I/Os. Although the solid-state drive (SSD) has been introduced for years, HDDs are still dominant storage due to large capacity and low cost. This paper proposes a caching middleware that manages the underlying heterogeneous storage devices in order to allow distributed file systems to achieve both high performance and low cost. Specifically, we design and implement a user-level caching system that offers SSD-like performance at a cost similar to a HDD. We demonstrate how such a middleware improves the performance of distributed file systems, such as the HDFS. Experimental results show that the caching system delivers up to 7X higher throughput and 76X higher IOPS than Linux Ext4 file system, and accelerates HDFS by 28% on 32 nodes.

Keywords: distributed file systems; user level file systems; hybrid file systems; heterogeneous storage; solid-state drives; SSD; cost-effective caching middleware; high-performance caching middleware; distributed systems; hard disk drives; HDD.

DOI: 10.1504/IJBDI.2016.077358

International Journal of Big Data Intelligence, 2016 Vol.3 No.2, pp.92 - 110

Received: 26 Aug 2014
Accepted: 10 Feb 2015

Published online: 29 Jun 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article