Title: MapReduce-based fuzzy very fast decision tree for constructing prediction intervals

Authors: Ojha Manish Kumar; Kumar Ravi; Vadlamani Ravi

Addresses: Center of Excellence in Analytics, Institute for Development and Research in Banking Technology, Castle Hills Road No. 1, Masab Tank, Hyderabad-500057, India; School of Computer and Information Sciences, University of Hyderabad, Hyderabad-500046, India ' Center of Excellence in Analytics, Institute for Development and Research in Banking Technology, Castle Hills Road No. 1, Masab Tank, Hyderabad-500057, India; School of Computer and Information Sciences, University of Hyderabad, Hyderabad-500046, India ' Center of Excellence in Analytics, Institute for Development and Research in Banking Technology, Castle Hills Road No. 1, Masab Tank, Hyderabad-500057, India

Abstract: We propose the fuzzy version of very fast decision tree (VFDT) to predict prediction intervals and compared them with those generated by traditional VFDT. The proposed fuzzy VFDT is able to capture intrinsic features of VFDT as well as uncertainties available in data. The VFDT and fuzzy VFDT were trained using the lower upper bound estimation (LUBE) method in order to generate prediction intervals. We also implemented VFDT; developed and implemented fuzzy VFDT using Apache Hadoop MapReduce framework, where multiple slave nodes build a VFDT and fuzzy VFDT model. The developed models were tested on six datasets taken from the web. We conducted sensitivity analysis by studying the influence of the window size of the data stream, number of bins in discretisation on the final results. Results demonstrated that the proposed MapReduce-based fuzzy VFDT and VFDT can construct high-quality prediction intervals precisely and quickly.

Keywords: very fast decision tree; VFDT; fuzzy VFDT; MapReduce; prediction interval; big data; Hadoop; stream data.

DOI: 10.1504/IJBDI.2019.100894

International Journal of Big Data Intelligence, 2019 Vol.6 No.3/4, pp.234 - 247

Received: 15 Feb 2018
Accepted: 22 Jun 2018

Published online: 04 Jun 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article