Title: On the estimation of optimal number of clusters for the induction of fuzzy decision trees

Authors: Swathi Jamjala Narayanan; Ilango Paramasivam; Rajen B. Bhatt

Addresses: School of Computing Science and Engineering, VIT University, Vellore – 632014, India ' School of Computing Science and Engineering, VIT University, Vellore – 632014, India ' Robert Bosch Research and Technology Cente, Pittsburgh, PA 15203, USA

Abstract: Fuzzy decision tree (FDT) induction is a powerful methodology to extract human interpretable fuzzy classification rules. As far as our knowledge goes there is no recent comparative study of fuzzy cluster validity indices with an objective of using it for estimating the optimal number of clusters for each of the continuous attributes during the process of induction of FDT. In this paper, we study the performance of the FDT with optimal number of partitions for each node appearing in the FDT. By obtaining optimal number of fuzzy clusters, we capture the intrinsic structure of the attribute values during the formation of fuzzy partitions, which in turn improves the classification accuracy of FDT. Extensive computational experiments are conducted on FDT developed using Fuzzy ID3 and eight fuzzy cluster validity indices over 30 publicly available pattern classification datasets. Non-parametric statistical tests are conducted to test the null hypothesis.

Keywords: FDT; fuzzy decision tree; fuzzy ID3; fuzzy c-means; cluster analysis; cluster validity; non-parametric statistical test; optimal clusters; data science.

DOI: 10.1504/IJDS.2017.086255

International Journal of Data Science, 2017 Vol.2 No.3, pp.221 - 245

Received: 09 Jul 2014
Accepted: 16 Nov 2014

Published online: 27 Aug 2017 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article