Int. J. of Data Science   »   2017 Vol.2, No.3



Title: On the estimation of optimal number of clusters for the induction of fuzzy decision trees


Authors: Swathi Jamjala Narayanan; Ilango Paramasivam; Rajen B. Bhatt


School of Computing Science and Engineering, VIT University, Vellore – 632014, India
School of Computing Science and Engineering, VIT University, Vellore – 632014, India
Robert Bosch Research and Technology Cente, Pittsburgh, PA 15203, USA


Abstract: Fuzzy decision tree (FDT) induction is a powerful methodology to extract human interpretable fuzzy classification rules. As far as our knowledge goes there is no recent comparative study of fuzzy cluster validity indices with an objective of using it for estimating the optimal number of clusters for each of the continuous attributes during the process of induction of FDT. In this paper, we study the performance of the FDT with optimal number of partitions for each node appearing in the FDT. By obtaining optimal number of fuzzy clusters, we capture the intrinsic structure of the attribute values during the formation of fuzzy partitions, which in turn improves the classification accuracy of FDT. Extensive computational experiments are conducted on FDT developed using Fuzzy ID3 and eight fuzzy cluster validity indices over 30 publicly available pattern classification datasets. Non-parametric statistical tests are conducted to test the null hypothesis.


Keywords: FDT; fuzzy decision tree; fuzzy ID3; fuzzy c-means; cluster analysis; cluster validity; non-parametric statistical test; optimal clusters; data science.


DOI: 10.1504/IJDS.2017.10007390


Int. J. of Data Science, 2017 Vol.2, No.3, pp.221 - 245


Available online: 27 Aug 2017



Editors Full text accessPurchase this articleComment on this article