Title: A review: The effects of imperfect data on incremental decision tree

Authors: Hang Yang; Peng Li; Xiaobin Guo; Huajun Chen; Zhiqiang Lin

Addresses: Electric Power Research Institute, China Southern Power Grid, Guangzhou, 510000, China ' Electric Power Research Institute, China Southern Power Grid, Guangzhou, 510000, China ' Electric Power Research Institute, China Southern Power Grid, Guangzhou, 510000, China ' Electric Power Research Institute, China Southern Power Grid, Guangzhou, 510000, China ' Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China

Abstract: Decision tree, as one of the most widely used methods in data mining, has been used in many realistic applications. Incremental decision tree handles streaming data scenario that is applicable for big data analysis. However, imperfect data are unavoidable in real-world applications. Studying the state-of-art incremental decision tree induction using Hoeffding bound, we investigated the influence of imperfect data on decision tree model. Additionally, we found the imperfect data worsen the performance of decision tree learning, resulting in worse accuracy and more consumed resource. This paper would be good reference for the future research. When thinking of a new generation of incremental decision tree, we should try to overcome the negative effects of imperfect data.

Keywords: incremental decision tree; data mining; data stream mining; classification.

DOI: 10.1504/IJICT.2018.089029

International Journal of Information and Communication Technology, 2018 Vol.12 No.1/2, pp.162 - 174

Received: 20 Dec 2014
Accepted: 26 Jun 2015

Published online: 04 Jan 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article