Title: Introduction of pruning based on measures of association in the expansion phase of the induction algorithm CART: a case study of pre-university orientation for Sidi Mohamed Ben Abdelah University
Authors: Imane Satauri; Omar El Beqqali
Addresses: Department of Computer Science, Faculty of Sciences Fez, Morocco ' Department of Computer Science, Faculty of Sciences Fez, Morocco
Abstract: Determining the right size of the tree is a crucial operation in the construction of a decision tree on the basis of a large volume of data. It largely determines its performance during its deployment in the population. This, in fact, considers the avoidance of two extremes: the sub-study, defined by a reduced tree, poorly capturing relevant information of the learning data; the over-learning, defined by an exaggerated size of the tree, capturing the specifics of the learning data, characteristics that can not be transposed in the population. In both cases, we have a less performing prediction model. This paper presents an approach of indirect pre-pruning introduced within the algorithm classification and regression tree (CART) expansion phase; it is based on the rules generated from the decision tree and uses validation criteria inspired from the data mining techniques to discover association rules.
Keywords: over-learning; indirect pre-pruning; CART expansion phase; association rules.
International Journal of Society Systems Science, 2017 Vol.9 No.2, pp.165 - 180
Received: 16 Mar 2016
Accepted: 13 Nov 2016
Published online: 28 Jul 2017 *