Authors: Lior Rokach
Addresses: Department of Information System Engineering, Ben-Gurion University of the Negev, Beer Sheva 84104, Israel
Abstract: Data mining methods can be used for discovering interesting patterns in manufacturing databases. These patterns can be used to improve manufacturing processes. However, data accumulated in manufacturing plants usually suffer from the |Curse of Dimensionality|, that is, relatively small number of records compared to large number of input features. As a result, conventional data mining methods may be inaccurate in these cases. This paper presents a new feature set decomposition approach that is based on genetic algorithm. For this purpose a new encoding schema is proposed and its properties are discussed. Moreover we examine the effectiveness of using a Vapnik-Chervonenkis dimension bound for evaluating the fitness function of multiple oblivious trees classifiers. The new algorithm was tested on various real-world manufacturing data sets. The results obtained have been compared to other methods, indicating the superiority of the proposed algorithm.
Keywords: genetic algorithms; data mining; quality engineering; feature set decomposition; manufacturing data.
International Journal of Intelligent Systems Technologies and Applications, 2008 Vol.4 No.1/2, pp.57 - 78
Published online: 22 Dec 2007 *Full-text access for editors Access for subscribers Purchase this article Comment on this article