Title: Genetic algorithm-based clustering ensemble: determination number of clusters

Authors: Mehdi Mohammadi, Ali Azadeh, Morteza Saberi, Amir Azaron

Addresses: Department of Computer Engineering, Iran University of Science and Technology, University Road, Hengam Street, Resalat Square, Tehran, Iran. ' Department of Industrial Engineering, Department of Engineering Optimization Research, Research Institute of Energy Management and Planning, Center of Excellence for Intelligent Experimental Mechanics, Faculty of Engineering, University of Tehran, P.O. Box 11365-4563, Iran. ' Department of Industrial Engineering, University of Tafresh, Tafresh, Iran; Institute for Digital Ecosystems & Business Intelligence, CBS, Curtin University of Technology, GPO Box U1987, Perth, WA 6845, Australia. ' Department of Financial Engineering and Engineering Management, School of Science and Engineering, Reykjavik University, Reykjavik, Iceland

Abstract: Genetic algorithms (GAs) have been used in the clustering subject. Also, a clustering ensemble as one acceptable clustering method combines the results of multiple clustering methods on a given dataset and creates final clustering on the dataset. In this paper, genetic algorithm base on clustering ensemble (GACE) is introduced for finding optimal clusters. The most important property of our method is the ability to extract the number of clusters. With this ability, the need for data examination is removed, and then solving related problems will not be time consuming. GACE is applied to eight series of databases. Experimental results were compared with other four clustering methods. Data envelopment analysis (DEA) is used to compare methods. The results of DEA indicate that GACE is the best method. The four methods are co-association function and average link (CAL), co-association function and K-means (CK), hypergraph-partitioning algorithm (HGPA) and cluster-based similarity partitioning (CSPA).

Keywords: genetic algorithms; GAs; clustering ensembles; data envelopment analysis; DEA; co-association function; average link; K-means; hypergraph-partitioning algorithms; HGPA; cluster-based similarity partitioning; CSPA.

DOI: 10.1504/IJBFMI.2010.036004

International Journal of Business Forecasting and Marketing Intelligence, 2010 Vol.1 No.3/4, pp.201 - 216

Published online: 12 Oct 2010 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article