Title: EDA-USL: unsupervised clustering algorithm based on estimation of distribution algorithm

Authors: Jiancong Fan; Yongquan Liang; Qiang Xu; Ruisheng Jia; Zhihua Cui

Addresses: College of Information Science and Engineering, Shandong University of Science and Technology, Qingdao City, China ' College of Information Science and Engineering, Shandong University of Science and Technology, Qingdao City, China ' College of Information Science and Engineering, Shandong University of Science and Technology, Qingdao City, China ' College of Information Science and Engineering, Shandong University of Science and Technology, Qingdao City, China ' Complex System and Computational Intelligence Laboratory, Taiyuan University of Science and Technology, Taiyuan City, China

Abstract: Clustering analysis is primarily concerned with the classification of data points into different clusters. Estimation of distribution algorithms (EDAs) uses machine learning techniques to solve optimisation problems by trying to learn the locations of the more promising regions of the search space. In EDAs a population may be approximated with a probability distribution, and new candidate solutions can be obtained by sampling from this distribution, instead of combining and modifying existing solutions in a stochastic way. Unsupervised clustering learning algorithm based on estimation of distribution (EDA-USL) is designed to solve the analysis of dataset without labels. EDA-USL randomly selects a few data as individuals to construct initial population. The probability distribution of population is computed to estimate the distribution of dataset. The optimal individuals in population are selected by the designed fitness function. Then the new individuals that combine with the optimal ones to form the next generation are selected according to the classification patterns of the optimal individuals. EDA-USL is validated on the benchmark datasets and analysed. The experimental results show that EDA-USL has high stability and performs well in classification accuracy.

Keywords: estimation of distribution; evolutionary computation; unsupervised learning; clustering analysis; machine learning; optimisation; stability; classification accuracy.

DOI: 10.1504/IJWMC.2011.044111

International Journal of Wireless and Mobile Computing, 2011 Vol.5 No.1, pp.88 - 97

Available online: 09 Dec 2011 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article