Title: Multifractal-based cluster hierarchy optimisation algorithm

Authors: Guang-Hui Yan, Li-Song Liu, Lin-Na Du, Xia-Xia Yang, Zhi-Cheng Ma, Xiao-Min Zhang

Addresses: School of Information and Electrical Engineering, Lanzhou Jiaotong University, No. 88 West Anning Road, Lanzhou City, Gansu Province, (730070) P.R. China. ' School of Information and Electrical Engineering, Lanzhou Jiaotong University, No. 88 West Anning Road, Lanzhou City, Gansu Province, (730070) P.R. China. ' School of Information and Electrical Engineering, Lanzhou Jiaotong University, No. 88 West Anning Road, Lanzhou City, Gansu Province, (730070) P.R. China. ' School of Information and Electrical Engineering, Lanzhou Jiaotong University, No. 88 West Anning Road, Lanzhou City, Gansu Province, (730070) P.R. China. ' Gansu Electric Power Information & Communication Centre, No. 629 West Xijin Road, Lanzhou City, Gansu Province, (730050) P.R. China. ' Gansu Electric Power Information & Communication Centre, No. 629 West Xijin Road, Lanzhou City, Gansu Province, (730050) P.R. China

Abstract: A cluster is a collection of data objects that are similar to one another within the same cluster and are dissimilar to the objects in other clusters. Moreover, there will exist more or less similarities among these large amounts of initial cluster results in a real-life data set. Accordingly, an analyser will have difficulty implementing further analysis if they know nothing about these similarities. Therefore, it is very valuable to analyse these similarities and construct the hierarchy structures of the initial clusters. The traditional cluster methods are unfit for this cluster postprocessing problem for their favour of finding the spherical shape clusters, impractical hypothesis and multiple scans of the data set. Based on multifractal theory, we propose the MultiFractal-based Cluster Hierarchy Optimisation (MFCHO) algorithm, which integrates the cluster similarity with cluster shape and cluster distribution to construct the cluster hierarchy tree from the disjoint initial clusters. The elementary time-space complexity of the MFCHO algorithm is presented. Several comparative experiments using synthetic and real-life data sets show the performance and the effectivity of MFCHO.

Keywords: data mining; cluster hierarchy optimisation; multifractal cluster hierarchy; fractal dimension; cluster similarity; cluster shape; cluster distribution.

DOI: 10.1504/IJBIDM.2008.022734

International Journal of Business Intelligence and Data Mining, 2008 Vol.3 No.4, pp.353 - 374

Available online: 25 Jan 2009

Full-text access for editors Access for subscribers Purchase this article Comment on this article