Title: New cluster ensemble approach to integrative biological data analysis

Authors: Natthakan Iam-On; Tossapon Boongoen; Simon Garrett; Chris Price

Addresses: School of Information Technology, Mae Fah Luang University, 57100, Thailand ' Department of Mathematics and Computer Science, Royal Thai Air Force Academy, 10220, Thailand ' Aispire Consulting Ltd., Tanyrallt, Aberystwyth, SY23 3PG, UK ' Department of Computer Science, Aberystwyth University, SY23 3DB, UK

Abstract: Clinical data has been employed as the major factor for traditional cancer prognosis. However, this classic approach may be ineffective for analysing morphologically indistinguishable tumour subtypes. As such, microarray technology emerges as the promising alternative. Despite a large number of microarray studies, the actual clinical application of gene expression data analysis remains limited owing to the complexity of generated data and the noise level. Recently, the integrative cluster analysis of both clinical and gene expression data has been shown to be an effective alternative to overcome the above-mentioned problems. This paper presents a novel method for using cluster ensembles that is accurate for analysing heterogeneous biological data. Evaluation against real biological and benchmark data sets suggests that the quality of the proposed model is higher than many state-of-the-art cluster ensemble techniques and standard clustering algorithms.

Keywords: clustering; cluster ensembles; heterogeneous biological data; link analysis; gene expression data; data analysis; cluster analysis; clinical data; bioinformatics; cancer prognosis; tumour subtypes.

DOI: 10.1504/IJDMB.2013.055495

International Journal of Data Mining and Bioinformatics, 2013 Vol.8 No.2, pp.150 - 168

Received: 02 May 2011
Accepted: 02 May 2011

Published online: 20 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article