Title: Bagged gene shaving for the robust clustering of high-throughput data

Authors: Bradley M. Broom, Erik P. Sulman, Kim-Anh Do, Mary E. Edgerton, Kenneth D. Aldape

Addresses: Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, Texas 77030, USA. ' Department of Radiation Oncology, UT MD Anderson Cancer Center, Houston, Texas 77030, USA. ' Department of Biostatistics, UT MD Anderson Cancer Center, Houston, Texas 77030, USA. ' Department of Pathology, UT MD Anderson Cancer Center, Houston, Texas 77030, USA. ' Department of Pathology, UT MD Anderson Cancer Center, Houston, Texas 77030, USA

Abstract: The analysis of high-throughput data sets, such as microarray data, often requires that individual variables (genes, for example) be grouped into clusters of variables with highly correlated values across all samples. Gene shaving is an established method for generating such clusters, but is overly sensitive to the input data: changing just one sample can determine whether or not an entire cluster is found. This paper describes a clustering method based on the bootstrap aggregation of gene shaving clusters, which overcomes this and other problems, and applies the new method to a large gene expression microarray dataset from brain tumour samples.

Keywords: bootstrap aggregation; clustering; gene shaving clusters; glioblastoma; bioinformatics; microarray data; brain tumour samples; brain tumours.

DOI: 10.1504/IJBRA.2010.035997

International Journal of Bioinformatics Research and Applications, 2010 Vol.6 No.4, pp.326 - 343

Published online: 11 Oct 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article