Authors: Leif E. Peterson, Matthew A. Coleman
Addresses: Center for Biostatistics, The Methodist Hospital Research Institute, Houston, TX 77030, USA. ' Biology and Biotechnology Research Program, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
Abstract: Random Spherical Linear Oracles (RSLO) for DNA microarray gene expression data are proposed for classifier fusion. RSLO employs random hyperplane splits of samples in the principal component score space based on the first three principal components (X, Y,Z) of the input feature set. Hyperplane splits are used to assign training(testing) samples to separate logistic regression mini-classifiers, which increases the diversity of voting results since errors are not shared across mini-classifiers. We recommend use of RSLO with 3-4 10-fold CV and re-partitioning samples randomly every ten iterations prior to each 10-fold CV. This equates to a total of 30-40 iterations.
Keywords: ensemble classifier fusion; random linear oracles; principal directions; hyperplanes; PCs; principal components; microarrays; RSLO; random spherical linear oracles; microarray classification; logistic ensembles; bioinformatics.
International Journal of Data Mining and Bioinformatics, 2009 Vol.3 No.4, pp.382 - 397
Published online: 09 Nov 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article