Title: On a solution for the high-dimensionality-small-sample-size regression problem with several different microarrays

Authors: Vladimir Nikulin

Addresses: Department of Mathematical Methods in Economy, Vyatka State University, Kirov, 610000, Russia

Abstract: A common phenomenon in biological experiments is that it is not possible to obtain complete measurements for all the samples. Note that some microarrays are very informative, but very expensive to have them for all the samples. However, we can use publicly available background knowledge about the potential links between the components of different microarrays (known, also, as genes). As a result, we shall translate all the selected genes in the terms of other genes. Those secondary genes are to be included in the regression models automatically to give the learning processes the right initial directions. The proposed method was tested online during the e-LICO data-mining Contest, where we had achieved second best score.

Keywords: microarrays; regression modelling; LOO; leave-one-out; relevance vector machines; regularisation; random permutations; learning; bioinformatics; secondary genes.

DOI: 10.1504/IJDMB.2014.060049

International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.3, pp.221 - 234

Received: 07 Aug 2011
Accepted: 29 Dec 2011

Published online: 21 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article