Perturbation and candidate analysis to combat overfitting of gene expression microarray data
by Ravi Mathur; J. David Schaffer; Walker H. Land Jr.; John J. Heine; Jonathan M. Hernandez; Timothy Yeatman
International Journal of Computational Biology and Drug Design (IJCBDD), Vol. 4, No. 4, 2011

Abstract: Analysis of gene expression microarray datasets presents the high risk of over-fitting (spurious patterns) because of their feature-rich but case-poor nature. This paper describes our ongoing efforts to develop a method to combat over-fitting and determine the strongest signal in the dataset. A GA-SVM hybrid along with Gaussian noise (manual noise gain) is used to discover feature sets of minimal size that accurately classifies the cases under cross-validation. Initial results on a colorectal cancer dataset shows that the strongest signal (modest number of candidates) can be found by a binary search.

Online publication date: Sat, 24-Jan-2015

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Computational Biology and Drug Design (IJCBDD):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com