Title: Distance metric learning and support vector machines for classification of mass spectrometry proteomics data

Authors: Qingzhong Liu, Mengyu Qiao, Andrew H. Sung

Addresses: Department of Computer Science and Engineering, Institute for Complex Additive Systems Analysis, New Mexico Tech, 801 Leroy Place, Socorro, NM 87801, USA. ' Department of Computer Science and Engineering, Institute for Complex Additive Systems Analysis, New Mexico Tech, 801 Leroy Place, Socorro, NM 87801, USA. ' Department of Computer Science and Engineering, Institute for Complex Additive Systems Analysis, New Mexico Tech, 801 Leroy Place, Socorro, NM 87801, USA

Abstract: Mass spectrometry has become a widely used measurement in proteomics research. High dimensionality of features and small dataset are two major limitations hindering the accuracy of classification in mass spectrum data analysis; consequently, to obtain good results, the issues of feature extraction and feature selection are especially important. The quality of the feature set determines the reliability of the prediction of disease status. A well-known approach is to detect peak values and then apply support vector machine recursive feature elimination (SVMRFE) to choose feature sets for classification. In this paper, we apply a distance metric learning to classify proteomics mass spectrometry data. Experimental results show that distance metric learning can successfully be applied to the classification of proteomics data and the results are comparable to or better than, the best results by applying SVM to the feature sets chosen with the use of SVMRFE. We also perform feature reduction using manifold learning and experimental results indicate its promising potential in this application.

Keywords: proteomics; mass spectrometry; feature selection; classification; distance metric learning; manifold learning; support vector machines; SVM recursive features; recursive feature elimination; SVMRFE.

DOI: 10.1504/IJKESDP.2009.028815

International Journal of Knowledge Engineering and Soft Data Paradigms, 2009 Vol.1 No.3, pp.216 - 226

Published online: 03 Oct 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article