Title: Feature selection and classification of metabolomics data using artificial bee colony programming (ABCP)

Authors: Celal Öztürk; Mustafa Tarım; Sibel Arslan

Addresses: Department of Computer Engineering, Erciyes University, Melikgazi, 38039 Kayseri, Turkey ' Department of Computer Engineering, Erciyes University, Melikgazi, 38039 Kayseri, Turkey ' Department of Computer Engineering, Erciyes University, Melikgazi, 38039 Kayseri, Turkey

Abstract: One area of metabolic data analysis is processes that involve the detection and discovery of biomarkers used in the early diagnosis of diseases and development of alternative treatments. Classification and feature selection are frequently used in the statistical analysis of metabolomics data for the detection and discovery of biomarkers. Recently, automatic programming methods have begun to be used instead of conventional methods. In this paper, three conventional classification and feature selection methods (PLS-DA, RF, SVM) and two automatic programming methods (ABCP and GP) are applied to classification problems where they are evaluated on synthetic and real data sets. The selection performances on the biomarker discovery of the algorithms have been compared. It has been found that automatic programming methods are more successful in classifying metabolic data and ABCP is superior to GP in biomarker discovery.

Keywords: metabolomics data; biomarker discovery; feature selection; classification; artificial bee colony programming; genetic programming; bioinformatics.

DOI: 10.1504/IJDMB.2020.10029552

International Journal of Data Mining and Bioinformatics, 2020 Vol.23 No.2, pp.101 - 118

Received: 28 Jan 2019
Accepted: 27 Jan 2020

Published online: 22 May 2020 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article