Title: Subtree selection in kernels for graph classification

Authors: Mehmet Tan; Faruk Polat; Reda Alhajj

Addresses: Department of Computer Engineering, TOBB University of Economics and Technology, Ankara, Turkey ' Department of Computer Engineering, Middle East Technical University, Ankara, Turkey ' Department of Computer Science, University of Calgary, Calgary, AB, Canada

Abstract: Classification of structured data is essential for a wide range of problems in bioinformatics and cheminformatics. One such problem is in silico prediction of small molecule properties such as toxicity, mutagenicity and activity. In this paper, we propose a new feature selection method for graph kernels that uses the subtrees of graphs as their feature sets. A masking procedure which boils down to feature selection is proposed for this purpose. Experiments conducted on several data sets as well as a comparison of our method with some frequent subgraph based approaches are presented.

Keywords: feature selection; graph kernels; bioinformatics; cheminformatics; subtree selection; graph classification.

DOI: 10.1504/IJDMB.2013.056080

International Journal of Data Mining and Bioinformatics, 2013 Vol.8 No.3, pp.294 - 310

Received: 03 May 2011
Accepted: 04 May 2011

Published online: 20 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article