Title: Protein interaction detection in sentences via Gaussian Processes: a preliminary evaluation

Authors: Tamara Polajnar, Simon Rogers, Mark Girolami

Addresses: Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ, Scotland. ' Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ, Scotland. ' Department of Computing Science, University of Glasgow, Glasgow, G12 8QQ, Scotland

Abstract: The non-parametric deterministic Support Vector Machines (SVMs) produce high levels of performances in text classification. This article offers a much needed evaluation of the Gaussian Process (GP) classifier, as a non-parametric probabilistic analogue to SVMs, which has been rarely applied to text classification. We provide an extensive experimental comparison of the performance and properties of these competing classifiers on the challenging problem of protein interaction detection in biomedical publications. Our results show that GPs can match the performance of SVMs without the need for costly margin parameter tuning, whilst offering the advantage of an extendable probabilistic framework for text classification.

Keywords: text mining; Gaussian process classifiers; SVM; support vector machines; protein interaction detection; biomedical publications; sentence classification; text classification.

DOI: 10.1504/IJDMB.2011.038577

International Journal of Data Mining and Bioinformatics, 2011 Vol.5 No.1, pp.52 - 72

Received: 09 Jul 2009
Accepted: 14 Jul 2009

Published online: 24 Jan 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article