Title: Evaluating the contributions of GO term properties to semantic similarity measurement

Authors: Jan Sladek; Young-Rae Cho

Addresses: Department of Computer Science, Baylor University, Waco, Texas 76798, USA ' Department of Computer Science, Baylor University, Waco, Texas 76798, USA

Abstract: Recent systematic approaches for semantic analysis and annotations in bio-ontology databases have advanced understanding of molecular functions in a genomic scale. Over the last decade, various semantic similarity measures have been proposed to quantify functional similarity between genes using Gene Ontology (GO) and its annotations. However, major challenges in these applications are the increasing complexity of ontology structures and the inconsistency of annotation data. In this study, we explore term properties in GO, such as term specificity and the term balancing effect, and evaluate the contributions of these properties to semantic similarity measurement. Our experiment is designed to predict positive protein-protein interactions (PPIs) by semantic similarities to which these term properties are applied. The experimental results show prediction accuracy improved when GO terms are weighted by term specificity. The experimental results also show balancing terms with respect to their specificity are significant factor in measuring semantic similarity between proteins.

Keywords: gene ontology; semantic similarity; annotations; term specificity; GO; PPIs.

DOI: 10.1504/IJDMB.2017.087181

International Journal of Data Mining and Bioinformatics, 2017 Vol.18 No.3, pp.240 - 251

Received: 27 May 2017
Accepted: 13 Jun 2017

Published online: 06 Oct 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article