Authors: Jan Sladek; Young-Rae Cho
Addresses: Department of Computer Science, Baylor University, Waco, Texas 76798, USA ' Department of Computer Science, Baylor University, Waco, Texas 76798, USA
Abstract: Recent systematic approaches for semantic analysis and annotations in bio-ontology databases have advanced understanding of molecular functions in a genomic scale. Over the last decade, various semantic similarity measures have been proposed to quantify functional similarity between genes using Gene Ontology (GO) and its annotations. However, major challenges in these applications are the increasing complexity of ontology structures and the inconsistency of annotation data. In this study, we explore term properties in GO, such as term specificity and the term balancing effect, and evaluate the contributions of these properties to semantic similarity measurement. Our experiment is designed to predict positive protein-protein interactions (PPIs) by semantic similarities to which these term properties are applied. The experimental results show prediction accuracy improved when GO terms are weighted by term specificity. The experimental results also show balancing terms with respect to their specificity are significant factor in measuring semantic similarity between proteins.
Keywords: gene ontology; semantic similarity; annotations; term specificity; GO; PPIs.
International Journal of Data Mining and Bioinformatics, 2017 Vol.18 No.3, pp.240 - 251
Received: 27 May 2017
Accepted: 13 Jun 2017
Published online: 03 Oct 2017 *