Title: The family based variability in protein family expansion

Authors: Anasua Sarkar; Macha Nikolski; Pascal Durrens

Addresses: CNRS/LaBRI, Universite Bordeaux 1, 351 cours de la Liberation, 33405 Talence Cedex, France ' CNRS/LaBRI, Universite Bordeaux 1, 351 cours de la Liberation, 33405 Talence Cedex, France ' CNRS/LaBRI, Universite Bordeaux 1, 351 cours de la Liberation, 33405 Talence Cedex, France

Abstract: In this paper we propose an automatic protein family expansion approach for recruitment of new members among the protein-coding genes in newly sequenced genomes. The criteria for adding a new member to a family depends on the structure of each individual family versus being globally uniform. The detection of a threshold in the ROC space of all sorted iterative profile sets defines the alignments selection criteria for each family. Furthermore, the statistical estimation of most-frequent optimal sorting criteria generates the optimal filtering strategy in a learning-parameter set for profile-based homology search.

Keywords: protein family expansion; sequence profiles; protein specific scoring matrix; remote homologues; ROC analysis; alignment selection criteria; optimal filtering strategy; proteins; protein families; bioinformatics; gene sequences.

DOI: 10.1504/IJBRA.2013.052473

International Journal of Bioinformatics Research and Applications, 2013 Vol.9 No.2, pp.121 - 133

Published online: 06 Sep 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article