Title: Sequence-based protein superfamily classification using computational intelligence techniques: a review

Authors: Swati Vipsita; Santanu Kumar Rath

Addresses: Department of Computer Science, IIIT Bhubaneswar, Bhubaneswar 751003, Odisha, India ' Department of Computer Science, IIIT Bhubaneswar, Bhubaneswar 751003, Odisha, India

Abstract: Protein superfamily classification deals with the problem of predicting the family membership of newly discovered amino acid sequence. Although many trivial alignment methods are already developed by previous researchers, but the present trend demands the application of computational intelligent techniques. As there is an exponential growth in size of biological database, retrieval and inference of essential knowledge in the biological domain become a very cumbersome task. This problem can be easily handled using intelligent techniques due to their ability of tolerance for imprecision, uncertainty, approximate reasoning, and partial truth. This paper discusses the various global and local features extracted from full length protein sequence which are used for the approximation and generalisation of the classifier. The various parameters used for evaluating the performance of the classifiers are also discussed. Therefore, this review article can show right directions to the present researchers to make an improvement over the existing methods.

Keywords: bi-gram feature; feature selection; feature extraction; dimensionality reduction; global features; motifs; optimisation; amino acid sequences; kernels; protein sequences; protein superfamily classification; computational intelligence; bioinformatics.

DOI: 10.1504/IJDMB.2015.067957

International Journal of Data Mining and Bioinformatics, 2015 Vol.11 No.4, pp.424 - 457

Received: 05 Apr 2012
Accepted: 20 Apr 2013

Published online: 12 Mar 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article