Title: Sequence-based protein superfamily classification using computational intelligence techniques: a review
Authors: Swati Vipsita; Santanu Kumar Rath
Addresses: Department of Computer Science, IIIT Bhubaneswar, Bhubaneswar 751003, Odisha, India ' Department of Computer Science, IIIT Bhubaneswar, Bhubaneswar 751003, Odisha, India
Abstract: Protein superfamily classification deals with the problem of predicting the family membership of newly discovered amino acid sequence. Although many trivial alignment methods are already developed by previous researchers, but the present trend demands the application of computational intelligent techniques. As there is an exponential growth in size of biological database, retrieval and inference of essential knowledge in the biological domain become a very cumbersome task. This problem can be easily handled using intelligent techniques due to their ability of tolerance for imprecision, uncertainty, approximate reasoning, and partial truth. This paper discusses the various global and local features extracted from full length protein sequence which are used for the approximation and generalisation of the classifier. The various parameters used for evaluating the performance of the classifiers are also discussed. Therefore, this review article can show right directions to the present researchers to make an improvement over the existing methods.
Keywords: bi-gram feature; feature selection; feature extraction; dimensionality reduction; global features; motifs; optimisation; amino acid sequences; kernels; protein sequences; protein superfamily classification; computational intelligence; bioinformatics.
DOI: 10.1504/IJDMB.2015.067957
International Journal of Data Mining and Bioinformatics, 2015 Vol.11 No.4, pp.424 - 457
Received: 05 Apr 2012
Accepted: 20 Apr 2013
Published online: 12 Mar 2015 *