Title: Identification of protein sequence motifs using modified and adaptive fuzzy C-means granular models

Authors: M. Chitralegha; K. Thangavel

Addresses: Department of Computer Science, Periyar University, Salem, India ' Department of Computer Science, Periyar University, Salem, India

Abstract: The activities and functions of proteins are determined by protein sequence motifs. All protein sequence segments may not produce potential motif patterns. The generated sequence segments have no standard labels. Hence, unsupervised segment selection technique is adopted to select significant protein sequence segments. Therefore, Singular Value Decomposition (SVD) entropy is used to select potential sequence segments. In this proposed work, SVD is combined with two different types of Granular Computing models including Modified Fuzzy C-Means and Adaptive Fuzzy C-Means to generate protein motif information efficiently. The two proposed models are compared with Fuzzy C-Means granular computing model. The experimental results show that Adaptive Fuzzy C-Means granular technique outperforms Modified Fuzzy C-Means and Fuzzy C-Means. A new evaluation method for sequence motif information namely 'Information-Gain' measure is adopted.

Keywords: protein sequence motifs; clustering; HSSP-BLOSUM62; adaptive fuzzy C-means; modified fuzzy C-means; SVD; singular value decomposition; protein sequences; granular computing; unsupervised segment selection; bioinformatics.

DOI: 10.1504/IJBRA.2016.080716

International Journal of Bioinformatics Research and Applications, 2016 Vol.12 No.4, pp.281 - 298

Received: 12 Jul 2015
Accepted: 06 Sep 2015

Published online: 05 Dec 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article