Title: Identification of protein sequence motifs using modified and adaptive fuzzy C-means granular models
Authors: M. Chitralegha; K. Thangavel
Addresses: Department of Computer Science, Periyar University, Salem, India ' Department of Computer Science, Periyar University, Salem, India
Abstract: The activities and functions of proteins are determined by protein sequence motifs. All protein sequence segments may not produce potential motif patterns. The generated sequence segments have no standard labels. Hence, unsupervised segment selection technique is adopted to select significant protein sequence segments. Therefore, Singular Value Decomposition (SVD) entropy is used to select potential sequence segments. In this proposed work, SVD is combined with two different types of Granular Computing models including Modified Fuzzy C-Means and Adaptive Fuzzy C-Means to generate protein motif information efficiently. The two proposed models are compared with Fuzzy C-Means granular computing model. The experimental results show that Adaptive Fuzzy C-Means granular technique outperforms Modified Fuzzy C-Means and Fuzzy C-Means. A new evaluation method for sequence motif information namely 'Information-Gain' measure is adopted.
Keywords: protein sequence motifs; clustering; HSSP-BLOSUM62; adaptive fuzzy C-means; modified fuzzy C-means; SVD; singular value decomposition; protein sequences; granular computing; unsupervised segment selection; bioinformatics.
International Journal of Bioinformatics Research and Applications, 2016 Vol.12 No.4, pp.281 - 298
Received: 12 Jul 2015
Accepted: 06 Sep 2015
Published online: 03 Dec 2016 *