Title: Condensing position-specific scoring matrixs by the Kidera factors for ligand-binding site prediction

Authors: Chun Fang; Tamotsu Noguchi; Hayato Yamana

Addresses: Department of Computer Science and Engineering, Waseda University, Tokyo 169-8555, Japan; Computational Biology Research Center (CBRC), Tokyo 135-0064, Japan ' Computational Biology Research Center (CBRC), Tokyo 135-0064, Japan; Pharmaceutical Education Research Center, Meiji Pharmaceutical University, Tokyo 204-0004, Japan ' Department of Computer Science and Engineering, Waseda University, Tokyo 169-8555, Japan

Abstract: Position-specific scoring matrix (PSSM) has been widely used for identifying protein functional sites. However, it is 20-dimentional and contains many redundant features. The Kidera factors were reported to contain information relating almost all physical properties of amino acids, but it requires appropriate weighting coefficients to express their properties. We developed a novel method, named as KSPSSMpred, which integrated PSSM and the Kidera Factors into a 10-dimensional matrix (KSPSSM) for ligand-binding site prediction. Flavin adenine dinucleotide (FAD) was chosen as a representative ligand for this study. When compared with five other feature-based methods on a benchmark dataset, KSPSSMpred performed the best. This study demonstrates that, KSPSSM is an effective feature extraction method which can enrich PSSM with information relating 188 physical properties of residues, and reduce 50% feature dimensions without losing information included in the PSSM.

Keywords: Kidera factors; position specific scoring matrix; PSSM; ligand binding sites; site prediction; bioinformatics; protein functional sites; feature extraction.

DOI: 10.1504/IJDMB.2015.068954

International Journal of Data Mining and Bioinformatics, 2015 Vol.12 No.1, pp.70 - 84

Received: 21 Feb 2013
Accepted: 19 Jul 2013

Published online: 22 Apr 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article