Title: Cysteine and tryptophan anomalies found when scanning all the binding sites in the Protein Data Bank
Authors: Gabor Ivan, Zoltan Szabadka, Vince Grolmusz
Addresses: Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary. ' Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary. ' Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary
Abstract: The Protein Data Bank (PDB) is one of the richest sources of structural biological information in the World. It started to exist as the computer-readable depository of crystallographic data complementing printed papers. The proper interpretation of the content of the individual files in the PDB still needs the detailed information found in the citing publication. An advanced graph theoretical method is presented here for automatically repairing, re-organising and re-structuring PDB data yielding the identification of all the protein-ligand complexes and all the binding sites in the PDB. As an application, we identified strong cysteine and tryptophan irregularities in the data.
Keywords: PDB; protein data bank; mmCIF; data mining; residue composition; binding sites; bioinformatics; graph theory; protein-ligand complexes; tryptophan irregularities; cysteine irregularities.
DOI: 10.1504/IJBRA.2010.038740
International Journal of Bioinformatics Research and Applications, 2010 Vol.6 No.6, pp.594 - 608
Received: 12 Feb 2010
Accepted: 23 Jun 2010
Published online: 24 Feb 2011 *