Title: Cysteine and tryptophan anomalies found when scanning all the binding sites in the Protein Data Bank

Authors: Gabor Ivan, Zoltan Szabadka, Vince Grolmusz

Addresses: Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary. ' Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary. ' Protein Information Technology Group, Eotvos University, H-1117 Budapest, Hungary; Uratim Ltd. Ugron 8, Budapest, Hungary

Abstract: The Protein Data Bank (PDB) is one of the richest sources of structural biological information in the World. It started to exist as the computer-readable depository of crystallographic data complementing printed papers. The proper interpretation of the content of the individual files in the PDB still needs the detailed information found in the citing publication. An advanced graph theoretical method is presented here for automatically repairing, re-organising and re-structuring PDB data yielding the identification of all the protein-ligand complexes and all the binding sites in the PDB. As an application, we identified strong cysteine and tryptophan irregularities in the data.

Keywords: PDB; protein data bank; mmCIF; data mining; residue composition; binding sites; bioinformatics; graph theory; protein-ligand complexes; tryptophan irregularities; cysteine irregularities.

DOI: 10.1504/IJBRA.2010.038740

International Journal of Bioinformatics Research and Applications, 2010 Vol.6 No.6, pp.594 - 608

Received: 12 Feb 2010
Accepted: 23 Jun 2010

Published online: 24 Feb 2011 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article