Title: BioHCVKD: A bioinformatics knowledge discovery system for HCV drug discovery – identifying proteins, ligands and active residues, in biological literature

Authors: Rania Ahmed Abdel Azzem Abdel Rahman Abul Seoud

Addresses: Faculty of Engineering, Department of Electrical Engineering, Communication and Electronics Section, El Fayoum University Fayoum 63514, Egypt

Abstract: Hepatitis C Virus (HCV) causes significant morbidity worldwide with restricted treatment options and lack of a universal cure which necessitate design of novel drugs. Researchers face an enormous growth of literature with very small portions of HCV knowledge accessible in structured way. This paper proposes the BioHCVKD that helps researchers to annotate relevant HCV information targeted to accelerate HCV drug discovery. BioHCVKD combines the dictionary based filtering and conditional random field (CRF) based gene mention tagger. BioHCVKD is supported by two modules, the Abstract Insertion module, and the Protein Insertion module. BioHCVKD achieves a recall of 73.25%, a precision of 70.5% and F-score of 71.85%, which improves the performance of the name entity tagger.

Keywords: named entity recognition; text mining; protein normalisation; HCV drug discovery; hepatitis C virus; drug delivery; bioinformatics; knowledge discovery; proteins; ligands; active residues.

DOI: 10.1504/IJBRA.2011.041741

International Journal of Bioinformatics Research and Applications, 2011 Vol.7 No.3, pp.317 - 333

Published online: 02 Aug 2011 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article