Title: Dynamic visual data mining: biological sequence analysis and annotation using SeqVISTA

Authors: Tianhua Niu, Zhenjun Hu

Addresses: Division of Preventive Medicine, Department of Medicine, Brigham and Women's Hospital, 900 Commonwealth Ave. East, Boston, MA 02215, USA. ' Center for Advanced Genomic Technology, Bioinformatics Program, Boston University, 44 Cummington St., Boston, MA 02215, USA

Abstract: In the post-genomic era, the volume of public sequence databases is increasing exponentially and visualisation-centric techniques have become more and more important in biological sequence analysis and annotation. In this paper, we present a methodology called dynamic visual data mining (DVDM), which combines biological object modelling, interactive display, and data analysis tools into one integrative platform. Using Java Development Kit v1.4, an object-oriented software named SeqVISTA has been developed based on DVDM. To illustrate the application of SeqVISTA, the following examples are shown: regular expression pattern matching; comparative analysis of alternative exon splicing patterns; Fourier analyses; exon prediction (MZEF and GENSCAN). Overall, we argue that DVDM is an important technique for biologists to unveil the information hidden behind the large genomic and proteomic databases, and SeqVISTA provides a versatile tool that integrates multiple computational algorithms for meeting biologists| data mining needs.

Keywords: visualisation; dynamic visual data mining; GenBank; SWISS-PROT; biological sequence analysis; annotation; biological object modelling; interactive display; data analysis; object-oriented software; DNA sequence; protein sequence; computational biology; bioinformatics.

DOI: 10.1504/IJBRA.2005.006900

International Journal of Bioinformatics Research and Applications, 2005 Vol.1 No.1, pp.18 - 30

Published online: 21 Apr 2005 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article