Title: Segmentation of short human exons based on spectral features of double curves

Authors: Rong Jiang, Hong Yan

Addresses: School of Electrical and Information Engineering, University of Sydney, NSW 2006, Australia. ' Department of Computer Engineering and Information Technology, City University of Hong Kong, 83 Tat Chee Ave. Kowloon, Hong Kong

Abstract: This paper presents a new segmentation method based on spectral analysis to locate borders between short protein coding regions and non-coding regions. We formulate the innovative double curve representation of a DNA sequence and apply local three-codon measurement on the discrete Fourier spectral features at 1/3 frequency to identify short protein coding regions. The proposed spectral segmentation method based on double curves requires no prior knowledge of the DNA data. Our simulation results show that the proposed spectral method greatly improves the accuracy of identifying short coding regions in DNA sequences compared with the results obtained from the other methods that analyse DNA sequences directly.

Keywords: double curves; DNA sequence analysis; Fourier spectrum; triplets; gene identification; data mining; bioinformatics; short human exons; spectral analysis; segmentation; simulation; short protein coding regions.

DOI: 10.1504/IJDMB.2008.016754

International Journal of Data Mining and Bioinformatics, 2008 Vol.2 No.1, pp.15 - 35

Published online: 21 Jan 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article