Title: Improvement in protein-coding region identification based on sliding window trigonometric fast transforms using Singular Value Decomposition

Authors: Malaya Kumar Hota, Vinay Kumar Srivastava

Addresses: Department of Electronics and Communication Engineering, Motilal Nehru National Institute of Technology, Allahabad 211004, Uttar Pradesh, India. ' Department of Electronics and Communication Engineering, Motilal Nehru National Institute of Technology, Allahabad 211004, Uttar Pradesh, India

Abstract: In this paper, the performance of various sliding window trigonometric fast transforms for identification of protein coding regions has been analysed at the nucleotide level. It is found that, Short-Time Discrete Fourier Transform (ST-DFT) gives better identification accuracy in comparison with Short-Time Discrete Cosine Transform (ST-DCT), Short-Time Discrete Sine Transform (ST-DST) and Short-Time Discrete Hartley Transform (ST-DHT). In the proposed method, identification accuracy of protein coding regions has been improved by applying Singular Value Decomposition (SVD) on the DNA spectrum obtained using sliding window trigonometric fast transforms. The results show that, in proposed method all trigonometric fast transforms gives almost similar results in terms of area under ROC curve for GENSCAN test set.

Keywords: protein coding regions; period-3 property; sliding window trigonometric fast transforms; ST-DFT; short-time discrete Fourier transform; ST-DCT; short-time discrete cosine transform; ST-DST; short-time discrete sine transform; ST-DHT; short-time discrete Hartley transform; SVD; singular value decomposition; bioinformatics.

DOI: 10.1504/IJDMB.2011.038580

International Journal of Data Mining and Bioinformatics, 2011 Vol.5 No.1, pp.110 - 127

Published online: 24 Jan 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article