Authors: Haiying Wang, Huiru Zheng, Jinglu Hu
Addresses: School of Computing and Mathematics, University of Ulster at Jordanstown, N. Ireland, UK. ' School of Computing and Mathematics, University of Ulster at Jordanstown, N. Ireland, UK. ' Waseda University, Japan
Abstract: The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.
Keywords: regulatory sequences; log likelihood ratio statistics; Poisson distribution; hierarchical clustering; neural networks; self-organising maps; gene classification; coregulated genes.
International Journal of Computational Biology and Drug Design, 2008 Vol.1 No.2, pp.141 - 157
Available online: 08 Sep 2008 *Full-text access for editors Access for subscribers Purchase this article Comment on this article