Title: Poisson approach to clustering analysis of regulatory sequences

Authors: Haiying Wang, Huiru Zheng, Jinglu Hu

Addresses: School of Computing and Mathematics, University of Ulster at Jordanstown, N. Ireland, UK. ' School of Computing and Mathematics, University of Ulster at Jordanstown, N. Ireland, UK. ' Waseda University, Japan

Abstract: The presence of similar patterns in regulatory sequences may aid users in identifying co-regulated genes or inferring regulatory modules. By modelling pattern occurrences in regulatory regions with Poisson statistics, this paper presents a log likelihood ratio statistics-based distance measure to calculate pair-wise similarities between regulatory sequences. We employed it within three clustering algorithms: hierarchical clustering, Self-Organising Map, and a self-adaptive neural network. The results indicate that, in comparison to traditional clustering algorithms, the incorporation of the log likelihood ratio statistics-based distance into the learning process may offer considerable improvements in the process of regulatory sequence-based classification of genes.

Keywords: regulatory sequences; log likelihood ratio statistics; Poisson distribution; hierarchical clustering; neural networks; self-organising maps; gene classification; coregulated genes.

DOI: 10.1504/IJCBDD.2008.020206

International Journal of Computational Biology and Drug Design, 2008 Vol.1 No.2, pp.141 - 157

Published online: 08 Sep 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article