Title: Probabilistic models for biological sequences: selection and Maximum Likelihood estimation

Authors: Svetlana Ekisheva, Mark Borodovsky

Addresses: School of Biology, Georgia Institute of Technology, Atlanta, GA 30332-0230, USA. ' Department of Biomedical Engineering, School of Biology, Georgia Institute of Technology, Atlanta, GA 30332-0230, USA

Abstract: Probabilistic models for biological sequences (DNA and proteins) are frequently used in bioinformatics. We describe statistical tests designed to detect the order of dependency among elements of the sequence and to select the most appropriate probabilistic model for an experimental biological sequence. For a model of given type, the independence model, the first-order Markov chain and the hidden Markov model (HMM), we derive the uniform lower bound for the rate of decay for the errors of the maximum likelihood (ML) estimates of the model parameters and, subsequently, the uniform confidence intervals for the parameters.

Keywords: statistical models; biological sequences; parameter estimation; maximum likelihood; asymptotic properties; bioinformatics.

DOI: 10.1504/IJBRA.2006.010607

International Journal of Bioinformatics Research and Applications, 2006 Vol.2 No.3, pp.305 - 324

Published online: 07 Aug 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article