Using naïve Bayesian classification as a meta-predictor to improve start codon prediction accuracy in prokaryotic organisms Online publication date: Tue, 29-Jul-2014
by Sean Landman; Imad Rahal
International Journal of Data Mining, Modelling and Management (IJDMMM), Vol. 5, No. 3, 2013
Abstract: Modern gene location prediction techniques are able to achieve near-perfect accuracy for prokaryotic organisms, but this reported accuracy is generally only for the stop codon locations. Accurate prediction of the start codon locations is more difficult to attain, and different approaches often produce conflicting predictions for the same gene. In this paper, we describe a new approach to resolve these conflicts and improve start codon prediction accuracy. Our approach uses a set of gene location prediction results from other popular prediction approaches to find consistently predicted gene locations. It then uses these consistent genes as a training set for a naïve Bayesian classifier to improve accuracy in the ambiguous genes, those in which there are some inconsistencies in the predicted start codon location among the original predictions. The methods detailed here apply to prokaryotic organisms, using E. coli and the EcoGene Verified Set database as a case study.
Online publication date: Tue, 29-Jul-2014
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining, Modelling and Management (IJDMMM):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email email@example.com