Predicting alternatively spliced exons using semi-supervised learning Online publication date: Mon, 30-Nov-2015
by Ana Stanescu; Karthik Tangirala; Doina Caragea
International Journal of Data Mining and Bioinformatics (IJDMB), Vol. 14, No. 1, 2016
Abstract: Cost-efficient next generation sequencers can now produce unprecedented volumes of raw DNA data, posing challenges for annotation. Supervised machine learning approaches have been traditionally used to analyse and annotate complex genomic information. However, such approaches require labelled data for training, which in practice is scarce or expensive, while the unlabelled data is abundant. For some problems, semi-supervised learning can help improve supervised classifiers by making use of large amounts of unlabelled data and the latent information within them. We evaluate the applicability of semi-supervised learning algorithms to the problem of DNA sequence annotation, specifically to the prediction of alternatively spliced exons. We employ Expectation Maximisation, Self-training, and Co-training algorithms in an effort to assess the strengths and limitations of these techniques in the context of alternative splicing.
Online publication date: Mon, 30-Nov-2015
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining and Bioinformatics (IJDMB):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email firstname.lastname@example.org