Sequence similarity using composition method
by Geetika Munjal; Pooja Sharma; Deepti Gaur
International Journal of Data Science (IJDS), Vol. 3, No. 1, 2018

Abstract: Deoxyribo nucleic acid (DNA) has enormous capacity to carry very important information in the form of character strings. Sequence analysis is the process of applying a wide range of methods to DNA sequences for understanding the structure, feature or evolution of these nucleotides strings. The analysis uses mathematical methods to convert these character strings to numerical values, and these numerical values are used to find similarity between the sequences. DNA sequences only contain four nucleotides: A, C, G and T, but in order to find information from these sequences, sequence comparison becomes essential. In this paper, various methods to analyse DNA sequences including usage of entropy, divergence, LZ complexity and the role of hybridisation are explored. A hybrid model based on the composition vector and distance methods is proposed to find dissimilarity between sequences and this hybrid model is tested on sequences of species downloaded from National Center for Biotechnology Information (NCBI).

Online publication date: Sun, 25-Mar-2018

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Science (IJDS):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com