DNA data clustering by combination of 3D cellular automata and n-grams for structure molecule prediction Online publication date: Mon, 05-Dec-2016
by Fatima Kabli; Reda Mohamed Hamou; Abdelmalek Amine
International Journal of Bioinformatics Research and Applications (IJBRA), Vol. 12, No. 4, 2016
Abstract: Knowledge extraction from genomic data is important activity for the biologist. In order to mine the underlying biological knowledge, we based on the Knowledge Discovery in Databases (KDD) process. In this paper, we transformed DNA sequences into texts: the text indexed by TF-IDF and n-grams approach. In the aim of grouping the similar DNA sequences, we applied the bio-inspired 3D cellular automata for clustering method. For the analysis of clustering results we based on the transformation of each DNA sequence into amino acids sequence; according to the standard genetic code, we concluded that the clusters help the biologists to select DNA sequences that can produce a type of medicament (molecule) and their various derivatives (low concentration in their composition).
Online publication date: Mon, 05-Dec-2016
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Bioinformatics Research and Applications (IJBRA):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email firstname.lastname@example.org