Symbolic approach to reduced bio-basis
by Mohamed A. Mahfouz; Yasser El-Sonbaty; M.A. Ismail
International Journal of Data Mining and Bioinformatics (IJDMB), Vol. 20, No. 1, 2018

Abstract: Reduced bio-basis is the minimal set of fixed-length sub-sequences of a biological sequence with maximum information. Sequence data are not numerical so centroid-based clustering algorithms are not directly applicable. The main contribution of this paper is to show how to apply centroid-based algorithms on biological sequences. The average similarity between a sub-sequence and other sub-sequences in a cluster is reduced to a similarity between the sub-sequence and an artificial centre formed in a similar way to the formation of the centre of symbolic objects. After applying the hard version of the proposed symbolic clustering algorithm, a possibilistic membership is computed for each sub-sequence that adds high outliers' rejection capability to the algorithm. Well-studied issues for the centroid-based approach such as parallelism or scalability can be applied to the proposed approach. Experimental results on several real datasets show that the proposed approach, in several respects, is superior to traditional methods.

Online publication date: Tue, 05-Jun-2018

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Data Mining and Bioinformatics (IJDMB):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com