Title: Proposals for classification methods dedicated to biological data

Authors: Anne-Muriel Arigon, Guy Perriere, Manolo Gouy

Addresses: Universite de Lyon, Universite Lyon 1, CNRS, Laboratoire de Biometrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France. ' Universite de Lyon, Universite Lyon 1, CNRS, Laboratoire de Biometrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France. ' Universite de Lyon, Universite Lyon 1, CNRS, Laboratoire de Biometrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France

Abstract: The number of available genomic sequences is growing very fast, due to the development of massive sequencing techniques. Sequence classification is needed and contributes to the assessment of gene and species evolutionary relationships. Classification methods are thus necessary to carry out these identification operations in an accurate and fast way. We develop a classification method dedicated to homologous sequence family databases, allowing the attribution of a new sequence to a cluster using similarity measures. We used this classification method to implement two applications, Homologous Sequence Identification (HoSeqI) and MultiHoSeqI. Lately, we developed a chimera detection method and implemented an application, Chimeric Sequence Identification (ChiSeqI) to automate the processes of classification of specific biological data, the bacterial 16S ribosomal RNA sequences, and of detection of chimeric sequences.

Keywords: biological data classification; similarity measures; alignment; phylogeny; chimera detection method; clusters; genomic sequences; sequencing; sequence classification; bioinformatics; RNA sequences; chimeric sequences.

DOI: 10.1504/IJBET.2010.029649

International Journal of Biomedical Engineering and Technology, 2010 Vol.3 No.1/2, pp.4 - 21

Published online: 30 Nov 2009 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article