Title: Operon prediction by Markov clustering

Authors: Wei Du; Zhongbo Cao; Yan Wang; Enrico Blanzieri; Chen Zhang; Yanchun Liang

Addresses: Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; College of Chemistry, Jilin University, Changchun 130012, China ' Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China ' Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; College of Mathematics, Jilin University, Changchun 130012, China ' Department of Information and Communication Technology, University of Trento, Povo 38050, Italy ' Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China ' Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China

Abstract: The prediction of operons is a critical step for the reconstruction of biochemical and regulatory networks at the whole genome level. In this paper, a novel operon prediction model is proposed based on Markov Clustering (MCL). The model employs a graph-clustering method by MCL for prediction and does not need a classifier. In the cross-species validation, the accuracies of E. coli K12, Bacillus subtilis and P. furiosus are 92.1, 86.9 and 87.3%, respectively. Experimental results show that the proposed method has a powerful capability of operon prediction. The compiled program and test data sets are publicly available at http://ccst.jlu.edu.cn/JCSB/OPMC/.

Keywords: genome analysis; structural genomics; operon prediction; cluster analysis; graph clustering; MCL; Markov clustering; intergenic distance; conserved gene clusters; gene ontology; MFE; minimum free energy; bioinformatics; operons; E. coli; Bacillus subtilis; P. furiosus.

DOI: 10.1504/IJDMB.2014.062149

International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.4, pp.424 - 443

Received: 06 Nov 2010
Accepted: 30 Apr 2011

Published online: 21 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article