Title: Pupylation sites prediction with ensemble classification model

Authors: Wenzheng Bao; Zhenhua Huang; Chang-An Yuan; De-Shuang Huang

Addresses: Institute of Machine Learning and Systems Biology, Tongji University, Shanghai, China ' Institute of Machine Learning and Systems Biology, Tongji University, Shanghai, China ' Science Computing and Intelligent Information Processing of GuangXi Higher Education Key Laboratory, Guangxi Teachers Education University, Nanning, Guangxi, China ' Institute of Machine Learning and Systems Biology, Tongji University, Shanghai, China

Abstract: Post-translational modification of protein is one of the most important biological processions in the field of proteomics and bioinformatics. Pupylation is a novel post translational modification which the small, intrinsically disordered prokaryotic ubiquitin-like protein is conjugated to lysine residues of potential segments. Both the experimental and computational prediction methods of such modified sites have proved to be a challenging issue. Computational methods mainly aimed at extracting effective features from the potential protein segments. In this paper, the statistical feature of adjacent amino acid residues has been proposed and the novel feature is combined appearance of adjacent amino acid and the BLOSUM62 matrix. The Neural Network and the Naïve Bayesian model have been employed as the classification model in this work. Such model will also be utilised to deal with many other issues in the field of computational biology.

Keywords: lysine pupylation; neural network; Naïve Bayes; post-translational modification.

DOI: 10.1504/IJDMB.2017.086441

International Journal of Data Mining and Bioinformatics, 2017 Vol.18 No.2, pp.91 - 104

Received: 25 Mar 2017
Accepted: 29 Mar 2017

Published online: 10 Sep 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article