Title: Predicting transmission of avian influenza A viruses from avian to human by using informative physicochemical properties
Authors: Jia Wang; Chuang Ma; Zheng Kou; Yan-Hong Zhou; Huai-Lan Liu
Addresses: Hubei Bioinformatics and Molecular Imaging Key Laboratory, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China ' School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA ' State Key Laboratory of Virology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China ' Hubei Bioinformatics and Molecular Imaging Key Laboratory, School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China ' School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
Abstract: Some strains of avian influenza A virus (AIV) can directly transmit from their natural hosts to humans. These avian-to-human transmissions have continuously been reported to cause human deaths worldwide since 1997. Predicting whether AIV strains can transmit from avian to human is valuable for early warning of AIV strains with human pandemic potential. In this study, we constructed a computational model to predict avian-to-human transmission of AIV based on physicochemical properties. Initially, ninety signature positions in the inner protein sequences were extracted with the entropy method. These positions were then encoded with 531 physicochemical features. Subsequently, the optimal subset of these physicochemical features was mined with several feature selection methods. Finally, a support vector machine (SVM) model named A2H was established to integrate the selected optimal features. The experimental results of cross-validation and an independent test show that A2H has the capability of predicting transmission of AIV from avian to human.
Keywords: avian influenza A virus; AIV strains; bird flu; bioinformatics; feature selection; genetic algorithms; information gain; interspecies transmission; mRMR; physicochemical properties; relief; support vector machines; SVM; avian-to-human transmission; A2H; computational modelling; protein sequences; feature extraction; data mining.
International Journal of Data Mining and Bioinformatics, 2013 Vol.7 No.2, pp.166 - 179
Received: 22 Apr 2012
Accepted: 23 Jul 2012
Published online: 10 Apr 2013 *