Title: Consensus RNA secondary structure prediction using information of neighbouring columns and principal component analysis
Authors: Tianhang Liu; Jianping Yin; Long Gao; Wei Chen; Minghui Qiu
Addresses: Computer School, National University of Defence Technology, Changsha, China ' Computer School, National University of Defence Technology, Changsha, China ' Computer School, National University of Defence Technology, Changsha, China ' Computer School, National University of Defence Technology, Changsha, China ' Medical Informatics Institute, Chinese PLA General Hospital, Beijing, China
Abstract: RNA is a family of biological macromolecules. It is important to all kinds of biological processes. RNA structure is closely related to its functions. Hence, determining the structure is invaluable in understanding genetic diseases and creating drugs. Nowadays, RNA secondary structure prediction is a field yet to be researched. In this paper, we present a novel method using RNA sequence alignment to predict a consensus RNA secondary structure. In essence, the goal of the method is to give a prediction about whether any two columns of an alignment correspond to a base pair or not, using the information provided by the alignment. The information includes the covariation score, the fraction of complementary nucleotides and the consensus probability matrix of the column pair and those of its neighbours. Then principal component analysis is applied to overcome the problem of over-fitting. A comparison of our method and other consensus RNA secondary structure prediction methods including NeCFold, ELMFold, KnetFold, PFold and RNAalifold, in 47 families from Rfam (version 11.0) is performed. Results show that our method surpasses the other methods in terms of Matthews correlation coefficient, sensitivity and selectivity.
Keywords: RNA secondary structure prediction; comparative sequence analysis; principal component analysis; PCA; information of neighbouring columns.
International Journal of Computational Science and Engineering, 2019 Vol.19 No.3, pp.430 - 439
Received: 26 Jul 2016
Accepted: 15 Mar 2017
Published online: 26 Jul 2019 *