Title: Predicting protein-RNA interaction using sequence derived features and machine learning approach

Authors: Chandan Pandey; Rokkam Sandeep; Aikansh Priyam; Satyajit Mahapatra; Sitanshu Sekhar Sahu

Addresses: Department of Electronics and Communication, Birla Institute of Technology, Mesra, Ranchi, Jharkhand 835215, India ' Department of Electronics and Communication, Birla Institute of Technology, Mesra, Ranchi, Jharkhand 835215, India ' Department of Electronics and Communication, Birla Institute of Technology, Mesra, Ranchi, Jharkhand 835215, India ' Department of Electronics and Communication, Birla Institute of Technology, Mesra, Ranchi, Jharkhand 835215, India ' Department of Electronics and Communication, Birla Institute of Technology, Mesra, Ranchi, Jharkhand 835215, India

Abstract: Protein-RNA interactions play a very crucial part in various cellular processes. Several computational methods are being developed based on primary, secondary and tertiary information of proteins and RNA to predict the interactions. In this paper, various sequence based information of proteins and RNA are explored to predict the interactions using machine learning approach. The conjoint ternion feature is found to be superior as compared to the other composition based features. It provides an accuracy of 89.67% and MCC of 0.79 on a standard database. When tested on an independent dataset, it provides the prediction accuracy of 83.23%.

Keywords: protein; RNA; feature extraction; SVM; machine learning.

DOI: 10.1504/IJDMB.2017.090991

International Journal of Data Mining and Bioinformatics, 2017 Vol.19 No.3, pp.270 - 282

Received: 19 Sep 2017
Accepted: 12 Feb 2018

Published online: 05 Apr 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article