Title: Determining protein conformation using vibrational frequencies: an ensemble approach

Authors: Charu Kathuria; Deepti Mehrotra; Navnit Kumar Misra

Addresses: Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, Uttar Pradesh, India ' Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, Uttar Pradesh, India ' Brahmanand P.G. College, The Mall, Kanpur, India

Abstract: The vibrational frequencies of amides are widely used for proteins secondary structure characterisation. Prediction of protein structure based on amino acid sequence using classification approach has shown improvement with respect to accuracy, but for predicting change in structure with variation in environmental conditions, FTIR frequencies can be aptly used. In this paper an Ensemble model is constructed using, basic seven amide frequencies AMIDE A, AMIDE I, AMIDE II, AMIDE III, AMIDE IV, AMIDE V and AMIDE VI to predict the secondary structure of proteins using conformation of amino acids which are categorised as alpha helices, parallel beta sheets, anti-parallel beta sheets or omega helices. X-validation method and different performance measures are used to determine the accuracy of the model. The proposed experiment hence proves that an Ensemble approach outperforms all the other classification techniques which are used to predict the model as the accuracy of single K-NN model measured was 55.33% but when used with Bagging or Adaboost techniques of Ensemble its accuracy reaches to 94.67%.

Keywords: secondary structure of proteins; amino amides; vibrational frequencies; polypeptides; classification; ensemble; bagging; adaboost.

DOI: 10.1504/IJDMB.2020.107380

International Journal of Data Mining and Bioinformatics, 2020 Vol.23 No.2, pp.142 - 159

Accepted: 28 Feb 2020
Published online: 22 May 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article