Title: A new edge effect correction for sequence alignment

Authors: Amirhossein Karami; Afshin Fayyaz Movaghar

Addresses: Faculty of Mathematical Sciences, Department of Statistics, University of Mazandaran, Babolsar, 47416-13534, Iran ' Faculty of Mathematical Sciences, Department of Statistics, University of Mazandaran, Babolsar, 47416-13534, Iran

Abstract: Comparing two sequences is generally based on their local score and the corresponding p-value. Normally, the local score is obtain from an alignment that is not started near the end of either sequence. So, it is necessary to consider this fact in p-value calculation. The edge effect correction (EEC) (finite size correction) is an appropriate way to improve the significance of sequences alignment. In this paper, the EEC is applied to a method of evaluating sequence similarity, the h-tuple method, then, the results of the corrected h-tuple method are compared with the ones based on the extreme value theory in a real database. The receiver operating characteristic (ROC) curve reveals that the corrected h-tuple method is more accurate.

Keywords: EEC; edge effect correction; h-tuple method; extreme value theory; sequence alignment; local score; scoring scheme; optimal local alignment; bivariate normal distribution; ROC curve.

DOI: 10.1504/IJCBDD.2021.117182

International Journal of Computational Biology and Drug Design, 2021 Vol.14 No.3, pp.159 - 165

Accepted: 10 Dec 2020
Published online: 20 Aug 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article