Title: Analysing large biological data sets with an improved algorithm for MIC

Authors: Shuliang Wang; Yiping Zhao

Addresses: School of Software, Beijing Institute of Technology, 5 South Zhongguancun Street, Haidian District, Beijing 100081, China ' School of Software, Beijing Institute of Technology, 5 South Zhongguancun Street, Haidian District, Beijing 100081, China

Abstract: The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.

Keywords: maximal information coefficient; MIC; biological annotations; big data; similarity measures; bioinformatics.

DOI: 10.1504/IJDMB.2015.071548

International Journal of Data Mining and Bioinformatics, 2015 Vol.13 No.2, pp.158 - 170

Received: 22 Nov 2013
Accepted: 06 Sep 2014

Published online: 31 Aug 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article