Title: Analysing large biological data sets with an improved algorithm for MIC
Authors: Shuliang Wang; Yiping Zhao
Addresses: School of Software, Beijing Institute of Technology, 5 South Zhongguancun Street, Haidian District, Beijing 100081, China ' School of Software, Beijing Institute of Technology, 5 South Zhongguancun Street, Haidian District, Beijing 100081, China
Abstract: The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.
Keywords: maximal information coefficient; MIC; biological annotations; big data; similarity measures; bioinformatics.
DOI: 10.1504/IJDMB.2015.071548
International Journal of Data Mining and Bioinformatics, 2015 Vol.13 No.2, pp.158 - 170
Received: 22 Nov 2013
Accepted: 06 Sep 2014
Published online: 31 Aug 2015 *