Title: Annealing-based model-free expectation maximisation for multi-colour flow cytometry data clustering

Authors: Başak Esin Köktürk; Bilge Karaçalı

Addresses: Department of Electrical and Electronics Engineering, İzmir Institute of Technology, İzmir, Turkey ' Department of Electrical and Electronics Engineering, İzmir Institute of Technology, İzmir, Turkey

Abstract: This paper proposes an optimised model-free expectation maximisation method for automated clustering of high-dimensional datasets. The method is based on a recursive binary division strategy that successively divides an original dataset into distinct clusters. Each binary division is carried out using a model-free expectation maximisation scheme that exploits the posterior probability computation capability of the quasi-supervised learning algorithm subjected to a line-search optimisation over the reference set size parameter analogous to a simulated annealing approach. The divisions are continued until a division cost exceeds an adaptively determined limit. Experiment results on synthetic as well as real multi-colour flow cytometry datasets showed that the proposed method can accurately capture the prominent clusters without requiring any prior knowledge on the number of clusters or their distribution models.

Keywords: expectation maximisation; quasi-supervised learning; data clustering; gating; multi-colour flow cytometry data; simulated annealing; data analysis; bioinformatics; high-dimensional datasets; prominent clusters.

DOI: 10.1504/IJDMB.2016.073365

International Journal of Data Mining and Bioinformatics, 2016 Vol.14 No.1, pp.86 - 99

Received: 29 Apr 2015
Accepted: 15 May 2015

Published online: 30 Nov 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article