Title: Dual stream data exploration

Authors: Xi Wang; Malcolm Crowe; Colin Fyfe

Addresses: School of Computing, University of the West of Scotland, High Street, Paisley, PA1 2BE, Scotland, UK. ' School of Computing, University of the West of Scotland, High Street, Paisley, PA1 2BE, Scotland, UK. ' School of Computing, University of the West of Scotland, High Street, Paisley, PA1 2BE, Scotland, UK.

Abstract: We consider means of extracting information from two data streams simultaneously when each data stream contains information about the other, i.e., there is redundancy in the data streams and we wish to identify the commonality between the data streams. The standard statistical method for doing this is canonical correlation analysis and so we consider extensions of this method: in the first group we use Bregman divergences to create methods of extracting information from the dual data streams which are optimal when the data has a distribution other than the Gaussian distribution. In the second advance, we use the method of reservoir computing in order to extract non-linear relationships. Finally we join the two methods and illustrate on a database of student marks.

Keywords: canonical correlation analysis; CCA; Bregman divergence; reservoir computing; dual streams; data exploration; data streams; information extraction; data redundancy; commonality; Gaussian distribution; Carl Friedrich Gauss; non-linear relationships; databases; student marks; data mining; data modelling; data management; intelligent data analysis.

DOI: 10.1504/IJDMMM.2012.046810

International Journal of Data Mining, Modelling and Management, 2012 Vol.4 No.2, pp.188 - 202

Published online: 23 Aug 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article