Authors: Tsung Tai Yeh; Tseng Yi Chen; Yen Chiu Chen; Hsin Wen Wei
Addresses: Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei, Taiwan. ' 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan. ' Computer Science Department, National Tsing Hua University, 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan. ' Institute of Information Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei, Taiwan
Abstract: Advances in non-linear dimensionality reduction provide a way to understand and visualise the underlying structure of complex datasets. The performance of large-scale non-linear dimensionality reduction is of key importance in data mining, machine learning, and data analysis. In this paper, we concentrate on improving the performance of non-linear dimensionality reduction using large-scale datasets on the GPU. In particular, we focus on solving problems including k-nearest neighbour (KNN) search and sparse spectral decomposition for large-scale data, and propose an efficient framework for local linear embedding (LLE). We implement a k-d tree-based KNN algorithm and Krylov subspace method on the GPU to accelerate non-linear dimensionality reduction for large-scale data. Our results enable GPU-based k-d tree LLE processes of up to about 30-60× faster compared to the brute force KNN (Hernandez et al., 2007) LLE model on the CPU. Overall, our methods save O(n²-6n-2k-3) memory space.
Keywords: nonlinear dimension reduction; dimensionality reduction; GPU; complex datasets; memory space; graphics processing unit.
International Journal of Granular Computing, Rough Sets and Intelligent Systems, 2011 Vol.2 No.2, pp.149 - 165
Available online: 26 Oct 2011 *Full-text access for editors Access for subscribers Purchase this article Comment on this article