Authors: Alaa Tharwat
Addresses: Electrical Department, Faculty of Engineering, Suez Canal University, Ismailia, Egypt
Abstract: Dimensionality reduction is one of the preprocessing steps in many machine learning applications and it is used to transform the features into a lower dimension space. Principal component analysis (PCA) technique is one of the most famous unsupervised dimensionality reduction techniques. The goal of the technique is to find the PCA space, which represents the direction of the maximum variance of the given data. This paper highlights the basic background needed to understand and implement the PCA technique. This paper starts with basic definitions of the PCA technique and the algorithms of two methods of calculating PCA, namely, the covariance matrix and singular value decomposition (SVD) methods. Moreover, a number of numerical examples are illustrated to show how the PCA space is calculated in easy steps. Three experiments are conducted to show how to apply PCA in the real applications including biometrics, image compression, and visualisation of high-dimensional datasets.
Keywords: principal component analysis; PCA; dimensionality reduction; feature extraction; covariance matrix; singular value decomposition; SVD; PCA space; biometrics; image compression; tutorial; machine learning; visualisation; high-dimensional datasets.
International Journal of Applied Pattern Recognition, 2016 Vol.3 No.3, pp.197 - 240
Received: 04 Jan 2016
Accepted: 03 Mar 2016
Published online: 13 Oct 2016 *