Member-only story Feature Transformations: A Tutorial on PCA and LDA

Introduction

When dealing with high-dimension data, it is common to use methods such as Principal Component Analysis (PCA) to reduce the dimension of the data. This converts the data to a different (lower dimension) set of features. This contrasts with feature subset selection which selects a subset of the original features (see [1] for a turorial on feature selection).

PCA is a linear transformation of the data to a lower dimension space. In this article we start off by explaining what a linear transformation is. Then we show with Python examples how PCA works. The article concludes with a description of Linear Discriminant Analysis (LDA) a supervised linear transformation method. Python code for the methods presented in that paper is available on GitHub.

Read More

Tags: Tutorial