chapter eleven
This chapter covers:
- Why do we need dimension reduction?
- What are the problems of high dimensionality and colinearity?
- What is principal component analysis?
- What is t-SNE?
- What is UMAP?
Our first stop in dimension reduction brings us to three very powerful and popular algorithms: principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP). All three of these dimension reduction algorithms turns a set of (potentially many) variables into a smaller number of variables which retain as much of the original, multi-dimensional information as possible. Each algorithm does this in a different way.
Note
The first historical example of dimension reduction was a two-dimensional map! Another form of dimension reduction that we encounter in our daily lives is the compression of audio into formats like .mp3 and .wav.