concept dimensionality in category R

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr.
Dimension-reduction algorithms take unlabeled (because they are unsupervised learning methods) and high-dimensional data (data with many variables) and learn a way of representing it in a lower number of dimensions. Dimension-reduction algorithms may be used as an exploratory technique (because it’s very difficult for humans to visually interpret data in more than two or three dimensions at once) or as a preprocessing step in the machine learning pipeline (it can help mitigate problems such as collinearity and the curse of dimensionality, terms I’ll define in later chapters). Dimension-reduction algorithms can also be used to help us visually confirm the performance of classification and clustering algorithms (by allowing us to plot the data in two or three dimensions).
Mitigating the curse of dimensionality
In chapter 5, I discussed the curse of dimensionality. This slightly dramatic-sounding phenomenon describes a set of challenges we encounter when trying to identify patterns in a dataset with many variables. One aspect of the curse of dimensionality is that for a fixed number of cases, as we increase the number of dimensions in the dataset (increase the feature space), the cases get further and further apart. To reiterate this point in figure 13.1, I’ve reproduced figure 5.2 from chapter 5. In this situation, the data is said to become sparse. Many machine learning algorithms struggle to learn patterns from sparse data and may start to learn from the noise in the dataset instead.
Figure 13.1. Data becomes more sparse as the number of dimensions increases. Two classes are shown in one-, two-, and three-dimensional feature spaces. The dotted lines in the three-dimensional representation are to clarify the position of the points along the z-axis. Note the increasing empty space with increased dimensions.
![]()
Another aspect of the curse of dimensionality is that as the number of dimensions increases, the distances between the cases begin to converge to a single value. Put another way, for a particular case, the ratio between the distance to its nearest neighbor and its furthest neighbor tends toward 1 in high dimensions. This presents a challenge to algorithms that rely on measuring distances (particularly Euclidean distance), such as k-nearest neighbors, because distance starts to become meaningless.
How can you mitigate the impacts of the curse of dimensionality and/or collinearity on the predictive performance of your models? Why, with dimension reduction, of course! If you can compress most of the information from 100 variables into just 2 or 3, then the problems of data sparsity and near-equal distances disappear. If you turn two collinear variables into one new variable that captures all the information of both, then the problem of dependence between the variables disappears.
But we’ve already encountered another set of techniques that can mitigate the curse of dimensionality and collinearity: regularization. As we saw in chapter 11, regularization can be used to shrink the parameter estimates and even completely remove weakly contributing predictors. Regularization can therefore reduce sparsity resulting from the curse of dimensionality, and remove variables that are collinear with others.