concept rpart algorithm in category R

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr.
Tree-based models can be used for both classification and regression tasks, so you may see them described as classification and regression trees (CART). However, CART is a trademarked algorithm whose code is proprietary. The rpart algorithm is simply an open source implementation of CART. You’ll learn how to use trees for regression tasks in chapter 12.
At each stage of the tree-building process, the rpart algorithm considers all of the predictor variables and selects the predictor that does the best job of discriminating the classes. It starts at the root and then, at each branch, looks again for the next feature that will best discriminate the classes of the cases that took that branch. But how does rpart decide on the best feature at each split? This can be done a few different ways, and rpart offers two approaches: the difference in entropy (called the information gain) and the difference in Gini index (called the Gini gain). The two methods usually give very similar results; but the Gini index (named after the sociologist and statistician Corrado Gini) is slightly faster to compute, so we’ll focus on it.
In this section, I’ll show you which hyperparameters need to be tuned for the rpart algorithm, what they do, and why we need to tune them in order to get the best-performing tree possible. Decision tree algorithms are described as greedy. By greedy, I don’t mean they take an extra helping at the buffet line; I mean they search for the split that will perform best at the current node, rather than the one that will produce the best result globally. For example, a particular split might discriminate the classes best at the current node but result in poor separation further down that branch. Conversely, a split that results in poor separation at the current node may yield better separation further down the tree. Decision tree algorithms would never pick this second split because they only look at locally optimal splits, instead of globally optimal ones. There are three issues with this approach: