chapter twenty

20 Finishing up

Our purpose in these last few pages is to survey the results from chapters 2 through 19 and review the techniques we used along the way. Rather than go chapter by chapter and therefore repeat our journey in the same sequence, we’ll instead consolidate our findings into nine “learning areas” that are further broken down by packages, applied techniques, and chapter references. For instance, between chapters 5 and 14, we developed four types of models—linear regression, regression tree, analysis of variance (ANOVA), and logistic regression—using a mix of base R and packaged functions; accordingly, modeling is one of our nine learning areas. Once we get to section 20.4, we’ll review which models were applied where and for what ends.

The following learning areas are listed in the order in which they will be presented:

Cluster analysis (20.1)
Significance testing (20.2)
Effect size testing (20.3)
Modeling (20.4)
Operations research (20.5)
Probability (20.6)
Statistical dispersion (20.7)
Standardization (20.8)
Summary statistics and visualization (20.9)

In addition, we’ve created a series of Sankey diagrams (see chapter 3), one for each learning area, that plot the relationships between learning areas, packages and base R functions, techniques, and chapter numbers. These final visualizations are therefore visual snapshots of the same confluences.

20 Finishing up

20.1 Cluster analysis

20.2 Significance testing

20.3 Effect size testing

20.4 Modeling

20.5 Operations research

20.6 Probability

20.7 Statistical dispersion