13 Advanced transformations of data frames
This chapter covers
- Performing advanced transformations of data frames and grouped data frames
- Chaining transformation operations to create data processing pipelines
- Sorting, joining, and reshaping data frames
- Working with categorical data
- Evaluating classification models
In chapter 12 you learned how to perform basic transformations of data frames using operation-specification syntax using the combine function. In this chapter you will learn more advanced scenarios about how you can use this syntax, along with more functions that accept it: select, select!, transform, transform!, subset, and subset!. With these functions, you can conveniently perform any operation on columns you would need. At the same time, these functions are optimized for speed, and optionally can use multiple threads to perform computations. As in chapter 12, I also show you how you can specify these transformations using DataFramesMeta.jl domain-specific language.
In this chapter you will also learn how you can combine multiple tables using join operations. DataFrames.jl has an efficient implementation for all standard joins: inner join, left and right joins, outer join, semi and anti joins, and cross join. Similarly, I will show you how you can reshape data frames with stack and unstack functions.