When performing data analysis or feature engineering (which is my favorite part of machine learning; see chapter 13), nothing makes me quite as happy as window functions. On first glance, they look like a watered-down version of the split-apply-combine pattern introduced in chapter 9. Then you open the blinds and bam—powerful manipulations in a short, expressive body of code.
Those who don’t know window functions are bound to reimplement its functionality poorly. This has been my experience coaching data analysts, scientists, and engineers. If you find yourself struggling to
- Rank records
- Identify the top/bottom record according to a set of conditions
- Get a value from a previous observation in a table (e.g., using our temperature data frame from chapter 9 and asking “What was the temperature yesterday?”)
- Build trended features (i.e., features that summarize past observations, such as the average of the observations for the previous week)