This chapter covers
- Window functions and what kind of data transformation they enable.
- Summarizing, ranking, and analyzing data using the different classes of window functions.
- Building static, growing, and unbounded windows to your functions.
- Apply UDF to windows as custom window functions.
When performing data analysis or feature engineering (which is my favorite part of machine learning!—see chapter 12), nothing gets me happy quite like window functions. When you take a first glance at them, they look like a watered-down version of the split-apply-combine introduced in chapter 9. Then you open the blinds and bam! powerful manipulations in a short, expressive body of code.
Those who don’t know window functions are bound to reimplement its functionality, poorly. This has been my experience coaching data analysts, scientists, and engineers. If you find yourself struggling to
- Rank records
- Identify the top/bottom record according to a predicate
- Get a value from a previous observation
- Build trended features (meaning features that summarize past observations, such as the average of the observations for the previous week).
you will find that window functions will multiply your productivity and simplify your code.