10 Your data under a different lens: Window functions

 

This chapter covers:

  • Window functions and the kind of data transformation they enable
  • Summarizing, ranking, and analyzing data using the different classes of window functions
  • Building static, growing, and unbounded windows to your functions
  • Apply UDF to windows as custom window functions

When performing data analysis or feature engineering (which is my favorite part of machine learning; see chapter 13), nothing makes me quite as happy as window functions. On first glance, they look like a watered-down version of the split-apply-combine pattern introduced in chapter 9. Then you open the blinds and bam—powerful manipulations in a short, expressive body of code.

Those who don’t know window functions are bound to reimplement its functionality poorly. This has been my experience coaching data analysts, scientists, and engineers. If you find yourself struggling to

  • Rank records
  • Identify the top/bottom record according to a set of conditions
  • Get a value from a previous observation in a table (e.g., using our temperature data frame from chapter 9 and asking “What was the temperature yesterday?”)
  • Build trended features (i.e., features that summarize past observations, such as the average of the observations for the previous week)

you will find that window functions will multiply your productivity and simplify your code.

10.1 Growing and using a simple window function

10.1.1 Identifying the coldest day of each year, the long way

10.1.2 Creating and using a simple window function to get the coldest days

10.1.3 Comparing both approaches

10.2 Beyond summarizing: Using ranking and analytical functions

10.2.1 Ranking functions: Quick, who’s first?

10.2.2 Analytic functions: Looking back and ahead

10.3 Flex those windows! Using row and range boundaries

10.3.1 Counting, window style: Static, growing, unbounded

10.3.2 What you are vs. where you are: Range vs. rows

10.4 Going full circle: Using UDFs within windows

10.5 Look in the window: The main steps to a successful window function

Summary