chapter ten

10 Reshaping and Pivoting

This chapter covers:

The differences between wide and narrow data
Generating a pivot table from a DataFrame
Aggregating pivot table values by sum, average, count and more
Stacking and unstacking index levels in a DataFrame
Melting an aggregate DataFrame

As we've seen many times before, a dataset can arrive in a format that is not ideal for the analysis that we'd like to perform on it. One study even estimates that 80% of "analysis" consists of cleaning up data and contorting it into the proper shape[1].

Sometimes, issues can be confined to a specific column, row, or cell. For example, a column may have the wrong data type, a row may have missing values, and a cell may have incorrect character casing. Other times, there can be larger, structural problems with a dataset that extend beyond the data itself. As one example, the values may be stored in a format that makes it easy to extract a single row but difficult to perform an aggregate summary of the data, or vice versa.

To reshape a dataset means to manipulate it into a different shape, one that tells a new story that could not be properly told from its original presentation. Reshaping offers a new "view" or "perspective" on the data.

10.1 Wide vs. Narrow Data

10.2 Creating a Pivot Table from a DataFrame

10 Reshaping and Pivoting

This chapter covers:

10.1 Wide vs. Narrow Data

10.2 Creating a Pivot Table from a DataFrame

10.2.1 The pivot_table Method

10.2.2 Additional Options for Pivot Tables

10.3 Stacking and Unstacking Index Levels

10.4 Melting a Dataset

10.4.1 Melting a Dataset

10.5 Exploding a List of Values

10.6 Coding Challenge

10.7 Summary