1 Series

 

If you have any experience with pandas, you know that we typically work with data in two-dimensional tables known as data frames, with rows and columns. But each column in a data frame is built from a series, a one-dimensional data structure (figure 1.1), which means you can think of a data frame as a collection of series.

Figure 1.1 Each of a data frame’s columns is a series.

This perspective is particularly useful once you learn what methods are available on a series, because most of those methods are also available on data frames—but instead of getting a single result, we get one result for each column in the data frame. For example, when applied to a series, the mean method returns the mean of the values in the series (figure 1.2). If you invoke mean on a data frame, pandas invokes the mean method on each column, returning a collection of mean values. Moreover, those values are themselves returned as a series on which you can invoke further methods.

Figure 1.2 Invoking a series method (such as mean) on a data frame often returns one value for each column.

A deep understanding of series can be useful in other ways, too. In particular, with a boolean index (also known as a mask index), we can retrieve selected rows and columns of a data frame. (If you aren’t familiar with boolean indexes, see the sidebar “Selecting values with booleans,” later in this chapter.)

Useful references

Exercise 1 Test scores

Working it out

Solution

Beyond the exercise

Exercise 2 Scaling test scores

Working it out

Solution

Beyond the exercise

Exercise 3 Counting tens digits

Working it out

Solution

Beyond the exercise

Exercise 4 Descriptive statistics

sitemap