1 Series

 

If you have any experience with Pandas, then you know that we typically work with data in two-dimensional tables, known as "data frames," with rows and columns. But each column in a data frame is built from a "series," a one-dimensional data structure, which means that you can think of a data frame as a collection of series.

Figure 1.1. Each of a data frame’s columns is a series
CH1 F2 LERNER

This perspective is particularly useful once you learn what methods are available on a series, because most of those methods are also available on data frames—only instead of getting a single result, we’ll get one result for each column in the data frame. For example, the mean method, when applied to a series, returns the mean of the values in the series. If you invoke mean on a data frame, then Pandas will invoke the mean method on each column, returning a collection of mean values. Moreover, those values are themselves returned as a series, on which you can invoke further methods.

Figure 1.2. Invoking a series method (such as mean) on a data frame often returns one value for each column
CH1 F3 LERNER

Deep understanding of series can be useful in other ways, too. In particular, with a "boolean index" (also known as a "mask index"), we can retrieve selected rows and columns of a data frame. (If you aren’t familiar with boolean indexes, see the sidebar, "Selecting values with booleans," below.)

1.1 Useful references

1.2 Exercise 1: Test scores

1.2.1 Discussion

1.2.2 Solution

1.2.3 Beyond the exercise

1.3 Exercise 2: Scaling test scores

1.3.1 Discussion

1.3.2 Solution

1.3.3 Beyond the exercise

1.4 Exercise 3: Counting 10s digits

1.4.1 Discussion

1.4.2 Solution

1.4.3 Beyond the exercise

1.5 Exercise 4: Descriptive statistics

1.5.1 Discussion

1.5.2 Solution

1.5.3 Beyond the exercise

sitemap