8 Analyzing tables using Pandas

 

This section covers

  • Storing 2D tables using the Pandas library
  • Summarizing 2D table content
  • Manipulating row and column content
  • Visualizing tables using the Seaborn library

The ad-click data for case study 2 is saved in a two-dimensional table. Data tables are commonly used to store information. The tables may be stored in different formats: some tables are saved as spreadsheets in Excel, and others are text-based CSV files in which the columns are separated by commas. The formatting of a table isn’t important. What is important is its structure. All tables have structural features in common: every table contains horizontal rows and vertical columns, and quite often, column headers also hold explicit column names.

8.1 Storing tables using basic Python

Let’s define a sample table in Python. The table stores measurements for various species of fish, in centimeters. Our measurement table contains three columns: Fish, Length, and Width. The Fish column stores a labeled species of fish, and the Length and Width columns specify the length and width of each fish species. We represent this table as a dictionary. The column names serve as dictionary keys, and these keys map to lists of column values.

Listing 8.1 Storing a table using Python data structures
fish_measures = {'Fish': ['Angelfish', 'Zebrafish', 'Killifish', 'Swordtail'],
                 'Length':[15.2, 6.5, 9, 6],
                 'Width': [7.7, 2.1, 4.5, 2]}

8.2 Exploring tables using Pandas

8.3 Retrieving table columns

8.4 Retrieving table rows

8.5 Modifying table rows and columns

8.6 Saving and loading table data

8.7 Visualizing tables using Seaborn

Summary