12 Imports and exports

 

This chapter covers

  • Importing JSON data
  • Flattening a nested collection of records
  • Downloading a CSV from an online website
  • Reading from and writing to Excel workbooks

Data sets come in a variety of file formats: comma-separated values (CSV), tab-separated values (TSV), Excel workbooks (XLSX), and more. Some data formats do not store data in tabular format; instead, they nest collections of related data inside a key-value store. Consider the following two examples. Figure 12.1 stores data in a table, and figure 12.2 stores the same data in a Python dictionary.

Figure 12.1 A table of Oscar winners

Python’s dictionary is an example of a key-value data structure:

Figure 12.2 A Python dictionary (key-value store) with the same data
{
    2000: [
        {
            "Award": "Best Actor",
            "Winner": "Russell Crowe"
        },
        {
            "Award": "Best Actress",
            "Winner": "Julia Roberts"
        }
    ],
    2001: [
        {
            "Award": "Best Actor",
            "Winner": "Denzel Washington"
        },
        {
            "Award": "Best Actress",
            "Winner": "Halle Berry"
        }
    ]
}

12.1 Reading from and writing to JSON files

12.1.1 Loading a JSON file Into a DataFrame

12.1.2 Exporting a DataFrame to a JSON file

12.2 Reading from and writing to CSV files

12.3 Reading from and writing to Excel workbooks

12.3.1 Installing the xlrd and openpyxl libraries in an Anaconda environment

12.3.2 Importing Excel workbooks

12.3.3 Exporting Excel workbooks

12.4 Coding challenge

12.4.1 Problems

12.4.2 Solutions

Summary