chapter four
4 The DataFrame Object
This chapter covers:
- Instantiating a DataFrame object from a dictionary and a numpy ndarray
- Importing a multidimensional dataset with the read_csv method
- Sorting one or more columns in a DataFrame
- Accessing rows and columns from a DataFrame
- Setting and resetting the index of a DataFrame
- Renaming column and index values
4.1 Overview of a DataFrame
The workhorse of the Pandas library, the DataFrame is a 2-dimensional data structure consisting of rows and columns. Two points of reference are needed to extract any given value from the dataset. A DataFrame can be described as a grid or a table of data, similar to one you'd find in a spreadsheet application like Excel.
4.1.1 Creating A DataFrame from a Dictionary
As always, let's begin by importing Pandas. We'll also be using the NumPy library for some random data generation. It is commonly assigned the alias np.
In [1] import pandas as pd
import numpy as np
Before we import our first dataset, let's practice instantiating a DataFrame from some native Python objects. One suitable data structure is a dictionary; its keys will serve as the column names and the corresponding values will serve as that column's values.