chapter four

4 The DataFrame Object

 

This chapter covers:

  • Instantiating a DataFrame object from a dictionary and a numpy ndarray
  • Importing a multidimensional dataset with the read_csv method
  • Sorting one or more columns in a DataFrame
  • Accessing rows and columns from a DataFrame
  • Setting and resetting the index of a DataFrame
  • Renaming column and index values

4.1      Overview of a DataFrame

The workhorse of the Pandas library, the DataFrame is a 2-dimensional data structure consisting of rows and columns. Two points of reference are needed to extract any given value from the dataset. A DataFrame can be described as a grid or a table of data, similar to one you'd find in a spreadsheet application like Excel.

4.1.1   Creating A DataFrame from a Dictionary

As always, let's begin by importing Pandas. We'll also be using the NumPy library for some random data generation. It is commonly assigned the alias np.

In  [1] import pandas as pd
        import numpy as np

Before we import our first dataset, let's practice instantiating a DataFrame from some native Python objects. One suitable data structure is a dictionary; its keys will serve as the column names and the corresponding values will serve as that column's values.

4.1.2   Creating A DataFrame from a Numpy ndarray

4.2      Similarities between Series and DataFrames

4.2.1   Importing a CSV File with the read_csv Method

4.2.2   Shared and Exclusive Attributes between Series and DataFrames

4.2.3   Shared Methods between Series and DataFrames

4.3      Sorting a DataFrame

4.3.1   Sort by Single Column

4.3.2   Sort by Multiple Columns

4.4      Sort by Index

4.4.1   Sort by Row Index

4.4.2   Sort by Column Index

4.5      Setting a New Index

4.6      Selecting Columns or Rows from a DataFrame

4.6.1   Select a Single Column from a DataFrame

4.6.2   Select Multiple Columns from a DataFrame

4.7      Select Rows from a DataFrame

4.7.1   Extract Rows by Index Label

4.7.2   Extract Rows by Index Position

4.7.3   Extract Values from Specific Columns

4.8      Extract Value from Series

4.9      Rename Column or Row