1 Introducing Pandas

 

This chapter covers:

·      The growth of data science in the 21st century

·      The introduction of the pandas library for data analysis

·      The advantages and disadvantages of pandas relative to its competitors

·      The differences between working in Excel vs. a programming language

·      The basics of the DataFrame and the Series, the two primary objects in pandas

·      A tour of the library's features through a working example

Welcome to Pandas In Action! Pandas is a popular library for data analysis built on top of the Python programming language. A library is a collection of code designed to solve a specific but common business problem. You can think of Pandas as a digital toolbox that holds various tools for working with data. One piece of Python's vast data science ecosystem, pandas pairs well with other libraries for statistics, natural language processing, machine learning, visualization, and more.

In this chapter, we’ll take a look at the history and evolution of tools for working with big data. We’ll explore how pandas grew from one financial analyst’s pet project to an industry-standard used by companies like Netflix[1], Stripe[2], Google, Facebook and J.P. Morgan[3]. We'll compare the library's strengths and weaknesses with those of its competitors. Finally, we'll see what pandas is capable of by analyzing a real-world dataset; consider it a sneak peek of the concepts covered throughout the book.

1.1      Data in the 21st Century

 
 
 

1.2      Introducing pandas

 
 
 
 

1.2.1   Pandas vs Graphical Spreadsheet Applications

 

1.2.2   Pandas vs Its Competitors

 
 

1.3      Importing a Dataset

 

1.4      Manipulating a DataFrame

 
 
 

1.5      Counting Values in a Series

 
 
 

1.6      Filtering a Column by One or More Criteria

 

1.7      Grouping Data

 
 

1.8      Summary

 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest