Chapter 24. Exploring data
- Python’s advantages for handling data
- Jupyter Notebook
- Data aggregation
- Plots with matplotlib
Over the past few chapters, I’ve dealt with some aspects of using Python to get and clean data. Now it’s time to look at a few of the things that Python can help you do to manipulate and explore data.
In this chapter, we’ll look at some common Python tools for data exploration: Jupyter notebook, pandas, and matplotlib. I can only touch briefly on a few features of these tools, but the aim is to give you an idea of what is possible and some initial tools to use in exploring data with Python.
Python has become one of the leading languages for data science and continues to grow in that area. As I’ve mentioned, however, Python isn’t always the fastest language in terms of raw performance. Conversely, some data-crunching libraries, such as NumPy, are largely written in C and heavily optimized to the point that speed isn’t an issue. In addition, considerations such as readability and accessibility often outweigh pure speed; minimizing the amount of developer time needed is often more important. Python is readable and accessible, and both on its own and in combination with tools developed in the Python community, it’s an enormously powerful tool for manipulating and exploring data.