11 Visualization

 

Data analysis, as you’ve seen throughout this book, is largely about numbers. A typical pandas data frame contains columns and rows full of numbers, and data analysis involves lots of mathematical methods and statistical techniques.

That’s fine, except that we humans are typically bad at understanding large collections of numbers. We’re generally much better at comprehending visual depictions of numbers, especially if we’re trying to understand relationships among our data. So, although we often think of visualization as a way to explain technical ideas in simple terms to non-experts, the fact is that visualization can also be helpful for the experts working on a problem. Seeing a chart or graph can help us put the numbers in perspective, improve our understanding of a problem we’re working on, and thus inform the very analysis that created the visualization.

The 900-pound gorilla in the world of Python data visualization is Matplotlib. There’s no doubt that Matplotlib is powerful—but it’s also overwhelming to many people. Fortunately, pandas provides a visualization API that allows us to create plots from our data without having to use Matplotlib explicitly. We thus get the best of both worlds: the ability to plot information in our data frame, without having to learn too much about Matplotlib’s API. However, if and when you need more power, Matplotlib is there, under the hood.

Exercise 43 Cities

Working it out

Solution

Beyond the exercise

Exercise 44 Boxplotting weather

Working it out

Solution

Beyond the exercise

Exercise 45 Taxi fare breakdown

Working it out

Solution

Beyond the exercise

Exercise 46 Cars, oil, and ice cream

Working it out

Solution

Beyond the exercise

Exercise 47 Seaborn taxi plots

Working it out