6 Basic graphs

 

This chapter covers

  • Plotting data with bar, box, and dot plots
  • Creating pie charts and tree maps
  • Using histograms and kernel density plots

Whenever we analyze data, the first thing we should do is look at it. For each variable, what are the most common values? How much variability is present? Are there any unusual observations? R provides a wealth of functions for visualizing data. In this chapter, we’ll look at graphs that help you understand a single categorical or continuous variable. This topic includes

  • Visualizing the distribution of a variable
  • Comparing the distribution of a variable across two or more groups

In both cases, the variable can be continuous (for example, car mileage as miles per gallon) or categorical (for example, treatment outcome as none, some, or marked). In later chapters, we’ll explore graphs that display more complex relationships among variables.

The following sections explore the use of bar charts, pie charts, tree maps, histograms, kernel density plots, box plots, violin plots, and dot plots. Some of these may be familiar to you, whereas others (such as tree charts or violin plots) may be new. The goal, as always, is to understand your data better and to communicate this understanding to others. Let’s start with bar charts.

6.1 Bar charts

A bar plot displays the distribution (frequency) of a categorical variable through vertical or horizontal bars. Using the ggplot2 package, we can create a bar chart using the code

6.1.1 Simple bar charts

6.1.2 Stacked, grouped, and filled bar charts

6.1.3 Mean bar charts

6.1.4 Tweaking bar charts

6.2 Pie charts

6.3 Tree maps

6.4 Histograms

6.5 Kernel density plots

6.6 Box plots

6.6.1 Using parallel box plots to compare groups

6.6.2 Violin plots