This chapter covers
- Starting to work with R and data
- Mastering R’s data frame structure
- Loading data into R
- Recoding data for later analysis
This chapter works through how to start working with R and how to import data into R from diverse sources. This will prepare you to work examples throughout the rest of the book.
Figure 2.1 is a diagram representing a mental model for the book that has been reshaded to emphasize the purpose of this chapter: starting to work with R and importing data into R. The overall diagram shows the data science process diagram from chapter 1 combined with a rebus form of the book title. In each chapter, we will reshade this mental model to indicate the parts of the data science process we are emphasizing. For example: in this chapter, we are mastering the initial steps of collecting and managing data, and touching on issues of practicality, data, and R (but not yet the art of science).
Figure 2.1. Chapter 2 mental model
Many data science projects start when someone points the analyst toward a bunch of data, and the analyst is left to make sense of it.[5] Your first thought may be to use ad hoc tools and spreadsheets to sort through it, but you will quickly realize that you’re taking more time tinkering with the tools than actually analyzing the data. Luckily, there’s a better way: using R. By the end of the chapter, you’ll be able to confidently use R to extract, transform, and load data for analysis.