2 Starting with R and data
This chapter works through how to start working with R and how to import data into R from diverse sources. This will prepare you to work examples throughout the rest of this book. In this chapter we will:
- Start working with R and data
- Master R’s data frame structure
- Load data into R
- Start re-coding data for later analysis.
Figure 2.1 is a diagram representing a mental model for this book that has been re-shaded to emphasize the purpose of chapter: starting to work with R and importing data into R. The overall diagram is the data science process diagram from chapter 1 combined with a rebus form of this book's title. In each chapter we will re-shade this mental model to indicate the parts of the data science process we are emphasizing. For example: in this chapter we are mastering the initial steps of collecting and managing data, and touching on issues of practicality, data, and R (but not yet the art of science).
Figure 2.1. Chapter 2 Mental Model
Many data science projects start when someone points the analyst toward a bunch of data and the analyst is left to make sense of it.[5] A first thought may be to use ad-hoc tools and spreadsheets to sort through it, but you will quickly realize that you’re taking more time tinkering with the tools than actually analyzing the data. Luckily, there’s a better way: using R. By the end of the chapter, you’ll be able to confidently use R to extract, transform, and load data for analysis.