7 Doing things with lots of data
This chapter covers
- How to read data into R from different sources
- How to inspect your data once it’s in R
- How to write R data to a file
Working with the example data sets has been useful; they’re not overly complex, yet they have features that help demonstrate the tools you’ve been learning to use. But it’s not your data, and that’s what you really want to use; so let’s explore how to get your data into your workspace, ready to be manipulated/interrogated, and then store it for safe keeping once you’re done.
7.1 Tidy data principles
If you’ve ever received data from someone else, you’ve likely encountered the issue of them not sharing your mental model of how that data should look. Even something as simple as a table of values can be structured in many different ways, depending on who created it.
If you have, say, the heights, weights, and ages of a group of people,1 you might find these stored as a table with repeated labels (one for each person) with a descriptive label for the thing being measured (the variable) and a value corresponding to that variable. This could be entered as follows:
1 See the Wikipedia article at http://mng.bz/We00. The data is partial census data for the Dobe area !Kung San, compiled from interviews conducted by Nancy Howell in the late 1960s, available from http://mng.bz/JAKV.