5 Importing Data from Common Formats
This chapter covers
- Managing files with RStudio Projects
- The basics of the CSV file format
- Using the read_csv() function from the readr package
- Reading Excel data into R
Previously, we had used datasets available in the edr package to get the data we needed. This is all fine and well for the purpose of learning to use data transformation and visualization functions. However, real world data won’t often be found in R packages. You may encounter useful data in text files, Excel files, or on the web in myriad forms. While there are hundreds of ways data can be structured and stored, there are a few formats that are quite common and so the focus of this chapter will be on the methods for importing these common formats into R (as tibbles). These formats are CSV and Excel files.
Because we are going to manage files in this chapter, we’ll quickly learn about managing them in our file system by making a project in RStudio. The idea here is that we’d use a dedicated directory as the ‘home base’ for a named project, which the benefit of isolating each project and its files from other projects (and their associated files).
The edr package for its part will offer some custom functions to make the examples flow a bit better. We need some CSV and Excel files to import, so, a few functions from edr will be available for spawning these files in our project directory.