2 Creating a dataset

 

This chapter covers

  • Exploring R data structures
  • Using data entry
  • Importing data
  • Annotating datasets

The first step in any data analysis is creating a dataset containing the information to be studied in a format that meets your needs. In R, this task involves

  • Selecting a data structure to hold your data
  • Entering or importing your data into the data structure

Sections 2.1 and 2.2 of this chapter describe the wealth of structures that R can use to hold data. In particular, section 2.2 describes vectors, factors, matrices, data frames, lists, and tibbles. Familiarizing yourself with these structures (and the notation used to access elements within them) will help you tremendously in understanding how R works. You might want to take your time working through this section.

Section 2.3 covers the many methods for importing data into R. Data can be entered manually or imported from an external source. These data sources can include text files, spreadsheets, statistical packages, and database-management systems. For example, the data that I work with typically comes as comma-delimited text files or Excel spreadsheets. On occasion, though, I receive data as SAS and SPSS datasets or through connections to SQL databases. It’s likely that you’ll only have to use one or two of the methods described in this section, so feel free to choose those that fit your situation.

2.1 Understanding datasets

 
 
 

2.2 Data structures

 
 

2.2.1 Vectors

 
 
 
 

2.2.2 Matrices

 
 
 

2.2.3 Arrays

 
 

2.2.4 Data frames

 

2.2.5 Factors

 
 

2.2.6 Lists

 
 
 

2.2.7 Tibbles

 
 
 

2.3 Data input

 
 
 
 

2.3.1 Entering data from the keyboard

 
 
 

2.3.2 Importing data from a delimited text file

 
 

2.3.3 Importing data from Excel

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest