Chapter 1. The data science process
Listing 1.1. Building a decision tree
Listing 1.2. Plotting the confusion matrix
Listing 1.3. Plotting the relation between disposable income and loan outcome
Chapter 2. Loading data into R
Listing 2.1. Reading the UCI car data
Listing 2.2. Exploring the car data
Listing 2.3. Loading the credit dataset
Listing 2.4. Setting column names
Listing 2.5. Building a map to interpret loan use codes
Listing 2.6. Transforming the car data
Listing 2.7. Summary of Good.Loan and Purpose
Listing 2.8. PUMS data provenance documentation
Listing 2.9. SQL Screwdriver XML configuration file
Listing 2.10. Loading data with SQL Screwdriver
Listing 2.11. Loading data into R from a relational database
Listing 2.12. Selecting a subset of the Census data
Listing 2.13. Recoding variables
Listing 2.14. Summarizing the classifications of work
Chapter 3. Exploring data
Listing 3.1. The summary() command
Listing 3.2. Will the variable is.employed be useful for modeling?
Listing 3.3. Examples of invalid values and outliers
Listing 3.4. Looking at the data range of a variable
Listing 3.5. Checking units can prevent inaccurate results later
Listing 3.6. Plotting a histogram
Listing 3.7. Producing a density plot
Listing 3.8. Creating a log-scaled density plot
Listing 3.9. Producing a horizontal bar chart
Listing 3.10. Producing a bar chart with sorted categories
Listing 3.11. Producing a line plot
Listing 3.12. Examining the correlation between age and income