Introduction

published book

Data is everywhere, and it’s used in practically every industry in one way or another. One of the most common ways to interact with data, whether numbers or text, is with spreadsheet software. This approach offers several useful features: presenting data in a tabular view, allowing calculations to be performed using those values, and producing summaries of data. What spreadsheets don’t tend to provide is a way to do this repeatedly, reproducibly, or programmatically (without clicking or copying and pasting). Spreadsheets can be great for displaying data (including limited data summaries); but when you want to do something truly powerful with data, you need to go beyond them to a programming language. If you’ve ever had to copy results from a spreadsheet program to a word processor to write your report you’ll appreciate that there’s a better way to do this all within one single technology: R.

This sampler brings together chapters from three Manning books to address these issues. First, we have an overview of data stored in various structures from my own book, Beyond Spreadsheets with R, which details how different spreadsheet concepts (e.g. columns of data and tables) are represented within R. We then progress to a chapter from Practical Data Science with R, 2nd Edition by Nina Zumel and John Mount, which demonstrates how to use R to check the quality of this data with summary statistics and graphics. The last chapter, from R in Action, 2nd Edition by Robert I. Kabacoff, walks through how to generate reports based on results in several formats directly from R—no more copying data between software products to write a report based on your findings.

Moving away from spreadsheets may feel daunting, but the power of reproducible reporting and programming sophistication makes for a superior workflow which, once you are familiar with it, saves time and anguish compared to working with miscopied data errors and out of date results. My goal for this sampler is to highlight the benefits of transitioning from spreadsheets to using the R language to create reproducible processes for meaningful data analysis and reports. If you’re inspired to dig deeper into this worthy topic, I highly recommend continuing on with the full versions of all three books featured in this sampler. Thanks for reading!

Sign in to access this free ebook

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage