Chapter 5. Advanced data management

 

This chapter covers

  • Mathematical and statistical functions
  • Character functions
  • Looping and conditional execution
  • User-written functions
  • Ways to aggregate and reshape data

In chapter 4, we reviewed the basic techniques used for managing datasets in R. In this chapter, we’ll focus on advanced topics. The chapter is divided into three basic parts. In the first part, we’ll take a whirlwind tour of R’s many functions for mathematical, statistical, and character manipulation. To give this section relevance, we begin with a data-management problem that can be solved using these functions. After covering the functions themselves, we’ll look at one possible solution to the data-management problem.

Next, we cover how to write your own functions to accomplish data-management and -analysis tasks. First, we’ll explore ways of controlling program flow, including looping and conditional statement execution. Then we’ll investigate the structure of user-written functions and how to invoke them once created.

Then, we’ll look at ways of aggregating and summarizing data, along with methods of reshaping and restructuring datasets. When aggregating data, you can specify the use of any appropriate built-in or user-written function to accomplish the summarization, so the topics you learn in the first two parts of the chapter will provide a real benefit.

5.1. A data-management challenge

5.2. Numerical and character functions

5.3. A solution for the data-management challenge

5.4. Control flow

5.5. User-written functions

5.6. Aggregation and reshaping

5.7. Summary