Chapter 5. Advanced data management
This chapter covers
- Mathematical and statistical functions
- Character functions
- Looping and conditional execution
- User-written functions
- Ways to aggregate and reshape data
In chapter 4, we reviewed the basic techniques used for managing datasets within R. In this chapter, we’ll focus on advanced topics. The chapter is divided into three basic parts. In the first part we’ll take a whirlwind tour of R’s many functions for mathematical, statistical, and character manipulation. To give this section relevance, we begin with a data management problem that can be solved using these functions. After covering the functions themselves, we’ll look at one possible solution to the data management problem.
Next, we cover how to write your own functions to accomplish data management and analysis tasks. First, you’ll explore ways of controlling program flow, including looping and conditional statement execution. Then we’ll investigate the structure of user-written functions and how to invoke them once created.
Then, we’ll look at ways of aggregating and summarizing data, along with methods of reshaping and restructuring datasets. When aggregating data, you can specify the use of any appropriate built-in or user-written function to accomplish the summarization, so the topics you learned in the first two parts of the chapter will provide a real benefit.