8 Time series data: Data preparation
This chapter covers
- Preparing time series data for analysis
- Determining what subset of time series data to use
- Cleaning time series data by handling gaps and missing values
- Analyzing patterns in time series data
Most datasets you will come across have a time component. If the process to generate the data involves taking the same measurement at recurring intervals, the data is called time series data. An example is measuring the yearly GDP of a country or the output of machinery in a production line. However, even something seemingly static, such as a customer database, has a time component if we look at the date customer records were created. We might not explicitly think of the data as a time series, but using this time component allows us to unlock additional insights in our data. For example, you could analyze the rate at which new customer records are being created or what times of the day your operations team are inputting data into the database.