4 Using regression for call-center volume prediction

 

This chapter covers

  • Applying linear regression to real-world data
  • Cleaning data to fit curves and models you have not seen before
  • Using Gaussian distributions and predicting points along them
  • Evaluating how well your linear regression predicts the expected values

Armed with the power of regression-based prediction and TensorFlow, you can get started working on real-world problems involving more of the steps in the machine-learning process, such as data cleaning, fitting models to unseen data, and identifying models that aren’t necessarily an easy-to-spot best-fit line or a polynomial curve. In chapter 3, I showed you how to use regression when you control all steps of the machine-learning process, from using NumPy to generate fake data points that nicely fit a linear function (a line) or a polynomial function (a curve). But what happens in real life, when the data points don’t fit one of the patterns you’ve seen before, such as the set of points shown in figure 4.1? Take a close look at figure 4.1. Is a linear regression model a good predictor here?

Figure 4.1 A set of data points corresponding to weeks of the year on the x-axis (0-51) and normalized call volume (number of calls in a particular week/max calls for all weeks)
CH04_F01_Mattmann2

4.1 What is 311?

4.2 Cleaning the data for regression

4.3 What’s in a bell curve? Predicting Gaussian distributions

4.4 Training your call prediction regressor

4.5 Visualizing the results and plotting the error

4.6 Regularization and training test splits

Summary