3 Customizing a Gaussian process with the mean and covariance functions

This chapter covers

Controlling the expected behavior of a GP using mean functions
Controlling the smoothness of a GP using covariance functions
Learning the optimal hyperparameters of a GP using gradient descent

In chapter 2, we saw that the mean and covariance functions are the two core components of a Gaussian process (GP). Even though we used the zero mean and the RBF covariance function when implementing our GP, you can choose from many options when it comes to these two components.

By going with a specific choice for either the mean or the covariance function, we are effectively specifying prior knowledge for our GP. Incorporating prior knowledge into prediction is something we need to do with any Bayesian model, including GPs. Although I say we need to do it, being able to incorporate prior knowledge into a model is always a good thing, especially under settings in which data acquisition is expensive, like BayesOpt.

3.1 The importance of priors in Bayesian models

3.2 Incorporating what you already know into a GP

3.3 Defining the functional behavior with the mean function

3.3.1 Using the zero mean function as the base strategy

3.3.2 Using the constant function with gradient descent

3.3.3 Using the linear function with gradient descent

3.3.4 Using the quadratic function by implementing a custom mean function

3.4 Defining variability and smoothness with the covariance function

3.4.1 Setting the scales of the covariance function

3.4.2 Controlling smoothness with different covariance functions

3.4.3 Modeling different levels of variability with multiple length scales

3.5 Exercise

Summary