3 Customizing a Gaussian process with the mean and covariance functions

This chapter covers

Controlling the trend of a Gaussian process using mean functions
Controlling the smoothness of a Gaussian process using covariance functions
Learning the optimal hyperparameters of a Gaussian process using gradient descent

In chapter 2, we saw that the mean and covariance functions are the two core components of a Gaussian process (GP). Even though we used the zero mean and the RBF covariance function when implementing our GP, you can choose from many options when it comes to these two components.

By going with a specific choice for either the mean or the covariance function, we are effectively specifying prior knowledge for our GP. Incorporating prior knowledge into prediction is something we have to do with any Bayesian model, including GPs. Although I say we "have to" do it, being able to incorporate prior knowledge into a model is always a good thing, especially under settings in which data acquisition is expensive like Bayesian optimization.

3.1 Question: Why can’t you seem to change some people’s mind? Answer: Because of their priors.

3.2 Incorporating what you already know into a Gaussian process

3.3 Defining the functional trend with the mean function

3.3.1 Using the zero mean function as the base strategy

3.3.2 Using the constant function with gradient descent

3.3.3 Using the linear function with gradient descent

3.3.4 Using the quadratic function by implementing a custom mean function

3.4 Defining variability and smoothness with the covariance function

3.4.1 Setting the scales of the covariance function

3.4.2 Controlling smoothness with different covariance functions

3.4.3 Modeling different levels of variability with multiple length scales

3.5 Summary

3.6 Exercise