13 Combining Gaussian processes with neural networks

 

This chapter covers

  • The difficulty of processing complex, structured data with common covariance functions
  • Using neural networks to handle complex, structured data
  • Combining neural networks with GPs

In chapter 2, we learned that the mean and covariance functions of a Gaussian process (GP) act as prior information that we’d like to incorporate into the model when making predictions. For this reason, the choice for these functions greatly affects how the trained GP behaves. Consequently, if the mean and covariance functions are misspecified or inappropriate for the task at hand, the resulting predictions won’t be useful.

As an example, remember that a covariance function, or kernel, expresses the correlation—that is, similarity—between two points. The more similar the two points are, the more likely they are to have similar values for the labels we’re trying to predict. In our housing price prediction example, similar houses are likely to go for similar prices.

How does a kernel exactly compute the similarity between any two given houses? Let’s consider two cases. In the first, a kernel only considers the color of the front door and outputs 1 for any two houses of the same door color and 0 otherwise. In other words, this kernel thinks two houses are similar if and only if they have the same color for their front doors.

13.1 Data that contains structures

13.2 Capturing similarity within structured data

13.2.1 Using a kernel with GPyTorch

13.2.2 Working with images in PyTorch

13.2.3 Computing the covariance of two images

13.2.4 Training a GP on image data

13.3 Using neural networks to process complex structured data

13.3.1 Why use neural networks for modeling?

13.3.2 Implementing the combined model in GPyTorch

Summary