chapter ten

10 Bayesian neural networks: Bringing uncertainty to deep learning

This chapter covers

Introducing uncertainty in neural network weights and biases
MCMC and variational inference for Bayesian neural networks
Dropout as a Bayesian inference technique

Neural networks are extraordinary function approximators. Given enough data and parameters, they can model complex data structures such as images, speech, language, protein structures, and more. Traditional neural networks aren’t probabilistic, lacking the ability to quantify their predictive uncertainty. Their parameters are fixed after training, and for each input, a neural network produces a single prediction with no measure of uncertainty about how it arrived there.

The Bayesian approach treats model parameters differently. We place a distribution over each unknown variable (parameter) and update the distribution in light of data. When making predictions, a Bayesian model produces predictive distributions, not single numbers.

But a neural network is still a parameterized function, a collection of parameters that approximates an unknown relationship. There’s nothing preventing us from placing a prior over each parameter of a network and inferring a posterior. A Bayesian neural network does exactly this. Instead of committing to one set of parameters, it maintains a distribution over many plausible ones. This enables the crucial ability to express uncertainty, especially in regions where data are noisy or sparse.

Traditional neural networks aren’t always enough

10 Bayesian neural networks: Bringing uncertainty to deep learning

This chapter covers

Traditional neural networks aren’t always enough

Bayesian neural network inference with MCMC and variational inference

MCMC

Variational inference

Dropout as a Bayesian inference method

The Bayesian interpretation

Implementation

Connections to other inference methods

Summary