10 Bayesian neural networks: Bringing uncertainty to deep learning
This chapter covers
- Introducing uncertainty in neural network weights and biases
- MCMC and variational inference for Bayesian neural networks
- Dropout as a Bayesian inference technique
Neural networks are extraordinary function approximators. Given enough data and parameters, they can model complex data structures such as images, speech, language, protein structures, and more. Traditional neural networks aren’t probabilistic, lacking the ability to quantify their predictive uncertainty. Their parameters are fixed after training, and for each input, a neural network produces a single prediction with no measure of uncertainty about how it arrived there.
The Bayesian approach treats model parameters differently. We place a distribution over each unknown variable (parameter) and update the distribution in light of data. When making predictions, a Bayesian model produces predictive distributions, not single numbers.
But a neural network is still a parameterized function, a collection of parameters that approximates an unknown relationship. There’s nothing preventing us from placing a prior over each parameter of a network and inferring a posterior. A Bayesian neural network does exactly this. Instead of committing to one set of parameters, it maintains a distribution over many plausible ones. This enables the crucial ability to express uncertainty, especially in regions where data are noisy or sparse.