6 Bayesian tools for machine learning
This chapter covers
- Unsupervised machine learning models
- Bayes’ theorem, conditional probability, entropy, cross-entropy, and conditional entropy
- Maximum likelihood estimation (MLE) and maximum a posteriori (MAP) estimation of model parameters
- Evidence maximization
- KLD
- Gaussian mixture models (GMM) and MLE estimation of GMM parameters
The Bayesian approach to statistics tries to model the world by modeling the uncertainties and prevailing beliefs and knowledge about the system. This is in contrast to the frequentist paradigm, where probability is strictly measured by observing a phenomenon repeatedly and measuring the fraction of time an event occurs. Machine learning, in particular unsupervised machine learning, is a lot closer to the Bayesian paradigm of statistics—the subject of this chapter.
In chapter 1, we primarily discussed supervised machine learning, where the training data is labeled: each input value is accompanied by a manually created desired output value. Labeling training inputs is a manual, labor-intensive process and often the worst pain point in building a machine learning–based system. This has led to considerable recent interest in unsupervised machine learning, where we build a model from unlabeled training data. How is this done?