Chapter 19. Clustering based on distributions with mixture modeling

This chapter covers

Understanding mixture model clustering
Understanding the difference between hard and soft clustering

Our final stop in unsupervised learning techniques brings us to an additional approach to finding clusters in data: mixture model clustering. Just like the other clustering algorithms we’ve covered, mixture model clustering aims to partition a dataset into a finite set of clusters.

In chapter 18, I showed you the DBSCAN and OPTICS algorithms, and how they find clusters by learning regions of high and low density in the feature space. Mixture model clustering takes yet another approach to identify clusters. A mixture model is any model that describes a dataset by combining a mix of two or more probability distributions. In the context of clustering, mixture models help us to identify clusters by fitting a finite number of probability distributions to the data and iteratively modifying the parameters of those distributions until they best fit the underlying data. Cases are then assigned to the cluster of the distribution under which they are most likely. The most common form of mixture modeling is Gaussian mixture modeling, which fits Gaussian (or normal) distributions to the data.

Chapter 19. Clustering based on distributions with mixture modeling

This chapter covers

19.1. What is mixture model clustering?

19.2. Building your first Gaussian mixture model for clustering

19.3. Strengths and weaknesses of mixture model clustering

Summary

Solutions to exercises

Chapter 19. Clustering based on distributions with mixture modeling

This chapter covers

19.1. What is mixture model clustering?

19.2. Building your first Gaussian mixture model for clustering

19.3. Strengths and weaknesses of mixture model clustering

Summary

Solutions to exercises

Unable to load book!