Chapter 19. Clustering based on distributions with mixture modeling

 

This chapter covers

  • Understanding mixture model clustering
  • Understanding the difference between hard and soft clustering

Our final stop in unsupervised learning techniques brings us to an additional approach to finding clusters in data: mixture model clustering. Just like the other clustering algorithms we’ve covered, mixture model clustering aims to partition a dataset into a finite set of clusters.

In chapter 18, I showed you the DBSCAN and OPTICS algorithms, and how they find clusters by learning regions of high and low density in the feature space. Mixture model clustering takes yet another approach to identify clusters. A mixture model is any model that describes a dataset by combining a mix of two or more probability distributions. In the context of clustering, mixture models help us to identify clusters by fitting a finite number of probability distributions to the data and iteratively modifying the parameters of those distributions until they best fit the underlying data. Cases are then assigned to the cluster of the distribution under which they are most likely. The most common form of mixture modeling is Gaussian mixture modeling, which fits Gaussian (or normal) distributions to the data.

19.1. What is mixture model clustering?

 
 

19.2. Building your first Gaussian mixture model for clustering

 
 

19.3. Strengths and weaknesses of mixture model clustering

 
 

Summary

 
 

Solutions to exercises

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage