concept OPTICS algorithm in category R

appears as: OPTICS algorithms, OPTICS algorithm, The OPTICS algorithm
Machine Learning with R, the tidyverse, and mlr

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr.

18.1.2. How does the OPTICS algorithm learn?

In this section, I’ll show you how the OPTICS algorithm learns regions of high density in a dataset, how it’s similar to DBSCAN, and how it differs. Technically speaking, OPTICS isn’t actually a clustering algorithm. Instead, it creates an ordering of the cases in the data in such a way that we can extract clusters from it. That sounds a little abstract, so let’s work through how OPTICS works.

The DBSCAN algorithm has one important drawback: it struggles to identify clusters that have different densities than each other. The OPTICS algorithm is an attempt to alleviate that drawback and identify clusters with varying densities. It does this by allowing the search radius around each case to expand dynamically instead of being fixed at a predetermined value.

Once all cases have been visited, the algorithm returns both the processing order (the order in which each case was visited) and the reachability score of each case. If we plot processing order against reachability score, we get something like the top plot in figure 18.6. To generate this plot, I applied the OPTICS algorithm to a simulated dataset with four clusters (you can find the code to reproduce this at www.manning.com/books/machine-learning-with-r-the-tidyverse-and-mlr). Notice that when we plot the processing order against the reachability score, we get four shallow troughs, each separated by spikes of high reachability. Each trough in the plot corresponds to a region of high density, while each spike indicates a separation of these regions by a region of low density.

Figure 18.6. Reachability plot of a simulated dataset. The top plot shows the reachability plot learned by the OPTICS algorithm from the data shown in the bottom plot. The plots are shaded to indicate where each cluster in the feature space maps onto the reachability plot.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage