concept `density` in category `R`

appears as: density, density, densities

Machine Learning with R, the tidyverse, and mlr

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr. Login to get full access to this book.

Figure 4.10. Faceted plot of Survived against FamSize and Fare. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

to see more go to Chapter 4. Classifying based on odds with logistic regression

In the ggplot() function call, we supply Survived as the x aesthetic and Value as the y aesthetic (coercing it into a numeric vector with as.numeric() because it was converted into a character by our gather() function call earlier). Next—and here’s the cool bit—we ask ggplot2 to facet by the Variable column, using the facet_wrap() function, and allow the y-axis to vary between the facets. Faceting allows us to draw subplots of our data, indexed by some faceting variable. Finally, we add a violin geometric object, which is similar to a box plot but also shows the density of data along the y-axis. The resulting plot is shown in figure 4.10.

Figure 4.10. Faceted plot of Survived against FamSize and Fare. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

to see more go to 4.2.3. Plotting the data

In the last two chapters, we saw how k-means and hierarchical clustering identify clusters using distance: distance between cases, and distance between cases and their centroids. Density-based clustering comprises a set of algorithms that, as the name suggests, uses the density of cases to assign cluster membership. There are multiple ways of measuring density, but we can define it as the number of cases per unit volume of our feature space. Areas of the feature space containing many cases packed closely together can be said to have high density, whereas areas of the feature space that contain few or no cases can be said to have low density. Our intuition here states that distinct clusters in a dataset will be represented by regions of high density, separated by regions of low density. Density-based clustering algorithms attempt to learn these distinct regions of high density and partition them into clusters. Density-based clustering algorithms have several nice properties that circumvent some of the limitations of k-means and hierarchical clustering.

to see more go to Chapter 18. Clustering based on density: DBSCAN and OPTICS

concept `density` in category `R`

Machine Learning with R, the tidyverse, and mlr

Figure 4.10. Faceted plot of `Survived` against `FamSize` and `Fare`. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

Figure 4.10. Faceted plot of `Survived` against `FamSize` and `Fare`. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

concept density in category R

Machine Learning with R, the tidyverse, and mlr

Figure 4.10. Faceted plot of Survived against FamSize and Fare. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

Figure 4.10. Faceted plot of Survived against FamSize and Fare. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

Unable to load book!

concept `density` in category `R`

Figure 4.10. Faceted plot of `Survived` against `FamSize` and `Fare`. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).

Figure 4.10. Faceted plot of `Survived` against `FamSize` and `Fare`. Violin plots show the density of data along the y-axis. The lines on each violin represent the first quartile, median, and third quartile (from lowest to highest).