concept shape in category R

appears as: The shape, shape, shapes, shape, shapes
Machine Learning with R, the tidyverse, and mlr

This is an excerpt from Manning's book Machine Learning with R, the tidyverse, and mlr.

When we measure a variable, it’s often desirable to examine the range of values taken on by the variable. We can do this, for example, using a histogram, where we plot the possible values of our variable against the frequency with which we observed each of them. The shape we get from plotting such a histogram represents the distribution of our variable and tells us information such as where our variable is centered, how dispersed it is, whether its values are symmetrically distributed around its center, and how many peaks it has.

We can summarize distributions of variables using a variety of statistics, such as those that summarize the central tendency of the distribution, those that summarize the dispersion, and those that summarize the shape and symmetry. Visually inspecting the distributions of our variables is important, however, to help us decide the best way to handle different variables.

Figure 2.3. The same scatter plot as in figure 2.1, with the Species variable mapped to the shape and col aesthetics
Listing 2.12. Mapping species to the shape and color aesthetics
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, shape = Species)) +
  geom_point()  +
  theme_bw()

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) +
  geom_point()  +
  theme_bw()

Each of the model types is represented by a different line in figure 19.3, and each has a strange three-letter code identifying it. The first letter of each code refers to the volume of each Gaussian, the second letter refers to the shape, and the third letter refers to the orientation. Each of these components can take one of the following:

  • Constraints can be placed on the covariance matrix to control the volume, shape, and orientation of the Gaussians.
  • Deep Learning with R

    This is an excerpt from Manning's book Deep Learning with R.

    Before training, we’ll preprocess the data by reshaping it into the shape the network expects and scaling it so that all values are in the [0, 1] interval. Previously, our training images, for instance, were stored in an array of shape (60000, 28, 28) of type integer with values in the [0, 255] interval. We transform it into a double array of shape (60000, 28 * 28) with values between 0 and 1.

    Images typically have three dimensions: height, width, and color depth. Although grayscale images (like our MNIST digits) have only a single color channel and could thus be stored in 2D tensors, by convention image tensors are always 3D, with a one-dimensional color channel for grayscale images. A batch of 128 grayscale images of size 256 × 256 could thus be stored in a tensor of shape (128, 256, 256, 1), and a batch of 128 color images could be stored in a tensor of shape (128, 256, 256, 3) (see figure 2.4).

    Figure 2.4. A 4D image data tensor (channels-first convention)

    There are two conventions for shapes of images tensors: the channels-last convention (used by TensorFlow) and the channels-first convention (used by Theano). The TensorFlow machine-learning framework, from Google, places the color-depth axis at the end: (samples, height, width, color_depth). Meanwhile, Theano places the color depth axis right after the batch axis: (samples, color_depth, height, width). With the Theano convention, the previous examples would become (128, 1, 256, 256) and (128, 3, 256, 256). The Keras framework provides support for both formats.

  • Vector data—2D tensors of shape (samples, features)
  • Timeseries data or sequence data—3D tensors of shape (samples, timesteps, features)
  • Images—4D tensors of shape (samples, height, width, channels) or (samples, channels, height, width)
  • Video—5D tensors of shape (samples, frames, height, width, channels) or (samples, frames, channels, height, width)
  • Listing 8.12. Running gradient ascent over different successive scales
    R in Action, Second Edition: Data analysis and graphics with R

    This is an excerpt from Manning's book R in Action, Second Edition: Data analysis and graphics with R.

    library(ggplot2)
    library(cluster)
    fit <- pam(df, k=2)
    df$clustering <- factor(fit$clustering)
    ggplot(data=df, aes(x=V1, y=V2, color=clustering, shape=clustering)) +
       geom_point() + ggtitle("Clustering of Bivariate Normal Data")

    The ggplot2 package provides methods for grouping and faceting. Grouping displays two or more groups of observations in a single plot. Groups are usually differentiated by color, shape, or shading. Faceting displays groups of observations in separate, side-by-side plots. The ggplot2 package uses factors when defining groups or facets.

    Figure 19.9. Scatterplot of years since graduation and salary. Academic rank is represented by color, and sex is represented by shape.
    sitemap

    Unable to load book!

    The book could not be loaded.

    (try again in a couple of minutes)

    manning.com homepage
    test yourself with a liveTest