concept latent space in category GAN

appears as: latent space, Latent space, The latent space
GANs in Action: Deep learning with Generative Adversarial Networks

This is an excerpt from Manning's book GANs in Action: Deep learning with Generative Adversarial Networks.

A bit more formally, we take a certain prescription (z)—for this simple case, let’s say it is a number between 0 and 9—and try to arrive at a generated sample (x*). Ideally, this x* would look as realistic as another real sample, x. The prescription, z, lives in a latent space and serves as an inspiration so that we do not always get the same output, x*. This latent space is a learned representation—hopefully meaningful to people in ways we think of it (“disentangled”). Different models will learn a different latent representation of the same data.

The random noise vector we saw in chapter 1 is often referred to as a sample from the latent space. Latent space is a simpler, hidden representation of a data point. In our context, it is denoted by z, and simpler just means lower-dimensional—for example, a vector or array of 100 numbers rather than the 768 that is the dimensionality of the samples we will use. In many ways, a good latent representation of a data point will allow you to group things that are similar in this space. We will get to what latent means in the context of an autoencoder in figure 2.3 and show you how this affects our generated samples in figures 2.6 and 2.7, but before we can do that, we’ll describe how autoencoders function.

  • Latent space (z): As we train, here we try to establish the latent space to have some meaning. Latent space is typically a representation of a smaller dimension and acts as an intermediate step. In this representation of our data, the autoencoder is trying to “organize its thoughts.”
  • Decoder network: We reconstruct the original object into the original dimension by using the decoder. This is typically done by a neural network that is a mirror image of the encoder. This is the step from z to x*. We apply the reverse process of the encoding to get back, for example, a 784 pixel-values long reconstructed vector (of a 28 × 28 image) from the 256 pixel-values long vector of the latent space.
  • Figure 2.2. Using an autoencoder in our letter example follows these steps: (1) You compress all the things you know about a machine learning engineer, and then (2) compose that to the latent space (letter to your grandmother). When she, using her understanding of words as a decoder (3), reconstructs a (lossy) version of what that means, you get out a representation of an idea in the same space (in your grandmother’s head) as the original input, which was your thoughts.

    Interestingly, the point estimate will also be wrong and can even live in an area where there is no actual data sampled from the true distribution. When you look at the samples (black crosses), no real samples occur where we have estimated our mean. This is, again, quite troubling. To tie it back to the autoencoder, see how in figure 2.6 we learned 2D normal distribution in the latent space centered around the origin? But what if we had thrown images of celebrity faces into the training data? We would no longer have an easy center to estimate, because the two data distributions would have more modalities than we thought we would have. As a result, even around the center of the distribution, the VAE could produce odd hybrids of the two datasets, because the VAE would try to somehow separate the two datasets.

    sitemap

    Unable to load book!

    The book could not be loaded.

    (try again in a couple of minutes)

    manning.com homepage
    test yourself with a liveTest