chapter seven

7 Generate high-resolution images with diffusion models

 

This chapter covers

  • The Denoising Diffusion Implicit Model noise scheduler
  • Adding the attention mechanism in denoising U-Net models
  • Generating high-resolution images with advanced diffusion models
  • Interpolating initial noise tensors to generate a series of images

In the previous two chapters, you’ve gained a foundational understanding of diffusion models, learning how they add noise to clean images and then reverse this process to generate new images from pure noise. By using the powerful denoising U-Net architecture, you saw how a model can be trained to transform pure noise into grayscale clothing-item images, step-by-step.

But what does it take to move from simple grayscale images to richly detailed, high-resolution color images? And how can we make these models not only more accurate but also faster and more efficient at generating such images? This chapter tackles these questions by introducing advanced tools and techniques that are now the backbone of state-of-the-art text-to-image generators.

7.1 Attention in U-Net, DDIM, and image interpolation

7.1.1 Incorporating the attention mechanism in the U-Net model

7.1.2 Denoising Diffusion Implicit Models

7.1.3 Image interpolation in diffusion models

7.2 High-resolution flower images as training data

7.2.1 Visualizing images in the training dataset

7.2.2 Applying forward diffusion on flower images

7.3 Building and training a U-Net for high-resolution images

7.3.1 Building the denoising U-Net model

7.3.2 Training the denoising U-Net model

7.4 Image generation and interpolation

7.4.1 Using the trained denoising U-Net to generate images

7.4.2 Transition from one image to another

Summary