chapter four

4 From pixels to pictures – generating images

 

This chapter covers

  • Generative AI Vision models, their model architecture, and key use cases for enterprises
  • Using Stable Diffusion's GUI and APIs for image generation and editing
  • Using advanced editing techniques– inpainting, outpainting, and image variations
  • Practical Image generation tips for enterprises to consider

Generating images is one of the many uses of Generative AI, which strives to create unique and realistic content from a mere prompt. Enterprises are increasingly adopting Generative AI to develop innovative image generation and editing solutions. Many innovative use cases are being enabled by this – from AI-powered architecture for innovative designs of buildings to fashion design, avatar generation, virtual clothes try-on, and virtual patients for medical training, to name a few. Going hand-in-hand with these are exciting products that enable some of this – such as Microsoft Designer and Adobe Firefly –which we will cover in this chapter.

In the preceding chapters, we delved into the fundamentals of Generative AI and the technology that enables us to generate text, including completions and chats. However, in this chapter, we shift gears and explore how Generative AI can be utilized to produce and adjust images. We will discover how creating images is a simple process and highlight some of the complexities of getting them right.

4.1 Vision Models

4.1.1 Variational Autoencoders (VAEs)

4.1.2 Generative Adversarial Networks (GANs)

4.1.3 Vision Transformer Models (ViT)

4.1.4 Diffusion models

4.1.5 Multi-Modal Models

4.2 Image Generation with Stable Diffusion

4.2.1 Dependencies

4.2.2 Generating Image

4.3 Image generation with other providers

4.3.1 OpenAI DALLE 3

4.3.2 Bing Image Creator

4.3.3 Adobe Firefly

4.4 Editing and enhancing images using Stable Diffusion

4.4.1 Generating using Image-to-Image API

4.4.2 Using The Masking API

4.4.3 Resize using the Upscale API

4.4.4 Image Generation Tips

4.5 Summary

4.6 References