This chapter covers
- What generative deep learning is, its applications, and how it differs from the deep-learning tasks we’ve seen so far
- How to generate text using an RNN
- What latent space is and how it can form the basis of generating novel images, through the example of variational autoencoders
- The basics of generative adversarial networks
Some of the most impressive tasks demonstrated by deep neural networks have involved generating images, sounds, and text that look or sound real. Nowadays, deep neural networks are capable of creating highly realistic human face images,[1] synthesizing natural-sounding speech,[2] and composing compellingly coherent text,[3] just to name a few achievements. Such generative models are useful for a number of reasons, including aiding artistic creation, conditionally modifying existing content, and augmenting existing datasets to support other deep-learning tasks.[4]
1Tero Karras, Samuli Laine, and Timo Aila, “A Style-Based Generator Architecture for Generative Adversarial Networks,” submitted 12 Dec. 2018, https://arxiv.org/abs/1812.04948. See a live demo at https://thispersondoesnotexist.com/.
2Aäron van den Oord and Sander Dieleman, “WaveNet: A Generative Model for Raw Audio,” blog, 8 Sept. 2016, http://mng.bz/MOrn.
3“Better Language Models and Their Implications,” OpenAI, 2019, https://openai.com/blog/better-language-models/.