7 Autoencoding and self-supervision

This chapter covers

Training without labels
Autoencoding to project data
Constraining networks with bottlenecks
Adding noise to improve performance
Predicting the next item to make generative models

You now know several approaches to specifying a neural network for classification and regression problems. These are the classic machine learning problems, where for each data point x (e.g., a picture of a fruit), we have an associated answer y (e.g., fresh or rotten). But what if we do not have a label y? Is there any useful way for us to learn? You should recognize this as an unsupervised learning scenario.

People are interested in self-supervision because labels are expensive. It is often much easier to get lots of data, but knowing what each data point is requires a lot of work. Think about a sentiment classification problem where you try to predict if a sentence is conveying a positive notion (e.g., “I love this deep learning book I’m reading.”) or a negative one (e.g., “The author of this book is bad at making jokes.”). It’s not hard to read the sentence, make a determination, and save that information. But if you want to build a good sentiment classifier, you might want to label hundreds of thousands to millions of sentences. Do you really want to spend days or weeks labeling so many sentences? If we could somehow learn without needing these labels, it would make our lives much easier.

7.1 How autoencoding works

7.1.1 Principle component analysis is a bottleneck autoencoder

7.1.2 Implementing PCA

7.1.3 Implementing PCA with PyTorch

7.1.4 Visualizing PCA results

7.1.5 A simple nonlinear PCA

7.2 Designing autoencoding neural networks

7.2.1 Implementing an autoencoder

7.2.2 Visualizing autoencoder results

7.3 Bigger autoencoders

7.3.1 Robustness to noise

7.4 Denoising autoencoders

7.4.1 Denoising with Gaussian noise

7.5 Autoregressive models for time series and sequences

7.5.1 Implementing the char-RNN autoregressive text model

7.5.2 Autoregressive models are generative models

7.5.3 Changing samples with temperature

7.5.4 Faster sampling

Exercises