chapter eight
8 Simplicity Hidden in Complexity
This chapter covers
- Compression reveals secret structure
- Complexity peaks between order and noise
- Simplicity beats overfitting
- Grokking: memorize, then simplify
- Are language models just blurry JPEGs of the Internet
Sutskever has argued that any good prediction model is implicitly a good compressor, and vice versa.[1] During one talk, he cites research from 2017 on the “sentiment neuron,” where OpenAI researchers, including Ilya, trained an LSTM to predict the next character in Amazon product reviews, but discovered that a single neuron captured the review’s sentiment.[2] The sentiment neuron arises because predicting the next character forces the model to encode sentiment in a latent variable.