chapter eight
8 Simplicity Hidden in Complexity
This chapter covers
- Compression reveals secret structure
- Complexity peaks between order and noise
- Simplicity beats overfitting
- Grokking: memorize, then simplify
- Are language models just blurry JPEGs of the Internet
Sutskever has argued that any good prediction model is implicitly a good compressor, and vice versa.[1] In one talk, he cites 2017 research on the “sentiment neuron,” in which OpenAI researchers, including Sutskever, trained an LSTM to predict the next character in Amazon product reviews but discovered that a single neuron captured the review’s sentiment.[2] The sentiment neuron arises because predicting the next character forces the model to encode sentiment in a latent variable.