12 Transformers for Protein Structure Prediction
This chapter covers
- Predicting protein structure directly from sequence
- Why proteins fold into stable 3D shapes
- Generating structures with modern generative models
- Capturing long-range residue interactions with attention
- Using protein language models in downstream biology tasks
In Chapter 11, we used graph neural networks (GNNs) to predict how small molecules bind to proteins. We treated protein structures as fixed scaffolds, with assistance from multiple sequence alignment and position scoring matrices to determine contacts between residues and form a protein graph. As we saw, determining drug-target binding affinity depends critically on knowing three-dimensional protein structures.
The importance of protein structure prediction extends to drug discovery as a whole. For example, when investigating why a cancer becomes resistant to treatment, we may need to understand how mutations could alter the shape of the target protein. Structure determines function, and without structure, drug design gravitates closer to guesswork.