11 Graph Neural Networks for Predicting Drug-Target Affinity
This chapter covers
- Transform SMILES strings and proteins into graph representations.
- Master core GNN theory and message-passing mechanisms.
- Build a dual-stream GNN to predict drug-target binding affinity.
- Train and evaluate the model using real-world benchmark datasets.
- Interpret the model's performance and prediction results.
In previous chapters, we treated molecules as sequences of characters in a SMILES string. While powerful, this simplification forces a linear structure onto objects that are inherently three-dimensional. A SMILES string discards the rich topological and structural information that dictates chemical behavior. This loss of information can impair a model's predictive power and the functional relevance of its learned representations.
In this chapter, we embrace a more natural representation of molecules as graphs. We will tackle one of the most critical tasks in computational drug discovery: predicting the binding affinity between a drug and a target protein. This task is central to reducing the immense cost and time associated with de novo drug development.
Our starting point for this journey will be the influential GraphDTA model, which demonstrated the Graph Neural Networks (GNNs) for learning from drug structures [1]. The original study showed that by representing drugs as graphs, their model could outperform contemporary deep learning approaches that relied on 1D string representations.