6 Case Study: Small Molecule Binding to an RNA Target

 

This chapter covers

  • An exemplary quantitative structure-activity relationship (QSAR) pipeline for understanding small molecule binding to RNA targets
  • Advanced molecular representation and descriptor calculation methods, especially in low data availability contexts
  • Representative data splitting with the Kennard-stone algorithm and dimensionality reduction algorithms like principal component analysis (PCA)
  • Sequential ensemble learning with gradient boosting
  • Advanced methods for model-specific and model-agnostic interpretability

In drug discovery, attenuating RNA (ribonucleic acid) targets with small molecules has emerged as a promising strategy to develop novel therapeutics. RNA is recognized as a versatile molecule that plays key regulatory roles in cells by carrying genetic information. The binding of a small molecule to an RNA target can lead to a variety of outcomes, such as inhibiting a specific RNA-protein interaction, altering RNA splicing patterns, or promoting RNA degradation. For example, small molecules that modulate RNA structures involved in cancer progression have been investigated as potential anticancer agents.

6.1 Small Molecule Binding to an RNA Target

6.1.1 The HIV-1 Transaction Response (TAR) RNA Model System

6.1.2 Structure: Computing Descriptors

6.1.3 Activity: Experimentally Measuring Binding Profiles

6.2 Representative Data Splitting & Dimensionality Reduction

6.2.1 Data Refinement

6.2.2 Representative Data Splitting: Kennard-Stone Algorithm

6.2.3 Dimensionality Reduction: Principal Component Analysis

6.3 QSAR Modeling: Mapping Descriptors to Measurements

6.3.1 Exemplary QSAR Modeling Workflow

6.3.2 QSAR Model Interpretation

6.4 Gradient Boosting Machines

6.4.1 Informing the RNA-Binding Chemical Space

6.4.2 The XGBoost Magic Trick

6.4.3 Model-agnostic Interpretation

6.5 Summary

6.6 Exercises

6.7 References