5 Classification: Cytochrome P450 Inhibition

 

This chapter covers

  • How to use machine learning to model metabolism via Cytochrome P450 inhibition.
  • Fundamental classification models such as logistic regression and decision trees.
  • Ensemble learning with bagging and random forests.
  • Different evaluation methods for classification problems.
  • Additional methods for model interpretation and applicability domain assessment.

In the realm of ligand-based virtual screening, we’ve walked through modeling scenarios for similarity searching and screening for antimalarial hits (chapter 2), filtering out compounds due to hERG blockage (chapter 3), and assessing compound absorption based on its solubility (chapter 4). Before we move on to tasks outside of ligand-based property prediction, we’ll direct our attention to one last challenge – metabolism!

In drug discovery, metabolism refers to the body’s ability to break down foreign molecules called xenobiotics, which includes drugs. Drugs are metabolized so that the molecules and their effects may be eliminated from the body, with metabolism being the main clearance pathway of 75 to 90% of all drugs. Naturally, the duration and intensity of a drug’s action are dependent on how the drug is metabolized. Metabolism also has implications in toxicity, as at least 7% of drug metabolites have known mechanisms that provoke adverse drug reactions [1].

5.1 Binary Classification of CYP3A4 Inhibition

5.1.1 Logistic Regression in Theory

5.1.2 Logistic Regression in Practice

5.1.3 (Mis)calibration: Questioning Probabilistic Output Assumptions

5.2 Tree-based Models

5.2.1 Decision Trees

5.2.2 Dealing with Data Set Imbalance

5.3 Ensemble Learning: A Preview

5.3.1 Decision Tree Bias-Variance Trade Offs

5.3.2 Random Forests: A Strong Collective of Weak Decision Trees

5.3.3 Multi-Instance Learning: Model-agnostic Ensemble Learning

5.4 Multiclass & Multilabel Classification

5.4.1 Multiclass Classification

5.4.2 Multilabel Classification

5.5 Summary

5.6 Exercises

5.7 References