13 Training a classification model to detect suspected tumors
This chapter covers
- Using PyTorch
DataLoader
s to load data - Implementing a model that performs classification on our CT data
- Setting up the basic skeleton for our application
- Adding logging and displaying metrics during training
In the previous chapters, we set the stage for our cancer-detection project. We covered medical details of lung cancer, took a look at the main data sources we will use for our project, and transformed our raw CT scans into a PyTorch Dataset
instance. Now that we have a dataset, we can easily consume our training data. So let’s do that!
13.1 A foundational model and training loop
We’re going to do two main things in this chapter. We’ll start by building the nodule classification model and training loop that will be the foundation that the rest of part 2 uses to explore the larger project. To do that, we’ll use the Ct
and LunaDataset
classes we implemented in chapter 12 to feed DataLoader
instances. Those instances, in turn, will feed our classification model with data via training and validation loops.
We’ll finish the chapter by using the results from running that training loop to introduce one of the hardest challenges in this part of the book: how to get high-quality results from messy, limited data. In later chapters, we’ll explore the specific ways in which our data is limited, as well as mitigate those limitations.