chapter eleven

11 Training A Classification Model To Detect Suspected Tumors

 

This chapter covers:

  • Using DataLoader s to load data
  • Implementing a model that performs classification on our CT data from chapter 10
  • Setting up the basic skeleton for our training and validation application
  • Logging and displaying metrics to evaluate the model’s performance

In the previous chapters we set the stage for our cancer detection project. We covered medical details of lung cancer, took a look at the main data sources we will use for our project, and transformed our raw CT scans into a PyTorch Dataset. Now that we have a Dataset, we can easily consume our training data.

So let’s do that!

We’re going to do two main things in this chapter. We’re going to start with building the classification model and training loop that will be the foundation that the rest of Part 2 uses to explore the larger project. To do that, we’ll take the Ct and LunaDataset classes we implemented last chapter, and use them to feed DataLoader instances. Those instances, in turn, feed our classification model with data via training and validation loops.

We’ll finish out the chapter by using the results from running that training loop to introduce one of the hardest challenges of Part 2, which is how to get high-quality results from messy, limited data. In later chapters we’ll be exploring the specific ways that our data is limited, as well as mitigating those limitations.

11.1  The main entrypoint for our application

11.2  Pre-training setup and initialization

11.2.1  Initalizing the model and optimizer

11.2.2  Care and feeding of DataLoaders

11.3  Our first-pass neural network design

11.3.1  The Core Convolutions

11.3.2  The Full Model

11.4  Training and validating the model

11.4.1  Deleting the loss variable

11.4.2  The computeBatchLoss function

11.4.3  The validation loop is similar

11.5  Outputting performance metrics

11.5.1  The logMetrics function

11.6  Running the training script

11.6.1  Needed data for training

11.6.2  Interlude: the enumerateWithEstimate function

11.8.1  Running TensorBoard