This chapter covers:
- Using
DataLoaders to load data
- Implementing a model that performs classification on our CT data from chapter 10
- Setting up the basic skeleton for our training and validation application
- Logging and displaying metrics to evaluate the model’s performance
In the previous chapters we set the stage for our cancer detection project. We covered medical details of lung cancer, took a look at the main data sources we will use for our project, and transformed our raw CT scans into a PyTorch Dataset. Now that we have a Dataset, we can easily consume our training data.
So let’s do that!
We’re going to do two main things in this chapter. We’re going to start with building the classification model and training loop that will be the foundation that the rest of Part 2 uses to explore the larger project. To do that, we’ll take the Ct and LunaDataset classes we implemented last chapter, and use them to feed DataLoader instances. Those instances, in turn, feed our classification model with data via training and validation loops.
We’ll finish out the chapter by using the results from running that training loop to introduce one of the hardest challenges of Part 2, which is how to get high-quality results from messy, limited data. In later chapters we’ll be exploring the specific ways that our data is limited, as well as mitigating those limitations.