This chapter covers:
- Loading and processing our raw data files; these are the annotations that describe the location of potentially malignant parts of a CT scan
- Implementing a Python class to represent our data to the rest of our project; for us, this will be the
Ct
class
- Converting our data into a format usable by PyTorch by implementing a
Dataset
subclass; theLunaDataset
class will combine the CT and annotation data and convert it into tensors
- Visualizing the data we will be using as training and validation data for the project
Now that we’ve covered the larger project for part 2, let’s get into specifics about what we’re going to do here in chapter 10. It’s time to implement basic data loading and processing routines for our raw data. Basically every significant project you work on will need something analogous to what we cover here.[95] Here is our high-level map of our project from chapter 9, shown here in Figure 10.1 . We’re going to be focusing on step 1, data loading for the rest of this chapter.