2 Fully connected networks

This chapter covers

Implementing a training loop in PyTorch
Changing loss functions for regression and classification problems
Implementing and training a fully connected network
Training faster using smaller batches of data

Now that we understand how PyTorch gives us tensors to represent our data and parameters, we can progress to building our first neural networks. This starts with showing how learning happens in PyTorch. As we described in chapter 1, learning is based on the principle of optimization: we can compute a loss for how well we are doing and use gradients to minimize that loss. This is how the parameters of a network are “learned” from the data and is also the basis of many different machine learning (ML) algorithms. For these reasons, optimization of loss functions is the foundation PyTorch is built from. So to implement any kind of neural network in PyTorch, we must phrase the problem as an optimization problem (remember that this is also called function minimization).

2.1 Neural networks as optimization

2.1.1 Notation of training a neural network

2.1.2 Building a linear regression model

2.1.3 The training loop

2.1.4 Defining a dataset

2.1.5 Defining the model

2.1.6 Defining the loss function

2.1.7 Putting it together: Training a linear regression model on the data

2.2 Building our first neural network

2.2.1 Notation for a fully connected network

2.2.2 A fully connected network in PyTorch

2.2.3 Adding nonlinearities

2.3 Classification problems

2.3.1 Classification toy problem

2.3.2 Classification loss function

2.3.3 Training a classification network

2.4 Better training code

2.4.1 Custom metrics

2.4.2 Training and testing passes

2.4.3 Saving checkpoints

2.4.4 Putting it all together: A better model training function

2.5 Training in batches