concept autograd in category deep learning

This is an excerpt from Manning's book Deep Learning with PyTorch.
JAX, a library by Google that was developed independently from TensorFlow, has started gaining traction as a NumPy equivalent with GPU, autograd and JIT capabilities.
This is when PyTorch tensors come to the rescue, with a PyTorch component called autograd. Chapter 3 presented a comprehensive overview of what tensors are and what functions we can call on them. We left out one very interesting aspect, however: PyTorch tensors can remember where they come from, in terms of the operations and parent tensors that originated them, and they can automatically provide the chain of derivatives of such operations with respect to their inputs. This means we won’t need to derive our model by hand;10 given a forward expression, no matter how nested, PyTorch will automatically provide the gradient of that expression with respect to its input parameters.
Figure 5.10 The forward graph and backward graph of the model as computed with autograd
![]()
In order to address this, PyTorch allows us to switch off autograd when we don’t need it, using the
torch.no_grad
context manager.12 We won’t see any meaningful advantage in terms of speed or memory consumption on our small problem. However, for larger models, the differences can add up. We can make sure this works by checking the value of therequires_grad
attribute on theval_loss
tensor:# In[16]: def training_loop(n_epochs, optimizer, params, train_t_u, val_t_u, train_t_c, val_t_c): for epoch in range(1, n_epochs + 1): train_t_p = model(train_t_u, *params) train_loss = loss_fn(train_t_p, train_t_c) with torch.no_grad(): #1 val_t_p = model(val_t_u, *params) val_loss = loss_fn(val_t_p, val_t_c) assert val_loss.requires_grad == False #2 optimizer.zero_grad() train_loss.backward() optimizer.step()Using the related
set_grad_enabled
context, we can also condition the code to run withautograd
enabled or disabled, according to a Boolean expression--typically indicating whether we are running in training or inference mode. We could, for instance, define acalc_forward
function that takes data as input and runsmodel
andloss_fn
with or without autograd according to a Booleantrain_is
argument:

This is an excerpt from Manning's book Grokking Deep Learning.
Perhaps the most elegant part of this form of autograd is that it works recursively as well, because each vector calls .backward() on all of its self.creators: